<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Testing a Citation and Text-Based Framework for Retrieving Publications for Literature Reviews</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>M. Janina Sarol</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Linxi Liu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jodi Schneider</string-name>
          <email>jodig@illinois.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Illinois at Urbana-Champaign</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <fpage>22</fpage>
      <lpage>33</lpage>
      <abstract>
        <p>We propose a citation- and text-based framework to conduct literature review searches. Given a small set of articles included in a literature review (i.e. seed articles), the rst step of the framework retrieves articles that are connected to the seed articles in the citation network. The next step lters these retrieved articles using a hybrid citation and text-based criteria. In this paper, we evaluate a rst implementation of this framework (code available at https://github.com/janinaj/ lit-review-search) by comparing it to the conventional search methods for retrieving the included studies of 6 published systematic reviews. Using di erent combinations of 3 seed articles, on average we retrieved 71.2% of the total included studies in the published reviews and 82.33% of the studies available in the search database (Scopus). Our best combinations retrieved 87% of the total included studies, which comprised 100% of the studies available in Scopus. In 5 of the 6 reviews, we reduced the number of results by 34{88%, which in practice would save reviewers signi cant time, since the overall number of search results that need to be manually screened is substantially reduced. These results suggest that our framework is a promising approach to improving the literature review search process.</p>
      </abstract>
      <kwd-group>
        <kwd>citation relationships</kwd>
        <kwd>text mining</kwd>
        <kwd>literature review</kwd>
        <kwd>systematic search</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Scholarly output is large and fast-growing: as of 2018, Scopus alone covers 69
million publications, comprised of journals, conference proceedings, and books1,
and this may double by 2027 as scholarly output grows about 8% each year [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Staying up-to-date in such an environment is di cult, especially with an increase
in interdisciplinary work. This makes literature reviews important, but
timeconsuming to conduct.
      </p>
      <p>
        There are multiple types of literature reviews, and each type has di erent
speci c goals [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. For instance, a state-of-the art review may focus on current
literature and emerging priorities, while rapid reviews may support policymaking
by assessing what is already known on a practical topic. Systematic searching
is useful for all types of literature reviews, but it is fundamental for systematic
reviews, which seek 100% recall. Systematic reviews try to nd all available
evidence pertaining to a given research question. Thus, they become increasingly
time-consuming and di cult to conduct as literature grows. It is pertinent that
all retrieved search results are screened, typically manually, and classi ed as
relevant or irrelevant.
      </p>
      <p>
        Even small improvements in the search process for literature reviews could
help researchers more e ciently retrieve relevant publications. Ross-White and
Godfrey [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] studied the precision of high-recall searches used for 8 systematic
reviews. They calculated that an average of 142 results needed to be screened
to nd 1 relevant paper. The 8 reviews they described screened a total of 17,378
abstracts to nd 122 relevant articles. The time required for screening can be
substantial. Bannach-Brown et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] suggested that 1 person can screen an
estimated 1,879 results per month. Librarians reported routinely spending 40-60
hours to develop search queries that still result in thousands of results that need
to be manually screened to nd a handful of relevant articles [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Alternative or complementary approaches to conventional term- and
conceptbased search methods are needed, and current work in this area is promising.
For instance, CitNetExplorer was originally designed to study the evolution of
science, but its citation network visualizations can also help systematically
retrieve publications [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. New approaches can also take advantage of additional
publication data, which is increasingly available for electronic access and e
cient retrieval. Scopus, a large scienti c database, provides citation information
for indexed articles. A public domain corpus of citation information,
OpenCitations [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], reportedly contains reference lists for 50% of CrossRef-indexed
publications as of 2018.2 Meanwhile, many publishers provide full-text access to their
content, and text mining of licensed content is increasingly feasible.3 These
additional data sources allow for the development of novel techniques that leverage
di erent kinds of information.
      </p>
      <p>We propose a citation- and text-based framework for conducting literature
review searches. Our approach di ers from conventional search methods in that
we use publications (\seed articles") as our starting point, rather than identifying
search strings. We also use the citation network of seed articles as our search and
retrieval space. We then lter the results by removing publications with weak
citation and topical relationships with the seed articles.</p>
      <p>We envision this framework to be useful for di erent types of literature
reviews. In this paper, we test a rst implementation of our framework on 6
systematic reviews.</p>
      <p>In Section 2, we provide related work on both citation and text-based
information retrieval. In Section 3, we describe our framework, a sample
implementa</p>
    </sec>
    <sec id="sec-2">
      <title>2 https://i4oc.org/#faqs</title>
      <p>3 e.g. through the Crossref Text and Data Mining APIs http://tdmsupport.
crossref.org
tion, and an experimental evaluation. In Section 4 we report our results, which
we analyze and discuss in Section 5. Finally, in Section 6, we conclude the paper.
2
2.1</p>
      <sec id="sec-2-1">
        <title>Related Work</title>
        <sec id="sec-2-1-1">
          <title>Text-based Techniques for Information Retrieval</title>
          <p>
            Topic modeling is one of the text mining techniques that has been frequently
used for information retrieval-based tasks. Wang, McCallum, and Wei found that
the use of topical phrases can improve the performance of information retrieval
systems [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ]. Combining collaborative ltering and topic modeling has also been
shown to be a promising approach in recommending scienti c publications [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ].
          </p>
          <p>
            Text mining has also been used speci cally for systematic review tasks. A
2015 systematic review by O'Mara-Eves et al. [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ] provides a detailed discussion
of proposed solutions for screening documents. More recent approaches include
a text-mining framework for screening documents for systematic reviews
introduced by Li et al. [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ] and a semi-supervised approach for screening relevant
documents developed by Kontonatsios et al. [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ].
2.2
          </p>
        </sec>
        <sec id="sec-2-1-2">
          <title>Citation-based Techniques for Information Retrieval</title>
          <p>
            Citation-based methods have also been proposed for retrieving and ranking
relevant scienti c publications. In a eld study using real searches in health science
libraries in the early 1990's, Pao [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ] found that citation searching was able to
add an average of 24% recall. Recent approaches include using term
frequencyinverse document frequency metrics, commonly used for text-based ranking, to
rank co-cited papers [
            <xref ref-type="bibr" rid="ref14">14</xref>
            ] and citation proximity analysis to recommend scienti c
publications [
            <xref ref-type="bibr" rid="ref15">15</xref>
            ].
          </p>
          <p>
            Belter [
            <xref ref-type="bibr" rid="ref16">16</xref>
            ] explored a citation-based approach for retrieving studies for
inclusion in systematic reviews, which has shown promising results, in
particular, substantial increases in precision. Our implementation bases its search and
citation-based ltering steps on Belter's approach; we add additional text-based
ltering and further automation. Belter's test set also inspired the experiment
we describe below. We use 6 of the 14 systematic reviews in Belter's study [
            <xref ref-type="bibr" rid="ref16">16</xref>
            ].
2.3
          </p>
        </sec>
        <sec id="sec-2-1-3">
          <title>Hybrid Techniques for Information Retrieval</title>
          <p>
            Wolfram [
            <xref ref-type="bibr" rid="ref17">17</xref>
            ] emphasized the synergy between information retrieval,
bibliometrics, and natural language processing. Adopting and integrating methods across
these domains seems natural, especially with the increasing availability of
citation data and full-text papers.
          </p>
          <p>
            Glanzel [
            <xref ref-type="bibr" rid="ref18">18</xref>
            ] proposed the use of bibliometrics-aided retrieval and hybrid
methods for studying scholarly disciplines. Silva et al. [
            <xref ref-type="bibr" rid="ref19">19</xref>
            ] demonstrated the
utility of using a hybrid citation and text-based approach for science mapping.
However, we were not able to nd prior frameworks that combine citation and
text-based methods to aid literature review searches. We hypothesize that such
a hybrid approach would also be useful in searching for studies for literature
reviews.
3
3.1
          </p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>Methods</title>
        <sec id="sec-2-2-1">
          <title>Proposed Framework</title>
          <p>We propose a three-step framework for searching and ltering articles for
literature reviews starting from one or more seed articles.
1. Select seed article(s): Identify 1 or more publications relevant for inclusion
in the review to use as seed articles.
2. Search: Collect papers connected by citation relationships to at least one
seed article.</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>3. Filter:</title>
          <p>(a) Citation-based: Remove papers with weak citation relationships to the
seed articles.
(b) Text-based: Filter the list of papers using keywords or topics found in
the set of all seed articles.</p>
          <p>These two ltering methods can be interchanged or combined.
3.2</p>
        </sec>
        <sec id="sec-2-2-3">
          <title>A Sample Implementation</title>
          <p>Select seed article(s) We use all possible combinations of 1-, 2-, or 3- seed
articles.</p>
          <p>Search We retrieved the references, citations, co-citing papers, and co-cited
papers of all seed articles. These relationships to the seed article are shown in
Figure 1. References (RP) are publications cited by a seed article (i.e. usually
listed at the end of articles), while citations (CP) are publications that cited a
seed article. Co-citing papers (CC) are papers that also cited the same articles
that the seed article cited, while co-cited papers (CR) are papers that are also
cited by the same articles that cited the seed article. For the rest of this paper,
we refer to this set of articles as the citation space of the seed article. We used
the Scopus APIs4 to retrieve the citation spaces.</p>
          <p>Filter We implemented a two-step ltering approach by rst removing the
articles that do not pass our citation-based criteria, then further ltering the
list of papers using keywords of the seed articles. The resulting list contains the
nal set of retrieved papers.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4 https://dev.elsevier.com/sc_apis.html</title>
      <p>Citation-Based Filtering Our citation-based ltering removes all papers that do
not meet at least one of these criteria from the retrieved set of papers:
{ paper A cites paper B
{ paper A is cited by paper B
{ paper A shares at least 10% of its references with paper B
{ paper A shares at least 10% of its citations with paper B.</p>
      <p>
        We chose 10% as Belter [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] reported promising results with this threshold.
Given constraints on our API usage (10,000 abstracts and 20,000 citations per
week), ltering by citations enabled us to retrieve a smaller number of abstracts
for text-based ltering.
      </p>
      <p>Text-Based Filtering To get the nal set of retrieved papers, we ltered the
remaining papers based on phrases extracted from the abstracts. We deemed a
paper relevant if its abstract contained at least one bigram or trigram phrase
found in any of the seed articles' abstracts.</p>
      <p>
        We used the Scopus Abstract Retrieval API to retrieve the abstracts. Then,
phrases were extracted from abstracts using an available Python
implementation5 of the Rapid Automatic Keyword Extraction (RAKE) algorithm [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
RAKE's graph-based approach to extract phrases has been tested on scienti c
abstracts, and its strength is retaining phrases that include stopwords (enabling
it to nd complex concepts, e.g. \curse of dimensionality"). We found that
      </p>
    </sec>
    <sec id="sec-4">
      <title>5 https://pypi.python.org/pypi/rake-nltk</title>
      <p>the unigram output from RAKE contained uninformative words { verbs (e.g.
needed), conjunctive adverbs (e.g. however), and nouns (e.g. studies), so we
omitted unigrams.
4
4.1</p>
      <sec id="sec-4-1">
        <title>Experiment on the Sample Implementation</title>
        <sec id="sec-4-1-1">
          <title>Aim of Experiment</title>
          <p>The aim of the experiment was to test our implementation against conventional
search procedures used in systematic reviewing. Systematic reviews aim to nd
all available evidence pertaining to a given research question (i.e. get 100% recall
on that question), and typically manually screen search results. Maintaining
recall while increasing precision (i.e. get less results for manual screening) would
save reviewers time. Therefore, for a given systematic review, our goal was
twofold: (1) retrieve all the designated major publications included in the review
and (2) reduce the total number of retrieved papers.
4.2</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>Ground Truth from Conventional Search Methods</title>
          <p>
            Review Article Title
1 Antibiotic regimens for management of intra-amniotic infection [
            <xref ref-type="bibr" rid="ref21">21</xref>
            ]
2 Interventions for preventing and ameliorating cognitive de cits in
adults treated with cranial irradiation [
            <xref ref-type="bibr" rid="ref22">22</xref>
            ]
3 Co-enzyme Q10 supplementation for the primary prevention of
cardiovascular disease [
            <xref ref-type="bibr" rid="ref23">23</xref>
            ]
4 Intermittent self-dilatation for urethral stricture disease in males [
            <xref ref-type="bibr" rid="ref24">24</xref>
            ]
5 Electronic cigarettes for smoking cessation and reduction [
            <xref ref-type="bibr" rid="ref25">25</xref>
            ]
6 Long-term proton pump inhibitor (PPI) use and the development of
gastric pre-malignant lesions [
            <xref ref-type="bibr" rid="ref26">26</xref>
            ]
          </p>
          <p>For citation retrieval, we approximated the search date speci ed in each
review by using the year. For example, if the search was reported as conducted
in February 2013, we retrieved citations to the seed articles that were published
prior to or in the entire year of 2013. It should also be noted that not all of the
studies were indexed by Scopus. We return to this point in Section 5.</p>
          <p>For each review, the major publications are used as both seed articles and
retrieval targets. In each case, the goal was, given some set of major publications
as seed articles, to retrieve all of the remaining major publications as retrieval
targets. In the following, for simplicity, we refer to the major publications from
a review as its studies or included studies.</p>
          <p>We tested our method on all possible 1-, 2-, and 3-seed combinations. For
instance, review #1 has 10 included studies indexed in Scopus: there are 10
1seed combinations, 45 2-seed combinations, and 120 3-seed combinations.
Consequently, our implementation tested a total of 175 seed combinations for
review #1.
5</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>Results</title>
        <p>Avg</p>
        <p>4
6.91
8.4
1.25
2.83
3.75</p>
        <p>2
3.7
4.5
4.45
7.13
8.4</p>
        <p>4
6.36
7.72
3.38
5.46
6.75
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3</p>
        <p>One of the advantages of this framework is that it can be largely automated.
While the seeds need to be selected manually, the retrieval of the citation space
and the ltering steps can be done programmatically, assuming that the data is
available.</p>
        <p>However, one of the limitations of this approach is that it relies on the
completeness of the available data. In our experiment, not all of the 55 included
studies were in Scopus: only 48 of the included studies were primary documents
indexed in Scopus; 6 were secondary documents not indexed in Scopus (but
containing title and citation data in Scopus); and 1 document (a meeting abstract
published in the appendix of a journal) had no information in Scopus. In
addition, of the 48 primary documents in Scopus, 1 had no abstract and 9 were
missing reference data.</p>
        <p>
          Further testing how the framework can be integrated into current literature
review processes is warranted. While our framework cannot guarantee 100%
recall all the time (although this is also the case with conventional methods),
we envision that it can be easily integrated in the processes for developing and
updating systematic reviews. The framework could also be used to estimate the
number of included studies when developing reviews. It has particular promise for
nding recent studies when updating systematic reviews, using the previously
included studies as seeds. Further, Belter [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] suggested that a citation-based
approach may retrieve articles that are not retrieved by the search methods
used by the reviewers, so our approach could also be used to seek additional
studies for inclusion in a review.
        </p>
        <p>
          While we have shown that our framework can work well for systematic
reviews, we also plan on testing our framework on di erent kinds of literature
reviews, such as scoping reviews. Our future work will explore how di erent
variations in citation space de nitions and ltering criteria work for various kinds
of literature reviews. We also want to explore how we can use the framework
to rank the retrieved publications. This could be tested on the CLEF E-health
2018 Task 2; as in 2017 [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ], given a Boolean query and its MEDLINE search
results for 20 Cochrane Diagnostic Test Accuracy reviews, systems will rank titles
and abstracts and determine a screening threshold. While the ranking of results
may not be as important in systematic reviews, it may be very useful for other
reviews, such as scoping reviews and state-of-the-art reviews.
7
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>Conclusion</title>
        <p>In this paper, we presented a citation and text-based framework for retrieving
publications for literature reviews. Our proposed framework retrieves papers
from a set of seed articles through citation relationships, then lters the papers
using citation and text-based methods. Our experiment on an implementation of
the framework showed that we can achieve up to 100% recall within the limits of
the data while improving the precision, but a careful selection of seeds is required.
Further testing of the performance and utility of the framework is warranted, but
our preliminary results suggest that a hybrid citation- and text-based approach
can be a useful strategy in supporting literature reviews.
8</p>
      </sec>
      <sec id="sec-4-4">
        <title>Acknowledgements</title>
        <p>Linxi Liu's work on this project was funded by the Illinois Informatics Institute
undergraduate research program.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bornmann</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mutz</surname>
          </string-name>
          , R.:
          <article-title>Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references</article-title>
          .
          <source>Journal of the Association for Information Science and Technology</source>
          <volume>66</volume>
          (
          <issue>11</issue>
          ) (
          <year>2015</year>
          )
          <volume>2215</volume>
          {
          <fpage>2222</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Grant</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Booth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>A typology of reviews: an analysis of 14 review types and associated methodologies</article-title>
          .
          <source>Health Information &amp; Libraries Journal</source>
          <volume>26</volume>
          (
          <issue>2</issue>
          ) (
          <year>2009</year>
          )
          <volume>91</volume>
          {
          <fpage>108</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Ross-White</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Godfrey</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Is there an optimum number needed to retrieve to justify inclusion of a database in a systematic review search? Health Information</article-title>
          and
          <source>Libraries Journal</source>
          <volume>34</volume>
          (
          <issue>3</issue>
          ) (
          <year>2017</year>
          )
          <volume>217</volume>
          {
          <fpage>224</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bannach-Brown</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Przybyla</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thomas</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rice</surname>
            ,
            <given-names>A.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ananiadou</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liao</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Macleod</surname>
            ,
            <given-names>M.R.:</given-names>
          </string-name>
          <article-title>The use of text-mining and machine learning algorithms in systematic reviews: reducing workload in preclinical biomedical sciences and reducing human screening error [biorxiv:255760]</article-title>
          . (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Hoang</surname>
            ,
            <given-names>L.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schneider</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Opportunities for computer support for systematic reviewing-a gap analysis</article-title>
          .
          <source>In: iConference 2018 Proceedings</source>
          , iSchools (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Van</given-names>
            <surname>Eck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.J.</given-names>
            ,
            <surname>Waltman</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.</surname>
          </string-name>
          :
          <article-title>Systematic retrieval of scienti c literature based on citation relations: Introducing the CitNetExplorer tool</article-title>
          . In: International Workshop on Bibliometric-enhanced
          <source>Information Retrieval (BIR 2014) at the European Conference on Information Retrieval</source>
          . (
          <year>2014</year>
          )
          <volume>13</volume>
          {
          <fpage>20</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Peroni</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shotton</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vitali</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>One year of the OpenCitations corpus</article-title>
          . In: International Semantic Web Conference, Springer (
          <year>2017</year>
          )
          <volume>184</volume>
          {
          <fpage>192</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McCallum</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wei</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Topical n-grams: Phrase and topic discovery, with an application to information retrieval</article-title>
          .
          <source>In: Proceedings of the 7th IEEE International Conference on Data Mining (ICDM</source>
          <year>2007</year>
          ),
          <source>October 28-31</source>
          ,
          <year>2007</year>
          , Omaha, Nebraska, USA. (
          <year>2007</year>
          )
          <volume>697</volume>
          {
          <fpage>702</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blei</surname>
            ,
            <given-names>D.M.:</given-names>
          </string-name>
          <article-title>Collaborative topic modeling for recommending scienti c articles</article-title>
          .
          <source>In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          , ACM (
          <year>2011</year>
          )
          <volume>448</volume>
          {
          <fpage>456</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>O</given-names>
            <surname>'Mara-Eves</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Thomas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>McNaught</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Miwa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Ananiadou</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.:</surname>
          </string-name>
          <article-title>Using text mining for study identi cation in systematic reviews: a systematic review of current approaches</article-title>
          .
          <source>Systematic Reviews</source>
          <volume>4</volume>
          (
          <issue>1</issue>
          ) (
          <year>2015</year>
          )
          <fpage>5</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sohn</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shen</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Murad</surname>
            ,
            <given-names>M.H.</given-names>
          </string-name>
          , Liu, H.:
          <article-title>A text-mining framework for supporting systematic reviews</article-title>
          .
          <source>American Journal of Information Management</source>
          <volume>1</volume>
          (
          <issue>1</issue>
          ) (
          <year>2016</year>
          ) 1{
          <fpage>9</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Kontonatsios</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brockmeier</surname>
            ,
            <given-names>A.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Przybyla</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McNaught</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mu</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goulermas</surname>
            ,
            <given-names>J.Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ananiadou</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>A semi-supervised approach using label propagation to support citation screening</article-title>
          .
          <source>Journal of Biomedical Informatics</source>
          <volume>72</volume>
          (
          <year>2017</year>
          )
          <volume>67</volume>
          {
          <fpage>76</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Pao</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          :
          <article-title>Term and citation retrieval: A eld study</article-title>
          .
          <source>Information Processing and Management</source>
          <volume>29</volume>
          (
          <issue>1</issue>
          ) (
          <year>1993</year>
          )
          <volume>95</volume>
          {
          <fpage>112</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>White</surname>
          </string-name>
          , H.D.:
          <article-title>Bag of works retrieval: TF*IDF weighting of co-cited works</article-title>
          .
          <source>In: International Workshop on Bibliometric-enhanced Information Retrieval at the European Conference on Information Retrieval</source>
          . (
          <year>2016</year>
          )
          <volume>63</volume>
          {
          <fpage>72</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Knoth</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khadka</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Can we do better than co-citations? bringing citation proximity analysis from idea to practice in research article recommendation</article-title>
          .
          <source>In: Proceedings of the 2nd Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL</source>
          <year>2017</year>
          ).
          <article-title>(</article-title>
          <year>2017</year>
          )
          <volume>14</volume>
          {
          <fpage>25</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Belter</surname>
            ,
            <given-names>C.W.</given-names>
          </string-name>
          :
          <article-title>Citation analysis as a literature search method for systematic reviews</article-title>
          .
          <source>Journal of the Association for Information Science and Technology</source>
          <volume>67</volume>
          (
          <issue>11</issue>
          ) (
          <year>2016</year>
          )
          <volume>2766</volume>
          {
          <fpage>2777</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Wolfram</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Bibliometrics, information retrieval and natural language processing: Natural synergies to support digital library research</article-title>
          .
          <source>In: Proceedings of the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL</source>
          <year>2016</year>
          ).
          <article-title>(</article-title>
          <year>2016</year>
          )
          <volume>6</volume>
          {
          <fpage>13</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18. Glanzel, W.:
          <article-title>Bibliometrics-aided retrieval: Where information retrieval meets scientometrics</article-title>
          .
          <source>Scientometrics</source>
          <volume>102</volume>
          (
          <issue>3</issue>
          ) (
          <year>2015</year>
          )
          <volume>2215</volume>
          {
          <fpage>2222</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>F.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Amancio</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bardosova</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Costa</surname>
            ,
            <given-names>L.d.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oliveira</surname>
            ,
            <given-names>O.N.</given-names>
          </string-name>
          :
          <article-title>Using network science and text analytics to produce surveys in a scienti c topic</article-title>
          .
          <source>Journal of Informetrics</source>
          <volume>10</volume>
          (
          <issue>2</issue>
          ) (
          <year>2016</year>
          )
          <volume>487</volume>
          {
          <fpage>502</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Rose</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Engel</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cramer</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cowley</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Automatic keyword extraction from individual documents</article-title>
          .
          <source>Text Mining: Applications and Theory</source>
          (
          <year>2010</year>
          )
          <volume>1</volume>
          {
          <fpage>20</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Chapman</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reveiz</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Illanes</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , Bon ll Cosp,
          <string-name>
            <surname>X.</surname>
          </string-name>
          :
          <article-title>Antibiotic regimens for management of intra-amniotic infection</article-title>
          .
          <source>Cochrane Database of Systematic Reviews</source>
          <volume>12</volume>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Day</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zienius</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gehring</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grosshans</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taphoorn</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grant</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brown</surname>
          </string-name>
          , P.D.:
          <article-title>Interventions for preventing and ameliorating cognitive de cits in adults treated with cranial irradiation</article-title>
          .
          <source>Cochrane Database of Systematic Reviews</source>
          <volume>12</volume>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Flowers</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hartley</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rees</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Co-enzyme Q10 supplementation for the primary prevention of cardiovascular disease</article-title>
          .
          <source>Cochrane Database of Systematic Reviews</source>
          <volume>2</volume>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Jackson</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Veeratterapillay</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harding</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dorkin</surname>
          </string-name>
          , T.J.:
          <article-title>Intermittent selfdilatation for urethral stricture disease in men</article-title>
          .
          <source>Cochrane Database of Systematic Reviews (12)</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>McRobbie</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bullen</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hartmann-Boyce</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hajek</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Electronic cigarettes for smoking cessation and reduction</article-title>
          .
          <source>Cochrane Database of Systematic Reviews</source>
          <volume>12</volume>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Song</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Long-term proton pump inhibitor (PPI) use and the development of gastric pre-malignant lesions</article-title>
          .
          <source>Cochrane Database of Systematic Reviews</source>
          <volume>12</volume>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Higgins</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Green</surname>
          </string-name>
          , S., eds.:
          <article-title>Cochrane handbook for systematic reviews of interventions</article-title>
          . Volume
          <volume>5</volume>
          .
          <issue>1</issue>
          .0. John Wiley &amp; Sons (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Kanoulas</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Azzopardi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spijker</surname>
          </string-name>
          , R.:
          <article-title>CLEF 2017 technologically assisted reviews in empirical medicine overview</article-title>
          .
          <source>In: Working Notes of CLEF 2017 - Conference and Labs of the Evaluation Forum. Volume CEUR</source>
          <year>1866</year>
          .
          <article-title>(</article-title>
          <year>2017</year>
          )
          <volume>1</volume>
          {
          <fpage>29</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>