<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Evaluating Pretrained Transformer Models for Citation Recommendation</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Rodrigo</forename><surname>Nogueira</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Tandon School of Engineering</orgName>
								<orgName type="institution">New York University</orgName>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department" key="dep1">David R. Cheriton</orgName>
								<orgName type="department" key="dep2">School of Computer Science</orgName>
								<orgName type="institution">University of Waterloo</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Zhiying</forename><surname>Jiang</surname></persName>
							<affiliation key="aff1">
								<orgName type="department" key="dep1">David R. Cheriton</orgName>
								<orgName type="department" key="dep2">School of Computer Science</orgName>
								<orgName type="institution">University of Waterloo</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Kyunghyun</forename><surname>Cho</surname></persName>
							<affiliation key="aff2">
								<orgName type="department">Courant Institute of Mathematical Sciences</orgName>
								<orgName type="institution">New York University</orgName>
							</affiliation>
							<affiliation key="aff3">
								<orgName type="department">Center for Data Science</orgName>
								<orgName type="institution">New York University</orgName>
							</affiliation>
							<affiliation key="aff4">
								<orgName type="department">Facebook AI Research</orgName>
							</affiliation>
							<affiliation key="aff5">
								<orgName type="institution">CIFAR Azrieli Global Scholar</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jimmy</forename><surname>Lin</surname></persName>
							<affiliation key="aff1">
								<orgName type="department" key="dep1">David R. Cheriton</orgName>
								<orgName type="department" key="dep2">School of Computer Science</orgName>
								<orgName type="institution">University of Waterloo</orgName>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff6">
								<address>
									<settlement>Lisbon</settlement>
									<country key="PT">Portugal</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Evaluating Pretrained Transformer Models for Citation Recommendation</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">C200933CB0E1750F300B042DAE5742E8</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T21:40+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Citation recommendation systems for the scientific literature, to help authors find papers that should be cited, have the potential to speed up discoveries and uncover new routes for scientific exploration. We treat this task as a ranking problem, which we tackle with a twostage approach: candidate generation followed by re-ranking. Within this framework, we adapt to the scientific domain a proven combination based on "bag of words" retrieval followed by re-scoring with a BERT model. We experimentally show the effects of domain adaptation, both in terms of pretraining on in-domain data and exploiting in-domain vocabulary. In addition, we evaluate eleven pretrained transformer models and analyze some unexpected failure cases. On three different collections from different scientific disciplines, our models perform close to or at the state of the art in the citation recommendation task.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The volume of scientific publications is growing at an incredible rate. For example, over 900,000 papers are added per year to MEDLINE, a database of the life sciences and biomedical literature. 1 A recent study estimates that 3M papers are published annually in the English language, with a growth rate of 3-5% per year <ref type="bibr" target="#b17">[18]</ref>. This flood of information has made it nearly impossible for researchers to keep abreast of discoveries and innovations, both in their specific sub-field as well as more broadly. Furthermore, there is an overwhelming amount of material that a scientist entering a new field of study needs to read before becoming familiarized with common concepts, methods, and other foundations.</p><p>A number of tools have come along to help researchers cope with this deluge. For example, keyword-based literature search engines (Google Scholar, Microsoft Academic, PubMed, and Semantic Scholar) and citation recommendation tools <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b1">2,</ref><ref type="bibr" target="#b26">27,</ref><ref type="bibr" target="#b20">21,</ref><ref type="bibr" target="#b13">14]</ref> help scientists find relevant articles, often exploiting citation networks to identify what's important in a particular field. Methods to automatically populate scientific knowledge bases <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b33">34,</ref><ref type="bibr" target="#b34">35]</ref> form another broad approach to tackling this challenge.</p><p>In this work, we investigate the potential of deep pretrained transformer models such as BERT <ref type="bibr" target="#b6">[7]</ref> and large scientific datasets such as Open Research <ref type="bibr" target="#b0">[1]</ref> to improve scientific search tools. More concretely, we tackle the task of scientific literature recommendation, where a paper (title and abstract) is given as a query, and the system's task is to find papers that should be cited. We use a standard keyword search engine (based on inverted indexes) with BM25 ranking <ref type="bibr" target="#b32">[33]</ref> to initially retrieve candidate documents and evaluate various pretrained transformer models as re-rankers.</p><p>We find that this simple pipeline is more effective than previous cluster-based methods <ref type="bibr" target="#b31">[32,</ref><ref type="bibr" target="#b3">4]</ref>. To summarize, our main contributions are as follows:</p><p>-We evaluate eleven pretrained ranking models and find that pretraining on the target domain and using domain-specific vocabulary leads to large improvements over a general-purpose model. -We find that despite the effectiveness of the pretrained transformer models as query-document relevance estimators, they perform poorly when the term overlap between the query and candidate documents is low. To address this issue, we train with more query-candidate pairs that have low term overlap, but interestingly, such a model performs poorly, even on the training set (see Section 5.2). -Contrary to our expectation given the symmetric nature of query and candidate documents, we find that query terms are more important than candidate document terms for relevance estimation (see Section 5.3).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related Work</head><p>Most early methods for scientific literature search and recommendation take advantage of keyword-based retrieval <ref type="bibr" target="#b12">[13,</ref><ref type="bibr" target="#b21">22]</ref>. These techniques suffer from the term mismatch problem, which is common in "bag-of-words" retrieval methods, but the issue is aggravated by the diversity of scientific vocabulary <ref type="bibr" target="#b16">[17,</ref><ref type="bibr" target="#b7">8,</ref><ref type="bibr" target="#b28">29]</ref>.</p><p>As the number of users grows, popular search engines can exploit interaction signals to learn better ranking models <ref type="bibr" target="#b27">[28,</ref><ref type="bibr" target="#b10">11,</ref><ref type="bibr" target="#b9">10]</ref>. However, the reported gains are relatively small compared to classic ranking methods such as BM25. Another common approach in scientific recommendation systems is collaborative filtering <ref type="bibr" target="#b26">[27,</ref><ref type="bibr" target="#b23">24,</ref><ref type="bibr" target="#b5">6]</ref>. These methods typically suffer from the cold-start problem, in which there is not enough evidence about new items (or users) to make predictions accurately.</p><p>More recently, cluster-based methods have started to become competitive with traditional retrieval-based methods in this task. Kanakia et al. <ref type="bibr" target="#b18">[19]</ref> cluster papers based on their word embedding representation and use co-citations to alleviate the cold-start problem. However, they perform human evaluations on a private dataset, which excludes an empirical comparison to our approach.</p><p>Perhaps closest to our work is Eto <ref type="bibr" target="#b8">[9]</ref>, who uses a combination of proximity measures from the graph of co-citations to score candidate documents. The edges in the graph are weighted by the distance in which two citations occur in the citing document. This method requires access to the full text of the citing document, which is often not available (for example, due to paywalled content). Our method, on the other hand, predicts citations using only article abstracts, which are widely available in scientific corpora.</p><p>The methods described so far and our work fall in the category of global methods, which aim at recommending citations for the entire paper. Another category comprises local methods, which aim at recommending citations for a specific sentence or paragraph in the document <ref type="bibr" target="#b13">[14,</ref><ref type="bibr" target="#b25">26,</ref><ref type="bibr" target="#b14">15,</ref><ref type="bibr" target="#b15">16]</ref>. We do not compare our method to these as we do not assume access to the full text.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Methods</head><p>This work tackles the task of citation recommendation: given a partially written paper, the system's task is to return all papers that should be cited in it. The input query q is the title and abstract of a paper (and not the full text). We argue that this assumption is crucial to building a useful tool as authors might desire recommendations of relevant citations prior to writing most of their paper.</p><p>Our method comprises two phases, Retrieval and Ranking. In the first phase, the top-k papers D are retrieved by a keyword search engine when queried with query q. In the second phase, we compute the probability p(d|q) of each paper d ∈ D being relevant to q. For this, we use a BERT <ref type="bibr" target="#b6">[7]</ref> re-ranker model based on Nogueira and Cho <ref type="bibr" target="#b29">[30]</ref>. Using the same notation as Devlin et al., we feed the query tokens as sequence A and the candidate paper tokens as sequence B.</p><p>In our setup, both the query and the candidate are the concatenation of the title and abstract of each paper, resulting in an input sequence that is often longer than the maximum tokens allowed by the model (typically 512 tokens). To handle this, we devote 256 tokens for the query and 256 for the candidate, truncating as necessary. At inference time, we use the model as a binary classifier: we feed the [CLS] token to a single layer neural network to obtain p(d|q). The output of our method is a list of papers D ranked by p(d|q). Training details are provided in Section 4.2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Experimental Setup</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Datasets</head><p>Open Research. We train and evaluate our models on the Open Research corpus <ref type="bibr" target="#b0">[1]</ref>,<ref type="foot" target="#foot_0">2</ref> comprising 7.2M computer science and biomedical paper abstracts and their references. We closely follow the data processing steps from Bhagavatula We remove papers that do not cite any other paper or that have no year of publication. Finally, we remove citations of papers that are not in the corpus or whose year of publication is later than that of the citing paper. Table <ref type="table" target="#tab_0">1</ref> shows the statistics of the final dataset after all processing steps. Note that although our dataset statistics do not match those reported in Bhagavatula et al. <ref type="bibr" target="#b3">[4]</ref>, they match the output of the evaluation script provided by the authors. <ref type="foot" target="#foot_1">3</ref> The difference is that the authors report statistics before the filtering steps (e.g., removing papers without references). Thus, our corpus and dataset splits match exactly and thus our results are comparable.</p><p>DBLP and PubMed. The DBLP and PubMed datasets were introduced by Ren et al. <ref type="bibr" target="#b31">[32]</ref> and comprise papers from computer science and biomedicine, respectively. We apply the same data processing steps from Bhagavatula et al., and the resulting dataset statistics are summarized in Table <ref type="table" target="#tab_0">1</ref>.</p><p>Once processed in the manner described above, the citations within each paper serve as the ground truth for that paper. That is, using a specific paper as a query, the perfect results set comprises the actual citations in that paper.</p><p>When evaluating our method on DBLP and PubMed, we use models trained on Open Research's training set as this yields better results than training on the much smaller DBLP and PubMed training sets. To avoid leaking training data into the evaluation sets, we use the following method to remove documents in Open Research's training set that appear in the development and test sets of PubMed and DBLP: We remove special characters from the title and use Jaccard similarity (on unigrams) to calculate the closeness of two documents, filtering with a threshold of 0.7. This method results in approximately half of the papers in the development and test sets of PubMed and DBLP being removed from the training set of Open Research.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Re-ranker Training</head><p>To obtain the positive and negative examples used to train our binary classification models, we retrieve the top 10 papers for each query (title + abstract) using the Anserini IR toolkit<ref type="foot" target="#foot_2">4</ref>  <ref type="bibr" target="#b35">[36,</ref><ref type="bibr" target="#b36">37]</ref> with BM25 ranking. Among these, approximately 6% on average are relevant papers (positive examples). We do not balance positive and negative examples; see additional discussions about this decision in Section 5.2.</p><p>Starting with a pretrained BERT model, we fine-tune it to our task using cross-entropy loss:</p><formula xml:id="formula_0">L = − j∈Jpos log(p(d j |q)) − j∈Jneg log(1 − p(d j |q)),<label>(1)</label></formula><p>where J pos and J neg are the indexes of the relevant and non-relevant papers and p(d j |q) is the relevance probability the model assigns to the j-th paper. We examine several BERT variants, detailed in Section 5.1.</p><p>All models are fine-tuned using Google's TPUs v3-8 with a batch size of 128 (128 sequences × 512 tokens = 65,536 tokens/batch) for 300k iterations, which takes approximately three days. This corresponds to training on 38.4M (300k × 128) query-candidate pairs, or 1.1 epochs. We do not see any improvements in the development set when training for another 700k iterations, which is equivalent to 3.8 epochs. We use Adam <ref type="bibr" target="#b19">[20]</ref> with the initial learning rate set to 3 × 10 −6 , β 1 = 0.9, β 2 = 0.999, L2 weight decay of 0.01, learning rate warmup over the first 10,000 steps, and linear decay of the learning rate. We use a dropout probability of 0.1 in all layers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Inference and Metrics</head><p>At inference time, we first retrieve the top 1000 candidate documents with the title and abstract as the query using BM25 ranking in Anserini. These documents are further re-ranked with one of the variants of the fine-tuned BERT models (see Section 5.1 for more details). Following Bhagavatula et al. <ref type="bibr" target="#b3">[4]</ref>, we evaluate the results using F 1 of the top 20 retrieved papers (F 1 @20) and Mean Reciprocal Ranking (MRR) of the top 1000 retrieved papers. We additionally report Recall@1000 (R@1000) to assess the effectiveness of our keyword search in isolation, which provides an upper bound on re-ranking effectiveness. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Results</head><p>Our main results are shown in Table <ref type="table" target="#tab_1">2</ref> with SciBERT-Large as the ranking model, selected based on the experiments in Section 5.1. On the Open Research dataset, our best configuration (BM25 + SciBERT-Large) improves upon the best previous result in terms of both F 1 @20 and MRR. On the smaller DBLP and PubMed datasets, our method is on par with the state of the art. Note that our BERT-based models are trained only on Open Research as we achieve better results than training on the smaller datasets. Interestingly, our baseline BM25 implementation using Anserini out of the box, denoted "BM25 (Anserini)" in Table <ref type="table" target="#tab_1">2</ref>, is 3-7 points higher in F 1 @20 than the BM25 implementation of Bhagavatula et al. This is likely due to the choice of the query form that we use for "bag of words" retrieval, which is analyzed in Section 5.3, and perhaps a better implementation of BM25 in Anserini (which is based on Lucene).</p><p>Our method appears to be as effective and more scalable than a clusterbased approach. For example, Bhagavatula et al.'s model requires at least 100 GB of RAM to search the 7M documents in the Open Research corpus, <ref type="foot" target="#foot_3">5</ref> whereas keyword search has far more modest memory requirements.</p><p>In the next sections, we investigate the effectiveness of our method by evaluating various pretrained transformer models, as well as the effects of class imbalance and different query forms.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">In-vs. Out-Domain Pretraining</head><p>Here we investigate how different pretraining configurations change effectiveness in the target task. The results, shown in Table <ref type="table" target="#tab_2">3</ref>, are from fine-tuning the pretrained models on Open Research's training set for 300k iterations with a batch size of 128, which corresponds to approximately 1.1 epochs. In the remainder of this paper, we call an in-domain corpus a collection whose majority of documents are from the same domains as those in Open Research (i.e., biomedicine and computer science), and we call an out-domain corpus a collection whose majority of papers are not from those domains.</p><p>The models pretrained on an in-domain corpus, i.e., BioBERT <ref type="bibr" target="#b22">[23]</ref> (row 7) and SciBERT <ref type="bibr" target="#b2">[3]</ref> (rows 8-11), yield significant improvements in the target task over models pretrained on a corpus of a similar size but a different domain (rows 3-5). Pretraining on an out-domain corpus ten times the size of the in-domain corpus results in lower effectiveness on the target task; compare RoBERTa <ref type="bibr" target="#b24">[25]</ref>, row 6 vs. row 10. We conclude that, at least for the task of citation recommendation, pretraining on a smaller in-domain corpus is more effective than pretraining on a larger out-domain corpus.</p><p>When pretraining settings are kept the same except for the vocabulary, the use of in-domain vocabulary gives 5-10% improvement over out-domain vocab-ulary (row 8 vs. 9 and row 10 vs. 11). This make intuitive sense, and Beltagy et al. <ref type="bibr" target="#b2">[3]</ref> report a similar finding in other tasks as well.</p><p>The NCBI models <ref type="bibr" target="#b30">[31]</ref> (rows 1 and 2) are pretrained on an in-domain corpus but produce worse results than models pretrained on an out-domain corpus of a similar size (rows 3-5). They also underperform when compared to SciBERT-Base (row 8), which is pretrained on an in-domain corpus of a similar size but comprises full papers instead of abstracts. As also noted by Beltagy et al. <ref type="bibr" target="#b2">[3]</ref>, this result suggests that pretraining with longer documents improves the target task effectiveness.</p><p>We find that model size appears to be even more important than document length. Our SciBERT-Large models (rows 10 and 11) have higher effectiveness than the SciBERT-Base models (rows 8 and 9) despite being pretrained on a smaller corpus of 7M paper abstracts (1.4B tokens) as opposed to 1M full-text papers (3.2B tokens).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2">Class Imbalance</head><p>Because we only use the top 10 papers returned by BM25 as training examples, the BERT-based models in this work are trained with more negative examples than positive ones (94% vs. 6%). In a separate experiment, to balance these classes, we include in the training phase pairs of query and relevant papers not retrieved by BM25, but this results in F 1 @20 and MRR close to zero in both training and development sets. We obtain a similar result when adding to the training set negative candidates randomly sampled from the corpus.</p><p>What explains these findings? We hypothesize that although BERT is a strong model for document ranking, it still partly relies on exact term match to learn relevance. Thus, when we sample training documents not using an exact term match method such as BM25, fewer terms between the query and the candidate paper match, which makes learning relevance harder. Further studies should investigate if this limitation applies to other tasks as well.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3">Query Analysis</head><p>In the citation recommendation task, the "query" used for initial retrieval can take many forms, such as the title of the paper, the concatenation of title and abstract, or keywords extracted from the text. Here we investigate how these query forms impact the effectiveness of a keyword-based retrieval method.</p><p>In Table <ref type="table" target="#tab_3">4</ref>, we show the effectiveness of BM25 on the Open Research development set. For Key Terms, we follow Bhagavatula et al. <ref type="bibr" target="#b3">[4]</ref> and use Whoosh<ref type="foot" target="#foot_4">6</ref> to first create an index and then extract key terms from the title and abstract with Whoosh's key terms from text method. Despite being faster due to having fewer query terms, the results show that this method has lower effectiveness than simply concatenating the title and abstract of the paper.  <ref type="figure">1</ref>. F1@20 on the development set when varying the number of tokens allocated to the input sequence (whose limit is 512 tokens) for the query (as opposed to the candidate document). The query is the concatenation of the title and abstract.</p><p>One of the limitations of transformer-based models (including BERT) is that memory consumption increases quadratically with the number of tokens in the input sequence. On modern hardware such as TPU v3s or GPU V100s, the maximum number of tokens that we can efficiently train a BERT-Large model is approximately 512. In our task, since the concatenation of query and candidate tokens is typically longer than this limit, there is a trade-off between the number of tokens we allocate to each sequence.</p><p>In Figure <ref type="figure">1</ref>, we show how effectiveness changes as we allocate more tokens to the query than to the candidate document while limiting the sum of the two sequences to 512 tokens. These results are obtained with BM25 + SciBERT-Base (for faster experimental turnaround). The curve shows that query terms are more important to the re-ranker model, as increasing query tokens from 64 to 256 increases F 1 @20 by 2 points. Decreasing candidate document tokens from 256 to 64 barely changes F 1 @20. This result is somewhat surprising as one expects the two sequences to have equal importance in the task of querydocument relevance estimation. Note that in all previous experiments (Table <ref type="table" target="#tab_1">2</ref>), we used 256 tokens for the query and 256 for the candidate; this suggests that our main results might be even higher had we tuned this hyperparameter as well. Future work should investigate if this is particular to citation recommendation, or if it also occurs in other retrieval tasks with long queries as well.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Conclusions</head><p>We provide an extensive evaluation of pretrained transformer models for the scientific literature recommendation task. We find that in-domain pretraining and domain-specific vocabulary greatly improve effectiveness. Additionally, we present an unexpected finding: Despite the symmetry of the two inputs when trying to estimate the relevance of a candidate article to a query article, we find that terms from the query article are more important than terms from the candidate article in allocating "space" for BERT input. Future work should investigate this observation in more detail.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>Statistics of the datasets.</figDesc><table><row><cell></cell><cell cols="3">Open Research DBLP PubMed</cell></row><row><cell>Total # of docs</cell><cell cols="3">6,892,252 50,227 47,347</cell></row><row><cell>Total # of citations</cell><cell cols="3">44,400,729 156,807 825,371</cell></row><row><cell>Avg. # citations per doc</cell><cell>6.45</cell><cell>3.12</cell><cell>17.43</cell></row><row><cell>Avg. len. per doc (char)</cell><cell cols="2">1,391 1,193</cell><cell>1,504</cell></row><row><cell>Queries -Train</cell><cell cols="3">3,343,809 27,322 26,793</cell></row><row><cell>-Dev</cell><cell cols="2">487,582 8,324</cell><cell>2,768</cell></row><row><cell>-Test</cell><cell>464,449</cell><cell>931</cell><cell>8,815</cell></row><row><cell>q/rel. doc pairs -Train</cell><cell cols="3">32,470,673 106,011 558,674</cell></row><row><cell>-Dev</cell><cell cols="3">5,985,787 38,628 66,655</cell></row><row><cell>-Test</cell><cell cols="3">5,944,269 12,168 200,042</cell></row></table><note>et al.<ref type="bibr" target="#b3">[4]</ref> to create the training, development, and test sets. In more detail, we sort papers by publication year and use the oldest 80% for training (1991-2014), the next 10% for development (2014-2015), and the most recent 10% for testing (2015-2016). Since the development and test sets are too large (400k+ papers), we randomly sample 20k examples from each set.</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 .</head><label>2</label><figDesc>Main results on Open Research, DBLP, and PubMed.</figDesc><table><row><cell></cell><cell cols="2">F1@20</cell><cell></cell><cell>MRR</cell><cell cols="2">R@1000</cell></row><row><cell></cell><cell cols="2">Dev Test</cell><cell cols="2">Dev Test</cell><cell cols="2">Dev Test</cell></row><row><cell>Open Research</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>BM25 [4]</cell><cell>-</cell><cell>0.058</cell><cell>-</cell><cell>0.218</cell><cell>-</cell><cell>-</cell></row><row><cell>BM25 (Anserini)</cell><cell cols="2">0.082 0.089</cell><cell cols="2">0.279 0.312</cell><cell cols="2">0.424 0.421</cell></row><row><cell>Citeomatic [4]</cell><cell>-</cell><cell>0.125</cell><cell>-</cell><cell>0.330</cell><cell>-</cell><cell>-</cell></row><row><cell>BM25 + SciBERT-Large</cell><cell cols="2">0.136 0.132</cell><cell cols="2">0.430 0.431</cell><cell cols="2">0.424 0.421</cell></row><row><cell>DBLP</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>BM25 [4]</cell><cell>-</cell><cell>0.119</cell><cell>-</cell><cell>0.425</cell><cell>-</cell><cell>-</cell></row><row><cell>BM25 (Anserini)</cell><cell cols="2">0.105 0.194</cell><cell cols="2">0.352 0.585</cell><cell cols="2">0.669 0.691</cell></row><row><cell>ClusCite [32]</cell><cell>-</cell><cell>0.237</cell><cell>-</cell><cell>0.548</cell><cell>-</cell><cell>-</cell></row><row><cell>Citeomatic [4]</cell><cell cols="2">-0.303</cell><cell>-</cell><cell>0.689</cell><cell>-</cell><cell>-</cell></row><row><cell>BM25 + SciBERT-Large</cell><cell cols="2">0.149 0.272</cell><cell cols="2">0.472 0.714</cell><cell cols="2">0.669 0.691</cell></row><row><cell>PubMed</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>BM25 [4]</cell><cell>-</cell><cell>0.209</cell><cell>-</cell><cell>0.574</cell><cell>-</cell><cell>-</cell></row><row><cell>BM25 (Anserini)</cell><cell cols="2">0.299 0.268</cell><cell cols="2">0.793 0.721</cell><cell cols="2">0.794 0.765</cell></row><row><cell>ClusCite [32]</cell><cell>-</cell><cell>0.274</cell><cell>-</cell><cell>0.578</cell><cell>-</cell><cell>-</cell></row><row><cell>Citeomatic [4]</cell><cell cols="2">-0.329</cell><cell>-</cell><cell>0.771</cell><cell>-</cell><cell>-</cell></row><row><cell>BM25 + SciBERT-Large</cell><cell cols="2">0.326 0.304</cell><cell cols="2">0.835 0.792</cell><cell cols="2">0.794 0.765</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3 .</head><label>3</label><figDesc>Results on Open Research's development set of BERT-based models pretrained under different settings. All models are fine-tuned for approximately one epoch on the training set.</figDesc><table><row><cell cols="2">Pretrained Model Size Pretraining Corpus</cell><cell>Tokens Vocabulary</cell><cell>Cased F1@20 MRR</cell></row><row><cell>(1) NCBI</cell><cell>Base PubMed+MIMIC</cell><cell>4.5B Wiki+Books</cell><cell>0.093 0.315</cell></row><row><cell>(2) NCBI</cell><cell>Large PubMed+MIMIC</cell><cell>4.5B Wiki+Books</cell><cell>0.105 0.352</cell></row><row><cell>(3) Google</cell><cell>Base Wiki+Books</cell><cell>3.3B Wiki+Books</cell><cell>0.113 0.374</cell></row><row><cell>(4) Google</cell><cell>Large Wiki+Books</cell><cell>3.3B Wiki+Books</cell><cell>0.115 0.373</cell></row><row><cell>(5) Google WWM</cell><cell>Large Wiki+Books</cell><cell>3.3B Wiki+Books</cell><cell>0.121 0.399</cell></row><row><cell>(6) RoBERTa</cell><cell>Large Various (Non-Scientific)</cell><cell>33B (Non-Scientific)</cell><cell>0.125 0.409</cell></row><row><cell>(7) BioBERT v1.1</cell><cell>Base Wiki+Books+PubMed+PMC</cell><cell>21.3B PubMed+PMC</cell><cell>0.128 0.417</cell></row><row><cell>(8) SciBERT</cell><cell>Base Open Research (1M Full Papers)</cell><cell>3.2B Wiki+Books</cell><cell>0.125 0.409</cell></row><row><cell>(9) SciBERT</cell><cell>Base Open Research (1M Full Papers)</cell><cell>3.2B Open Research</cell><cell>0.131 0.423</cell></row><row><cell>(10) SciBERT</cell><cell>Large Open Research (7M Abstracts)</cell><cell>1.4B Wiki+Book</cell><cell>0.135 0.420</cell></row><row><cell>(11) SciBERT</cell><cell>Large Open Research (7M Abstracts)</cell><cell>1.4B Open Research</cell><cell>0.137 0.430</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4 .</head><label>4</label><figDesc>BM25 results on Open Research's development set when different query forms are used. BERT-based re-ranking is not applied in these experiments.</figDesc><table><row><cell></cell><cell></cell><cell></cell><cell>Open Research</cell><cell>PubMed</cell><cell>DBLP</cell></row><row><cell>Query Type</cell><cell></cell><cell cols="2">F1@20 MRR R@1000</cell><cell cols="2">F1@20 MRR R@1000</cell><cell>F1@20 MRR R@1000</cell></row><row><cell cols="4">Key Terms (Whoosh) 0.065 0.251 0.282</cell><cell cols="2">0.201 0.595 0.604</cell><cell>0.130 0.425 0.510</cell></row><row><cell>Title</cell><cell></cell><cell cols="2">0.063 0.244 0.287</cell><cell cols="2">0.199 0.584 0.654</cell><cell>0.133 0.424 0.551</cell></row><row><cell cols="2">Title and Abstract</cell><cell cols="2">0.095 0.351 0.363</cell><cell cols="2">0.268 0.720 0.765</cell><cell>0.194 0.585 0.691</cell></row><row><cell></cell><cell>0.14</cell><cell></cell><cell></cell><cell></cell></row><row><cell></cell><cell>0.13</cell><cell></cell><cell></cell><cell></cell></row><row><cell>1 @20</cell><cell>0.12</cell><cell></cell><cell></cell><cell></cell></row><row><cell>F</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell></cell><cell>0.11</cell><cell></cell><cell></cell><cell></cell></row><row><cell></cell><cell cols="2">64 0.1</cell><cell>128</cell><cell>256</cell><cell>384</cell><cell>448</cell></row><row><cell></cell><cell></cell><cell></cell><cell cols="2"># tokens for query</cell></row><row><cell>Fig.</cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_0">https://s3-us-west-2.amazonaws.com/ai2-s2-research-public/open-corpus/ 2017-02-21/papers-2017-02-21.zip</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_1">https://github.com/allenai/citeomatic/blob/master/citeomatic/scripts/ evaluate.py</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_2">http://anserini.io/ BIR 2020 Workshop on Bibliometric-enhanced Information Retrieval</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_3">https://github.com/allenai/citeomatic#citeomatic-evaluation BIR 2020 Workshop on Bibliometric-enhanced Information Retrieval</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_4">https://whoosh.readthedocs.io/en/latest/</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This research was supported in part by the Canada First Research Excellence Fund, the Natural Sciences and Engineering Research Council (NSERC) of Canada, NVIDIA, and eBay. Additionally, we would like to thank Google for computational resources in the form of Google Cloud credits.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Construction of the literature graph in Semantic Scholar</title>
		<author>
			<persName><forename type="first">W</forename><surname>Ammar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Groeneveld</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Bhagavatula</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Beltagy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Crawford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Downey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Dunkelberger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Elgohary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Feldman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Ha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Kinney</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kohlmeier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Murray</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">H</forename><surname>Ooi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Peters</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Power</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Skjonsberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wilhelm</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Van Zuylen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Etzioni</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
				<meeting>the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="84" to="91" />
		</imprint>
	</monogr>
	<note>Industry Papers</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Technical paper recommendation: A study in combining multiple information sources</title>
		<author>
			<persName><forename type="first">C</forename><surname>Basu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Hirsh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">W</forename><surname>Cohen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Nevill-Manning</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Artificial Intelligence Research</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page" from="231" to="252" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">I</forename><surname>Beltagy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Cohan</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1903.10676</idno>
		<title level="m">SciBERT: Pretrained contextualized embeddings for scientific text</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Bhagavatula</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Feldman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Power</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Ammar</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1802.08301</idno>
		<title level="m">Content-based citation recommendation</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">A system for automatic personalized tracking of scientific literature on the web</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">D</forename><surname>Bollacker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Lawrence</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">L</forename><surname>Giles</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fourth ACM conference on Digital Libraries (DL &apos;99)</title>
				<meeting>the Fourth ACM conference on Digital Libraries (DL &apos;99)</meeting>
		<imprint>
			<date type="published" when="1999">1999</date>
			<biblScope unit="page" from="105" to="113" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Research paper recommender systems on big scholarly data</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">T</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Pacific Rim Knowledge Acquisition Workshop</title>
				<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="251" to="260" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">BERT: Pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
		<title level="s">Long and Short Papers</title>
		<meeting>the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="4171" to="4186" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Combining global and local semantic contexts for improving biomedical information retrieval</title>
		<author>
			<persName><forename type="first">D</forename><surname>Dinh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Tamine</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">European Conference on Information Retrieval</title>
				<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="375" to="386" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Extended co-citation search: Graph-based document retrieval on a cocitation network containing citation context information</title>
		<author>
			<persName><forename type="first">M</forename><surname>Eto</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Processing &amp; Management</title>
		<imprint>
			<biblScope unit="volume">56</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page">102046</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Best Match: New relevance search for PubMed</title>
		<author>
			<persName><forename type="first">N</forename><surname>Fiorini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Canese</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Starchenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Kireev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Miller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Osipov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kholodov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Ismagilov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mohan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ostell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Lu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PLoS Biology</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="issue">8</biblScope>
			<biblScope unit="page">e2005343</biblScope>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">How user intelligence is improving PubMed</title>
		<author>
			<persName><forename type="first">N</forename><surname>Fiorini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Leaman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">J</forename><surname>Lipman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Lu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nature Biotechnology</title>
		<imprint>
			<biblScope unit="volume">36</biblScope>
			<biblScope unit="issue">10</biblScope>
			<biblScope unit="page">937</biblScope>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Swan: A distributed knowledge infrastructure for Alzheimer disease research</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kinoshita</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Miller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Seaborne</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Cayzer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Clark</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Web Semantics: Science, Services and Agents on the World Wide Web</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="222" to="228" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">First steps towards electronic research communication</title>
		<author>
			<persName><forename type="first">P</forename><surname>Ginsparg</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computers in Physics</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="390" to="396" />
			<date type="published" when="1994">1994</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Context-aware citation recommendation</title>
		<author>
			<persName><forename type="first">Q</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Pei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Kifer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Mitra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">L</forename><surname>Giles</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 19th International Conference on World Wide Web</title>
				<meeting>the 19th International Conference on World Wide Web</meeting>
		<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="421" to="430" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Recommending citations: Translating papers into references</title>
		<author>
			<persName><forename type="first">W</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kataria</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Caragea</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Mitra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">L</forename><surname>Giles</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Rokach</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM &apos;12)</title>
				<meeting>the 21st ACM International Conference on Information and Knowledge Management (CIKM &apos;12)</meeting>
		<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="1910" to="1914" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">A neural probabilistic model for context based citation recommendation</title>
		<author>
			<persName><forename type="first">W</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Liang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Mitra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">L</forename><surname>Giles</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Twenty-Ninth AAAI Conference on Artificial Intelligence</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Information needs of clinical teams: Analysis of questions received by the clinical informatics consult service</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">N</forename><surname>Jerome</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">B</forename><surname>Giuse</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">W</forename><surname>Gish</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">A</forename><surname>Sathe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S</forename><surname>Dietrich</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Bulletin of the Medical Library Association</title>
		<imprint>
			<biblScope unit="volume">89</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page">177</biblScope>
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>Johnson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Watkinson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mabe</surname></persName>
		</author>
		<title level="m">The STM report: An overview of scientific and scholarly publishing</title>
				<imprint>
			<publisher>Technical and Medical Publishers</publisher>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">A scalable hybrid research paper recommender system for Microsoft Academic</title>
		<author>
			<persName><forename type="first">A</forename><surname>Kanakia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Eide</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The World Wide Web Conference</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="2893" to="2899" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<title level="m" type="main">Adam: A method for stochastic optimization</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">P</forename><surname>Kingma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ba</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1412.6980</idno>
		<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Conceptual recommender system for CiteSeerX</title>
		<author>
			<persName><forename type="first">A</forename><surname>Kodakateri Pudhiyaveetil</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gauch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Luong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Eno</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Third ACM Conference on Recommender Systems</title>
				<meeting>the Third ACM Conference on Recommender Systems</meeting>
		<imprint>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="241" to="244" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Indexing and retrieval of scientific literature</title>
		<author>
			<persName><forename type="first">S</forename><surname>Lawrence</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Bollacker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">L</forename><surname>Giles</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 8th ACM International Conference on Information and Knowledge Management (CIKM &apos;99)</title>
				<meeting>the 8th ACM International Conference on Information and Knowledge Management (CIKM &apos;99)</meeting>
		<imprint>
			<date type="published" when="1999">1999</date>
			<biblScope unit="page" from="139" to="146" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Yoon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">H</forename><surname>So</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kang</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1901.08746</idno>
		<title level="m">BioBERT: A pre-trained biomedical language representation model for biomedical text mining</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Context-based collaborative filtering for citation recommendation</title>
		<author>
			<persName><forename type="first">H</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Kong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Bai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">M</forename><surname>Bekele</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Xia</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="1695" to="1703" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<monogr>
		<title level="m" type="main">RoBERTa: A Robustly Optimized BERT Pretraining Approach</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Joshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Levy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stoyanov</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1907.11692</idno>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Recommending citations with translation model</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Shan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Yan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM &apos;11)</title>
				<meeting>the 20th ACM International Conference on Information and Knowledge Management (CIKM &apos;11)</meeting>
		<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="2017" to="2020" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">On the recommending of citations for research papers</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Mcnee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Albert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Cosley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Gopalkrishnan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">K</forename><surname>Lam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Rashid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Konstan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Riedl</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2002 ACM Conference on Computer Supported Cooperative Work</title>
				<meeting>the 2002 ACM Conference on Computer Supported Cooperative Work</meeting>
		<imprint>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page" from="116" to="125" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Deep learning for biomedical information retrieval: Learning textual relevance from click logs</title>
		<author>
			<persName><forename type="first">S</forename><surname>Mohan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Fiorini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Lu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">BioNLP</title>
		<imprint>
			<biblScope unit="page" from="222" to="231" />
			<date type="published" when="2017">2017. 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Improved biomedical term selection in pseudo relevance feedback</title>
		<author>
			<persName><forename type="first">M</forename><surname>Nabeel Asim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wasim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Usman Ghani Khan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Mahmood</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Database</title>
		<imprint>
			<date type="published" when="2018">2018. 2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>Nogueira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Cho</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1901.04085</idno>
		<title level="m">Passage re-ranking with BERT</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Peng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Yan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Lu</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1906.05474</idno>
		<title level="m">Transfer learning in biomedical natural language processing: An evaluation of BERT and ELMo on ten benchmarking datasets</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">ClusCite: Effective citation recommendation by information network-based clustering</title>
		<author>
			<persName><forename type="first">X</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Khandelwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Gu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Han</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</title>
				<meeting>the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="821" to="830" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Okapi at TREC-3</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">E</forename><surname>Robertson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Walker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hancock-Beaulieu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gatford</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 3rd Text REtrieval Conference (TREC-3)</title>
				<meeting>the 3rd Text REtrieval Conference (TREC-3)<address><addrLine>Gaithersburg, Maryland</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1994">1994</date>
			<biblScope unit="page" from="109" to="126" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Automated hypothesis generation based on mining scientific literature</title>
		<author>
			<persName><forename type="first">S</forename><surname>Spangler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">D</forename><surname>Wilkins</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">J</forename><surname>Bachman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Nagarajan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Dayaram</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">J</forename><surname>Haas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Regenbogen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">R</forename><surname>Pickering</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Comer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">N</forename><surname>Myers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">R</forename><surname>Stanoi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Kato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lelescu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">J</forename><surname>Labrie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Parikh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Lisewski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Donehower</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Lichtarge</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</title>
				<meeting>the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="1877" to="1886" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<analytic>
		<title level="a" type="main">Moliere: Automatic biomedical hypothesis generation system</title>
		<author>
			<persName><forename type="first">J</forename><surname>Sybrandt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Shtutman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Safro</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</title>
				<meeting>the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="1633" to="1642" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<analytic>
		<title level="a" type="main">Anserini: Enabling the use of Lucene for information retrieval research</title>
		<author>
			<persName><forename type="first">P</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Fang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 40th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR</title>
				<meeting>the 40th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR</meeting>
		<imprint>
			<date type="published" when="2017">2017. 2017</date>
			<biblScope unit="page" from="1253" to="1256" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b36">
	<analytic>
		<title level="a" type="main">Anserini: Reproducible ranking baselines using Lucene</title>
		<author>
			<persName><forename type="first">P</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Fang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Data and Information Quality</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page">16</biblScope>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
