<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>AUEB-NLP at BioASQ 8: Biomedical Document and Snippet Retrieval</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dimitris Pappas</string-name>
          <email>dpappas@ilsp.gr</email>
          <email>pappasd@aueb.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Petros Stavropoulos</string-name>
          <email>petros.stavropoulos@athenarc.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ion Androutsopoulos</string-name>
          <email>ion@aueb.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Informatics, Athens University of Economics and Business</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute for Language and Speech Processing, Research Center 'Athena'</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We present the submissions of AUEB's NLP group to the BIOASQ 8 document and snippet retrieval tasks. We relied mostly on JPDRMM, our top performing model of BIOASQ 7, but we also tested feeding JPDRMM with word embeddings obtained by applying a graph node embedding method to a biomedical co-occurrence graph; the latter approach was competitive to using biomedical WORD2VEC embeddings in JPDRMM. We also experimented with neural methods to encode, index, and directly retrieve snippets (sentences) and indirectly documents containing the retrieved snippets, instead of relying on conventional information retrieval to pre-fetch possibly relevant documents and invoking JPDRMM to re-rank the pre-fetched documents and their snippets; conventional BM25-based pre-fetching, however, was far better. Our JPDRMM-based document and snippet retrieval methods scored at the top or near the top for all test batches of BIOASQ 8.</p>
      </abstract>
      <kwd-group>
        <kwd>Biomedical Document Retrieval Biomedical Snippet Retrieval BioASQ Deep Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        BIOASQ [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] is an annual competition for biomedical document classification
(Task A), as well as document, snippet, structured data retrieval, question
answering, and summarization (Task B).3 This work pertains to Task B, which
consists of two ‘phases’. In Phase A, systems are provided with English biomedical
questions and are required to retrieve relevant documents and document
snippets from a collection of MEDLINE/PUBMED articles.4 In Phase B, systems are
provided with English biomedical questions, along with gold relevant
documents and gold document snippets per question; they are required to respond
with ‘exact answers’ (e.g., named entities) and ‘ideal’ answers, i.e.g,
paragraphsized summaries. Here we provide an overview of the submissions of AUEB’s
NLP group to the document and snippet retrieval tasks (parts of Task 8b, Phase
A). We also participated in exact answer extraction (part of Task 8b, Phase B)
this year, but we describe this aspect of our work very briefly, because it was
only a quick attempt to reuse work from cloze-style biomedical machine
reading comprehension (MRC) [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ], which led to poor results. By contrast our best
document and snippet retrieval systems scored at the top or near the top for all
test batches of BIOASQ 8, as in BIOASQ 6 and 7 [
        <xref ref-type="bibr" rid="ref28 ref4">4, 28</xref>
        ].
      </p>
      <p>
        For document and snippet retrieval, we employed our JPDRMM model [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ],
which had achieved top performance in BIOASQ 7. This year, we tested JPDRMM
with biomedical WORD2VEC embeddings [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], as in our previous work, but also
with word embeddings obtained by applying a graph node embedding method
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] to a biomedical entity co-occurrence graph; both approaches were equally
good. We also experimented with neural methods to encode, index, and
directly retrieve relevant snippets (sentences) and indirectly retrieve documents
containing the retrieved snippets, instead of relying on conventional
information retrieval to pre-fetch possibly relevant documents and invoking JPDRMM
to re-rank the pre-fetched documents and their snippets; conventional
BM25based pre-fetching, however, followed by JPDRMM reranking was far better.5
      </p>
      <p>
        As already noted, we also participated (for the first time) in exact answer
extraction this year. We experimented with SCIBERT-MAX-READER, a
SCIBERTbased model [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] that we recently introduced for cloze-style biomedical MRC
[
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]. Although SCIBERT-MAX-READER has been found to reach or even exceed
human expert performance in biomedical cloze-style MRC [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ], it performed
poorly in BIOASQ 8, indicating that BIOASQ’s exact answer extraction task is not
as similar as we had hoped to the MRC task and dataset SCIBERT-MAX-READER
was developed for. Hence, we do not provide here any further information
about this aspect of our work, though we hope to examine in future work if
it can be used to pre-train exact answer extraction components, which will be
subsequently fine-tuned on BIOASQ exact answer extraction training instances.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2 JPDRMM-based Models for Document and Snippet</title>
    </sec>
    <sec id="sec-3">
      <title>Retrieval</title>
      <p>
        Following last year’s successful approach [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ], we used JPDRMM as our main
model for document and snippet retrieval. JPDRMM is actually a (neural)
reranking model; it is fed with the top N documents retrieved by a
conventional (and computationally more efficient) information retrieval engine, and
5 As in recent BIOASQ editions, gold snippets are almost always sentences in BIOASQ
8. Hence, we take snippets to be sentences in our experiments. When gold snippets
contain multiple sentences, we break them into multiple single-sentence snippets.
it is trained to jointly re-rank the top N documents and their snippets. We do
not discuss JPDRMM further here, since it is presented in detail in our previous
work [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. We note, however, that JPDRMM computes one loss for document
reranking (Ldoc) and another one for snippet re-ranking (Lsnip). In our previous
work, we simply added the two losses, but this year we used a linear
combination of the two losses, L = ldocLdoc + lsnipLsnip, and we tuned the two loss
weights (ldoc, lsnip) by performing a 10-fold cross-validation on training data.6
      </p>
      <p>
        Furthermore, in our previous work JPDRMM had been used with biomedical
WORD2VEC embeddings [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. Here we also experimented with word
embeddings obtained by applying a graph node embedding method [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] to a
biomedical entity co-occurrence graph. The node embedding method we used is an
extension of NODE2VEC [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] that considers both the topology of the graph it is
applied to and text associated with each node of the graph. In our case, nodes
are biomedical entities and the text of each node is the (often multi-word)
English name of the corresponding entity. Roughly speaking, the graph node
embedding method uses an RNN to obtain a node embedding from the word
embeddings of the text (name) of the node, and then applies graph convolutions to
make sure that the embeddings of nodes with common neighbors are close to
each other. In effect, the embeddings of two nodes (entities) end up being close
to each other if the two nodes have similar names (e.g., ‘acute cardiomyopathy’,
‘cardiomyopathy’) or similar neighbors. To construct the entity co-occurrence
graph, we used PUBTATOR [
        <xref ref-type="bibr" rid="ref38">38</xref>
        ] to identify the biomedical entities in a
randomly selected set of approx. 5 million PUBMED abstracts. Whenever a
biomedical entity was found in the same abstract with another one, a link between
the two entities was added to the graph. We then pruned links corresponding
to co-occurrences with frequencies lower than 10. Although the graph
embedding method is primarily intended to generate node embeddings, it also
generates word embeddings, which we used as an alternative to WORD2VEC
embeddings. The intuition was that nodes (entities) with similar neighborhoods
in the co-occurrence graph are probably related, the graph node embedding
method places their embeddings close to each other, and this might also help
place close to each other the embeddings of the words of the names of the two
related nodes, since node embeddings are based on the word embeddings of
the node names. We call GRAPH-JPDRMM the JPDRMM version that uses word
embeddings obtained via the graph embedding method, and W2V-JPDRMM the
original JPDRMM version with biomedical WORD2VEC embeddings.
      </p>
    </sec>
    <sec id="sec-4">
      <title>3 SEMantic Indexing for SEntence Retrieval (SEMISER)</title>
      <p>
        Our JPDRMM-based models of the previous section rely on conventional
information retrieval to obtain a set of N possibly relevant documents from the
document collection, and then invoke JPDRMM to re-rank the retrieved N
documents and their snippets. Instead, in this section we use a neural encoder to
6 We tuned for lsnip and ldoc in f0, 0.001, 0.01, 0.1, 0.2, 1.0, 5.0, 10.0, 100.0g, which led to
lsnip = 0.01 and ldoc = 1.0.
map each sentence of the document collection to a sentence embedding, and we
index the sentences of the document collection by their sentence embeddings.
We use a similar encoder to map each query to a query embedding, and
approximate k-NN retrieval algorithms [
        <xref ref-type="bibr" rid="ref20 ref3">3, 20</xref>
        ] to retrieve the sentences of the
document collection whose embeddings are closest to the query embedding; the
retrieved sentences are ranked by increasing distance to the query embedding.
When required to retrieve documents too, we simply report the documents that
contained the retrieved sentences; the relevance score of each document is the
minimum query-sentence distance over all the sentences of the document.7 The
encoder of the sentences and the encoder of the queries are jointly trained in a
‘self-supervised’ manner, detailed below, which does not require manually
labeled gold relevant documents and snippets per training query. The resulting
method, called SEMISER (SEMantic Indexing for SEntence Retrieval), is a new
deep learning model for semantic indexing of sentences [
        <xref ref-type="bibr" rid="ref40">40</xref>
        ].
      </p>
      <p>
        SEMISER takes a sentence and a query as input (Fig. 1). Each word of the
sentence and query is mapped to the corresponding word embedding. In BIOASQ 8,
we used the same biomedical WORD2VEC embeddings as in the JPDRMM
methods.8 Two stacked trigram convolutional layers (with tanh activations) are used
to obtain a context-aware embedding for each word, and a self-attention layer
(different for sentences and queries) then computes the sentence and query
embeddings. The self-attention layer actually produces two sentence embeddings
(vectors) and two query embeddings. The intuition is that the two vectors will
capture different views of the sentence and query, respectively, similar in spirit
to the multiple representations obtained when using multiple attention heads
in Transformer-based models [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ]. To force the two sentence (or query)
embeddings to learn different views of the sentence (or query), we compute a cosine
similarity loss between the two sentence (or query) embeddings during
training. We also compute the maximum cosine similarity over all four pairs of
sentence and query embeddings, and require it to be 1 (or 0) when SEMISER is given
a query and a relevant (or irrelevant) sentence, using binary cross-entropy loss.9
We simply added the three losses, but we plan to tune the loss weights in
future work. Although SEMISER can be trained in a supervised manner, by using
pairs consisting of queries and relevant (or irrelevant) sentences as positive (or
negative) training instances, BIOASQ provides relatively few training instances
by today’s standards (approx. 2.6k training queries, with approx. 1.24 relevant
snippets per query on average). Instead, we opted for a ‘self-supervised’
approach, using an auxiliary training task for which very large numbers of
training instances can be obtained without manual annotation.
      </p>
      <p>
        For the auxiliary training task, we used sentences from 50k randomly
selected PUBMED documents. The positive training instances were pairs
consisting of one of the sentences and a (possibly multi-word) keyterm extracted from
7 We also maintain an index that maps sentences to their documents.
8 The word embeddings are not updated during training, in any of the methods we
experimented with.
9 We replace all negative cosine similarity values by zeros, using a RELU activation.
the sentence using SGRANK [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], an unsupervised keyterm extraction method.
The negative training instances were pairs consisting of one of the sentences
and a keyterm extracted from another randomly selected sentence. This process
led to approx. 2.3 million training instances; we generated an equal number of
positive and negative instances. In effect, the auxiliary task requires SEMISER
to be able to generate sentence and query embeddings containing enough
information to decide if a sentence contains a keyterm (treated as a query) or
not. The intuition is that in most cases relevant sentences contain keyterms of
the queries, hence being able to predict if a keyterm included in a sentence is
important. By forcing keyterms (more generally queries) to be represented by
low-dimensional embeddings, we also hope that similar queries will end up
being close in the vector space, and that similar sentences will also end up being
close in a similar manner.
      </p>
      <p>Having trained SEMISER, we use its left part (Fig. 1) to obtain and index
(offline) sentence embeddings, and the right part to convert queries (on the fly)
to query embeddings. To retrieve sentences (and the documents that contain
them), we query the index of sentence embeddings (using approximate k-NN
matching) to obtain the sentences with the most similar sentence embeddings.
For each retrieved sentence, we compute again the maximum similarity score
over all four pairs of sentence-query embeddings.
4</p>
    </sec>
    <sec id="sec-5">
      <title>Overall System Architecture</title>
      <p>In the JPDRMM-based methods (Section 2), we use ElasticSearch10 to index the
approx. 30 million PUBMED ‘documents’ (concatenated titles and abstracts) of
BIOASQ 8. Figure 2 illustrates the overall architecture of our JPDRMM-based
methods. Given a question, we submit it as a query to ElasticSearch to retrieve
10 https://www.elastic.co/elasticsearch/
the N documents with the best BM25 scores; in our experiments, N = 100. Then
JPDRMM (W2V-JPDRMM or GRAPH-JPDRMM) jointly re-ranks the N documents
and their snippets, assigning relevance scores to each one. We return the nd
documents with the highest relevance scores, and the ns snippets with the highest
relevance scores among the snippets of the nd documents. We set nd = ns = 10,
as required by BIOASQ 8.</p>
      <p>When using SEMISER (Section 3), we sentence-split11 all the PUBMED
‘documents’ of BIOASQ 8, we map each sentence to its two sentence embeddings (left
part of Fig. 1, already trained, see also Fig. 3), and we index (offline) all the
sentence embeddings in a single index. Given a BIOASQ question, we map it on the
fly to its two query embeddings (right part of Fig. 1, already trained, see also
Fig. 3), and we query the index of sentence embeddings (using approximate
kNN matching) to retrieve the ns sentences with the highest similarity scores.12
The similarity (relevance) score of each sentence is the maximum cosine
similarity over all four pairs of sentence-query embeddings (as in the lower part
of Fig. 1). To retrieve documents, we assign to each document the score of its
best (most relevant) snippet, and return the nd documents with the best scores.
Again, we set nd = ns = 10, as required by BIOASQ 8. Since multiple snippets
may come from the same document, we actually initially retrieve more than 10
snippets, to always be able to return 10 documents.</p>
      <p>Unfortunately, when used on its own to directly retrieve sentences and
documents (Fig. 3), SEMISER performed poorly. Hence, we also experimented with
combinations of SEMISER with other methods (e.g., applying SEMISER only to
documents retrieved by ElasticSearch), which are discussed in the next section.
11 We use NLTK’s English sentence splitter; see https://www.nltk.org/api/nltk.</p>
      <p>
        tokenize.html.
12 We use HNSWLIB [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] for approximate k-NN matching.
The BIOASQ 8 document collection consists of approx. 31M ‘documents’
(concatenated titles and abstracts) from the openly available ‘MEDLINE/PubMed
Baseline 2020’ collection.13 We discarded approx. 11M articles that contained
only titles. The average ‘document’ length is 197.19 words, the minimum length
is 11 words, and the maximum length is 1,500 words. There are 2,647 training
questions, from which we held out 100 for development. The average training
question length is 9.02 words, and the maximum length is 30 words. Each one
of the five (not publicly available) test batches contains 100 questions.
      </p>
      <sec id="sec-5-1">
        <title>5.1 AUEB-NLP Submissions</title>
        <p>We submitted the following five systems to BIOASQ 8 (Task 8b, Phase A). In all
cases, we used BM25 when scoring documents with ElasticSearch.
AUEB-NLP-1: W2V-JPDRMM for document and snippet retrieval, with BM25 for
initial document retrieval.</p>
        <p>AUEB-NLP-2 (batches 2–5 only): Same as AUEB-NLP-1, but with GRAPH-JPDRMM
instead of W2V-JPDRMM.</p>
        <p>AUEB-NLP-3: Same as AUEB-NLP-1, but we use SEMISER to re-score the
sentences of the nd documents that W2V-JPDRMM retrieves. Each one of the nd
documents is then re-ranked by the score of its best snippet.</p>
        <p>AUEB-NLP-4: SEMISER for document and snippet retrieval (Fig. 1) in batches
1–2. An ensemble of AUEB-NLP-1 and AUEB-NLP-2 in batches 3–5. In batches
3–4, the ensemble summed the scores of the two models (both when scoring
documents and snippets); in batch 5, it used the maximum score of the two
models.</p>
        <p>AUEB-NLP-5: BM25 for document retrieval, then SEMISER for snippet retrieval.</p>
        <p>The last three systems were intended to test the performance of SEMISER,
when used on its own for both document and snippet retrieval (AUEB-NLP-4,
batches 1–2), when pipelined after BM25 (AUEB-NLP-5), or when used as an
additional rescoring mechanism after W2V-JPDRMM (AUEB-NLP-3). Since SEMISER
13 Available from ftp://ftp.ncbi.nlm.nih.gov/pubmed/baseline/.
performed very poorly when used on its own (AUEB-NLP-3, batches 1–2), in the
last three batches we used the slot of AUEB-NLP-3 to experiment with ensembles
of our two best systems (AUEB-NLP-1 and AUEB-NLP-2).</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2 Results</title>
        <p>The official BIOASQ evaluation measure for document and snippet retrieval is
Mean Average Precision (MAP). Table 1 reports the official MAP scores of our
systems for batches 1–5, along with the best score achieved by other
participants in each batch. We also report system rankings, again based on MAP.</p>
        <p>A first observation is that AUEB-NLP-1 and AUEB-NLP-2, which use
W2VJPDRMM and GRAPH-JPDRMM respectively, performed particularly well in
snippet retrieval. In batches 1–4, they were the top two systems in snippet retrieval,
largely outperforming all other systems in MAP, and they ranked 2nd and 3rd
in batch 5, where their MAP was close to that of the best system.14 The two
systems also performed well in document retrieval, where they were ranked in
the top 8 positions in all batches among more than 20 participants. These
document and snippet retrieval results also indicate that JPDRMM works equally
well with the original biomedical WORD2VEC embeddings (W2V-JPDRMM) and
the word embeddings we obtained from the entity co-occurrence graph via the
graph node embedding method (GRAPH-JPDRMM).</p>
        <p>
          Another key observation is that SEMISER, which uses self-supervised neural
encoders to index and retrieve sentences and indirectly documents
(AUEB-NLP4, batches 1–2 only), performed poorly, both in document and snippet retrieval.
When SEMISER was used only to score the sentences of documents retrieved by
BM25 (AUEB-NLP-5), its snippet MAP improved substantially (see batches 1–2),
but remained well below the snippet MAP of the best systems. For document
retrieval, AUEB-NLP-5 uses BM25, hence the corresponding document MAP results
show the performance of conventional information retrieval. When SEMISER
was used to re-score sentences and documents retrieved by BM25 and
W2VJPDRMM (AUEB-NLP-3), both document MAP and snippet MAP were lower than
those of AUEB-NLP-5. Overall, we were unable to obtain benefits by including
SEMISER in any of our systems. We note, however, that SEMISER is still in early
development stages. We plan to investigate its failures further and try to
improve it. The simplistic ensembles (summing or taking the maximum score) of
AUEB-NLP-1 and AUEB-NLP-2 that we experimented with (AUEB-NLP-4, batches
3–5) did not improve document MAP and, more surprisingly, led to much worse
snippet MAP compared to the scores of the systems we combined. We also need
to investigate these results further.
14 We are surprised by the fact that the official MAP scores occasionally exceed 100%,
which may be due to using a wrong normalization.
Neural models for re-ranking have shown promising results in multiple
domains [
          <xref ref-type="bibr" rid="ref1 ref39">1, 39</xref>
          ]. The introduction of BERT sparked the creation of more deep
learning approaches for document retrieval and re-ranking [
          <xref ref-type="bibr" rid="ref12 ref22 ref25 ref27 ref41">12, 22, 25, 27, 41</xref>
          ].
Recently several deep learning models have also been introduced for neural
document retrieval using document vector representations [
          <xref ref-type="bibr" rid="ref12 ref19 ref24 ref36 ref44 ref6 ref7">6, 7, 12, 19, 24, 36, 44</xref>
          ].
        </p>
        <p>
          In the document retrieval task, several approaches that benefit from data
structured as graphs have been proposed [
          <xref ref-type="bibr" rid="ref21 ref32 ref44 ref45 ref9">9, 21, 32, 44, 45</xref>
          ]. A model that
benefits from structured data in biomedical IR is GRAPHENE [
          <xref ref-type="bibr" rid="ref46">46</xref>
          ]. It uses
graphaugmented document representation learning, query expansion, and
representation learning to rank biomedical articles. Its creators concatenated the titles
and abstracts of biomedical articles from the TREC Precision Medicine track [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ]
to create a pool of documents. They then used the MESH terms of the articles as
queries, aiming to retrieve articles labeled with the MESH terms of each query.
GRAPHENE managed to surpass other deep learning models [
          <xref ref-type="bibr" rid="ref17 ref24">17, 24</xref>
          ] in this task,
which resembles the auxiliary task of SEMISER.
        </p>
      </sec>
      <sec id="sec-5-3">
        <title>6.2 Snippet Retrieval</title>
        <p>
          Several deep learning approaches have been proposed for biomedical snippet
retrieval [
          <xref ref-type="bibr" rid="ref10 ref26 ref43">10, 26, 43</xref>
          ]. TANDA [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] is a BERT based deep learning model for answer
sentence selection. The authors fine-tuned a pre-trained BERT model, using
nonbiomedical data obtained from the Natural Questions dataset [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. Their tuning
led to a BERT model trained for answer sentence selection on Wikipedia articles.
A second fine-tuning step adapts the obtained model to the specific target
domain of each dataset they use for testing. Their model outperformed the
previous state of the art model in the TREC-QA dataset [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ], which partly consists of
biomedical documents. Yoon et al. [
          <xref ref-type="bibr" rid="ref43">43</xref>
          ] proposed a clustering-based method for
sentence selection. Their clustering significantly improved performance
leading to state of the art performance on WIKIQA [
          <xref ref-type="bibr" rid="ref42">42</xref>
          ] and TREC-QA [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] when it
was published, but was later surpassed by TANDA.
        </p>
        <p>
          The COVID-19 pandemic and the need to keep up with rapidly increasing
related biomedical literature led to new document collections and retrieval
challenges for COVID-19 [
          <xref ref-type="bibr" rid="ref30 ref37">30, 37</xref>
          ]. These datasets, however, provide only gold
documents, not snippets, and methods proposed for them focus only on document
retrieval [
          <xref ref-type="bibr" rid="ref16 ref18 ref41 ref5">5, 16, 18, 41</xref>
          ], some also on summarization [
          <xref ref-type="bibr" rid="ref13 ref33">13, 33</xref>
          ].
7
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusions and Future Work</title>
      <p>In this short system description paper, we presented our submissions to the
document and snippet retrieval tasks of BIOASQ 8. Our joint JPDRMM model
scored at the top or near the top in both tasks across all batches, as in BIOASQ
7. Its performance was equally good when its biomedical WORD2VEC
embeddings were replaced by word embeddings obtained from a biomedical entity
co-occurrence graph via a graph embedding method. In future work, we also
plan to experiment with JPDRMM fed with random word embeddings, to
investigate to what extent JPDRMM is affected by the word embeddings used. We
also experimented with self-supervised neural encoders to index and directly
retrieve snippets and indirectly documents, instead of initially retrieving
documents using conventional information retrieval and then re-ranking the
documents and their snippets with JPDRMM. This approach performed poorly, but
is still in early stages and we hope to improve it further.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Ahmad</surname>
            ,
            <given-names>W.U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>K.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          :
          <article-title>Context attentive document ranking and query suggestion</article-title>
          .
          <source>Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Beltagy</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lo</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cohan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Scibert: Pretrained language model for scientific text</article-title>
          .
          <source>In: EMNLP</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Boytsov</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Novak</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malkov</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nyberg</surname>
          </string-name>
          , E.:
          <article-title>Off the beaten path: Let's replace term-based retrieval with k-nn search</article-title>
          .
          <source>Computing Research Repository (CoRR) abs/1610</source>
          .10001 (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Brokos</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liosis</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McDonald</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pappas</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Androutsopoulos</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>AUEB at BioASQ 6: Document and Snippet Retrieval</article-title>
          .
          <source>In: Proceedings of the 6th BioASQ Workshop</source>
          . pp.
          <fpage>30</fpage>
          -
          <lpage>39</lpage>
          . Brussels, Belgium (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Allot</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          :
          <article-title>Keep up with the latest coronavirus research</article-title>
          .
          <source>Nature</source>
          <volume>579</volume>
          (
          <issue>7798</issue>
          ),
          <volume>193</volume>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Dai</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Callan</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          :
          <article-title>Context-aware sentence/passage term importance estimation for first stage retrieval</article-title>
          . ArXiv abs/
          <year>1910</year>
          .10687 (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Dai</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Callan</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          :
          <article-title>Context-aware document term weighting for ad-hoc search</article-title>
          .
          <source>Proceedings of The Web Conference</source>
          <year>2020</year>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Danesh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sumner</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martin</surname>
            ,
            <given-names>J.H.:</given-names>
          </string-name>
          <article-title>SGRank: Combining statistical and graphical methods to improve the state of the art in unsupervised keyphrase extraction</article-title>
          .
          <source>In: Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Farhi</surname>
            ,
            <given-names>S.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boughaci</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Graph based model for information retrieval using a stochastic local search</article-title>
          .
          <source>Pattern Recognition Letters</source>
          <volume>105</volume>
          ,
          <fpage>234</fpage>
          -
          <lpage>239</lpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Garg</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vu</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moschitti</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Tanda: Transfer and adapt pre-trained transformer models for answer sentence selection</article-title>
          .
          <source>In: The Thirty-Fourth AAAI Conference on Artificial Intelligence</source>
          ,
          <source>AAAI</source>
          <year>2020</year>
          , The Thirty-Second
          <source>Innovative Applications of Artificial Intelligence Conference</source>
          ,
          <source>IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI</source>
          <year>2020</year>
          , New York, NY, USA, February 7-
          <issue>12</issue>
          ,
          <year>2020</year>
          . pp.
          <fpage>7780</fpage>
          -
          <lpage>7788</lpage>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Grover</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leskovec</surname>
          </string-name>
          , J.: node2vec:
          <article-title>Scalable feature learning for networks</article-title>
          .
          <source>In: KDD</source>
          . pp.
          <fpage>855</fpage>
          -
          <lpage>864</lpage>
          . ACM (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Khattab</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zaharia</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Colbert: Efficient and effective passage search via contextualized late interaction over bert</article-title>
          . ArXiv abs/
          <year>2004</year>
          .12832 (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Kieuvongngam</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Niu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Automatic text summarization of covid-19 medical research articles using bert and gpt-2</article-title>
          . ArXiv abs/
          <year>2006</year>
          .
          <year>01997</year>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Kotitsas</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pappas</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Androutsopoulos</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McDonald</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Apidianaki</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Embedding biomedical ontologies by jointly encoding network structure and textual node descriptors</article-title>
          .
          <source>In: Proceedings of the 18th BioNLP Workshop and Shared Task</source>
          . pp.
          <fpage>298</fpage>
          -
          <lpage>308</lpage>
          . Association for Computational Linguistics, Florence,
          <source>Italy (Aug</source>
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Kwiatkowski</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palomaki</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Redfield</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          , Collins,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Parikh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Alberti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Epstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Polosukhin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Kelcey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Toutanova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.N.</given-names>
            ,
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.W.</given-names>
            ,
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Le</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            ,
            <surname>Petrov</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          :
          <article-title>Natural questions: a benchmark for question answering research</article-title>
          .
          <source>Transactions of the Association of Computational Linguistics</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jeong</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sung</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yoon</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sung</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Choi</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ko</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>S.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kang</surname>
          </string-name>
          , J.:
          <article-title>Answering domain-specific questions in real-time for covid-19 research</article-title>
          . arxiv (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          :
          <article-title>A deep architecture for matching short texts</article-title>
          .
          <source>In: Advances in Neural Information Processing Systems</source>
          <volume>26</volume>
          , pp.
          <fpage>1367</fpage>
          -
          <lpage>1375</lpage>
          . Curran Associates, Inc. (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>MacAvaney</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cohan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goharian</surname>
          </string-name>
          , N.:
          <article-title>Sledge: A simple yet effective baseline for coronavirus scientific knowledge search</article-title>
          . ArXiv abs/
          <year>2005</year>
          .02365 (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>MacAvaney</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nardini</surname>
            ,
            <given-names>F.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perego</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tonellotto</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goharian</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frieder</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Efficient document re-ranking for transformers by precomputing term representations</article-title>
          .
          <source>ArXiv abs/2004</source>
          .14255 (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Malkov</surname>
            ,
            <given-names>Y.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yashunin</surname>
            ,
            <given-names>D.A.</given-names>
          </string-name>
          :
          <article-title>Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs</article-title>
          .
          <source>Computing Research Repository (CoRR) abs/1603</source>
          .09320 (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Malliaros</surname>
            ,
            <given-names>F.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vazirgiannis</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Graph-based text representations: Boosting text mining, NLP and information retrieval with graphs</article-title>
          .
          <source>In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts. Association for Computational Linguistics</source>
          , Copenhagen, Denmark (Sep
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22. Mass,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Carmeli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Roitman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Konopnicki</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          :
          <article-title>Unsupervised FAQ retrieval with question generation and BERT</article-title>
          .
          <source>In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics</source>
          . pp.
          <fpage>807</fpage>
          -
          <lpage>812</lpage>
          . Association for Computational Linguistics,
          <string-name>
            <surname>Online</surname>
          </string-name>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>McDonald</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brokos</surname>
            ,
            <given-names>G.I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Androutsopoulos</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Deep Relevance Ranking Using Enhanced Document-Query Interactions</article-title>
          .
          <source>In: Proc. of the Conf. on Empirical Methods in Natural Language Processing</source>
          . Brussels, Belgium (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Mohan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fiorini</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          :
          <article-title>A fast deep learning model for textual relevance in biomedical information retrieval</article-title>
          .
          <source>Computing Research</source>
          Repository (CoRR) abs/
          <year>1802</year>
          .10078 (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Nogueira</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cho</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Passage re-ranking with BERT</article-title>
          .
          <source>Computing Research</source>
          Repository (CoRR) abs/
          <year>1901</year>
          .04085 (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Ozyurt</surname>
            ,
            <given-names>I.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bandrowski</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grethe</surname>
            ,
            <given-names>J.S.</given-names>
          </string-name>
          :
          <article-title>Bio-answerfinder: a system to find answers to questions from biomedical texts</article-title>
          .
          <source>Database : the journal of biological databases and curation 2020 (January</source>
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Padigela</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zamani</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Croft</surname>
          </string-name>
          , W.B.:
          <article-title>Investigating the successes and failures of BERT for passage re-ranking</article-title>
          .
          <source>Computing Research</source>
          Repository (CoRR) abs/
          <year>1905</year>
          .01758 (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Pappas</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brokos</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McDonald</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Androustopoulos</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Aueb at bioasq 7: Document and snippet retrieval</article-title>
          . In: BioASQ (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Pappas</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stavropoulos</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Androutsopoulos</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McDonald</surname>
            ,
            <given-names>R.:</given-names>
          </string-name>
          <article-title>BioMRC: A dataset for biomedical machine reading comprehension</article-title>
          .
          <source>In: Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing</source>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Roberts</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alam</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bedrick</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Demner-Fushman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lo</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Soboroff</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Voorhees</surname>
            ,
            <given-names>E.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>L.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hersh</surname>
            ,
            <given-names>W.R.</given-names>
          </string-name>
          :
          <article-title>Trec-covid: Rationale and structure of an information retrieval shared task for covid-19</article-title>
          .
          <article-title>Journal of the American Medical Informatics Association : JAMIA (</article-title>
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Roberts</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Demner-Fushman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Voorhees</surname>
            ,
            <given-names>E.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hersh</surname>
            ,
            <given-names>W.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bedrick</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lazar</surname>
            ,
            <given-names>A.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pant</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meric-Bernstam</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Overview of the trec 2017 precision medicine track</article-title>
          .
          <source>In: TREC</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Rospocher</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corcoglioniti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dragoni</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Boosting document retrieval with knowledge extraction and linked data</article-title>
          .
          <source>Semantic Web</source>
          <volume>10</volume>
          ,
          <fpage>753</fpage>
          -
          <lpage>778</lpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <surname>Su</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Siddique</surname>
            ,
            <given-names>F.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barezi</surname>
            ,
            <given-names>E.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fung</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Caire-covid: A question answering and multi-document summarization system for covid-19 research</article-title>
          . ArXiv abs/
          <year>2005</year>
          .03975 (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          34.
          <string-name>
            <surname>Tsatsaronis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balikas</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malakasiotis</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Partalas</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zschunke</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alvers</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weissenborn</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krithara</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Petridis</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Polychronopoulos</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Almirantis</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pavlopoulos</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baskiotis</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gallinari</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Artieres</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ngonga</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heino</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gaussier</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barrio-Alvers</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schroeder</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Androutsopoulos</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paliouras</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>An overview of the BioASQ Large-Scale Biomedical Semantic Indexing and Question Answering Competition</article-title>
          .
          <source>BMC Bioinformatics</source>
          <volume>16</volume>
          (
          <issue>138</issue>
          ) (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          35.
          <string-name>
            <surname>Vaswani</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shazeer</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parmar</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Uszkoreit</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gomez</surname>
            ,
            <given-names>A.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaiser</surname>
          </string-name>
          , L.u.,
          <string-name>
            <surname>Polosukhin</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Attention is all you need</article-title>
          .
          <source>In: Advances in Neural Information Processing Systems</source>
          <volume>30</volume>
          , pp.
          <fpage>5998</fpage>
          -
          <lpage>6008</lpage>
          . Curran Associates, Inc. (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          36.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>He</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>An end-to-end pseudo relevance feedback framework for neural document retrieval</article-title>
          .
          <source>Information Processing and Management</source>
          <volume>57</volume>
          (
          <issue>2</issue>
          ) (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          37.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>L.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lo</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chandrasekhar</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reas</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eide</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Funk</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kinney</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Merrill</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mooney</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Murdick</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rishi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sheehan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shen</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stilson</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wade</surname>
            ,
            <given-names>A.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wilhelm</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xie</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raymond</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weld</surname>
            ,
            <given-names>D.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Etzioni</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kohlmeier</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : Cord-19: The covid-19 open research dataset (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          38.
          <string-name>
            <surname>Wei</surname>
            ,
            <given-names>C.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Allot</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leaman</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          :
          <article-title>PubTator central: automated concept annotation for biomedical full text articles</article-title>
          .
          <source>Nucleic Acids Research</source>
          <volume>47</volume>
          (
          <issue>W1</issue>
          ),
          <fpage>W587</fpage>
          -
          <lpage>W593</lpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          39.
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ma</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nallapati</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Passage ranking with weak supervsion</article-title>
          . ArXiv abs/
          <year>1905</year>
          .05910 (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          40.
          <string-name>
            <surname>Yan</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yin</surname>
            ,
            <given-names>X.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>B.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>wei</surname>
            <given-names>Hao</given-names>
          </string-name>
          , H.:
          <article-title>Semantic indexing with deep learning: a case study</article-title>
          .
          <source>Big Data Analytics</source>
          <volume>1</volume>
          ,
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          41.
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          , Zhang, H.,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Simple applications of bert for ad hoc document retrieval</article-title>
          .
          <source>Computing Research</source>
          Repository (CoRR) abs/
          <year>1903</year>
          .10972 (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          42.
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yih</surname>
          </string-name>
          , W.t.,
          <string-name>
            <surname>Meek</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>WikiQA: A challenge dataset for open-domain question answering</article-title>
          .
          <source>In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing</source>
          . pp.
          <fpage>2013</fpage>
          -
          <lpage>2018</lpage>
          .
          <article-title>Association for Computational Linguistics</article-title>
          , Lisbon, Portugal (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          43.
          <string-name>
            <surname>Yoon</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dernoncourt</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>D.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bui</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jung</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>A compare-aggregate model with latent clustering for answer selection</article-title>
          .
          <source>In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management</source>
          . p.
          <fpage>2093</fpage>
          -
          <lpage>2096</lpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          44.
          <string-name>
            <surname>Zamani</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dehghani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Croft</surname>
            ,
            <given-names>W.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Learned-Miller</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kamps</surname>
          </string-name>
          , J.:
          <article-title>From neural re-ranking to neural ranking: Learning a sparse representation for inverted indexing</article-title>
          .
          <source>In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          45.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xie</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pan</surname>
          </string-name>
          , H.:
          <article-title>A graph based document retrieval method</article-title>
          .
          <source>In: 2018 IEEE 22nd International Conference on Computer Supported Cooperative Work in Design (CSCWD)</source>
          . pp.
          <fpage>426</fpage>
          -
          <lpage>432</lpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          46.
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Su</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sboner</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Graphene: A precise biomedical literature retrieval engine with graph augmented deep learning and external knowledge empowerment</article-title>
          .
          <source>Proceedings of the 28th ACM International Conference on Information and Knowledge Management</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>