<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Bari, Italy
" o.espitiamendoza@campus.unimib.it (Ó. Espitia); gabriella.pasi@unimib.it (G. Pasi)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Neural IR for Domain-Specific Tasks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Discussion Paper</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Óscar Espitia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gabriella Pasi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Milano-Bicocca</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <institution>Italy Department of Informatics, Systems, and Communication (DISCo) Information and Knowledge Representation, Retrieval, and Reasoning (IKR3) Lab https:// ikr3.disco.unimib.it</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>Several specific features such as the volume of data, document size, structure of the documents, jargon, the way information needs are defined, among others, are features that justify that retrieval models should not handle all information equally when it comes to domain-specific retrieval tasks (e.g., in law and healthcare). Neural IR models can deal with such features, especially those related to contextual elements (e.g., expert knowledge). The ongoing project will consider domain-specific embeddings to contextualize within a neural ranking setup document retrieval in domain-specific tasks.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Domain-specific search</kwd>
        <kwd>Neural IR</kwd>
        <kwd>Contextual IR</kwd>
        <kwd>Embeddings</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Traditional information retrieval (IR) models are designed as generic tools applicable in diferent
retrieval tasks. However, several specific features justify that retrieval models should not handle
all information equally when it comes to domain-specific (DS) retrieval tasks (e.g., in law and
healthcare) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The volume of data, document size, structure of the documents, jargon, the way
information needs are defined, among others, are features that define gaps between generic
search tools and DS search tasks.
      </p>
      <p>Neural IR is a growing field in IR; neural networks (NN) represent a tool for learning text
representations, defining ranking models capable of handling diferent kinds of documents, and
even introducing additional elements in the retrieval process, such as DS features.</p>
      <p>
        Some of those features can be considered as contextual elements, i.e., as factors influencing
how an IR system is used and how its performance should be accordingly evaluated. The concept
of context in IR has been extensively studied, giving rise to the so called contextual IR, which
aims at optimizing the retrieval performance by defining the search context and by taking it
into account in the information selection process and in assessing the search outcome [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        In last years, there have been developments in representing DS corpora; e.g., Legal-BERT
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and BIO-BERT [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]; which perform well in representing texts that come from the legal and
health domains, respectively. Even though these attempts are not strictly developed within
contextual IR, using such representation models in retrieval frameworks fits into the ways of
tailoring the retrieval process with contextual factors; in this case the context is represented by
specialized corpora, which help to disambiguate terms and include DS jargon.
      </p>
      <p>The main aim of the ongoing project is to define DS embeddings that can serve the purpose
of contextualizing search within a neural ranking setup in DS tasks. More specifically, our aim
is to evaluate the efectiveness of pre-trained embeddings with respect to traditional lexical
matching-based models, and to boost the performance of neural models by fine tuning the DS
embeddings in a semantic-based retrieval scenario.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Neural IR</title>
      <p>
        A generalized Learning to Rank problem is focused on finding the optimal ranking function,
according to existing neural IR models [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]; such a function could be abstracted by the formulation
 (, ) =  (Φ(), Φ()), where  is a query and  is a document from a collection. Depending
on whether the model is focused on defining  or Φ or both, one can get diferent anatomies
of models. First, models focused on learning representations of the input texts. This kind of
model considers as its inputs some basic representations such as one-hot encoding at character,
term, n-graph level, with the aim of learning more complex and dense text representations or
embeddings by using NN; in this case  is usually a similarity measure. In contrast, another
kind of models assume the representation function Φ as the input layer of the model, so it can
be either a simple or a complex representation to find interactions between the inputs, and
from those interactions the model learns relevance patterns for ranking; thus,  =  ∘  is
composed both of the interaction function () and deep models for ranking ().
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. DS tasks examples</title>
      <p>This section shortly introduces two examples of DS search tasks on which we are working.</p>
      <p>
        Patient- clinical trials matching: clinical trials are experiments conducted in the development
of new medical treatments, drugs, or devices. Recruiting candidates for a trial motivates the
task of matching eligible patients () to clinical trials () [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        Legal case retrieval: this task involves finding precedent cases ( ) that are relevant, i.e., could
support the decision concerning a given case () in the set of candidate cases [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>Compared to traditional ad-hoc text retrieval, the above tasks are relatively more challenging
since the query is much longer and more complex than common keyword-based queries. Besides
that, the definition of relevance of a document to a query is beyond general topical relevance,
and as such its assessment requires expert knowledge.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Building blocks for designing DS neural search models</title>
      <p>This section presents some strategies that have the potential to address DS search tasks and the
building blocks for model design.</p>
      <p>
        Several works have tried to learn text representations (Φ) that incorporate contextual elements
into the model by learning embeddings from text corpora that gather information potentially
useful to better represent the information need in a given context [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ]. On the other hand, it
is also known that there are several DS embeddings, such as legal-BERT, Bio-BERT, that were
created with the motivation of having good representations of texts that are DS.
      </p>
      <p>Given the representations of both a query and a document, one can match diferent parts of
the query with diferent parts of the document and then aggregate values as partial evidence
of relevance. Interactions-based approaches model this matching using an interaction matrix
formed by sweeping both the query and the document representations with a sliding window
that instantiates a function () for aggregating. Each instance of the window over the query
interacts with each instance of the window over the document. This process could capture local
interactions based on diferent matching levels (e.g., word-level and passage-level). Denfiing
the input representations, the aggregation function, and the matching level afects the model’s
performance; then, all those parameters must be defined according to the specific task and its
features.</p>
      <p>Finally, there are diferent kinds of neural networks that are especially designed for some
purposes that could match some of the issues described above and could be used for building a
ranking model ():</p>
      <p>Pooling layers sweep a kernel (with no weights) across the entire input, similar to the
convolutional layer. The kernel applies an aggregation function, which could either selects the
element with the maximum value to send to the output array (Max-pooling), or computes the
average or the sum (Average-pooling and sum-pooling, respectively) within the area covered by
the kernel while it moves across the input, leading to diferent architectures. Intuitively, pooling
layers conduct dimensionality reduction, reducing the number of parameters in the input. This
operation is useful to reduce complexity, improve eficiency, and limit the risk of overfitting.</p>
      <p>Encoders process the input and compress the information into a context vector (also known
as sentence embedding or vector) with fixed length. This representation is expected to be a
good summary of the meaning of the whole input. Encoders are recurrent neural networks, i.e.,
LSTM or GRU units, which can model sequential data.</p>
      <p>
        Attention mechanisms have become an integral part of sequence models in several tasks,
allowing the model to learn dependencies without regard to their distance in the input or output
sequences. Attention mechanisms are usually used in conjunction with recurrent networks but
also as the core of the transformer architectures [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. An attention function outputs a weighted
sum of the values, where the weight assigned to each value is computed by a compatibility
function of a given context vector with the values in the input. In this way, the attention
mechanism is used to infer the importance of each position of the input.
      </p>
    </sec>
    <sec id="sec-5">
      <title>5. Aim of the project</title>
      <p>The ongoing project hypothesizes that the building blocks described above can be exploited to
deal with the challenges of DS search. Then, we can design a neural ranker for DS tasks that
leverages contextual information. The contextual information can be included by analysing DS
features in search tasks and text representations that pay attention to the expert knowledge
contained within DS corpora. Two retrieval scenarios are considered following a re-ranking
technique: first, a lexical-based retrieval phase is followed by a re-ranking phase based on
semantic similarity; second, both the first phase and the re-ranking phase will be based on
semantic search. These scenarios will allow us to explore diferent features of the models,
for example, the value of exact matching and the efect of the vocabulary mismatch problem.
On the other hand, diferent training strategies will be considered for text representation and
text classification to build a more suitable semantic space for the specific task and to learn
interactions between the inputs for re-ranking. This will allow us to evaluate if the tasks can
benefit from fine-tuning text representations and if the learned representations are suitable for
the first-stage retrieval and re-ranking.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work is supported by the EU Horizon 2020 ITN/ETN on Domain Specific Systems for
Information Extraction and Retrieval (H2020-EU.1.3.1., ID: 860721).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hanbury</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lupu</surname>
          </string-name>
          ,
          <article-title>Toward a Model of Domain-Specific Search</article-title>
          , OAIR '13: Open research Areas in Information Retrieval (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Z. A.</given-names>
            <surname>Merrouni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Frikh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ouhbi</surname>
          </string-name>
          ,
          <article-title>Toward Contextual Information Retrieval: A Review and Trends</article-title>
          ,
          <source>Procedia Computer Science</source>
          <volume>148</volume>
          (
          <year>2019</year>
          )
          <fpage>191</fpage>
          -
          <lpage>200</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>I.</given-names>
            <surname>Chalkidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fergadiotis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Malakasiotis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Aletras</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Androutsopoulos</surname>
          </string-name>
          , LEGAL-BERT:
          <article-title>The Muppets straight out of Law School, Findings of Empirical Methods in Natural Language Processing (</article-title>
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Yoon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. H.</given-names>
            <surname>So</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Kang,</surname>
          </string-name>
          <article-title>BioBERT: a pre-trained biomedical language representation model for biomedical text mining</article-title>
          ,
          <source>Bioinformatics</source>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Fan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Pang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Ai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zamani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. B.</given-names>
            <surname>Croft</surname>
          </string-name>
          , X. Cheng,
          <article-title>A Deep Look into neural ranking models for information retrieval</article-title>
          ,
          <source>Information Processing and Management</source>
          <volume>57</volume>
          (
          <year>2020</year>
          )
          <fpage>102067</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>B.</given-names>
            <surname>Koopman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Zuccon</surname>
          </string-name>
          ,
          <article-title>A test collection for matching patients to clinical trials</article-title>
          ,
          <source>SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Rabelo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Y.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Goebel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yoshioka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Satoh</surname>
          </string-name>
          ,
          <article-title>A Summary of the COLIEE 2019 Competition</article-title>
          . In:
          <string-name>
            <surname>Sakamoto</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Okazaki</surname>
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mineshima</surname>
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Satoh</surname>
            <given-names>K</given-names>
          </string-name>
          . (eds) New Frontiers in Artificial Intelligence.
          <source>JSAI-isAI 2019, Lecture Notes in Computer Science 12331 LNAI</source>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Wen</surname>
          </string-name>
          ,
          <article-title>Employing Personal Word Embeddings for Personalized Search</article-title>
          ,
          <source>SIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Wen</surname>
          </string-name>
          ,
          <article-title>Encoding History with Context-aware Representation Learning for Personalized Search</article-title>
          ,
          <source>SIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          , Attention Is All You Need Ashish (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>