=Paper=
{{Paper
|id=Vol-2167/short12
|storemode=property
|title=Implicit-Explicit Representations for Case-Based Retrieval
|pdfUrl=https://ceur-ws.org/Vol-2167/short12.pdf
|volume=Vol-2167
|authors=Stefano Marchesin
|dblpUrl=https://dblp.org/rec/conf/desires/Marchesin18
}}
==Implicit-Explicit Representations for Case-Based Retrieval==
Implicit-Explicit Representations for Case-Based Retrieval
Stefano Marchesin
Department of Information Engineering
University of Padua, Italy
stefano.marchesin@unipd.it
ABSTRACT phrases, implicit relations within documents can be considered
We propose an IR framework to combine the implicit representa- too. Thus, distributional and knowledge-based representations of
tions — identified using distributional representation techniques — complex semantics (i.e. words, sentences and documents) identify
and the explicit representations — derived from external knowledge complementary semantic aspects of the underlying documents.
sources — of documents to improve medical case-based retrieval. We propose to integrate documents’ knowledge-based repre-
Combining implicit-explicit representations of documents aims at sentations in case-based retrieval [2], as a form of complementary
enriching the semantic understanding of documents and reducing refinement for distributional representations. Recent approaches
the semantic gap between documents and queries. exploit semantic relations to enhance the quality of learned word or
concept representations [1, 4], we propose to explicitly leverage se-
CCS CONCEPTS mantic relations to model document representations. Therefore, our
approach extracts concepts from documents (and queries) and con-
• Information systems → Document representation; Query
nect them using the semantic relations contained within a reference
representation; Information extraction;
knowledge source — creating a knowledge graph representation for
the document (or query). The intuition is that semantic relations
KEYWORDS carry high informative power that can boost precision.
Relation-based information retrieval; Semantic gap The knowledge graph representation can reduce the contextual
dependency of distributional representations and help discriminat-
1 MOTIVATIONS AND METHODOLOGY ing more effectively semantically similar from non-semantically
Medical literature published every year keeps growing drastically. similar texts. Besides, since the concepts considered are only those
Clinicians have limited time to retrieve relevant information from extracted from a document or a query, the problem of topic drift —
medical literature. Standard Information Retrieval (IR) systems are occurring when the query is expanded with concepts that are not
not able to cope with the amount of literature and the limited time pertinent to the information need — is reduced.
available to clinicians. Therefore, there has been a strong interest We propose to combine implicit and explicit representations for
for Clinical Decision Support (CDS) systems designed to produce case-based retrieval in two different ways: (i) considering document-
effective and timely knowledge, that can help clinicians in the level knowledge graphs as additional inputs for end-to-end neural
decision making process. Such systems are known as case-based scoring models that learn the relevance of document-query pairs
retrieval systems. Given a medical case of interest, a case-based via semantic features; (ii) considering document-level knowledge
retrieval system should retrieve highly related medical literature graphs with pseudo relevance feedback to boost documents in top
from a large collection of medical literature. positions that present a more similar graph compared with the
A key characteristic of the medical literature is the large use of query graph. Both approaches aim at reducing the semantic gap
synonyms and context-specific expressions. Such characteristics between queries and documents.
increase the semantic gap between documents and queries.
To tackle the problems above, both deep representation learning ACKNOWLEDGMENTS
methods and external knowledge sources have been used. Deep Supported by the CDC-STARS project of the University of Padua.
representation learning effectively discovers hidden structures that
relate — through latent semantic features — the different textual REFERENCES
components, be them words, sentences or documents [3]. External [1] Manaal Faruqui, Jesse Dodge, Sujay K. Jauhar, Chris Dyer, Eduard Hovy, and
Noah A. Smith. 2014. Retrofitting word vectors to semantic lexicons. arXiv
knowledge resources, such as ontologies and knowledge bases, preprint arXiv:1411.4166 (2014).
provide factual knowledge about the meaning of words and their [2] Stefano Marchesin. 2018. Case-Based Retrieval Using Document-Level Semantic
semantic relationships. Networks. In 41st ACM SIGIR (SIGIR ’18). ACM, New York, NY, USA, 1451.
[3] Bhaskar Mitra and Nick Craswell. 2017. Neural Models for Information Retrieval.
However, by formalizing the semantic relationships between arXiv preprint arXiv:1705.01509 (2017).
different concepts, knowledge sources represent a partial (and spe- [4] Gia-Hung Nguyen, Lynda Tamine, Laure Soulier, and Nathalie Souf. 2018. A Tri-
cific) representation of the world. Therefore, knowledge sources Partite Neural Document Language Model for Semantic Information Retrieval.
In European Semantic Web Conference. Springer, 445–461.
do not necessarily represent implicit relations that appear in doc-
uments. By learning distributional representations of words and
DESIRES 2018, August 2018, Bertinoro, Italy
© 2018 Copyright held by the author(s).