=Paper= {{Paper |id=Vol-1276/MedIR-SIGIR2014-overview |storemode=property |title=Report on the SIGIR 2014 Workshop on Medical Information Retrieval (MedIR) |pdfUrl=https://ceur-ws.org/Vol-1276/MedIR-SIGIR2014-overview.pdf |volume=Vol-1276 |dblpUrl=https://dblp.org/rec/conf/sigir/GoeuriotKJMZ14 }} ==Report on the SIGIR 2014 Workshop on Medical Information Retrieval (MedIR)== https://ceur-ws.org/Vol-1276/MedIR-SIGIR2014-overview.pdf
              Report on the SIGIR 2014 Workshop on Medical
                      Information Retrieval (MedIR)

                 Lorraine Goeuriot                         Liadh Kelly                      Gareth J.F. Jones
                 Dublin City University               Dublin City University                Dublin City University
                        Ireland                              Ireland                               Ireland
           lorraine.goeuriot@imag.fr   liadh.kelly@scss.tcd.ie   gjones@computing.dcu.ie
                              Henning Müller              Justin Zobel
                                           HES-SO Valais                University of Melbourne
                                            Switzerland                        Australia
                                 henning.mueller@hevs.ch              jzobel@unimelb.edu.au

ABSTRACT                                                              challenging and under explored.
The workshop on Medical Information Retrieval took place                 One of the central issues in medical information search is
at SIGIR 2014 in Gold Coast, Australia on July 11. The                the diversity of the users of these services with correspond-
workshop included eight oral presentations of referred papers         ing differences in types and scopes of their individual needs.
and an invited keynote presentation. This allowed time for            Their information needs will be associated with varied cate-
lively discussions among the participants. These showed the           gories and purposes, they will typically have widely varying
significant interest in the medical information retrieval do-         levels of medical knowledge, and, important in some set-
main and the many research challenges arising in this space           tings, they will have differing language skills.
which need to be addressed to give added value to the wide               These challenges can be summarized as follows:
variety of users that can profit from medical information               1. Varying information needs: While a patient with a re-
search, such as patients, general health professionals and                 cently diagnosed condition will generally benefit most
specialist groups such as radiologists who mainly search for               from simple or introductory information on the disease
images and image related information.                                      and its treatment, a patient living with or managing
                                                                           a condition over a longer term will generally be look-
                                                                           ing for more advanced information, or perhaps support
1. INTRODUCTION                                                            groups and forums. In a similar way, a general prac-
   Medical information retrieval refers to methodologies and               titioner might require basic information quickly while
technologies that seek to improve access to medical informa-               advising a patient, but more detailed information if de-
tion archives via a process of information retrieval (IR). Such            ciding on a course of treatment, while a specialist clin-
information is now potentially accessible from many sources                ician might look for an exhaustive list of similar cases
including the general web, social media, journal articles, and             or research papers relating to the condition of a pa-
hospital records. Health-related content is one of the most                tient that they are currently seeking to advise. Under-
searched-for topics on the Internet, and as such this is an                standing various types of users and their information
important domain for research in information retrieval.                    needs is one of the cornerstones of medical information
   Medical information is of interest to a wide variety of                 search, while adapting IR to best address these needs
users, including patients and their families, researchers, gen-            to develop effective, potentially personalized systems
eral practitioners and clinicians, and practitioners with spe-             is one of its greatest challenges.
cific expertise such as radiologists. There are several dedi-
cated services that seek to make this information more eas-             2. Varying medical knowledge: The different categories of
ily accessible, such as Health on the Net’s medical search                 users of medical information search systems will have
systems for the general public and medical practitioners:                  widely varying levels of medical knowledge, and indeed
http://www.hon.ch/. Despite the popularity of the medi-                    the medical knowledge of different individuals within
cal domain for users of search engines, and current interest               a user category can also vary greatly. This affects the
in this topic within the IR research community, develop-                   way in which individuals pose search queries to systems
ment of search and access technologies remains particularly                and also the level of complexity of information which
                                                                           should be returned to them or the type of support
                                                                           in understanding / disambiguating returned material
                                                                           which will be required.

                                                                        3. Varying language skills: Given that much medical con-
                                                                           tent is written only in the English language, research to
                                                                           date in medical information search has predominantly
                                                                           focused on monolingual English retrieval. However,
Copyright is held by the author/owner(s).                                  given the large number of non-English speakers on the
MedIR 2014, July 11, 2014, Gold Coast, Australia.                          Internet and the lack of content in their native lan-




                                                                  1
     guage, effective support for them to search English lan-          data in electronic form. In practice-based evidence, the clin-
     guage sources is highly desirable. The Internet in par-           ical record is mined to identify patterns of health character-
     ticular has affected the patient-physician relationship,          istics, such as diseases that co-occur, side-effects of treat-
     and providing relevant, reliable information to patients          ments, or more subtle combinations of patient attributes
     in their own language is a key to alleviate such chal-            that might explain a particular health outcome. This ap-
     lenging situations and reduce instances of phenomenon             proach contrasts with what has been the standard of care in
     such as cyberchondria.                                            medicine, evidence-based practice, in which treatment de-
                                                                       cisions are based on (quantitative) evidence derived from
In addition, the format, reliability, and quality of biomedi-          targeted research studies, specifically, randomised controlled
cal and medical information varies greatly. A single health            trials. Advantages of consulting the clinical record for evi-
record can contain clinical notes, technical pathology data,           dence rather than relying solely on structured research in-
images, and patient-contributed histories, and may be linked           clude avoiding the selection bias of the inclusion criteria for
by a physician to research papers. The importance of health            a clinical trial and monitoring of longer-term outcomes and
and medical topics and their impact on people’s everyday               effects. The two approaches are, of course, complementary
lives makes the need for retrieval of accurate and reliable            - a hypothesis derived from large-scale data mining could
information especially important. Determining the likely               in turn form the starting point for the design of a clinical
reliability of available information is challenging. Finally, as       trial to rigorously investigate that hypothesis. Information
with IR in general, the evaluation of medical search tools is          retrieval can play an important role in both approaches to
vital and challenging. For example, there are no established           collecting medical evidence. However, the use of informa-
or standardized baselines or evaluation metrics, and limited           tion retrieval methods in collecting practice-based evidence
availability of test collections. Further discussion and pro-          requires moving away from traditional document-oriented
gression on this topic would be beneficial to the community.           retrieval as the end goal in itself, to viewing that retrieval
                                                                       as an intermediate step towards knowledge discovery and
2. THEME AND PURPOSE OF THE WORK-                                      population-scale data mining. Furthermore, it may require
                                                                       the development of more context-specific retrieval strategies,
   SHOP                                                                designed to identify specific characteristics of interest and
   The objective of the workshop was to provide a forum to             support particular tasks in the medical context.
enable the progression of research in medical IR seeking to
provide enhanced search services for all users with interests
in medical information search. The workshop aimed to bring             4.    PRESENTED PAPERS
together researchers interested in medical information search            Of the twenty papers submitted to the workshop, eight
with the goal of identifying specific research challenges that         were selected for inclusion in the workshop proceedings and
need to be addressed to advance the state-of-the-art and to            for presentation at the workshop:
foster interdisciplinary collaborations towards the meeting
of these challenges. To enable this, we encouraged partic-                  • Patrick Cheong-Iao Pang, Karin Verspoor, Shanton
ipation from researchers in all fields related to medical in-                 Chang and Jon Pearce. Designing for Health Exploratory
formation search including mainstream IR, but also natural                    Seeking Behaviour [5]
language processing, multilingual text processing, and med-
ical image analysis.                                                        • Miji Choi, Karin Verspoor and Justin Zobel. Evalua-
   Topics of interest included but are were not limited to:                   tion of Coreference Resolution for Biomedical Text [1]

   • Users and information needs                                            • Yihan Deng, Matthaeus Stoehr and Kerstin Denecke.
                                                                              Retrieving Attitudes: Sentiment Analysis from Clini-
   • Semantics and natural language processing (NLP) for                      cal Narratives [2]
     medical IR
                                                                            • Bevan Koopman and Guido Zuccon. Why Assessing
   • Reliability and trust in medical IR
                                                                              Relevance in Medical IR is Demanding []
   • Personalised search
                                                                            • Dimitrios Markonis, Roger Schaer and Henning MÃijller.
   • Evaluation of medical IR                                                 Multi-modal relevance feedback for medical image re-
                                                                              trieval [3]
   • Multilingual issues in medical IR
                                                                            • Liqiang Nie, Mohammad Akbari, Tao Li and Tat-Seng
   • Multimedia technologies in medical IR                                    Chua. A Joint Local-Global Approach for Medical Ter-
                                                                              minology Assignment [4]
   • The role of social media in medical IR
                                                                            • Rajendra Prasath and Philip O’Reilly. Exploring Clus-
3. KEYNOTE - DR KARIN VERSPOOR                                                tering Based Knowledge Discovery towards Improved
   The keynote talk was given by Dr Karin Verspoor (Univer-                   Medical Diagnosis [6]
sity of Melbourne, Australia), on ”Practice-based Evidence
in Medicine: Where Information Retrieval Meets Data Min-                    • Guido Zuccon and Bevan Koopman. Integrating Un-
ing” [7]. A new approach in medical practice is emerging                      derstandability in the Evaluation of Consumer Health
thanks to the increasing availability of large-scale clinical                 Search Engines [9]




                                                                   2
5.   DISCUSSION SESSION                                                both the activity and interest in the medical information
   The discussion sessions started with a brainstorming ac-            retrieval space within the community. The workshop pro-
tivity to identify the key challenges in medical IR. The two           vided greater insights into the active areas of research within
main areas identified were the lack of available data sets and         this space and helped in progression of the many challenges
the need for better evaluation. Two groups were formed to              facing the space. Special attention was paid to evaluation
discuss these two topics.                                              within this space and possibilities for progression within the
   The first group discussed the lack of data sets. One of the         data set creation and benchmarking initiatives discussed.
reasons for this is the limited amount of publicly available
data (i.e. clinical data, query logs, etc.). Aside from the            7.   ACKNOWLEDGEMENTS
patient related issues of confidentiality and privacy, medical            We would like to thank SIGIR 2014 for hosting the work-
data being very varied and changing, getting representative            shop. Thanks also go to the program committee (Eiji Ara-
and up-to-date data sets is very challenging. These varia-             maki, Kyoto University, Japan; Celia Boyer, Health on the
tions can be found at different levels. The level of specializa-       Net, Switzerland; Ben Carterette, University of Delaware,
tion and targeted readers is the first one: consumer informa-          USA; Allan Hanbury, Vienna University of Technology, Aus-
tion varies greatly from clinical practice information. Then,          tria; William Hersh, Oregon Health and Science Univer-
linguistic variations such as shifting vocabulary are impact-          sity, USA; Jung-Jae Kim, Nanyang Technological Univer-
ing information extraction (IE) and IR results. In order               sity, Singapore; Gang Luo, University of Utah, USA; Iadh
to deal with these changing characteristics, what could the            Ounis, University of Glasgow, UK; Patrick Ruch, HES-SO,
value of abstraction into controlled vocabularies be? More-            Switzerland; Stefan Schulz, Medical University Graz, Aus-
over, controlled vocabulary would help in alleviating am-              tria; Karin Verspoor, NICTA, Australia; Ellen Voorhees,
biguity. But how can it be efficiently incorporated into a             NIST, USA; Ryen White, Microsoft Research, USA; Elad
retrieval approach? Concept-based representation of data               Yom-Tov, Microsoft Research, USA), paper authors and
and indexing are investigated but their efficiency in IR is            workshop attendees, without whom the workshop would not
still to be proven. Finally, some modalities are very specific         have been the success it was.
to the medical domain, such as temporality, negativity, and
patients’ characteristics in clinical data such as age, gender,
co-morbidities, etc. To understand and automatically pro-              8.   REFERENCES
cess these, training data is necessary (raw data and gold              [1] M. Choi, K. Verspoor, and J. Zobel. Evaluation of
standard annotations), but is difficult and expensive to ob-               coreference resolution for biomedical text. In
tain.                                                                      Proceedings of the SIGIR workshop on Medical
   The second group focused on the evaluation of medical                   Information Retrieval (MEDIR 2014), 2014.
information retrieval. They identified as the main issues the          [2] Y. Deng, M. Stoehr, and K. Denecke. Retrieving
lack of evaluation campaigns and benchmarks for medical                    attitudes: Sentiment analysis from clinical narratives.
IR, and the lack of information on the few existing cam-                   In Proceedings of the SIGIR workshop on Medical
paigns. Based on the experience of the group members, a                    Information Retrieval (MEDIR 2014), 2014.
few key challenges were focused on, in order to get more               [3] B. Koopman and G. Zuccon. Why assessing relevance
benchmarks, and improve their quality. Firstly, it is crucial              in medical ir is demanding. In Proceedings of the SIGIR
to design realistic tasks, which involve a deep understanding              workshop on Medical Information Retrieval (MEDIR
of the users and their needs. Only once that has been done                 2014), 2014.
can the dataset be built, with realistic data. Along with the          [4] L. Nie, M. Akbari, T. Li, and T.-S. Chua. A joint
task and use case scenario, the evaluation scheme and the                  local-global approach for medical terminology
definition of relevance needs to be very carefully planned, in             assignment. In Proceedings of the SIGIR workshop on
order to maximize the outcome of the task. In [2], relevance               Medical Information Retrieval (MEDIR 2014), 2014.
is modelled according to several relevance dimension factors:          [5] P. C.-I. Pang, K. Verspoor, S. Chang, and J. Pearce.
understandability, topicality, novelty, scope and reliability.             Designing for health exploratory seeking behaviour. In
For instance, a task focusing on IR for patients or laypeople              Proceedings of the SIGIR workshop on Medical
would define relevance as mainly based on the topicality, the              Information Retrieval (MEDIR 2014), 2014.
reliability and the understandability. These factors need to           [6] R. Prasath and P. O’Reilly. Exploring clustering based
be taken into account during the relevance judgement and                   knowledge discovery towards improved medical
results evaluation stages [8]. This would allow personaliza-               diagnosis. In Proceedings of the SIGIR workshop on
tion of the search, characterizing the users and their infor-              Medical Information Retrieval (MEDIR 2014), 2014.
mation needs. Lastly, benchmark creators should be incited             [7] K. Verspoor. Practice-based evidence in medicine:
to make their datasets available to the public, for specific               Where information retrieval meets data mining. In
tasks or any related research work.                                        Proceedings of the SIGIR workshop on Medical
                                                                           Information Retrieval (MEDIR 2014), keynote, 2014.
                                                                       [8] Y. Zhang, J. Zhang, M. Lease, and J. Gwizdka.
6.   CONCLUSIONS                                                           Multidimensional relevance modeling via psychometrics
   This was the first SIGIR workshop on ’Medical Informa-                  and crowdsourcing. In Proceedings of SIGIR 2014, 2014.
tion Retrieval’, and followed on nicely from the SIGIR 2013            [9] G. Zuccon and B. Koopman. Integrating
medical workshop on ’Health Search and Discovery: Helping                  understandability in the evaluation of consumer health
Users and Advancing Medicine’. The volume of interest in                   search engines. In Proceedings of the SIGIR workshop
the workshop, both through the number of paper submis-                     on Medical Information Retrieval (MEDIR 2014), 2014.
sions and large number of workshop participants, highlight




                                                                   3