=Paper=
{{Paper
|id=Vol-1276/MedIR-SIGIR2014-overview
|storemode=property
|title=Report on the SIGIR 2014 Workshop on Medical Information Retrieval (MedIR)
|pdfUrl=https://ceur-ws.org/Vol-1276/MedIR-SIGIR2014-overview.pdf
|volume=Vol-1276
|dblpUrl=https://dblp.org/rec/conf/sigir/GoeuriotKJMZ14
}}
==Report on the SIGIR 2014 Workshop on Medical Information Retrieval (MedIR)==
Report on the SIGIR 2014 Workshop on Medical Information Retrieval (MedIR) Lorraine Goeuriot Liadh Kelly Gareth J.F. Jones Dublin City University Dublin City University Dublin City University Ireland Ireland Ireland lorraine.goeuriot@imag.fr liadh.kelly@scss.tcd.ie gjones@computing.dcu.ie Henning Müller Justin Zobel HES-SO Valais University of Melbourne Switzerland Australia henning.mueller@hevs.ch jzobel@unimelb.edu.au ABSTRACT challenging and under explored. The workshop on Medical Information Retrieval took place One of the central issues in medical information search is at SIGIR 2014 in Gold Coast, Australia on July 11. The the diversity of the users of these services with correspond- workshop included eight oral presentations of referred papers ing differences in types and scopes of their individual needs. and an invited keynote presentation. This allowed time for Their information needs will be associated with varied cate- lively discussions among the participants. These showed the gories and purposes, they will typically have widely varying significant interest in the medical information retrieval do- levels of medical knowledge, and, important in some set- main and the many research challenges arising in this space tings, they will have differing language skills. which need to be addressed to give added value to the wide These challenges can be summarized as follows: variety of users that can profit from medical information 1. Varying information needs: While a patient with a re- search, such as patients, general health professionals and cently diagnosed condition will generally benefit most specialist groups such as radiologists who mainly search for from simple or introductory information on the disease images and image related information. and its treatment, a patient living with or managing a condition over a longer term will generally be look- ing for more advanced information, or perhaps support 1. INTRODUCTION groups and forums. In a similar way, a general prac- Medical information retrieval refers to methodologies and titioner might require basic information quickly while technologies that seek to improve access to medical informa- advising a patient, but more detailed information if de- tion archives via a process of information retrieval (IR). Such ciding on a course of treatment, while a specialist clin- information is now potentially accessible from many sources ician might look for an exhaustive list of similar cases including the general web, social media, journal articles, and or research papers relating to the condition of a pa- hospital records. Health-related content is one of the most tient that they are currently seeking to advise. Under- searched-for topics on the Internet, and as such this is an standing various types of users and their information important domain for research in information retrieval. needs is one of the cornerstones of medical information Medical information is of interest to a wide variety of search, while adapting IR to best address these needs users, including patients and their families, researchers, gen- to develop effective, potentially personalized systems eral practitioners and clinicians, and practitioners with spe- is one of its greatest challenges. cific expertise such as radiologists. There are several dedi- cated services that seek to make this information more eas- 2. Varying medical knowledge: The different categories of ily accessible, such as Health on the Net’s medical search users of medical information search systems will have systems for the general public and medical practitioners: widely varying levels of medical knowledge, and indeed http://www.hon.ch/. Despite the popularity of the medi- the medical knowledge of different individuals within cal domain for users of search engines, and current interest a user category can also vary greatly. This affects the in this topic within the IR research community, develop- way in which individuals pose search queries to systems ment of search and access technologies remains particularly and also the level of complexity of information which should be returned to them or the type of support in understanding / disambiguating returned material which will be required. 3. Varying language skills: Given that much medical con- tent is written only in the English language, research to date in medical information search has predominantly focused on monolingual English retrieval. However, Copyright is held by the author/owner(s). given the large number of non-English speakers on the MedIR 2014, July 11, 2014, Gold Coast, Australia. Internet and the lack of content in their native lan- 1 guage, effective support for them to search English lan- data in electronic form. In practice-based evidence, the clin- guage sources is highly desirable. The Internet in par- ical record is mined to identify patterns of health character- ticular has affected the patient-physician relationship, istics, such as diseases that co-occur, side-effects of treat- and providing relevant, reliable information to patients ments, or more subtle combinations of patient attributes in their own language is a key to alleviate such chal- that might explain a particular health outcome. This ap- lenging situations and reduce instances of phenomenon proach contrasts with what has been the standard of care in such as cyberchondria. medicine, evidence-based practice, in which treatment de- cisions are based on (quantitative) evidence derived from In addition, the format, reliability, and quality of biomedi- targeted research studies, specifically, randomised controlled cal and medical information varies greatly. A single health trials. Advantages of consulting the clinical record for evi- record can contain clinical notes, technical pathology data, dence rather than relying solely on structured research in- images, and patient-contributed histories, and may be linked clude avoiding the selection bias of the inclusion criteria for by a physician to research papers. The importance of health a clinical trial and monitoring of longer-term outcomes and and medical topics and their impact on people’s everyday effects. The two approaches are, of course, complementary lives makes the need for retrieval of accurate and reliable - a hypothesis derived from large-scale data mining could information especially important. Determining the likely in turn form the starting point for the design of a clinical reliability of available information is challenging. Finally, as trial to rigorously investigate that hypothesis. Information with IR in general, the evaluation of medical search tools is retrieval can play an important role in both approaches to vital and challenging. For example, there are no established collecting medical evidence. However, the use of informa- or standardized baselines or evaluation metrics, and limited tion retrieval methods in collecting practice-based evidence availability of test collections. Further discussion and pro- requires moving away from traditional document-oriented gression on this topic would be beneficial to the community. retrieval as the end goal in itself, to viewing that retrieval as an intermediate step towards knowledge discovery and 2. THEME AND PURPOSE OF THE WORK- population-scale data mining. Furthermore, it may require the development of more context-specific retrieval strategies, SHOP designed to identify specific characteristics of interest and The objective of the workshop was to provide a forum to support particular tasks in the medical context. enable the progression of research in medical IR seeking to provide enhanced search services for all users with interests in medical information search. The workshop aimed to bring 4. PRESENTED PAPERS together researchers interested in medical information search Of the twenty papers submitted to the workshop, eight with the goal of identifying specific research challenges that were selected for inclusion in the workshop proceedings and need to be addressed to advance the state-of-the-art and to for presentation at the workshop: foster interdisciplinary collaborations towards the meeting of these challenges. To enable this, we encouraged partic- • Patrick Cheong-Iao Pang, Karin Verspoor, Shanton ipation from researchers in all fields related to medical in- Chang and Jon Pearce. Designing for Health Exploratory formation search including mainstream IR, but also natural Seeking Behaviour [5] language processing, multilingual text processing, and med- ical image analysis. • Miji Choi, Karin Verspoor and Justin Zobel. Evalua- Topics of interest included but are were not limited to: tion of Coreference Resolution for Biomedical Text [1] • Users and information needs • Yihan Deng, Matthaeus Stoehr and Kerstin Denecke. Retrieving Attitudes: Sentiment Analysis from Clini- • Semantics and natural language processing (NLP) for cal Narratives [2] medical IR • Bevan Koopman and Guido Zuccon. Why Assessing • Reliability and trust in medical IR Relevance in Medical IR is Demanding [] • Personalised search • Dimitrios Markonis, Roger Schaer and Henning MÃijller. • Evaluation of medical IR Multi-modal relevance feedback for medical image re- trieval [3] • Multilingual issues in medical IR • Liqiang Nie, Mohammad Akbari, Tao Li and Tat-Seng • Multimedia technologies in medical IR Chua. A Joint Local-Global Approach for Medical Ter- minology Assignment [4] • The role of social media in medical IR • Rajendra Prasath and Philip O’Reilly. Exploring Clus- 3. KEYNOTE - DR KARIN VERSPOOR tering Based Knowledge Discovery towards Improved The keynote talk was given by Dr Karin Verspoor (Univer- Medical Diagnosis [6] sity of Melbourne, Australia), on ”Practice-based Evidence in Medicine: Where Information Retrieval Meets Data Min- • Guido Zuccon and Bevan Koopman. Integrating Un- ing” [7]. A new approach in medical practice is emerging derstandability in the Evaluation of Consumer Health thanks to the increasing availability of large-scale clinical Search Engines [9] 2 5. DISCUSSION SESSION both the activity and interest in the medical information The discussion sessions started with a brainstorming ac- retrieval space within the community. The workshop pro- tivity to identify the key challenges in medical IR. The two vided greater insights into the active areas of research within main areas identified were the lack of available data sets and this space and helped in progression of the many challenges the need for better evaluation. Two groups were formed to facing the space. Special attention was paid to evaluation discuss these two topics. within this space and possibilities for progression within the The first group discussed the lack of data sets. One of the data set creation and benchmarking initiatives discussed. reasons for this is the limited amount of publicly available data (i.e. clinical data, query logs, etc.). Aside from the 7. ACKNOWLEDGEMENTS patient related issues of confidentiality and privacy, medical We would like to thank SIGIR 2014 for hosting the work- data being very varied and changing, getting representative shop. Thanks also go to the program committee (Eiji Ara- and up-to-date data sets is very challenging. These varia- maki, Kyoto University, Japan; Celia Boyer, Health on the tions can be found at different levels. The level of specializa- Net, Switzerland; Ben Carterette, University of Delaware, tion and targeted readers is the first one: consumer informa- USA; Allan Hanbury, Vienna University of Technology, Aus- tion varies greatly from clinical practice information. Then, tria; William Hersh, Oregon Health and Science Univer- linguistic variations such as shifting vocabulary are impact- sity, USA; Jung-Jae Kim, Nanyang Technological Univer- ing information extraction (IE) and IR results. In order sity, Singapore; Gang Luo, University of Utah, USA; Iadh to deal with these changing characteristics, what could the Ounis, University of Glasgow, UK; Patrick Ruch, HES-SO, value of abstraction into controlled vocabularies be? More- Switzerland; Stefan Schulz, Medical University Graz, Aus- over, controlled vocabulary would help in alleviating am- tria; Karin Verspoor, NICTA, Australia; Ellen Voorhees, biguity. But how can it be efficiently incorporated into a NIST, USA; Ryen White, Microsoft Research, USA; Elad retrieval approach? Concept-based representation of data Yom-Tov, Microsoft Research, USA), paper authors and and indexing are investigated but their efficiency in IR is workshop attendees, without whom the workshop would not still to be proven. Finally, some modalities are very specific have been the success it was. to the medical domain, such as temporality, negativity, and patients’ characteristics in clinical data such as age, gender, co-morbidities, etc. To understand and automatically pro- 8. REFERENCES cess these, training data is necessary (raw data and gold [1] M. Choi, K. Verspoor, and J. Zobel. Evaluation of standard annotations), but is difficult and expensive to ob- coreference resolution for biomedical text. In tain. Proceedings of the SIGIR workshop on Medical The second group focused on the evaluation of medical Information Retrieval (MEDIR 2014), 2014. information retrieval. They identified as the main issues the [2] Y. Deng, M. Stoehr, and K. Denecke. Retrieving lack of evaluation campaigns and benchmarks for medical attitudes: Sentiment analysis from clinical narratives. IR, and the lack of information on the few existing cam- In Proceedings of the SIGIR workshop on Medical paigns. Based on the experience of the group members, a Information Retrieval (MEDIR 2014), 2014. few key challenges were focused on, in order to get more [3] B. Koopman and G. Zuccon. Why assessing relevance benchmarks, and improve their quality. Firstly, it is crucial in medical ir is demanding. In Proceedings of the SIGIR to design realistic tasks, which involve a deep understanding workshop on Medical Information Retrieval (MEDIR of the users and their needs. Only once that has been done 2014), 2014. can the dataset be built, with realistic data. Along with the [4] L. Nie, M. Akbari, T. Li, and T.-S. Chua. A joint task and use case scenario, the evaluation scheme and the local-global approach for medical terminology definition of relevance needs to be very carefully planned, in assignment. In Proceedings of the SIGIR workshop on order to maximize the outcome of the task. In [2], relevance Medical Information Retrieval (MEDIR 2014), 2014. is modelled according to several relevance dimension factors: [5] P. C.-I. Pang, K. Verspoor, S. Chang, and J. Pearce. understandability, topicality, novelty, scope and reliability. Designing for health exploratory seeking behaviour. In For instance, a task focusing on IR for patients or laypeople Proceedings of the SIGIR workshop on Medical would define relevance as mainly based on the topicality, the Information Retrieval (MEDIR 2014), 2014. reliability and the understandability. These factors need to [6] R. Prasath and P. O’Reilly. Exploring clustering based be taken into account during the relevance judgement and knowledge discovery towards improved medical results evaluation stages [8]. This would allow personaliza- diagnosis. In Proceedings of the SIGIR workshop on tion of the search, characterizing the users and their infor- Medical Information Retrieval (MEDIR 2014), 2014. mation needs. Lastly, benchmark creators should be incited [7] K. Verspoor. Practice-based evidence in medicine: to make their datasets available to the public, for specific Where information retrieval meets data mining. In tasks or any related research work. Proceedings of the SIGIR workshop on Medical Information Retrieval (MEDIR 2014), keynote, 2014. [8] Y. Zhang, J. Zhang, M. Lease, and J. Gwizdka. 6. CONCLUSIONS Multidimensional relevance modeling via psychometrics This was the first SIGIR workshop on ’Medical Informa- and crowdsourcing. In Proceedings of SIGIR 2014, 2014. tion Retrieval’, and followed on nicely from the SIGIR 2013 [9] G. Zuccon and B. Koopman. Integrating medical workshop on ’Health Search and Discovery: Helping understandability in the evaluation of consumer health Users and Advancing Medicine’. The volume of interest in search engines. In Proceedings of the SIGIR workshop the workshop, both through the number of paper submis- on Medical Information Retrieval (MEDIR 2014), 2014. sions and large number of workshop participants, highlight 3