=Paper=
{{Paper
|id=Vol-1737/T5-6
|storemode=property
|title=Relevance Detection and Argumentation Mining in Medical Domain
|pdfUrl=https://ceur-ws.org/Vol-1737/T5-6.pdf
|volume=Vol-1737
|authors=Vijayasaradhi Indurthi,Subba Reddy Oota
|dblpUrl=https://dblp.org/rec/conf/fire/IndurthiO16
}}
==Relevance Detection and Argumentation Mining in Medical Domain==
Relevance Detection and Argumentation Mining in Medical Domain Vijayasaradhi Indurthi Subba Reddy Oota IIIT Hyderabad, India IIIT Hyderabad, India vijaya.saradhi@students.iiit.ac.in oota.subba@students.iiit.ac.in ABSTRACT Sentence 2: “While aspirin has some role in preventing blood In this paper we describe a method to determine the relevancy of a clots, daily aspirin therapy is not for everyone as a primary heart query with a sentence in the document in the field of medical attack prevention method”. [Disagreement/Oppose] domain. We also describe a method to determine if the given statement supports the query, opposes the query or is neutral with 3. DESCRIPTION respect to the query. This is a part of CHIS shared task at FIRE For the shared tasks described above, we adopt a deep learning 2016. approach for solving them. Deep learning is a method which allows computers to learn from experience and understand the Keywords world in terms of a hierarchy of concepts, with each concept Information retrieval, argument mining, relevancy detection defined in terms of its relation to simpler concepts. By gathering knowledge from experience, this approach avoids the need for 1. INTRODUCTION human operators to formally specify all of the knowledge that the World Wide Web is increasingly being used by consumers as an computer needs. The hierarchy of concepts allows the computer to aid for health decision making and for self-management of learn complicated concepts by building them out of simpler ones. chronic illnesses as evidenced by the fact that one in every 20 We use a deep neural network to train the sentences. searches on google is about health. Information access mechanisms for factual health information retrieval have matured considerably, with search engines providing Fact checked Health Knowledge Graph search results to factual health queries. It is pretty straightforward to get an answer to the query “what are the symptoms of Diabetes” from the search engines. However retrieval of relevant multiple perspectives for complex health search queries which do not have a single definitive answer still remains elusive with most of the general purpose search engines. The presence of multiple perspectives with different grades of supporting evidence (which is dynamically changing over time due to the arrival of new research and practice evidence) makes it all the more challenging for a lay searcher. 2. SHARED TASKS Figure 1. The architecture of a deep neural network We use the term “Consumer Health Information Search” (CHIS) to denote such information retrieval search tasks, for The problems described above are modeled as a supervised which there is “No Single Best Correct Answer”; Instead multiple learning task [1][4]. For a given query, we have been given a and diverse perspectives/points of view (which very often are document consisting of a set of sentences. For each sentence we contradictory in nature) are available on the web regarding the have been provided with the ground truths, i.e. if the sentence is queried information. The goal of CHIS track is to research and relevant to the query, and if the sentence supports, opposes or is develop techniques to support users in complex multi-perspective neutral to the query. We have trained a deep neural network [2] health information queries. for this supervised learning task. Given a CHIS query, and a document/set of documents 4. FEATURES associated with that query, the FIRST task is to classify the We have selected binary bag-of-phrases [3] representation of the sentences in the document as relevant to the query or not [4]. The document. Since all words in the sentence are not relevant, we relevant sentences are those from that document, which are useful have identified the most important features manually and used in providing the answer to the query. The SECOND task is to these phrases to create the feature matrix. Some of the features classify these relevant sentences as supporting the claim made in included the presence of supporting words like ‘evidence’, the query, or opposing the claim made in the query [4]. ‘cause’, ‘exhibit’, ‘abnormal’, ‘nonetheless’. Opposing words like Example query: Does daily aspirin therapy prevent heart attack? ‘oppose’, ‘does not’, ‘least’, ‘less’, ‘nothing’, ‘harmless’ were also used as features as these words contribute in determining that Sentence 1: “Many medical experts recommend daily aspirin the sentence opposes the given query. If a feature phrase is present therapy for preventing heart attacks in people of age fifty and in the given text, the value for that feature would be 1. Otherwise, above.” [Affirmative/Support] the value of the feature is 0. All our features are binary. In the preprocessing phase, all text in the upper case was converted to lower case and all numbers were deleted. Some of the feature words and phrases are documented in the table 1. Table 1. Some relevant phrases used as features For task 1, the classification is a binary classification problem Increase Intense Evidence Harmful with a binary cross entropy layer at the output. For task 2, it is a However Nonetheless Oppose Does not multi-class classification problem, and hence a softmax layer is used at the output layer. For training the deep neural network, we Safe Healthier Harmless Decreased used keras. Keras is an open source neural network library written in Python. It is capable of running on top of either Tensorflow or Inversed Weak Deadly Cancer Theano. Designed to enable fast experimentation with deep neural Disease Overdose Dangerous Risk networks, it focuses on being minimal, modular and extensible. We train both the neural networks for 150 epochs for Adverse Hazard Poison Prohibit convergence. Overdose Irritate How safe Associated Suppress Side effect Oppose Disorder 6. RESULTS The following are the results obtained on the test set. Table 4 Incidence Deficit Though Whereas shows the average precision, recall and F1 score of the classifier for task 1. Table 5 shows the average precision, recall and F1 Nonetheless Shorten Reduce Prevent score of the classifier for task 2. Protect Wards off Effective Fewer Table 4. Task 1 precision on test set Questionable Benefit Disagree Unsupported Task Precision Recall F1-score Not Q1- Skincare 0.80 0.78 0.78 Inconclusive Unjustified Myth recommend Q2-MMR 0.84 0.79 0.81 Viral Evidence No increase Good choice Q3-HRT 0.89 0.89 0.89 Flawed Counteract Lessen Cause pain Q4-ECIG 0.79 0.66 0.68 Still high Effective Bothersome No longer Q5-Vit C 0.73 0.73 0.71 Inadvisable Strengthens Lessens Fighting Table 5. Task 2 precision on test set Unlikely Still high Good choice Alarming Task Precision Recall F1-score Table 2 shows the number of features used for each dataset Q1- Skincare 0.76 0.74 0.75 Table 2. Features used for each dataset Q2-MMR 0.55 0.45 0.47 Query Number of Features Q3-HRT 0.66 0.54 0.53 Q1- Skincare 81 Q4-ECIG 0.54 0.52 0.52 Q2-MMR 64 Q5-Vit C 0.52 0.50 0.49 Q3-HRT 105 Q4-ECIG 95 7. OBSERVATIONS Predicting the relevance and determining if a sentence supports Q5-Vit C 124 the given query is not a trivial problem and needs knowledge of Natural Language Processing and Information Retrieval techniques. In this paper we proposed a fast deep learning method 5. ARCHITECTURE to predict the same using a deep neural network. We observe that We use a deep neural network for training for both the tasks. The the average precision for task 1 is 77.03% and for task 2 is input layer had as many neurons as the input features. Task 1 is a 54.86%. Task 2 is a multi-class problem and is more difficult than binary classification problem, indicating if the sentence was task1. relevant to the query or not. Task 2 is a multi-class classification problem, which indicates if the sentence supports, opposes or is 8. FUTURE WORK neutral to the query. Table 3 shows the architecture of the neural In this paper, we have used a select set of phrases as features. network for both of the CHIS tasks [2][5]. Since the sentences and the query, both are short text segments, Table 3. Neural Architectures for CHIS tasks 1 and 2 features using Natural Langauge Processing like POS tagging etc can be used as features augmented with the existing features to Hidden #Neurons in improve the precision and recall [6]. Although we have identified Task Activations Layers Hidden layer the features manually, the features could have been figured out by Task 1 2 120, 8 relu, sigmoid selecting the adjectives and adverbs using any of the existing NLP toolkits. This would make the solution scalable and generic and Task 2 2 150, 150 tanh, tanh can be applied for other similar datasets. 9. CODE [4] Andrenucci, A., 2008. Automated Question-Answering All the code is available at https://github.com/saradhix/chis for Techniques and the Medical Domain. In HEALTHINF (2) research and academic purpose. (pp. 207-212). [5] Andreas Buja and Werner Stuetzle and Yi Shen. 2005. Loss 10. REFERENCES Functions for Binary Cross Probability Estimation and [1] Robert Gaizauskas, Mark Hepple, and Mark Greenwood. Classification: Structure and Applications. Doctoral Thesis. 2004. Information retrieval for question answering a SIGIR University of Pennsylvania. 2004 workshop. SIGIR Forum 38, 2 (December 2004), 41- [6] Veselin Stoyanov, Claire Cardie, and Janyce Wiebe. 2005. 44. DOI=http://dx.doi.org/10.1145/1041394.1041403. Multi-perspective question answering using the OpQA [2] Yoshua Bengio. 2009. Learning Deep Architectures for AI. corpus. In Proceedings of the conference on Human Found. Trends Mach. Learn. 2, 1 (January 2009), 1-127. Language Technology and Empirical Methods in Natural DOI=http://dx.doi.org/10.1561/2200000006. Language Processing (HLT '05). Association for [3] Maria Fernanda Caropreso and Stan Matwin. 2006. Beyond Computational Linguistics, Stroudsburg, PA, USA, 923-930. the bag of words: a text representation for sentence selection. DOI=http://dx.doi.org/10.3115/1220575.1220691. In Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence (AI'06), Luc Lamontagne and Mario Marchand (Eds.). Springer-Verlag, Berlin, Heidelberg, 324-335. DOI=http://dx.doi.org/10.1007/11766247_28.