=Paper=
{{Paper
|id=Vol-2872/short02
|storemode=property
|title=Applications of Reinforcement learning for Medical Decision Making
|pdfUrl=https://ceur-ws.org/Vol-2872/short02.pdf
|volume=Vol-2872
|authors=Neel Gandhi,Shakti Mishra
|dblpUrl=https://dblp.org/rec/conf/rtacsit/GandhiM21
}}
==Applications of Reinforcement learning for Medical Decision Making==
Applications of Reinforcement learning for Medical Decision Making Neel Gandhia , Shakti Mishraa a School of Technology, Pandit Deendayal Petroleum University Gandhinagar, Gujarat 382007 Abstract Reinforcement Learning(RL) is used for decision-making by interacting with uncertain/complex envi- ronments with the aim of maximizing long-term reward following a certain policy along with evaluative feedback for improvement. RL is advantageous in medical decision making compared to other forms of learning as it focuses on long-term rewards, it is also able to handle long and complex sequential decision-making tasks with sampled, delayed, and exhaustive feedback. It has emerged as a suitable method for developing satisfactory solutions in the healthcare domain. Improvement in the healthcare system can be achieved by integrating traditional health care practices with RL methods by consider- ing health status of a patient. In this paper, we have discussed various applications of RL that would be helpful in providing effective decisions for improving patient health treatment, prognosis, diagnosis, and condition . RL could be effective in the area of healthcare right from medical diagnosis to handling various critical decision-making tasks. The paper provides a broad view of the various applications of RL in the sector of healthcare. The paper illustrates various RL applications that would be effective in improving the existing healthcare sector at same time being efficient in handling complex medical decision-making tasks. Keywords Reinforcement Learning, Medical Decision Making, Healthcare, Medical Diagnosis 1. Introduction tially, RL has been used for treatment of pa- tients in a closed-loop manner having varied In recent years, Reinforcement Learning has advantages compared to supervised learning. emerged as one of crucial area in field of arti- Traditionally, supervised learning algorithms ficial intelligence impacting the field of health work on labeled data whereas RL has unique care including diagnosis, prognosis, and other feature of finding the pattern in given prob- medical treatments. Reinforcement learning lem statement and bound to learn from its methods have been very useful for a long time experience. Also, Evolution in RL from the in sequential decision-making tasks in robotics, past to present has made it capable of han- gaming, and simulation like healthcare that dling various issues like exploration and ex- are able to solve long and complicated decision- ploitation, credit assignment, and at the same making tasks with the use of policies, aiming time maximizing the reward using the opti- at maximizing reward as their final goal. Ini- mal policy for a specific medical decision-making task. RL has gained popularity among practi- Proccedings of RTA-CSIT 2021, May 2021, Tirana, Albania tioners dealing with dynamic treatment regimes, " neel.gict18@sot.pdpu.ac.in (N. Gandhi); medical diagnosis, and other decision-making shakti.mishra@sot.pdpu.ac.in (S. Mishra) tasks. Reinforcement learning has been ap- 0000-0003-1758-1764 (N. Gandhi); 0000-0002-5961-3114 (S. Mishra) plied for simulations in healthcare domains © 2021 Copyright for this paper by its authors. Use permit- like drug dosage, examination time, assess- ted under Creative Commons License Attribution 4.0 Inter- national (CC BY 4.0). ment of patient’s health status among others. CEUR Workshop Proceedings Application of RL in medical decision mak- CEUR http://ceur-ws.org (CEUR-WS.org) Workshop ISSN 1613-0073 Proceedings ing has to deal with patient health concern a) Chronic Diseases-Chronic diseases issues due to the risks involved in the medi- persist for a long period of time.Hence, cal treatment for a particular decision made practitioners follow chronic care by RL method. However, it is imperative to model(CCM) sequence of medical choose an optimal method for treatment of interventions to access patient health specified medical disease or medical condi- status[3].RL would be helpful to tion. practitioners for continuous decision- making by helping in treatment of chronic diseases including ane- 2. Applications of mia, cancer, diabetes, human im- Reinforcement Learning munodeficiency viruses(HIV) , men- tal illnesses among many other long- in medical decision lasting diseases. making i. Cancer- Q-learning with sup- port vector regression and ex- Reinforcement learning in healthcare follows tremely randomized trees is certain steps like agent (medical used for treatment of cancers[4] device/computer/equipment/system) that takes like cell cancer, chemother- a particular action in medical environment apy effect, and other cancer using defined policy to get a specific reward conditions. and then uses evaluative feedback to improve ii. Diabetes-Proper sequential dosage its performance[1].RL provides various meth- of insulin in cases of Type 2 ods to solve sequential decision-making prob- diabetes[5] with specified time lems with the goal of maximizing reward by and amount by application of interaction with environment using trial and reinforcement learning meth- error method. Also, exploring and exploiting ods for getting long-term health the environment for taking decisions by eval- benefit. uative feedback from environment and learn- iii. Anemia-Lack of RBC that can ing effective strategies during the process. be controlled using by RL method[6]by Reinforcement Learning has emerged as a promi- applying control input as the nent solution in decision-making tasks in med- amount of endogenous ery- ical sector and has application right from Dy- thropoietin and target under namic Treatment Regimes,medical diagnosis control is hemoglobin level that to various other complicated and cognitive also has an impact on iron stor- decision-making tasks. age in the patient’s body with the state component of hemoglobin 1. Dynamic Treatment Regimes(DTR) and avoids any damage to pa- -DTR designed for sequential decision- tient body’s by administering making problems by using reinforce- erythropoiesis-stimulating agent. ment learning methods for developing iv. HIV-HIV/human immunod- policy with respect to automation for eficiency virus[7] are treated process of developing treatment regimes with the combination of anti- for patients by consideration of long term HIV drugs that are referred health benefits[2]. to as highly active antiretro- viral therapy(HAART) requires Heparin Dosing[13] among other long term treatment using decision- critical care conditions making approach could be ef- 2. Medical Diagnosis- Medical Diagnosis[14] fectively dealt by using rein- is helpful in decision-making using RL forcement learning algorithm with medical condition data in form of like Batch RL. image and text data. v. Mental illness- Mental ill- ness usually persists for a long a) Computer Vision period of time requiring sig- Medical Image- Medical Image nificant adaptations/changes data obtained from various com- in terms of dosage as well as puter vision techniques are used treatment type involving very for feature extraction, image seg- complex decision-making pro- mentation, localization, tracing, and cess. Thus, it can be handled object detection along with RL algorithm[15]. using our RL approaches to b) Natural Language Proccesing solve the problems of Depression[8], Clinical text data- Clinical text Schizophrenia[9] among many data has also been used for treat- other mental issues. ment of patients using RL method[14] b) Intensive/Critical Care-RL method that are able to diagnose inferences would prove to be helpful in cases in RL methods like DQ method. of critical care treatment like me- c) Human-Computer Interface chanical ventilation[10] as well as Dialogue Systems, Chat-bots, treatment of diseases like sepsis[11] and Advanced Interfaces- Multi- and other critical care treatment agent systems were found effec- tive in monitoring clinical data us- i. Sepsis - Using model-based reinforcement learning techniques[11] ing RL method for developing user with improvised policies has interface that is able to adapt it- led to better treatment for the self for specific user[16]. condition of sepsis in patients. 3. Other Medical Decision-making Tasks ii. Anesthesia - Anesthesia is for healthcare systems the process of using specific a) Resource scheduling and task drugs to reduce the effect of allocation - Resource allocation sensation in body with the use problem in RL are usually mod- of RL-based control methods eled using Markov Decision Pro- [12] like temporal difference cess with reinforcement learning to detect distribution of drug using appropriate policies to pro- in patient’s body. vide better service to the patient[17]. iii. Others Critical Situation b) Optimal Process Control - Health- As RL method are used for care tasks like simulation of sur- handling decision making sys- gical operation, adaptive control tem in uncertain environment for medical video streaming, and ,it would be effective in deal- functional electric simulations poli- ing with critical situation such cies control are used with RL meth- as Mechanical Ventilation[10], ods like Q-learning, IRL, DRL among others in the best possible way to A causal reinforcement learning ap- achieve desired results[18]. proach, Technical Report, Technical Re- c) Drug Discovery -De novo de- port R-57, Causal Artificial Intelligence sign [19] has lead to development Lab, Columbia . . . , 2020. of RL methods for structural evo- [3] V. Barr, S. Robinson, B. Marin-Link, lution and development of drugs L. Underhill, A. Dotts, D. Ravensdale, using generative and predictive neu- S. Salivaras, The expanded chronic care ral networks. model, Hosp Q 7 (2003) 73–82. d) Patient Health Management - [4] Y. Zhao, M. R. Kosorok, D. Zeng, Re- Personalized Health Recommen- inforcement learning design for cancer dation System[20] has developed clinical trials, Statistics in medicine 28 by using functionality of RL meth- (2009) 3294–3315. ods to deal with consultation, dosage,[5] E. Yom-Tov, G. Feraru, M. Koz- nutrition and health activities. doba, S. Mannor, M. Tennenholtz, I. Hochberg, Encouraging Physical Activity in Patients With Diabetes: 3. Conclusion Intervention Using a Reinforcement Learning System, Journal of medi- The paper aims at providing a detailed overview cal Internet research 19 (2017) e338. for applications of reinforcement learning to doi:10.2196/jmir.7994. solve a variety of decision-making problems [6] A. E. Gaweda, M. K. Muezzinoglu, G. R. in healthcare domain. Reinforcement learn- Aronoff, A. A. Jacobs, J. M. Zurada, M. E. ing applied in various healthcare ailments was Brier, Individualization of pharmaco- found effective in providing optimal solutions logical anemia management using rein- for decision making in various healthcare tasks forcement learning, Neural Networks right from chronic diseases, medical diagno- 18 (2005) 826–834. sis to various other healthcare decision-making [7] D. Ernst, G.-B. Stan, J. Goncalves, L. We- problems. In this paper, reinforcement learn- henkel, Clinical data based optimal ing was found effective in dealing with med- sti strategies for hiv: a reinforcement ical data by using optimal policy resulting in learning approach, in: Proceedings of maximizing long-term rewards. Application the 45th IEEE Conference on Decision of reinforcement learning in healthcare will and Control, IEEE, 2006, pp. 667–672. improve the performance of existing health- [8] C. Chen, T. Takahashi, S. Nakagawa, care system by increasing the efficiency, safety, T. Inoue, I. Kusumi, Reinforcement and robustness of handling real-time data for learning in depression: a review of com- decision making in healthcare sector. putational research, Neuroscience & Biobehavioral Reviews 55 (2015) 247– References 267. [9] J. A. Waltz, M. J. Frank, B. M. Robin- [1] Y. Li, Deep reinforcement learn- son, J. M. Gold, Selective Reinforcement ing: An overview, arXiv preprint Learning Deficits in Schizophrenia Sup- arXiv:1701.07274 (2017). port Predictions from Computational [2] J. Zhang, E. Bareinboim, Designing Models of Striatal-Cortical Dysfunc- optimal dynamic treatment regimes: tion, Biological Psychiatry 62 (2007) 756–764. doi:10.1016/j.biopsych. 2006.09.042. [18] J. Shin, T. A. Badgwell, K.-H. Liu, J. H. [10] N. Prasad, L.-F. Cheng, C. Chivers, Lee, Reinforcement learning–overview M. Draugelis, B. E. Engelhardt, A rein- of recent progress and implications for forcement learning approach to wean- process control, Computers & Chemical ing of mechanical ventilation in in- Engineering 127 (2019) 282–294. tensive care units, arXiv preprint [19] M. Popova, O. Isayev, A. Tropsha, Deep arXiv:1704.06300 (2017). reinforcement learning for de novo [11] A. Raghu, M. Komorowski, S. Singh, drug design, Science advances 4 (2018) Model-Based Reinforcement Learn- eaap7885. ing for Sepsis Treatment (2018). [20] J. Mulani, S. Heda, K. Tumdi, J. Pa- URL: http://arxiv.org/abs/1811.09602. tel, H. Chhinkaniwala, J. Patel, Deep arXiv:1811.09602. reinforcement learning based person- [12] B. L. Moore, L. D. Pyeatt, V. Kulkarni, alized health recommendations, in: P. Panousis, K. Padrez, A. G. Doufas, Deep Learning Techniques for Biomed- Reinforcement learning for closed-loop ical and Health Informatics, Springer, propofol anesthesia: A study in human 2020, pp. 231–255. volunteers, Journal of Machine Learn- ing Research 15 (2014) 655–696. [13] S. Nemati, M. M. Ghassemi, G. D. Clif- ford, Optimal medication dosing from suboptimal clinical examples: A deep reinforcement learning approach, in: 2016 38th Annual International Con- ference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, 2016, pp. 2978–2981. [14] Y. Ling, S. A. Hasan, V. Datla, A. Qadir, K. Lee, J. Liu, O. Farri, Diagnos- tic inferencing via improving clinical concept extraction with deep reinforce- ment learning: A preliminary study, in: Machine Learning for Healthcare Con- ference, 2017, pp. 271–285. [15] F. Sahba, H. R. Tizhoosh, M. M. a. Salama, for Medical Image Segmenta- tion (2006) 1238–1244. [16] E. M. Shakshuki, M. Reid, T. R. Sheltami, An adaptive user interface in health- care, Procedia Computer Science 56 (2015) 49–58. [17] Z. Huang, W. M. van der Aalst, X. Lu, H. Duan, Reinforcement learning based resource allocation in business process management, Data & Knowledge Engi- neering 70 (2011) 127–145.