=Paper= {{Paper |id=Vol-2872/short02 |storemode=property |title=Applications of Reinforcement learning for Medical Decision Making |pdfUrl=https://ceur-ws.org/Vol-2872/short02.pdf |volume=Vol-2872 |authors=Neel Gandhi,Shakti Mishra |dblpUrl=https://dblp.org/rec/conf/rtacsit/GandhiM21 }} ==Applications of Reinforcement learning for Medical Decision Making== https://ceur-ws.org/Vol-2872/short02.pdf
Applications of Reinforcement learning for
Medical Decision Making
Neel Gandhia , Shakti Mishraa
a School of Technology, Pandit Deendayal Petroleum University Gandhinagar, Gujarat 382007



                                    Abstract
                                    Reinforcement Learning(RL) is used for decision-making by interacting with uncertain/complex envi-
                                    ronments with the aim of maximizing long-term reward following a certain policy along with evaluative
                                    feedback for improvement. RL is advantageous in medical decision making compared to other forms
                                    of learning as it focuses on long-term rewards, it is also able to handle long and complex sequential
                                    decision-making tasks with sampled, delayed, and exhaustive feedback. It has emerged as a suitable
                                    method for developing satisfactory solutions in the healthcare domain. Improvement in the healthcare
                                    system can be achieved by integrating traditional health care practices with RL methods by consider-
                                    ing health status of a patient. In this paper, we have discussed various applications of RL that would
                                    be helpful in providing effective decisions for improving patient health treatment, prognosis, diagnosis,
                                    and condition . RL could be effective in the area of healthcare right from medical diagnosis to handling
                                    various critical decision-making tasks. The paper provides a broad view of the various applications of
                                    RL in the sector of healthcare. The paper illustrates various RL applications that would be effective
                                    in improving the existing healthcare sector at same time being efficient in handling complex medical
                                    decision-making tasks.

                                    Keywords
                                    Reinforcement Learning, Medical Decision Making, Healthcare, Medical Diagnosis


1. Introduction                                                      tially, RL has been used for treatment of pa-
                                                                     tients in a closed-loop manner having varied
In recent years, Reinforcement Learning has advantages compared to supervised learning.
emerged as one of crucial area in field of arti- Traditionally, supervised learning algorithms
ficial intelligence impacting the field of health work on labeled data whereas RL has unique
care including diagnosis, prognosis, and other feature of finding the pattern in given prob-
medical treatments. Reinforcement learning lem statement and bound to learn from its
methods have been very useful for a long time experience. Also, Evolution in RL from the
in sequential decision-making tasks in robotics, past to present has made it capable of han-
gaming, and simulation like healthcare that dling various issues like exploration and ex-
are able to solve long and complicated decision- ploitation, credit assignment, and at the same
making tasks with the use of policies, aiming time maximizing the reward using the opti-
at maximizing reward as their final goal. Ini- mal policy for a specific medical decision-making
                                                                     task. RL has gained popularity among practi-
Proccedings of RTA-CSIT 2021, May 2021, Tirana, Albania tioners dealing with dynamic treatment regimes,
" neel.gict18@sot.pdpu.ac.in (N. Gandhi);                            medical diagnosis, and other decision-making
shakti.mishra@sot.pdpu.ac.in (S. Mishra)                             tasks. Reinforcement learning has been ap-
 0000-0003-1758-1764 (N. Gandhi);
0000-0002-5961-3114 (S. Mishra)
                                                                     plied for simulations in healthcare domains
         © 2021 Copyright for this paper by its authors. Use permit- like drug dosage, examination time, assess-
         ted under Creative Commons License Attribution 4.0 Inter-
         national (CC BY 4.0).                                       ment of patient’s health status among others.
         CEUR            Workshop                Proceedings
                                                                     Application of RL in medical decision mak-
 CEUR
               http://ceur-ws.org




         (CEUR-WS.org)
 Workshop      ISSN 1613-0073
 Proceedings
ing has to deal with patient health concern        a) Chronic Diseases-Chronic diseases
issues due to the risks involved in the medi-         persist for a long period of time.Hence,
cal treatment for a particular decision made          practitioners follow chronic care
by RL method. However, it is imperative to            model(CCM) sequence of medical
choose an optimal method for treatment of             interventions to access patient health
specified medical disease or medical condi-           status[3].RL would be helpful to
tion.                                                 practitioners for continuous decision-
                                                      making by helping in treatment
                                                      of chronic diseases including ane-
2. Applications of                                    mia, cancer, diabetes, human im-
   Reinforcement Learning                             munodeficiency viruses(HIV) , men-
                                                      tal illnesses among many other long-
   in medical decision                                lasting diseases.
   making                                                i. Cancer- Q-learning with sup-
                                                            port vector regression and ex-
Reinforcement learning in healthcare follows                tremely randomized trees is
certain steps like agent (medical                           used for treatment of cancers[4]
device/computer/equipment/system) that takes                like cell cancer, chemother-
a particular action in medical environment                  apy effect, and other cancer
using defined policy to get a specific reward               conditions.
and then uses evaluative feedback to improve            ii. Diabetes-Proper sequential dosage
its performance[1].RL provides various meth-                of insulin in cases of Type 2
ods to solve sequential decision-making prob-               diabetes[5] with specified time
lems with the goal of maximizing reward by                  and amount by application of
interaction with environment using trial and                reinforcement learning meth-
error method. Also, exploring and exploiting                ods for getting long-term health
the environment for taking decisions by eval-               benefit.
uative feedback from environment and learn-            iii. Anemia-Lack of RBC that can
ing effective strategies during the process.                be controlled using by RL method[6]by
Reinforcement Learning has emerged as a promi-              applying control input as the
nent solution in decision-making tasks in med-              amount of endogenous ery-
ical sector and has application right from Dy-              thropoietin and target under
namic Treatment Regimes,medical diagnosis                   control is hemoglobin level that
to various other complicated and cognitive                  also has an impact on iron stor-
decision-making tasks.                                      age in the patient’s body with
                                                            the state component of hemoglobin
   1. Dynamic Treatment Regimes(DTR)                        and avoids any damage to pa-
      -DTR designed for sequential decision-                tient body’s by administering
      making problems by using reinforce-                   erythropoiesis-stimulating agent.
      ment learning methods for developing             iv. HIV-HIV/human immunod-
      policy with respect to automation for                 eficiency virus[7] are treated
      process of developing treatment regimes               with the combination of anti-
      for patients by consideration of long term            HIV drugs that are referred
      health benefits[2].                                   to as highly active antiretro-
        viral therapy(HAART) requires                    Heparin Dosing[13] among other
        long term treatment using decision-              critical care conditions
        making approach could be ef-
                                          2. Medical Diagnosis- Medical Diagnosis[14]
        fectively dealt by using rein-
                                             is helpful in decision-making using RL
        forcement learning algorithm
                                             with medical condition data in form of
        like Batch RL.
                                             image and text data.
     v. Mental illness- Mental ill-
        ness usually persists for a long        a) Computer Vision
        period of time requiring sig-              Medical Image- Medical Image
        nificant adaptations/changes               data obtained from various com-
        in terms of dosage as well as              puter vision techniques are used
        treatment type involving very              for feature extraction, image seg-
        complex decision-making pro-               mentation, localization, tracing, and
        cess. Thus, it can be handled              object detection along with RL algorithm[15].
        using our RL approaches to              b) Natural Language Proccesing
        solve the problems of Depression[8],       Clinical text data- Clinical text
        Schizophrenia[9] among many                data has also been used for treat-
        other mental issues.                       ment of patients using RL method[14]
b) Intensive/Critical Care-RL method               that are able to diagnose inferences
   would prove to be helpful in cases              in RL methods like DQ method.
   of critical care treatment like me-          c) Human-Computer Interface
   chanical ventilation[10] as well as             Dialogue Systems, Chat-bots,
   treatment of diseases like sepsis[11]           and Advanced Interfaces- Multi-
   and other critical care treatment               agent systems were found effec-
                                                   tive in monitoring clinical data us-
     i. Sepsis - Using model-based
        reinforcement learning techniques[11]      ing RL method for developing user
        with improvised policies has               interface that is able to adapt it-
        led to better treatment for the            self for specific user[16].
         condition of sepsis in patients.   3. Other Medical Decision-making Tasks
     ii. Anesthesia - Anesthesia is            for healthcare systems
         the process of using specific          a) Resource scheduling and task
         drugs to reduce the effect of             allocation - Resource allocation
         sensation in body with the use            problem in RL are usually mod-
         of RL-based control methods
                                                   eled using Markov Decision Pro-
         [12] like temporal difference
                                                   cess with reinforcement learning
         to detect distribution of drug
                                                   using appropriate policies to pro-
         in patient’s body.                        vide better service to the patient[17].
    iii. Others Critical Situation              b) Optimal Process Control - Health-
         As RL method are used for
                                                   care tasks like simulation of sur-
         handling decision making sys-
                                                   gical operation, adaptive control
         tem in uncertain environment
                                                   for medical video streaming, and
         ,it would be effective in deal-
                                                   functional electric simulations poli-
         ing with critical situation such          cies control are used with RL meth-
         as Mechanical Ventilation[10],            ods like Q-learning, IRL, DRL among
             others in the best possible way to        A causal reinforcement learning ap-
             achieve desired results[18].              proach, Technical Report, Technical Re-
          c) Drug Discovery -De novo de-               port R-57, Causal Artificial Intelligence
             sign [19] has lead to development         Lab, Columbia . . . , 2020.
             of RL methods for structural evo- [3] V. Barr, S. Robinson, B. Marin-Link,
             lution and development of drugs           L. Underhill, A. Dotts, D. Ravensdale,
             using generative and predictive neu-      S. Salivaras, The expanded chronic care
             ral networks.                             model, Hosp Q 7 (2003) 73–82.
          d) Patient Health Management - [4] Y. Zhao, M. R. Kosorok, D. Zeng, Re-
             Personalized Health Recommen-             inforcement learning design for cancer
             dation System[20] has developed           clinical trials, Statistics in medicine 28
             by using functionality of RL meth-        (2009) 3294–3315.
             ods to deal with consultation, dosage,[5] E. Yom-Tov, G. Feraru, M. Koz-
             nutrition and health activities.          doba, S. Mannor, M. Tennenholtz,
                                                       I. Hochberg,       Encouraging Physical
                                                       Activity in Patients With Diabetes:
3. Conclusion                                          Intervention Using a Reinforcement
                                                       Learning System, Journal of medi-
The paper aims at providing a detailed overview
                                                       cal Internet research 19 (2017) e338.
for applications of reinforcement learning to
                                                       doi:10.2196/jmir.7994.
solve a variety of decision-making problems
                                                   [6] A. E. Gaweda, M. K. Muezzinoglu, G. R.
in healthcare domain. Reinforcement learn-
                                                       Aronoff, A. A. Jacobs, J. M. Zurada, M. E.
ing applied in various healthcare ailments was
                                                       Brier, Individualization of pharmaco-
found effective in providing optimal solutions
                                                       logical anemia management using rein-
for decision making in various healthcare tasks
                                                       forcement learning, Neural Networks
right from chronic diseases, medical diagno-
                                                       18 (2005) 826–834.
sis to various other healthcare decision-making
                                                   [7] D. Ernst, G.-B. Stan, J. Goncalves, L. We-
problems. In this paper, reinforcement learn-
                                                       henkel, Clinical data based optimal
ing was found effective in dealing with med-
                                                       sti strategies for hiv: a reinforcement
ical data by using optimal policy resulting in
                                                       learning approach, in: Proceedings of
maximizing long-term rewards. Application
                                                       the 45th IEEE Conference on Decision
of reinforcement learning in healthcare will
                                                       and Control, IEEE, 2006, pp. 667–672.
improve the performance of existing health-
                                                   [8] C. Chen, T. Takahashi, S. Nakagawa,
care system by increasing the efficiency, safety,
                                                       T. Inoue, I. Kusumi, Reinforcement
and robustness of handling real-time data for
                                                       learning in depression: a review of com-
decision making in healthcare sector.
                                                       putational research, Neuroscience &
                                                       Biobehavioral Reviews 55 (2015) 247–
References                                             267.
                                                   [9] J. A. Waltz, M. J. Frank, B. M. Robin-
 [1] Y. Li,       Deep reinforcement learn-            son, J. M. Gold, Selective Reinforcement
       ing: An overview,        arXiv preprint         Learning Deficits in Schizophrenia Sup-
       arXiv:1701.07274 (2017).                        port Predictions from Computational
 [2] J. Zhang, E. Bareinboim, Designing                Models of Striatal-Cortical Dysfunc-
       optimal dynamic treatment regimes:              tion, Biological Psychiatry 62 (2007)
                                                       756–764. doi:10.1016/j.biopsych.
     2006.09.042.                                [18] J. Shin, T. A. Badgwell, K.-H. Liu, J. H.
[10] N. Prasad, L.-F. Cheng, C. Chivers,              Lee, Reinforcement learning–overview
     M. Draugelis, B. E. Engelhardt, A rein-          of recent progress and implications for
     forcement learning approach to wean-             process control, Computers & Chemical
     ing of mechanical ventilation in in-             Engineering 127 (2019) 282–294.
     tensive care units,      arXiv preprint     [19] M. Popova, O. Isayev, A. Tropsha, Deep
     arXiv:1704.06300 (2017).                         reinforcement learning for de novo
[11] A. Raghu, M. Komorowski, S. Singh,               drug design, Science advances 4 (2018)
     Model-Based Reinforcement Learn-                 eaap7885.
     ing for Sepsis Treatment (2018).            [20] J. Mulani, S. Heda, K. Tumdi, J. Pa-
     URL: http://arxiv.org/abs/1811.09602.            tel, H. Chhinkaniwala, J. Patel, Deep
     arXiv:1811.09602.                                reinforcement learning based person-
[12] B. L. Moore, L. D. Pyeatt, V. Kulkarni,          alized health recommendations, in:
     P. Panousis, K. Padrez, A. G. Doufas,            Deep Learning Techniques for Biomed-
     Reinforcement learning for closed-loop           ical and Health Informatics, Springer,
     propofol anesthesia: A study in human            2020, pp. 231–255.
     volunteers, Journal of Machine Learn-
     ing Research 15 (2014) 655–696.
[13] S. Nemati, M. M. Ghassemi, G. D. Clif-
     ford, Optimal medication dosing from
     suboptimal clinical examples: A deep
     reinforcement learning approach, in:
     2016 38th Annual International Con-
     ference of the IEEE Engineering in
     Medicine and Biology Society (EMBC),
     IEEE, 2016, pp. 2978–2981.
[14] Y. Ling, S. A. Hasan, V. Datla, A. Qadir,
     K. Lee, J. Liu, O. Farri,       Diagnos-
     tic inferencing via improving clinical
     concept extraction with deep reinforce-
     ment learning: A preliminary study, in:
     Machine Learning for Healthcare Con-
     ference, 2017, pp. 271–285.
[15] F. Sahba, H. R. Tizhoosh, M. M. a.
     Salama, for Medical Image Segmenta-
     tion (2006) 1238–1244.
[16] E. M. Shakshuki, M. Reid, T. R. Sheltami,
     An adaptive user interface in health-
     care, Procedia Computer Science 56
     (2015) 49–58.
[17] Z. Huang, W. M. van der Aalst, X. Lu,
     H. Duan, Reinforcement learning based
     resource allocation in business process
     management, Data & Knowledge Engi-
     neering 70 (2011) 127–145.