A Dynamic Human-in-the-loop Recommender System for Evidence-based Clinical Staging of COVID-19 Yogatheesan Varatharajah Haotian Chen University of Illinois at Urbana-Champaign University of Illinois at Urbana-Champaign varatha2@illinois.edu hc19@illinois.edu Andrew Trotter Ravishankar Iyer University of Illinois at Chicago University of Illinois at Urbana-Champaign trottera@uic.edu rkiyer@illinois.edu ABSTRACT and anticipate system-level allocation of health resources [3]. Such In this position paper, we discuss the potential use of a reinforce- a model will require consideration of diverse patient characteristics ment learning (RL)-based human-in-the-loop recommender system and clinical variables in order to determine the patient’s disease to support clinical management of COVID-19. COVID-19 is a dis- severity and associated risk of complications including death. It also ease of extraordinary complexity that even the most experienced could be applied to estimate the demands placed on the medical clinicians are struggling to understand. There is an urgent need for and staff resources and the definition of a disease staging system an evidence-based model for predicting the severity of the COVID- would be a critical tool in future studies of potential treatments 19 disease and its complications that can guide individual clinical [1]. At present, patients are triaged predominantly using clinical management decisions. Such a model will utilize a diverse set of assessments based on other respiratory illnesses and may not ac- information to determine a patient’s disease severity and associated curately reflect the trajectories they may follow under COVID-19. risk of complications. An immediate application would be a clinical This disease is very new and there is a scarcity of research defining protocol tailored for COVID-19 patient care; this is a critical need risk factors for severe disease or methods to predict patients at risk both today and for future studies of potential treatments. for rapid decline in their health. It is critical to develop dynamically evolving analytical tools that can make accurate recommendations CCS CONCEPTS using limited and readily available baseline data. These analytic tools should also adaptively incorporate new information prospec- • Human-centered computing → Human computer interaction tively from current encounters in order to clinically stage disease (HCI); • Applied computing → Health care information systems. severity at baseline and throughout disease progression. KEYWORDS COVID-19; reinforcement learning; human-in-the-loop; staging 2 SYSTEM OVERVIEW ACM Reference Format: Our proposed system (shown in Figure 1) is based on a human-in- Yogatheesan Varatharajah, Haotian Chen, Andrew Trotter, and Ravishankar the-loop RL algorithm that leverages expert knowledge of clinical Iyer. 2020. A Dynamic Human-in-the-loop Recommender System for Evidence- based Clinical Staging of COVID-19. In 5th International Workshop on Health experts and data-driven analytics. Our system will operate as fol- Recommender Systems co-located with 14th ACM Conference on Recommender lows. Consider a situation in which the algorithm is challenged Systems (HealthRecSys’20), Online, Worldwide, September 26, 2020. , 2 pages. with a patient who presents to the emergency department with a defined set of symptoms, laboratory variables, clinical measure- 1 INTRODUCTION ments and imaging results. The learning algorithm will be able to provide an estimate of the patient’s probability of experiencing The emergence of the Severe Acute Respiratory Syndrome Coron- serious complications (such as requiring mechanical ventilation or avirus (SARS-CoV-2) poses significant challenges to the livelihood death) using the patient’s baseline characteristics. Based on this of the affected nations and, in the absence of directed treatment prognosis, a decision algorithm would recommend admission or or a vaccine, requires drastic public health measures which have discharge home and if admitted, the level of medical care required crippled national and international economies [2]. Preliminary data (e.g., general ward, step down unit, intensive care unit). However, has shown that there is a spectrum of disease severity for which the final decision regarding the level of hospital care and admin- the disease mechanisms, patient characteristics, and risk factors are istration of supportive and directed treatments (such as anti-viral poorly understood. There is an urgent need for an evidence-based drugs) will be determined by a clinical expert, using both the al- model to predict the severity of COVID-19 disease and its compli- gorithm’s recommendation and his/her own clinical judgment (a cations which can guide individual clinical management decisions human-in-the-loop ML model). The individual patient’s clinical outcome (e.g., need for ventilator support, symptom severity, time HealthRecSys’20, September 26, 2020, Online, Worldwide spent in ICU, treatment response, recovery or death, side effects) © 2020 Copyright for the individual papers remains with the authors. Use permitted will be used to reinforce the prognostic algorithm. The framework under Creative Commons License Attribution 4.0 International (CC BY 4.0). This volume is published and copyrighted by its editors. will adapt to continuously refine decisions based on new data and expert-clinician reinforcement. HealthRecSys’20, September 26, 2020, Online, Worldwide Varatharajah, et al. Mild Disease stage prediction Moderate Online optimizer Expert Severe Decision Learning Critical Biomarker data Treatment Patient Outcome environment Figure 1: A Dynamic ML-based Clinical Staging Scheme for COVID-19. 3 CHALLENGES Modeling the human-in-the-loop decision process: A typical There are several challenges in developing a successful human-in- RL approach relies on an effective balance between exploration the-loop reinforcement-learning framework that generalizes across and exploitation such that the algorithm is allowed sufficient explo- the entire disease severity spectrum. ration of the input space prior to basing predictions primarily on Extracting actionable intelligence from heterogeneous and the space that it has already explored. However, that paradigm is incomplete data: Owing to the complexity of COVID-19, the iden- not usable in this setting, because treatment decisions are a matter tification of distinct clinical stages of COVID-19 progression and of life and death; we cannot take actions that would jeopardize patient trajectories requires the integration of multiple data sources. medical ethics. Therefore, our approach requires the presence of a We believe that domain-guided models that integrate machine learn- clinical expert who will make decisions after appraising the model’s ing methods and clinical insights will be beneficial. Specifically, predictions in light of his/her own assessments. probabilistic graphical models, can represent the domain-driven relationships between different information sources, and can be 4 CONCLUSION transformed into discriminative models that can be trained using In this paper, we described a novel domain-guided human-in-the- the available data. The goals are to improve outcomes, appropri- loop RL framework to assist physicians in clinical decision-making ately allocate healthcare resources, and reduce mortality rates while to stage COVID-19 patients across the disease severity spectrum. directed treatments and vaccines are being developed. Going forward, the clinical stages as defined by this approach could Quantifying the uncertainty in model predictions: Since data form the basis for evaluating the efficacy of existing and new drugs are limited in the beginning, uncertainty in the prognostics will be related to the patients in different stages of disease progression. high in the early stages and will gradually decrease as the model is While the proposed model is specifically designed and trained for updated using new data. The ability to quantify such uncertainty is COVID-19, the underlying paradigm of our model, i.e., the human critical in order for clinicians to accurately gauge the importance in the loop RL, affords the adaptivity to be applicable to other res- of their own assessments relative to the model’s predictions. We piratory illnesses and other future pandemics, with re-calibration. recommend the use of Bayesian methods and attribution-based approaches to quantify the uncertainty and interpret model predic- 5 ACKNOWLEDGEMENTS tions, respectively, both of which will inform the clinicians. This project has been funded by the Jump ARCHES endowment Time-frames for model development and reinforcement: It through the Health Care Engineering Systems Center at the Uni- is unclear how many data points are required to train an accurate versity of Illinois. initial model. A potential measure that can guide this decision is the convergence of class probabilities to their respective population REFERENCES means. In addition, there have been multiple reports indicating var- [1] Ezekiel J. Emanuel et al. 2020. Fair Allocation of Scarce Medical Resources in the ied lengths of hospitalizations. Such variability depends on multiple Time of Covid-19. New England Journal of Medicine 382, 21 (2020), 2049–2055. https://doi.org/10.1056/NEJMsb2005114 factors including disease severity, comorbidities, health provider [2] Center for Disease Control. 2020. Severe Outcomes Among Patients with Coro- policies, decisions regarding withdrawal of life support, cost of navirus Disease 2019 (COVID-19) — United States, February 12–March 16, 2020. Morb Mortal Wkly Rep 2020 69 (2020), 343–346. https://doi.org/10.15585/mmwr. treatments, etc. Clearly, some of these factors are non-deterministic mm6912e2 and data on such factors are typically unavailable for the machine [3] Hasan K Siddiqi and Mandeep R Mehra. 2020. COVID-19 illness in native and learning model. Learning a sufficiently accurate and robust decision immunosuppressed states: A clinical–therapeutic staging proposal. The Journal of Heart and Lung Transplantation 39, 5 (2020), 405. https://doi.org/10.1016/j.healun. scheme with those difficult-to-measure elements and determining 2020.03.012 when to reinforce the learning algorithm remain challenging tasks.