Technical Feasibility, Financial Viability, and Clinician Acceptance: On the Many Challenges to AI in Clinical Practice Nur Yildirim,1 John Zimmerman, 1 Sarah Preum 2 1 Human-Computer Interaction Institute, Carnegie Mellon University 2 Computer Science Department, Dartmouth College yildirim@cmu.edu, johnz@cs.cmu.edu, sarah.masud.preum@dartmouth.edu Abstract AI has largely focused on some aspects of technical feasi- bility. Researchers have made many, stunning technical ad- Artificial intelligence (AI) applications in healthcare offer the promise of improved decision making for clinicians, and bet- vances that confirm this is a great space for innovation. Little ter healthcare outcomes for patients. While technical AI ad- to no work has investigated how AI systems might pay for vances in healthcare showcase impressive performances in themselves within the complex landscape of healthcare re- lab settings, they seem to fail when moving to clinical prac- imbursement, and little work has explored when, where, and tice. In this position paper, we reflect on our experiences of in what form AI inferences might be viewed as valuable by designing for AI acceptance and discuss three interrelated clinicians. Below we touch on each of these three areas that challenges to AI in clinical practice: technical feasibility, fi- we feel must be addressed for AI to thrive in the clinic. nancial viability, and clinician acceptance. We discuss each challenge and their implications for future research in clinical AI. We encourage the research community to take on these Technical Feasibility lenses in collaboratively tackling the challenges of moving While there have been great technical advances around AI AI systems into real-world healthcare applications. in healthcare, much of the work is not clinically relevant (Seneviratne, Shah, and Chu 2020); research has largely Introduction reproduced human decision making, and researchers have tended to focus on more difficult problems than searching Over the last decade, there has been lots of excitement about for low-hanging fruit. When we use the term “clinically rel- what AI might do for healthcare. AI offers the promise of evant” and apply it to AI innovation, we are talking about improved cancer diagnosis, faster discovery of new drugs, research that offers empirical evidence that clinicians want and even personalization of patients’ healthcare experiences. the AI output researchers are developing. Drawing from our The transition to electronic health records has produced a work on the ICU, researchers have created systems that pre- wealth of data ripe for mining. Interestingly, AI systems that dict medication (Suresh et al. 2017), predict a patient will work great in computer labs largely fail when they move need a ventilator (Suresh et al. 2017), predict if a patient will to clinical practice, and the number one reason they fail is die (Song et al. 2018) or be discharged (Zhang et al. 2020), the lack of adoption by clinicians (Yang, Zimmerman, and and predict the onset of conditions like sepsis (Nemati et al. Steinfeld 2015). 2018), tachycardia (Liu et al. 2021), or hypotension (Yoon Our team has been investigating how AI might function et al. 2020). Researchers detail the importance of this infor- more effectively in the enterprise: How AI systems might mation to a patient’s health, but they do not provide evidence help professionals both make better decisions and also feel that clinicians currently find the use of their own expertise like they are becoming better at their jobs. We have done to make these predictions challenging, that clinicians make work in education, business, and healthcare. Currently, we a high rate of errors, or that clinicians have expressed an ex- are working on a project to identify how AI might improve plicit desire to know this prediction. ICU care. Can it automate mundane tasks? Can it discover Technology follows a familiar adoption process. First, low-value care? Can it detect deviations between standards there is a need for technical capabilities. Once capabilities of care and actual practice? exist, there is a need to map these capabilities to situations For this position paper, we reflect on what we have that might benefit from this capability. Finally, there is a learned about AI acceptance in the workplace across many need for evidence that applying the new capability actu- domains, and we tailor our insights to aspects most relevant ally created a desirable benefit. Almost all AI innovation in to clinical practice. In our experience, AI only flourishes healthcare is working on new capabilities. That is an im- when it is technically feasible, financially viable, and ac- portant first step. But now, we need more work showing the ceptable or even desired by end-users. Research on clinical capabilities actually map to authentic needs. When we have Copyright © 2021, for this paper by its authors. Use permitted un- a body of work showing AI can address real needs, the next der Creative Commons License Attribution 4.0 International (CC step will be deployment studies showing the new technology BY 4.0). has real-world impact, that it improves health outcomes and lowers the cost of delivering care. This can be challenging due to the near-monopoly held by One driver of the disconnect between AI capability and the tiny number of EHR vendors and from the software de- clinical need is the fact that most AI innovations in health velopment culture that has shifted to lean-agile, with a focus focus on reproducing the work of human decision-makers. on making an MVP – the minimum viable product. Techni- These are often approaching problems from a machine cal AI healthcare research never addresses how the advance learning perspective because the diagnosis (human deci- will make money, how it will pay for itself. The work does sion) and selected treatment (human decision) are well doc- not detail the changes needed to current data pipelines. It umented. The labeled data exists for training a system. Most does not talk about the increased amount of computing re- healthcare decisions like these are “textbook”, they are ob- quired. It does not specify how this will save time, allowing vious to a clinical expert. And an AI system trained on this clinicians to treat more people in the same amount of time, data will work the best for textbook cases. This is not where and it does not detail how clinicians might be able to charge clinicians need help, unless the intent is to automate clini- more, because the quality of the decision-making should get cians out of existence. Clinicians need the most help with the better. The flow of money in healthcare is complicated, from unusual cases, cases with high clinician uncertainty (Yang patient to insurance company to the many intersecting clini- et al. 2016). Clinicians also need help with the social aspects cians delivering care. Unlike with consumer goods, a better of their work, with getting a team to work better together in product (healthcare decision) does not directly relate to in- order to benefit from the collective intelligence. creased demand or higher price. While this challenge may In the ICU, patients on a ventilator receive input from the seem out of scope for technical AI innovation, it still consti- Interventionist, the ICU nurse, and the Respiratory Thera- tutes a significant barrier to AI adoption compared to other pists (RT). The RTs will perform breathing tests on venti- industries. lated patients in the early morning. When the Interventionist EHR are an expensive problem. Many hospitals will have arrives, they use the results of this test to decide if a pa- 10 or more different systems that all seem designed to be tient should be extubated. But RTs and Interventionists do inoperable with one another (Reisman 2017; Glaser 2020). not always agree on who should get a breathing test, lead- Clinicians and healthcare providers are not software com- ing to situations where the doctor wants the results but the panies, yet they must constantly make large IT investments test has not been conducted. An AI system that tries to pre- to make systems run and to get them to trade data with one dict expert disagreement could raise this issue ahead of time, another (He et al. 2019). Adding on an AI system in this en- allowing the experts to make a decision before the window vironment is expensive in ways that are not the case in other for decision making has closed. Situations like expert dis- industries. The development cost never really ends, as each agreement are only indirectly captured in EHRs, but they time an EHR vendor offers a major upgrade, almost all of show real moments where clinicians would benefit from a the additional code and enhancements a healthcare provider machine prediction. previously developed must be rewritten. In our experience, AI researchers working in healthcare An additional financial barrier comes from the current are most interested in working on difficult challenges. This culture of software development. With the rapid growth of helps researchers publish, as they can offer clear evidence agile development, software development has become much that they have advanced the state of the art for AI and ma- more risk averse, with development teams searching for chine learning. However, by doing this, they most often clear evidence of value. Increasingly, teams are working to- overlook the low-hanging fruit, situations where a little bit ward defining and rapidly deploying an MVP that can pro- of well-known AI might actually help accelerate or enhance duce clear evidence of its beneficial impact in weeks. This clinical practice. In our ICU work, we noticed that clinicians is fine for retail companies that might want to try out new must frequently input orders for new medication. This is not personalization approaches, where they can deploy and run a difficult task, but it is a tedious task. As they type, they A/B studies within a few weeks to collect evidence of the see a list of possible medications and doses they might be positive impact for their innovation. This seems to be out looking for, shown as an alphabetical list. Sometimes this of reach for almost any AI healthcare system that focuses helps. In examining this mundane, tedious task, we noticed on high-risk clinician decision-making. Healthcare is unpre- that a list of medications ordered by frequency as opposed pared for A/B testing, and we are not suggesting this would to alphabetically significantly reduced the task completion be a good thing. Our point is that new employees coming time by more than 50%. This is not a “sexy” innovation nor into software development bring a mindset that is in conflict even a use of AI. But it does help to illustrate how the mun- with the pace AI in healthcare will need. dane labor of interaction with IT systems is largely ignored by data scientists and AI researchers. Clinician Acceptance and Desire Unfortunately, very little research documents the details Financial Viability of clinician decision-making and identifies the time and In our work on AI innovation in the enterprise, we have ob- place an AI inference might be experienced as most valu- served that the biggest barrier for getting a new AI capability able or explores which forms of AI output are most useful off of the whiteboard and onto a product roadmap is a strong (Cai et al. 2019; Yang et al. 2016). It does seem like in- business case. Software product managers want to know that creased funding for the human-AI interaction aspects of AI the value of the innovation will be much higher than the de- in healthcare is one way this might be addressed. The lack velopment costs and the operational costs for the innovation. of human-centered approach in healthcare AI innovation is evident. The current standard, the assumptions made by AI Acknowledgements researchers working on healthcare systems often indicates a This material is based upon work supported by the Na- lack of understanding of clinical practice workflows (Topol tional Science Foundation under Grant No. (2007501) 2019). Many if not most research systems are built with the and work supported by the National Institutes of Health assumption that clinicians recognize that they need help with (R35HL144804). Any opinions, findings, and conclusions a decision, and that in their “free time” the clinicians will or recommendations expressed in this material are those of walk up and use a separate IT system to get advice on what the authors and do not necessarily reflect the views of the they should do (Yang et al. 2016). The reality is that clin- National Science Foundation or the National Institutes of icians have no free time and they are unsure when a smart Health. system might help. Their main experience with clinical deci- sion support systems mostly involves continuous, irrelevant alerts that distract them from their work, and that provide a References negative orientation to spending time on the computer (Ra- Cai, C. J.; Winter, S.; Steiner, D.; Wilcox, L.; and Terry, jkomar, Dean, and Kohane 2019). M. 2019. ” Hello AI”: Uncovering the Onboarding The lack of a human-centered approach to AI innovation Needs of Medical Practitioners for Human-AI Collabora- means that many innovation avenues are under-investigated. tive Decision-Making. Proceedings of the ACM on Human- For example, little work has been done to apply techniques computer Interaction 3(CSCW): 1–24. such as business process mining to healthcare. This would Glaser, J. 2020. Its time for a new kind of electronic health help reveal what the actual standard of care is, and clini- record. Harvard Business Review. June 12. cians could use this type of insight on their own behaviors to better understand and identify areas for improvement. As He, J.; Baxter, S. L.; Xu, J.; Xu, J.; Zhou, X.; and Zhang, K. we mentioned previously, predicting events like expert dis- 2019. The practical implementation of artificial intelligence agreement empowers human decision-makers to reflect and technologies in medicine. Nature medicine 25(1): 30–36. consider before they have committed to a path of action. Liu, X.; Liu, T.; Zhang, Z.; Kuo, P.-C.; Xu, H.; Yang, Z.; Healthcare practice has many goals; for example, rounding Lan, K.; Li, P.; Ouyang, Z.; Ng, Y. L.; et al. 2021. TOP- often mixes goals of patient care and health worker training. Net Prediction Model Using Bidirectional Long Short-term More human-learning focused AI systems might capture the Memory and Medical-Grade Wearable Multisensor Sys- dialog and provide feedback to an attending physician on the tem for Tachycardia Onset: Algorithm Development Study. quality of their rounding – feedback such as waiting longer JMIR Medical Informatics 9(4): e18803. after asking a question and monitoring for implicit bias in Nemati, S.; Holder, A.; Razmi, F.; Stanley, M. D.; Clifford, who they ask questions of and who they compliment. Sys- G. D.; and Buchman, T. G. 2018. An interpretable machine tems could also help with orchestration, the work of effec- learning model for accurate prediction of sepsis in the ICU. tively coordinating work across the many experts. For exam- Critical care medicine 46(4): 547. ple, in the ICU, a system might recommend an order for vis- iting patients during rounding based on the estimated time Rajkomar, A.; Dean, J.; and Kohane, I. 2019. Machine learn- needed, the importance of early decision making for a pa- ing in medicine. New England Journal of Medicine 380(14): tient (will they likely be extubated), and the physical layout 1347–1358. of the rooms. There are many types of insights around hu- Reisman, M. 2017. EHRs: the challenge of making elec- man performance and processes that have been largely ig- tronic data usable and interoperable. Pharmacy and Thera- nored by current technically focused research. peutics 42(9): 572. Conclusion Seneviratne, M. G.; Shah, N. H.; and Chu, L. 2020. Bridging the implementation gap of machine learning in healthcare. In this position paper, we elaborated on the interrelated chal- BMJ Innovations 6(2). lenges of feasibility, viability, and acceptance for moving clinical AI into the real world. These challenges will re- Song, H.; Rajan, D.; Thiagarajan, J. J.; and Spanias, A. 2018. quire thinking about not only the AI capabilities that are ripe Attend and diagnose: Clinical time series analysis using at- for application, but also the business of healthcare, and the tention models. In Thirty-second AAAI conference on artifi- needs and desires of the frontline healthcare workers. We en- cial intelligence. vision a future where researchers from AI, HCI, healthcare, Suresh, H.; Hunt, N.; Johnson, A.; Celi, L. A.; Szolovits, design, and business research communities work together to P.; and Ghassemi, M. 2017. Clinical intervention predic- take on these challenges. HCI and design researchers can tion and understanding using deep networks. arXiv preprint focus on how technical advances in clinical AI might match arXiv:1705.08498 . to current needs and workflows of clinicians. AI and busi- ness researchers can work on low hanging fruit – worker Topol, E. J. 2019. High-performance medicine: the conver- needs and desires design research reveals that are likely gence of human and artificial intelligence. Nature medicine to be solved with well known AI capabilities and existing 25(1): 44–56. healthcare data and infrastructure. We invite clinical AI re- Yang, Q.; Zimmerman, J.; and Steinfeld, A. 2015. Review searchers to advance collaborative research practices, effec- of Medical Decision Support Tools: Emerging Opportunity tively bridging the gap between research communities. for Interaction Design. IASDR 2015 Interplay Proceedings . Yang, Q.; Zimmerman, J.; Steinfeld, A.; Carey, L.; and An- taki, J. F. 2016. Investigating the heart pump implant deci- sion process: opportunities for decision support tools to help. In Proceedings of the 2016 CHI Conference on Human Fac- tors in Computing Systems, 4477–4488. Yoon, J. H.; Jeanselme, V.; Dubrawski, A.; Hravnak, M.; Pinsky, M. R.; and Clermont, G. 2020. Prediction of hy- potension events with physiologic vital sign signatures in the intensive care unit. Critical Care 24(1): 1–9. Zhang, D.; Thadajarassiri, J.; Sen, C.; and Rundensteiner, E. 2020. Time-Aware Transformer-based Network for Clinical Notes Series Prediction. In Machine Learning for Health- care Conference, 566–588. PMLR.