NLP for the Institute Developing and Deploying an NLP Capability to Accelerate Cancer Research Aaron Cohen Oregon Health & Science University Portland, Oregon, United States cohenaa@ohsu.edu Abstract— It has been well documented that a great deal of characterization are two areas in particular where text sources data useful for medical research is present in clinical narrative could greatly supplement our current data. text. The OHSU Knight Cancer Institute has begun a program to There is perhaps less discussion about how often what was create a natural language processing (NLP) capability to extract, structured data at its origin has become inaccessible except in store, and link data from free text sources at the patient level, free text form. This problem is further compounded in tertiary and make this data available to researchers in a continuous, care institutions, like the OHSU Knight Cancer Institute, where reusable, efficient and timely manner through services delivery the entire history of a referred patient's condition may only be from the Translational Research Hub (TRH). This talk will present in the electronic health record (EHR) as free text. present the challenges, progress, and future goals of our program to build NLP capabilities that can help us use free text from the At the same time, future medical advances, such as in cancer EHR to first support the transformation of cancer research with research, will require much more complete patient data than has the hopes of positively impacting clinical care in the future. been previously available. Such advances include the discovery of new cures, expanding early detection, and realizing the promise Keywords— Text mining; Cancer research; Translational of precision medicine. Phenotype description and outcome medicine