Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA 1 TOCSOC: A temporal ontology for comparing the survival outcomes of clinical trials in oncology Deendayal Dinakarpandian, Michaela Liedtke, Bhavish Dinakar Mark A. Musen College of Chemistry, University of California, Berkeley Department of Medicine, Stanford University Berkeley, CA Stanford, CA dinakar@stanford.edu Abstract—The outcome of clinical trials for cancer is typically D. Consolidating gains in therapy. In contrast to summarized in terms of survival. However, different trials for incurable cancers, the availability of highly effective treatments the same disease may use different measures of survival, or use for some cancers makes it possible to induce longer periods of differing vocabulary to refer to the same outcome measure. This remission (potentially a cure) where there is no evidence of makes it harder to automate an objective comparison of disease. Rather than measures of mortality, measures like treatments. We propose a temporal ontology of survival outcome disease free survival are useful in such cases. measures that a) helps to standardize the vocabulary for reporting survival outcomes and b) makes it possible to E. Limited recruitment and retention in studies. Patients automatically rank the relative efficacy of different treatments. are prone to drop out of studies, particularly in cancer. The approach has been illustrated by examples from the Progressive attrition of participants sometime forces oncology literature. The temporal ontology and the investigators to use short term measures to report outcomes accompanying reasoner are freely available on Github rather than wait for the originally planned longer term (https://github.com/pdddinakar/TOCSOC). measures. For example, 2 or 3 yr. survival statistics might be reported instead of 5 yr. statistics. Keywords—temporal ontology; survival outcome; oncology; clinical trials; reasoning F. Early termination on ethical grounds. If a therapy is highly successful compared to standard therapy, a decision to I. INTRODUCTION terminate the study and publish early might be made. Conversely, if the treatment itself causes unacceptable harm to The outcome of clinical trials for cancer is often trial participants, the trial may be terminated prematurely. In summarized in terms of survival. This may be a rate, for both cases, measures of shorter term survival may be included example a 5-yr survival of 50% or a duration, for example a in the corresponding publication. median survival time of 4 years. Ideally, if all potential treatments for a specific cancer were compared in terms of a Even when the same survival measure is used, different common metric, it would be straightforward to rank them in studies use different terms to refer to the same concept, and terms of their effectiveness. In reality, clinical trials often use a different papers use the same term to refer to differing outcome wide variety of survival outcome measures. The scientific, measures. Oncologists typically use their expert knowledge to ethical and pragmatic reasons for this heterogeneity are listed resolve these ambiguities and evaluate the relative merits of below: different therapies. This could be in the context of drafting best practice guidelines or for individualized patient care. A. Variation in study design. Long term studies may use survival measures over longer periods of times than short term This paper proposes the use of a temporal ontology of studies. terms for summarizing the results of clinical trials in oncology. The use of an ontology can reduce the ambiguity in specifying B. Differences in life expectancy. Life expectancy after results. Additionally, the inclusion of temporal relationships diagnosis varies greatly among cancers. For instant, the 5-yr within the ontology can help partially automate the comparison survival rate for malignant melanoma exceeds 90% but is less between treatments whose effectiveness has been summarized than 20% for lung cancer (1). Thus, studies to improve the with different but related measures. We first describe the treatment might seek to look at longer time periods for source of the vocabulary and the process to create the temporal melanoma compared to lung cancer. ontology. This is followed by a description of the reasoning C. Tracking disease control. For cancers that are used to rank treatments for a specific cancer. We give incurable, the pragmatic goal is sometimes to retard its examples from real world data and conclude with a discussion progress. In such cases, progression-free survival rather than of limitations and future plans. measures of mortality may be used as a metric to capture phases of stable disease. ICBO 2018 August 7-10, 2018 1 Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA 2 II. CREATION OF THE TEMPORAL ONTOLOGY C. Ambiguous terms not useful for comparing durations Overall survival (OS) is a commonly used measure of the or rates were removed, e.g., Long term survival. Since time effectiveness of cancer therapy. It is defined as the length of duration is expected to be explicitly stated in summarizing an time from either the date of diagnosis or the start of treatment outcome, “Long term survival” is not a useful concept to that patients are still alive (2). In other words, such a standardize. commonly used term has two different interpretations that is obvious only to a human reader. We searched the Bioportal (3) TABLE I. collection of ontologies for a perfect match to the term TERMS FROM CCTOO SUFFIX SORTED TERMS “Overall survival.” The following four independent resources include OS as a term: “National Cancer Institute Thesaurus Distant recurrence-free survival Disease free survival rate (NCIT) (4),” “Experimental Factor Ontology (EFO) (5),” Biochemical relapse-free survival Relapse-free survival rate Long term survival Progression-free survival rate “Cancer Care: Treatment Outcome Ontology (CCTOO) (6)” Local relapse-free survival Event-free survival rate and “Interlinking Ontology for Biological Concepts (IOBC) Event-free survival rate Overall survival rate (7).” As CCTOO (6) is specific to cancer treatment, we Invasive disease-free survival Breast cancer specific survival selected this ontology for further exploration. Failure-free survival Disease-specific survival Metastasis-free survival Prostate cancer-specific survival Out of a total of 1133 terms in the ontology, we found 35 Overall survival rate Regional recurrence free survival terms (First column in Table 1) containing the token Treatment-free survival PSA progression free survival “survival,” which were scattered throughout the ontology. Distant failure-free survival Symptomatic skeletal event free Locoregional failure-free survival survival CCTOO is based on IS_A and IS_ASSESSED_BY PSA progression free survival Recurrence-free survival relationship between terms. In contrast, our goal was to create Overall survival Local recurrence-free survival a temporal ontology with the relationship Disease-specific survival Distant recurrence-free survival NOT_GREATER_THAN (NGT) between the terms. The Progression-free survival Failure-free survival rationale for this is the fact that many events in cancer Symptomatic skeletal event free Locoregional failure-free survival outcomes that precede another could also be simultaneous. For survival Distant failure-free survival Local progression-free survival Disease-free survival example, though several symptoms (events) of cancer may not Distant disease-free survival Invasive disease-free survival be fatal, the timing of some symptoms may coincide with Immune-related progression-free Biochemical disease-free survival death. survival Distant disease-free survival Radiographic progression-free Relapse-free survival An exhaustive approach to determine if an NGT survival Biochemical relapse-free survival relationship exists between every pair of terms would require Relapse-free survival Local relapse-free survival 595 comparisons. In order to this more efficiently, we first Progression-free survival rate Progression-free survival sorted the terms based on their suffixes to group related Event-free survival Radiographic progression-free Disease-free survival survival concepts together - the terms were reversed, sorted based on Clinical progression-free survival Immune-related progression-free the reversed strings and reversed again to obtain the original Local recurrence-free survival survival terms. This procedure resulted in a sorted list of terms (Second Regional recurrence free survival Biochemical progression-free column in Table 1), such that neighboring terms sharing Biochemical progression-free survival suffixes were more likely to have a temporal relationship with survival Clinical progression-free survival Prostate cancer-specific survival Local progression-free survival each other. For example, the first five terms in the second Relapse-free survival rate Metastasis-free survival column in Table 1 are all survival rates, and all types of Disease free survival rate Treatment-free survival “Progression-free survival” are grouped together. Recurrence-free survival Event-free survival Biochemical disease-free survival Overall survival These were manually checked and arranged into a Breast cancer specific survival Long term survival hierarchical list, where each indent corresponds to the NGT relationship. Since definitions were missing for most of the CCTOO terms, we referred to the following resources, in D. A clear distinction between period and rate was made. order, to establish and add the meanings of the terms: NCI It is common practice in publications to use the term “survival” dictionary (2), the NCI Outcome Measures Glossary (8, 9), the to refer to both a duration of time, e.g., median survival time DATECAN initiative (10) and finally Pubmed (11) searches and a rate, e.g., proportion alive after a period of time has for papers containing the terms. We edited the hierarchy based elapsed. The reader has to infer this from the context. on the following criteria: However, this distinction needs to be explicit in an ontology. A. Highly specific terms were removed, e.g., Breast Therefore, we added the suffix “time” to all terms to indicate cancer specific survival. Since the intended use of the proposed the first interpretation and the suffix “rate” to all terms to temporal ontology is in the context of a specified disease, it is indicate the first interpretation. redundant to explicitly include disease names in the names of E. Missing terms were added, e.g., only 5 ‘rate’ terms survival measures. were present in CCTOO. A corresponding ‘rate’ term was B. Synonyms were merged together, e.g., “Disease-free created for each ‘time’ term. survival” was chosen as the canonical term for “Relapse-free The resulting temporally related hierarchy contains 44 survival.” terms related by NOT_GREATER_THAN (NGT) relationships. These consist of 22 concepts expressed as both ICBO 2018 August 7-10, 2018 2 Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA 3 durations (Fig. 1) and rates (only the first few rows as shown at with O2. If the observed value of O1 is at least as large as the bottom of figure for brevity). The full version is available as an value of O2, then T1 is likely better than T2. OWL file created with the help of Protégé (12). While the distinction between rate and time may be clear to a human We present several representative cases below to illustrate specific scenarios of reasoning derived from the general reader from the context, it is necessary to separate these concepts for machine interpretation. Also, since the motivating TOCSOC rule. goal is to compare treatments, definitions of the concepts Note A. Identical measure with different values. If treatments that the terms “Overall survival time (OS)” and “Disease- x and y have overall survival times (often reported as medians) specific survival time (DSS)” are in bold on the far right as the of 5 and 6 years respectively, then it is trivial to conclude that y deepest concepts. These refer to the longest periods. All terms is better than x. Now consider a treatment p for the same type are NGT DSS, and OS is NGT DSS. This is because OS is of cancer where the group was followed for only 5 years, at agnostic of health or treatment status, while DSS is longer which point more than half the subjects were still alive. This is because it excludes deaths from causes unrelated to the disease usually referred to as median not reached, implying that the or its treatment. At the other extreme, “Treatment-free survival overall survival for this group is greater than 5. This implies time” has the shortest duration and has an NGT relationship that p is likely better than x, but not guaranteed to be better with all terms; cancer is likely to return earliest when all than y. treatments, including maintenance, are discontinued. The final hierarchy was checked for accuracy by author M.L., who is an B. Measures of same type but differing in duration. If oncologist. treatment x results in a 5-yr OS rate of 80% while treatment y results in a 4-yr OS rate of 70%, then x is better than y. Fig. 1. The TOCSOC temporal hierarchy. C. Temporally related measures. This is the specific Treatment-free survival time scenario that TOCSOC was envisioned to handle. If treatment Failure-free survival time x results in a (median) progression-free survival (PFS) of 5 Distant failure-free survival time years and treatment y results in a median OS of 4 years, then x Regional failure-free survival time has an OS of at least 5 years (inferred from TOCSOC) and is Local failure-free survival time Disease-free survival time therefore better than y. Event-free survival time D. Comparing rates with periods. When available, Invasive disease-free survival time Symptomatic skeletal event-free survival times should be compared with survival times and survival time rates with rates. However, it may sometimes be necessary to Biochemical disease-free survival time compare rates with times. This is possible to a limited extent. Recurrence-free survival time Measures that end with “survival time” are typically the Distant recurrence-free survival time median survival time within a group. For example, if 4 subjects Regional recurrence-free survival time with treatment x have survival times {1,2,4,5}, then (median) Local recurrence-free survival time survival time with treatment x is 3 years. This may be Locoregional recurrence-free survival interpreted as a survival rate of 50% at 3 years. To be strictly time correct, this corresponds to a survival rate of at most 50% since Progression-free survival time the median for survival times {1,3,3,3} is also 3, even though Radiographic progression-free survival time this is also the maximum survival time; there are no survivors Biochemical progression-free survival past 3 years. time Clinical progression-free survival E. Replicate measures. Different studies may report time different outcomes for the same treatment. One option to deal Local progression-free survival time with this situation is to use an average value for each treatment Overall survival time that is weighted by the size of the replicate studies. Another Disease-specific survival time option is to compare treatments based on a bounded range of reported performances, though this is likely to underestimate Treatment-free survival rate the difference between treatments. Failure-free survival rate ………… F. Indeterminable comparisons. Sibling terms ………… (successive terms at the same level of indentation in Fig. 1) are uncomparable by definition. For example, “Biochemical III. TEMPORAL REASONING FOR TREATMENT progression-free survival time” may be greater than “Clinical COMPARISON progression-free survival time” in some individuals, but the The temporal ontology shown in Fig. 1 may be interpreted other way around in others. Even when comparable, it is hard as longer time durations from left to right. This temporal to reach a conclusion if one treatment has an OS of 90 % at 2 ordering of types of survival outcomes can be exploited based years and an alternative treatment has a PFS of 50% at 4 years. on the following key TOCSOC reasoning principle: A plethora of data can also paradoxically lead to an inconclusive result. If multiple metrics are available for each Consider treatments T1 and T2 with respective outcome treatment, then rankings might be different or even reversed measures O1 and O2, such that O1 has an NGT relationship based on choice of metric. The pragmatic strategy for this is to ICBO 2018 August 7-10, 2018 3 Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA 4 report all rankings along with the rationale, thus serving more randomized clinical trials, study populations often turn out to as an objective summary of evidence than a ranker. contain a mixture of cancers at the molecular level. For improving the rationale of decision making, advances in Based on the above considerations, we implemented a disease subtyping also need to be taken into account. Each reasoner that takes a temporal ontology and a set of treatments study is likely to have selection biases, both known and with corresponding survival outcomes as input, and outputs a unknown in its choice of subjects. While treatment outcomes ranking of treatments. The survival outcome input is specified are often summarized as an average estimate of effectiveness, as either a rate (time period of observation and proportion) or a it is important to take into account the confidence intervals of duration (survival time). The temporal ontology is represented estimates when comparing them. Further, expanded individual internally as directed acyclic graph in an adjacency matrix. A profiles are likely to be taken into account in the era of second directed graph is created corresponding to the ranking personalized and molecular medicine. of treatments. In silent mode, only unambiguous rankings are returned. In verbose mode, undeterminable rankings (cycles in The present study could be improved in terms of both the the graph) are also included in the output. Since the ontology is ontology employed and the power of the reasoner. This paper read dynamically, the reasoner can be used with alternate restricted itself to using terms from a pre-existing ontology in versions of ontologies based on NGT relationships. the useful but narrow perspective of ‘survival.’ As medical care improves to the point where many more cancers are IV. ILLUSTRATIONS FROM LITERATURE curable, temporal metrics for the quality of life are likely to become more important. Further, different types of cancer may Consider the results of two treatments (the exact details are use specialized metrics to evaluate outcomes. As terms are not relevant) for high risk multiple myeloma shown in the used more consistently in the literature, more precise temporal table below: relationships could be used. While using a detailed temporal ontology like the W3C OWL Time Ontology (16) would be Trial Treatment Disease Metric Value overkill, it would be helpful to add a few more relationships, Reference e.g., STRICTLY_LESS_THAN could be added where (13) A HRMM OS 5-yr 55% applicable. As such, the first version of TOCSOC is best (14) AA+B HRMM OS 4-yr 54% viewed as an upper ontology. More terms can be incorporated by mining trials registered at sites like “clinicaltrials.gov” for primary and secondary endpoints that have temporal Since treatment AA+B has a 4-yr OS that is lower than dependencies, some of which may be specific only to certain the 5-yr OS for treatment A, it cannot be better than treatment cancers. A. The reasoner is currently conservative in being largely Now consider the following comparison of treatment deterministic; it could be enhanced by a Bayesian mode that AA+B with AAsib that exploits the structure of TOCSOC. takes into account prior distributions of the outcomes as well as The observed outcome for AAsib corresponds to 50% OS at the temporal relationship between them. Instead of point 4.25 years. Since the 4-yr PFS for AA+B is 52%, we can estimates, full distributions could be taken into account to conclude that the 4-yr OS for AA+B is significantly higher combine multiple weak signals into more robust evidence for than 52% (OS is typically considerably higher than PFS in rankings. most cases) and therefore better than AAsib. Trial Treatment Disease Metric Value ACKNOWLEDGMENTS Reference This work was conducted using the Protégé resource, (14) AA+B HRMM PFS 4-yr 52% which is supported by grant GM10331601 from the National (15) AAsib HRMM Median OS 4.25 yrs. Institute of General Medical Sciences of the United States National Institutes of Health. V. LIMITATIONS & FUTURE PLANS REFERENCES We have shown the value of recasting an existing ontology into one based on temporal relationships for comparing the 1. Surveillance, Epidemiology, and End Results Program effectiveness of different treatments for cancer. This can help [Internet]. 1. Surveillance, Epidemiology, and End Results rank different treatments for each cancer, especially as multiple Program [Internet]. Available from: https://seer.cancer.gov/. new treatments are increasingly becoming available for several 2. NCI Dictionary of Cancer Terms [Internet]. Available cancers. However, it is important to acknowledge that this from: approach only ranks treatments; it is far from a treatment https://www.cancer.gov/publications/dictionaries/cancer- ‘recommender.’ Several other considerations often drive choice terms. of therapy. A treatment with a shorter survival time may be 3. Whetzel PL, Noy NF, Shah NH, Alexander PR, Nyulas C, selected for reasons of toxicity, cost or patient age. A treatment Tudorache T, et al. BioPortal: enhanced functionality via new that is better at preventing distant recurrences than local Web services from the National Center for Biomedical recurrences may be preferred. The result of comparing a set of Ontology to access and use ontologies in software treatments may not be valid because of heterogeneity of the underlying disease. Despite diligent efforts to conduct ICBO 2018 August 7-10, 2018 4 Proceedings of the 9th International Conference on Biological Ontology (ICBO 2018), Corvallis, Oregon, USA 5 applications. Nucleic Acids Res. 2011;39(Web Server 10. Gourgou-Bourgade S, Cameron D, Poortmans P, Asselain issue):W541-5. B, Azria D, Cardoso F, et al. Guidelines for time-to-event end 4. NCI Thesaurus [Internet]. Available from: point definitions in breast cancer trials: results of the https://ncit.nci.nih.gov/ncitbrowser/. DATECAN initiative (Definition for the Assessment of Time- 5. Malone J, Holloway E, Adamusiak T, Kapushesky M, to-event Endpoints in CANcer trials)dagger. Ann Oncol. Zheng J, Kolesnikov N, et al. Modeling sample variables with 2015;26(5):873-9. an Experimental Factor Ontology. Bioinformatics. 11. PubMed [Internet]. Available from: 2010;26(8):1112-8. https://www.ncbi.nlm.nih.gov/pubmed/. 6. Lin FP-Y, Groza T, Kocbek S, Antezana E, Epstein RJ. 12. Musen MA, Protege T. The Protege Project: A Look Back The Cancer Care Treatment Outcomes Ontology (CCTO): A and a Look Forward. AI Matters. 2015;1(4):4-12. computable ontology for profiling treatment outcomes of 13. Bjorkstrand B, Iacobelli S, Hegenbart U, Gruber A, patients with solid tumors. Journal of Clinical Oncology. Greinix H, Volin L, et al. Tandem autologous/reduced- 2017;35(15_suppl):e18137-e. intensity conditioning allogeneic stem-cell transplantation 7. Kushida T, Kozaki K, Tateisi Y, Watanabe K, Masuda T, versus autologous transplantation in myeloma: long-term Matsumura K, et al., editors. Efficient construction of a new follow-up. J Clin Oncol. 2011;29(22):3016-22. ontology for life sciences by subclassifying related 14. Green DJ, Maloney DG, Storer BE, Sandmaier BM, terms in the Japan Science and Technology Agency thesaurus. Holmberg LA, Becker PS, et al. Tandem International Conference on Biomedical Ontology; 2017. autologous/allogeneic hematopoietic cell transplantation with 8. National Cancer Institute. Outcome Measures Glossary bortezomib maintenance therapy for high-risk myeloma. [Available from: Blood Adv. 2017;1(24):2247-56. https://wiki.nci.nih.gov/display/CRF/Outcome+Measures+Glo 15. Giaccone L, Storer B, Patriarca F, Rotta M, Sorasio R, ssary. Allione B, et al. Long-term follow-up of a comparison of 9. Punt CJ, Buyse M, Kohne CH, Hohenberger P, Labianca nonmyeloablative allografting with autografting for newly R, Schmoll HJ, et al. Endpoints in adjuvant treatment trials: a diagnosed myeloma. Blood. 2011;117(24):6721-7. systematic review of the literature in colon cancer and 16. W3C. Time Ontology in OWL 2017 [Available from: proposed definitions for future trials. J Natl Cancer Inst. https://www.w3.org/TR/owl-time/. 2007;99(13):998-1003. ICBO 2018 August 7-10, 2018 5