=Paper=
{{Paper
|id=Vol-1692/paperC
|storemode=property
|title=Challenges and opportunities for system biology standards and tools in medical research
|pdfUrl=https://ceur-ws.org/Vol-1692/paperC.pdf
|volume=Vol-1692
|authors=Matthias König,Anika Oellrich,Dagmar Waltemath,Richard J. B. Dobson,Tim J. P. Hubbard,Olaf Wolkenhauer
|dblpUrl=https://dblp.org/rec/conf/odls/KonigOWDHW16
}}
==Challenges and opportunities for system biology standards and tools in medical research==
Challenges and opportunities for system biology standards and tools in medical research Matthias König1,† , Anika Oellrich 2,† , Dagmar Waltemath 3,†∗, Richard JB Dobson2,4 , Tim JP Hubbard5 and Olaf Wolkenhauer 3 1 Humboldt University Berlin, Institute for Theoretical Biology, D-10115 Berlin, Germany 2 King’s College London, IoPPN, London, SE5 8AF, UK 3 University of Rostock, Department of Systems Biology and Bioinformatics, D-18051 Rostock, Germany 4 UCL Institute of Health Informatics, Farr Institute of Health Informatics Research, University College London, London, WC1E 6BT, UK 5 King’s College London, Department of Medical & Molecular Genetics, London, SE1 9RT, UK † Authors contributed equally. ABSTRACT simulation of disease progression [6]; and the understanding of Kinetic models are increasingly relevant in medical research. In mechanisms, as opposed to just predicting outcomes [7]. systems biology, more than 10 years of experience with the develop- With new technologies available to provide the data to identify ment of standards and tools to construct and analyse kinetic models and characterise disease relevant components, there is an increasing exists. This has supported the sharing of kinetic models, increased demand for methodologies that enable us to study the interactions their reuse, and thereby has helped to reproduce and validate sci- of molecular and cellular components in a patient. Arguably, the entific results. Given this expertise, it seems natural to consider success of systems and personalised medicine relies then on the the application and development of standards and tools to meet the application of kinetic models in the clinic [8]. requirements of medical scientists. The construction of such models requires an integration of clin- In this paper, we discuss challenges and opportunities for stan- ical and patient-specific molecular data with public databases such dards and tools from systems biology in medical research, and we as Ensembl [9] and ENCODE [10]. This process effectively brings suggest criteria for the safe use of simulations. We conclude that together the two worlds of basic research and clinical practice. For standards, tools and infrastructure need to be extended to ensure this union to succeed, ontologies will play a crucial role. Standards the quality, reliability and safety required when working with medical to encode information together with ontologies to unambiguously and patient data. This will foster the adaptation of modelling in the characterise domain knowledge, form the basis for the development clinic, providing tools for improved diagnosis, prognosis and therapy. of tools that can analyse kinetic models. These tools in turn support Contact: dagmar.waltemath@uni-rostock.de the sharing and reuse of models, which is also a means to validate results and generally improve reproducibility in medical research. Here we illustrate the challenges that need to be overcome in 1 INTRODUCTION future work to achieve trustworthy systems that can be integrated In modern medicine, technologies complement conventional clin- easily into a clinical environment. The structure of the remaining ical data with molecular and genetic information. Patient-specific sections of this paper is as follows. In section 2 , we outline the molecular profiling provides opportunities for earlier diagnosis, challenges that exist when planning to use state-of-the-art systems more accurate prognoses and optimised therapeutic decisions [1]. biology tools and standards in clinical environments. Following on The data generated from these new technologies have led to a rise from that, in section 3 we suggest criteria that address the challenges of computational approaches in medicine [2]. outlined and need to be taken into consideration when building clin- ‘Personalised Medicine’ and ‘Systems Medicine’ are two terms ically applicable solutions. We present a summary of our findings in that are frequently used to capture this trend for interdisciplinary the last section of this paper. approaches in which clinical research, molecular and cell biol- ogy, medical informatics, bioinformatics, biostatistics and systems biology approaches join forces. Personalised medicine uses marker- 2 CHALLENGES IN APPLYING SYSTEMS assisted diagnosis and targeted therapies derived from an individ- BIOLOGY STANDARDS AND TOOLS ual’s molecular profile and patient data [3]. Systems medicine aims 2.1 Access to clinical data to bring computational models closer to the clinic to shed light on the dynamic complexity of human physiology and disease [4]. In Almost no clinical data sets are available for integration with this context, the focus has been on the modelling of phenomena, models, neither are these data sets sufficiently documented in a for- where an understanding of processes (kinetics) is crucial. This in- malised manner. Consequently, the process of selecting clinical data cludes the response of cells, tissues and organs to drugs [5]; the for a given model (and vice versa) is hindered. This is partly due to patient data being sensitive, limiting its accessibility for analysis, but mainly due to missing incentives, guidelines and requirements ∗ to whom correspondence should be addressed to provide data access upon publication of clinical studies. 1 König et al Clinical data sets are required for testing as well as prediction Markup Language (SED-ML), or BioPax [17]. As a consequence, purposes. While a reoccurring complaint is the lack of suitable data sharing and/or integrating models within communities is feasible. sets to test a model with, this problem is hard to overcome given However, model reuse across communities can be challenging, as that patient data needs to be secured over unauthorised access at all different standards are used for the representation and annotation of times or anonymised in a proper manner. Some efforts such as the the data. 100,000 genome project conducted by Genomics England1 and the Even within communities, there is no consensus on which ontolo- openEHR2 project aim to provide access to structured, semantically gies to use for data and model representation. It is also not defined annotated clinical data for research purposes. However, the amount to which degree of detail models and data need to be annotated, cre- of available data is still too limited to test models and computational ating further obstacles to integrate models for simulation purposes. simulations reliably. Extensive cross-domain initiatives need to be built and are required In practice, most research data are neither shared nor recycled to take decisions on ontologies and standards that are not only con- outside the original project team [11]. Models are instead being venient for model developers, curators and researchers, but that are developed and used within a single clinic, e. g. by collaborative also practical (implementation, costs, etc.) in a clinical application projects that incorporate clinical research groups and computational scenario. biology groups located in the same institution. In these settings, however, modelling has already been applied successfully, for 2.4 Validated predictions in a clinical context example to study melanoma resistance to immunotherapy [12]. A major hurdle for the translation of computational models into medical research is the difficulty to proof the efficiency and pre- 2.2 Good quality models and documentation dictive value of the model. Every recommendation determined by a In addition to relevant clinical data being accessible, it must be rep- clinical decision support system needs to be in line with the policies resented in a way that it can be integrated and interpreted by both for medical care providers as issued by the health authorities in the humans and machines. This requires a dialogue not only between respective country. In order to proof health economic efficiency, ex- healthcare providers and researchers, but also with staff recording tensive, potentially double-blinded, clinical trials are required that the data and policy makers regulating patient data records. compare model-based treatment decisions with unsupported deci- Currently, the majority of published models are not available in sions by clinical staff. These clinical trials have to span over all areas standard formats, and the model quality is not sufficiently doc- of clinical application, i.e. cover different types of diseases as well umented. While promoting the reuse of such virtual experiments as ranges of treatments and patients in differing health conditions would vastly improve the usefulness and relevance of computational to assess clinical safety. Every in silico model provides an esti- models in biomedical endeavours [13], even the computational code mation of pathological processes and therefore naturally contains underlying a model is often inaccessible. Without the ability of re- errors. These errors can potentially lead to wrong treatment deci- producing the models, however, models cannot be exploited for sions, which is why great care needs to be taken when transporting clinical use. SED-ML is a standard for the encoding of simulation systems biology models, standards and tools into clinical practice. setups, the specification of possible parametrisations and the def- Sustained software support is equally important. Software libraries inition of analyses [14]. However, SED-ML to date encodes only for standards should to be stable, well-tested, and they should sup- for a subset of experiments performed in clinical research. Further port the complete standard in correct manner. Such implementations extensions are needed in the standard itself. will facilitate the update of standards by the community and tool In addition, available models are not fully annotated, i.e. the developers and thus provide shareable data and models. description of model components and parameters are missing, hin- dering interpretation and integration with other models and clinical data. Model provenance information is not kept, leading to misin- 3 CRITERIA FOR REUSABLE SIMULATION terpretations and even irreproducibility of the original findings. MODULES AND SEMANTIC DATA Ongoing efforts such as curation processes in BioModels3 , or the The reproducibility and reusability of models and model-based re- provision of fully reproducible archives of virtual experiments in sults have been discussed in several assays over the past years the Physiome Model Repository [15] or in the JWS Online database [8, 18]. One conclusion of these assays is that the reusability of sim- [16] improve this situation. However, curation is very slow due to ulation models needs to be ensured, before computational models the manual labour involved and seldom performed after a model has can be considered for predictive processes in the clinic. Four impor- been published. Moreover, concerted efforts for model validation, tant aspects that determine reusability are discussed in the following annotation, and conversion into computable formats are lacking. subsections. 2.3 Standardised representation of models and data 3.1 Semantic annotation via biomedical ontologies The systems biology community developed a set of interoperable An essential step to ensure reusability of models is a thorough se- standards for modelling in biology, including the Systems Biology mantic annotation to biomedical ontologies. An ontology formally Markup Language (SBML), CellML, Synthetic Biology Markup defines concepts and relations between concepts in a knowledge Language (SBOL), NeuroML, Simulation Experiment Description domain [19]. In the context of this paper, semantic annotation de- scribes the process of linking the entities and processes of a model 1 https://www.genomicsengland.co.uk/ to terms in relevant ontologies. These semantic descriptions allow the-100000-genomes-project/ researchers and tools alike to describe the data used in experimental 2 http://www.openehr.org studies and models. They enable not only the integration of different 3 http://www.ebi.ac.uk/biomodels-main/ types of data but also the reasoning over the data, thus connecting 2 Systems biology standards in medical research data items (or models) to existing knowledge. Systems biology es- 3.2 Generation of safe simulation modules tablished a system for semantic annotations of models, using RDF Reusability depends on the availability of all model-related data [8]. together with standardised relationships [20] and resources identi- For studies performed by medical researchers, it is particularly im- fiers [21]. Recently, composite annotations have been proposed as a portant to provide full documentation of safe parameter ranges and means to provide exact descriptions of the model entities [22]. test case scenarios. This requires tailor-made standards for report- In order to implement models in the clinic, the systems biology ing. The data description must ensure that it is straightforward to data must be linked to biomedical data, biomedical measurements interpret the output from simulation modules without an expertise and personalised patient data. An integration on the syntactical level in modelling. is not expressive enough to allow for automatisation, but integra- In this context, a simulation module encapsulates a computational tion on the semantic level holds the promise of overcoming this model that has been tested, documented, annotated, and certified to limitation. Figure 1 illustrates the necessary steps for the seman- meet safety requirements. A module suitable for inclusion into a tic integration of patient data, computational models, and external diagnostic tool needs to provide extensive documentation and safe, data for the benefit of patients and clinical staff. standardised software interfaces (e. g. for resetting simulation pa- Many biomedical ontologies are maintained in online portals, rameters or accessing and interpreting simulation results; see more such as BioPortal or the Open Biomedical Ontologies (OBO) details section 3.4). The requirements for documentation of a model Foundry web page, which provide search interfaces, web services, are clearly defined in a Minimum Information guideline (MIRIAM) version control, and mappings between ontologies [23, 24, 25]. [34]. We argue that the documentation of a simulation module for However, different ontologies are used for a semantic representa- medical research needs to be extended to also cover information on tion due to e.g. differences in the medical systems used in dif- applicable virtual experiments, allowed applications, and conditions ferent countries which requires reliable mappings between these under which the data are applicable in simulations. ontologies. In addition to these factors, the development, testing and man- One effort addressing the mapping between terminologies and on- agement of software used for medical purposes will need to follow tologies is the Unified Medical Language Systems (UMLS) [26], rules issued by regulatory agencies to ensure the safety of patients which to date harmonises over 150 terminologies and ontologies4 . and their related data. As medical software Apps have become more For example, the Human Phenotype Ontology [27], the International prevalent, guidance has been developed by a number of national Classification of Diseases 5 and SNOMED CT [28] are all integrated agencies including Germany (“Medizinproduktegesetz”)7 the US 8 in UMLS. While resources such as UMLS allow the transfer from and the UK 9 . These include definitions of what software constitutes one ontology to the other, it is important to be aware that this process a “medical device” and which regulations apply. However frame- of transfer largely depends on the quality of the mapping and the works to regulate sophisticated software systems for medicine, such quality of the annotations that have been assigned in the first place. as simulation modules, will need considerably more development. Moreover, as ontologies go through several development cycles, the mappings need to be updated, which in itself can lead to a change in 3.3 Testing procedures to ensure safety the quality of the mapping and consequently the alignment of data Due to the sheer amount of data necessary to model the physiology and models in clinical applications. Furthermore, research into the of a human being, the development of future diagnostic tools will direction of mappings and similarity measures for terms within and rely on previously developed, standardised simulation modules and across bio-ontologies should be taken into account [29]. For exam- on thorough semantic annotation. Before models and consequently ple, it can be valuable to determine the similarity of data sets that modules can be consulted in medical predictions they need to be are annotated to different ontologies. tested thoroughly. This is, in theory, possible for a subset of models Another set of ontologies to consider for this endeavour are those in systems biology. For example, all models in the curated branch encoding information about model versions, as well as provenance of BioModels should be able to reproduce at least one behavior and evidence of data encoded in the model. For example, PROV-O observed and described in the reference publication. [30] is an ontology of provenance terms that could potentially be For a module to be considered safe in a clinical environment, the adapted to attach provenance to model data. Similarly, the Prove- encapsulated model predictions must be medically reliable, i.e. they nance, Authoring and Versioning Ontology (PAV) [31],can be used must not only capture the underlying disease mechanisms but also to add provenance information for collected data and representations adapt to the uniqueness of each individual patient. This requirement chosen in simulation models/modules. Another effort going into entails that the error rate for predictions needs to be very small and this direction is the Ontology of Biomedical AssociatioN (OBAN), under no circumstances can exceptions lead to failure in the inter- used for provenance information on disease-phenotype associations mediate computation. Due to the diversity of data that is included text mined through EuropePMC6 [32]. Furthermore, the Evidence into a model, physical units, error ranges and data mappings have Ontology (EVO) [33] captures terms that can be used to trace to be handled with special care. It is crucial that the patient-specific biomedical evidence in data as well as models. Despite these on- going efforts, further work is needed to allow for the integration of computational models with a variety of independent data resources. 7 http://www.bfarm.de/DE/Medizinprodukte/ Abgrenzung/medical_apps/_node.html 8 http://www.fda.gov/downloads/medicaldevices/ 4 https://www.nlm.nih.gov/pubs/factsheets/umls. deviceregulationandguidance/guidancedocuments/ html, accessed 14 June 2016 ucm263366.pdf 5 http://www.who.int/classifications/icd/en/ 9 https://www.gov.uk/government/publications/ 6 http://europepmc.org/ medical-devices-software-applications-apps 3 König et al Figure 1. A) Illustration of the integration process of computational models and data from different sources. The integration strongly relies on the availability and detail of the ontologies used for the semantic annotations. User interfaces need to provide access to the simulation modules, but restrict the change of parameters to ranges that are safe w.r.t. a clinical application. SBML and CellML are standards used to encode models in a computable format. Electronic Health Records (EHRs) refers to any data recorded in a hospital or GP practice. B) Example workflow for the application of a simulation module to the prediction of the Galactose Elimination Capacity (GEC), a key liver function parameter. Semantically annotated patient data is used as input to the simulation module based on the defined module interface. The module performs individual predictions and risk estimation based on the input data which can be evaluated within the context of the reference ranges of the module. A proof-of-principle is available at https://www.livermetabolism.com/gec_app/. The example model is a regression model for the prediction of hepatic galactose clearance based on the independent variables gender, age, height, and weight as input parameters. The predicted GEC value and its variability (based on the uncertainty of the model prediction) are than used for the classification of the subject into healthy or diseased with the measured GEC value. Within the figure the presented key challenges (C) and important solutions (S) for systems biology standards and tools and medical research are marked: (C1) Access to clinical data. High quality clinical data must be integrated with the models. These are required for validation and for prediction; (C2) Good quality models and documentation. Requirement for representation in standard formats and description of model components and parameters; (C3) Standardised representation of models and data; (C4) Validated predictions in a clinical context. Efficiency and predictive value of the model have to be shown. Policies of medical health care providers have to be fullfilled; (C5) Detailed documentation of virtual experiments. Simulation settings are necessary to reproduce and verify the results; (S1) Semantic annotation via biomedical ontologies; (S2) Generation of safe simulation modules; (S3) Testing procedures to ensure safety. Functional curation of models; (S4) Standardised and secure software interfaces. Safe simulation of models via validation of input parameters and definition of allowed values; data to be simulated with the module matches the requirements of conditions. This procedure is referred to as functional curation of model parameters such that a reliable prediction can be ensured. the model [36]. For this purpose, standardised tests need to be in place and con- Tests facilitate model evaluation and are thus an important com- tinuously be passed throughout development. The electrophysiology ponent of a module. The test data consists of simulation inputs and web lab [35] is one example of a web-based tool to check the reli- outputs, which allow users to evaluate predictive error, sensitivity ability of models relating to the physiology of the heart. It features and specificity of a module. Furthermore, users require access to the a set of published models in CellML format, and applies to them tests with which the parameter ranges and prediction outcomes have several virtual experiments. The tests check how each model re- been assessed during model development. produces the expected behavior of a real heart under a variety of 4 Systems biology standards in medical research The documentation released with a simulation module should de- are well-suited for personalisation. Moreover, the models can be tail how simulation results are to be correctly interpreted. This is embedded in pharmacokinetics and pharmacodynamics applications particularly relevant for the classification of results in terms of quan- used during drug development. tiles within patient cohorts. In order to verify whether a module is However, before modeling can be fully incorporated into medical safe for use, information detailing the history, developer(s), input workflows, additional requirements should be met. Among these data and test results is strictly necessary. Only if this information is are further standards to represent the provenance of a model and to provided one can evaluate if the latest version of a module is safe for document valid parameter ranges under certain conditions. Further- application and how the changes made over time have affected the more, solutions for high-quality annotation of models and for the error rates of predictions as well as edge-cases in simulation scenar- curation of data need to be developed. Other challenges, like the ios. Systems biology already offers tools for model version control representation of uncertainties, restricted model changes and per- (e. g., [37]). However, we note that the potential of model prove- sonalisation are yet unsolved and have to be addressed in future nance has not yet been fully explored, and the description of model research. A specific focus of future works should be on the defi- parameters as well as a model’s quality (in terms of applicability nition of a minimal semantic interface that patient data has to fulfill and reliability) is so far neither satisfactory nor standardised. for a model to be applicable, i. e., a minimal set of semantically en- coded data the model requires as input. For instance, in the case of a 3.4 Standardised and secure software interfaces regression model, all independent variables of the model must exist. In order to apply modules in clinical practice, standardised software Finally, models used in the clinic need to fulfill safety require- interfaces are required that enable the safe simulation of models (e.g. ments and adhere to data privacy guidelines. For example, at no through restricted parameter ranges), validation of input parame- point would it be acceptable to mix data from several patients and ters, support for allometric scaling (of parameters like organ sizes give a patient or other unauthorised staff access to patients’ data. or blood flow), and the evaluation of simulation results in terms of We conclude that systems biology research focuses on the de- confidence intervals. velopment of (predictive) models. These models are mainly set in It is not unlikely that a model used through a diagnostic tool is a research environment and use batch samples and flexible time administered by a clinician, nurse or other medical staff. The simu- tables. Many of the achievements towards reproducibility of sim- lation mode must hence include a safe mode in which only defined ulation studies in systems biology can be reused to establish an properties of the model/module can be adapted. However, these de- infrastructure for reusable models in the clinic. However, the ex- fined properties need to cover, at the same time, the uniqueness of isting infrastructure needs to be evaluated thoroughly, and it needs each patient so that the simulation can be truly personalised. An to be extended to meet clinical standards when working with patient adaptation of the above web lab can help to provide clinicians with data. an overview of possible behaviors of a system given different sets of patient data and clinical investigations. ACKNOWLEDGEMENT Software tools such as the Taverna Workflow Suite [38] or Galaxy MK is supported by the Federal Ministry of Education and Research [39] are used for various data analysis tasks in Bioinformatics. Once (BMBF, Germany) within the research network Systems Medicine constructed, the workflows are reusable. Executable protocols can of the Liver (LiSyM) (grant number 031L0054). AO and RJBD be shared, reused and repurposed. Similarly, high-quality work- would like to acknowledge NIHR Biomedical Research Centre for flows could be provided for standard procedures in the clinic that Mental Health, the Biomedical Research Unit for Dementia at the involve virtual experiments. Tested and trusted workflows can safe South London, the Maudsley NHS Foundation Trust and Kings clinicians time as they automatise processes that otherwise would College London. RJBD’s work is also supported researchers at require a long time to specialise in. the National Institute for Health Research (NIHR) University Col- Moreover, tool and model developers have to safeguard the data lege London Hospitals Biomedical Research Centre, and by awards that is used as input to the computational model so that patient data establishing the Farr Institute of Health Informatics Research at cannot be used for other purposes than the treatment of this patient. UCLPartners, from the Medical Research Council, Arthritis Re- Otherwise obtaining consent from patients to employ their data for search UK, British Heart Foundation, Cancer Research UK, Chief medical purposes will be impossible. There is an arguable potential Scientist Office, Economic and Social Research Council, Engineer- that the models could be improved over time as the patient data in ing and Physical Sciences Research Council, National Institute for itself can help tweaking model parameters but this would have to be Health Research, National Institute for Social Care and Health Re- covered by each patient’s consent. search, and Wellcome Trust (grant MR/K006584/1). TJPH would like to acknowledge Kingś College London and the NIHR Biomed- 4 CONCLUSION ical Research Centre at Guyś and St ThomasŃHS Foundation Trust and the NIHR Biomedical Research Centre for Mental Health. DW With kinetic models being increasingly used and reused for the pre- is funded through the BMBF e:Bio program (grant no. 0316194). diction of disease risks, the monitoring of disease progression, or The authors acknowledge support through CaSyM, the EC FP7 for drug development, the quality and reliability of models becomes coordinating action Coordinating Systems Medicine across Europe. a major concern. In this situation, medical research can benefit from the experiences in systems biology, by incorporating existing stan- dards, tools and infrastructure. Standards and standard-compliant REFERENCES tools increase the exchangeability of models, and enable researchers [1]Leroy Hood et al. Systems biology and new technologies enable to reproduce published results. As computational models can be predictive and preventative medicine. Science, 306(5696):640– readily parameterised with individual patient and cohort data, they 643, 2004. 5 König et al [2]Raimond L Winslow et al. Computational medicine: translat- Research, 40(D1):D580–D586, 2012. ing models to clinical care. Science Translational Medicine, [22]John H Gennari et al. Multiple ontologies in action: composite 4(158):158rv11–158rv11, 2012. annotations for biosimulation models. Journal of Biomedical [3]Geoffrey S Ginsburg et al. Personalized medicine: revolutioniz- Informatics, 44(1):146–154, 2011. ing drug discovery and patient care. TRENDS in Biotechnology, [23]Manuel Salvadores et al. BioPortal as a dataset of linked 19(12):491–496, 2001. biomedical ontologies and terminologies in RDF. Semantic [4]Olaf Wolkenhauer et al. The road from systems biology to Web, 4(3):277–284, 2013. systems medicine. Pediatric research, 73(4-2):502–507, 2013. [24]Barry Smith et al. The OBO Foundry: coordinated evolution [5]William E Evans et al. Pharmacogenomics: translating of ontologies to support biomedical data integration. Nature functional genomics into rational therapeutics. science, Biotechnology, 25(11):1251–1255, 2007. 286(5439):487–491, 1999. [25]Anika Gross et al. How do computed ontology mappings [6]K Romero et al. The future is now: Model-based clinical evolve?-a case study for life science ontologies. In Joint trial design for alzheimer’s disease. Clinical Pharmacology & Workshop on Knowledge Evolution and Ontology Dynamics, Therapeutics, 97(3):210–214, 2015. 2012. [7]Jessica Nasica-Labouze et al. Amyloid β protein and [26]Olivier Bodenreider. The unified medical language system alzheimer’s disease: When computer simulations complement (UMLS): integrating biomedical terminology. Nucleic Acids experimental studies. Chemical Reviews, 115(9):3518–3563, Research, 32(suppl 1):D267–D270, 2004. 2015. [27]Sebastian Köhler et al. The human phenotype ontology project: [8]Dagmar Waltemath et al. How modeling standards, soft- linking molecular biology and disease through phenotype data. ware, and initiatives support reproducibility in systems biology Nucleic Acids Research, 42(D1):D966–D974, 2014. and systems medicine. IEEE Transactions on Biomedical [28]Kevin Donnelly. SNOMED-CT: The advanced terminology and Engineering, June 2016. coding system for eHealth. Studies in Health Technology and [9]Fiona Cunningham et al. Ensembl 2015. Nucleic Acids Informatics, 121:279, 2006. Research, 43(D1):D662–D669, 2015. [29]Michael Hartung et al. Effective composition of mappings for [10]Kate R Rosenbloom et al. Encode data in the ucsc genome matching biomedical ontologies. In Extended Semantic Web browser: year 5 update. Nucleic Acids Research, 41(D1):D56– Conference, pages 176–190. Springer, 2012. D63, 2013. [30]Timothy Lebo et al. Prov-o: The prov ontology. W3C [11]Taavi Tillmann et al. Systems medicine 2.0: potential benefits of Recommendation, 30, 2013. combining electronic health care records with systems science [31]Paolo Ciccarese et al. Pav ontology: provenance, authoring and models. Journal of Medical Internet Research, 17(3):e64, 2015. versioning. Journal of biomedical semantics, 4(1):1, 2013. [12]Guido Santos et al. Model-based genotype-phenotype mapping [32]Sirarat Sarntivijai et al. Linking rare and common disease: used to investigate gene signatures of immune sensitivity and mapping clinical disease-phenotypes to ontologies in therapeu- resistance in melanoma micrometastasis. Scientific Reports, 6, tic target validation. Journal of Biomedical Semantics, 7(8), 2016. 2016. [13]Jonathan Cooper et al. A call for virtual experiments: accelerat- [33]Marcus C Chibucos et al. Standardized description of scien- ing the scientific process. Progress in biophysics and molecular tific evidence using the Evidence Ontology (ECO). Database, biology, 117(1):99–106, 2015. 2014:bau075, 2014. [14]Dagmar Waltemath et al. Reproducible computational biology [34]Nicolas Le Novère et al. Minimum information requested experiments with sed-ml-the simulation experiment description in the annotation of biochemical models (MIRIAM). Nature markup language. BMC systems biology, 5(1):1, 2011. Biotechnology, 23(12):1509–1515, 2005. [15]Tommy Yu et al. The physiome model repository 2. Bioinfor- [35]Jonathan Cooper et al. The Cardiac Electrophysiology Web matics, 27(5):743–744, 2011. Lab. Biophysical Journal, 110(2):292–300, 2016. [16]Brett G Olivier and Jacky L Snoep. Web-based kinetic mod- [36]Jonathan Cooper et al. High-throughput functional curation of elling using jws online. Bioinformatics, 20(13):2143–2144, cellular electrophysiology models. Progress in biophysics and 2004. molecular biology, 107(1):11–20, 2011. [17]Falk Schreiber et al. Specifications of standards in systems [37]Martin Scharm et al. An algorithm to detect and communicate and synthetic biology. J. Int. Bioinformatics, 12(258.10):2390, the differences in computational models describing biological 2015. systems. Bioinformatics, page btv484, 2015. [18]Leonard P Freedman et al. The economics of reproducibility in [38]Katherine Wolstencroft et al. The taverna workflow suite: preclinical research. PLOS Biology, 13(6):e1002165, 2015. designing and executing workflows of web services on the desk- [19]Victoria Uren et al. Semantic annotation for knowledge man- top, web or in the cloud. Nucleic acids research, page gkt328, agement: Requirements and a survey of the state of the art. Web 2013. Semantics: science, services and agents on the World Wide Web, [39]Jeremy Goecks et al. Galaxy: a comprehensive approach for 4(1):14–28, 2006. supporting accessible, reproducible, and transparent computa- [20]Chen Li et al. Biomodels database: An enhanced, curated and tional research in the life sciences. Genome biology, 11(8):1, annotated resource for published quantitative kinetic models. 2010. BMC Systems Biology, 4(1):92, 2010. [21]Nick Juty et al. Identifiers. org and MIRIAM Registry: commu- nity resources to provide persistent identification. Nucleic Acids 6