=Paper=
{{Paper
|id=Vol-2929/paper6
|storemode=property
|title=COVIDGraph: Connecting Biomedical COVID-19 Resources and Computational Biology Models
|pdfUrl=https://ceur-ws.org/Vol-2929/paper6.pdf
|volume=Vol-2929
|authors=Martin Preusse,Alexander Jarasch,Tim Bleimehl,Sebastian Muller,Jamie Munro,Lea Gutebier,Ron Henkel,Dagmar Waltemath
|dblpUrl=https://dblp.org/rec/conf/vldb/PreusseJBMMGHW21
}}
==COVIDGraph: Connecting Biomedical COVID-19 Resources and Computational Biology Models==
<pdf width="1500px">https://ceur-ws.org/Vol-2929/paper6.pdf</pdf>
<pre>
 COVIDGraph: Connecting biomedical COVID-19 resources and
              computational biology models
                    Lea Gütebier                                                Ron Henkel                              Alexander Jarasch
       University Medicine Greifswald                              University Medicine Greifswald              German Center for Diabetes Research
            Greifswald, Germany                                        Greifswald, Germany                             Munich, Germany
    lea.guetebier@stud.uni-greifswald.de                           ron.henkel@uni-greifswald.de                       jarasch@dzd-ev.de

                   Tim Bleimehl                                            Sebastian Müller                                Jamie Munro
    German Center for Diabetes Research                                       yWorks                                     Munro Consulting
            Munich, Germany                                              Tübingen, Germany                                  London, UK
        tim.bleimehl@helmholtz-                                    sebastian.mueller@yworks.com                       jamie@munro.consulting
              muenchen.de

                                        Martin Preusse, and the                                Dagmar Walthemath
                                          HealthEcco Team                                  University Medicine Greifswald
                                             Kaiser & Preusse                                  Greifswald, Germany
                                            Freiburg, Germany                                 dagmar.waltemath@uni-
                                        martin@kaiser-preusse.com                                   greifswald.de

ABSTRACT                                                                                  1   INTRODUCTION
The COVID-19 pandemic has changed life across the globe. In Jan-                          CovidGraph is a research and communication platform that encom-
uary 2020, little was known about SARS-COV-2, but the vastly                              passes publications, case statistics, genes and functions, molecular
increasing number of infections and the uncontrolled spreading                            data and more. It is developed and maintained by HealthECCO, a
demanded fast medical action. Within a year, over 4 million publi-                        non-profit collaboration of researchers, software developers, data
cations relating to COVID-19 appeared in the scientific literature.                       scientists and medical professionals (https://healthecco.org/). Our
Additionally, patents have been registered, ontologies have been                          aim is to help researchers quickly and efficiently find their way
extended, simulation studies for prediction of disease spread and                         through COVID-19 datasets using tools that implement artificial
underlying bioinformatics mechanisms have been built, and health                          intelligence methods, advanced visualisation techniques, and intu-
studies have been designed. To support the exploration of COVID-                          itive user interfaces. Through CovidGraph users can explore papers,
19 data, the CovidGraph project was initiated as a non-profit, collab-                    patents, treatments and medications covering the family of corona
orative and open project driven by researchers, software developers,                      viruses. In addition to literature data we connect information from
data scientists and medical professionals. In this article we outline                     biological entities - namely genes, proteins and their function -
the history, goals and scope of CovidGraph. Using the example of                          spanning a network of unparalleled size and knowledge. The latest
computational biology models, we show how additional resources                            addition to the CovidGraph are systems biology models (Fig. 1).
can be integrated with the knowledge graph to extend the scope of
the CovidGraph, for example, to systems biology data.

Reference Format:
Lea Gütebier, Ron Henkel, Alexander Jarasch, Tim Bleimehl, Sebastian
Müller, Jamie Munro, Martin Preusse, and the HealthEcco Team,
and Dagmar Walthemath. COVIDGraph: Connecting biomedical
COVID-19 resources and computational biology models. In the 2nd
Workshop on Search, Exploration, and Analysis in Heterogeneous
Datastores (SEA Data 2021).


PVLDB Artifact Availability:
The source code, data, and/or other artefacts have been made available at
https://github.com/covidgraph/documentation.

Copyright © 2021 for the individual papers by the papers’ authors. Copyright © 2021
for the volume as a collection by its editors. This volume and its papers are published
under the Creative Commons License Attribution 4.0 International (CC BY 4.0).
Published in the Proceedings of the 2nd Workshop on Search, Exploration, and Anal-        Figure 1: Overview: CovidGraph data sources with the inte-
ysis in Heterogeneous Datastores, co-located with VLDB 2021 (August 16-20, 2021,
Copenhagen, Denmark) on CEUR-WS.org.                                                      grated system biology nodes (cyan box).
   Over the last years, NoSQL approaches such as Key-Value Stores,            to the Reactome pathway knowledgebase, a database for molec-
BigTable, document databases, triple stores, or graph databases               ular information about biological pathways [11]. As components
[1], together with semantic web applications, became more pop-                of the transcription and translation process in humans genes code
ular within the life sciences. Graph databases offer a storage con-           for transcripts which in turn code for proteins. In the CovidGraph
cept based on nodes, (directed) edges, properties and labels. Nodes           these processes are described by relationships between gene nodes,
can be labelled and are connected by edges, and both can con-                 transcript nodes and protein nodes. The data for the transcript
tain properties. They also allow easy horizontal scaling and fast             nodes is taken from the NCBI Reference Sequence Database [17];
graph traversal. Finally, graph databases are schema optional –               the Universal Protein Resource (UniProt) provides a resource of
a feature that is much appreciated when storing heterogeneous,                protein sequences and annotation data [5]. Proteins associated with
highly connected, cross-domain data items from different sources.             annotation data from the Gene Ontology are linked to GO term
The HealthECCO project integrates such heterogeneous resources                nodes. The last node type connected with gene nodes are disease
and compiles a knowledge-base targeted at COVID-19 data (https:               nodes. They are in turn associated with anatomy nodes. The cor-
//healthecco.org/covidgraph/), and potentially other diseases in              responding data is provided by Hetionet, an integrative network
future versions. The underlying graph database is Neo4j [18].                 of biomedical data including connections between diseases and
                                                                              anatomies [9].
                                                                                 Knowledge is primarily centred around the domain of corona-
2   DATA RESOURCES                                                            viruses but is steadily extended to other connected diseases as part
Previous versions of the CovidGraph already integrated data from              of the HealthECCO project. The latest addition to CovidGraph is a
five categories (Fig. 2 (A)): Patents, Papers, BioMedical (ontolo-            resource of computational biology models. We will introduce the
gies and controlled vocabularies), Clinical Trials and Statistical &          systems biology node in detail in Section 4.
Geographic. Categories are cross-linked by relationships. For ex-
ample, items from the "Papers" category are linked to items from              3   COVIDGRAPH FRAMEWORK
the "Patents" category. One paper source is the COVID-19 Open Re-
                                                                              The CovidGraph infrastructure is built as a labelled property graph
search Dataset (CORD-19) – a collection of research papers relating
                                                                              based on the Neo4j Enterprise edition v4.2. Textual information,
to COVID-19 (and corona viruses) [24]. It is the main data source
                                                                              such as publications, clinical studies or ontology term descriptions,
for information about papers in the CovidGraph and contains pub-
                                                                              is enriched and recognised by a pipeline based on natural lan-
lications from PubMed, medRxiv and bioRxiv. Papers and related
                                                                              guage processing and named entity recognition (BioBERT [13]).
information are stored and linked in multiple nodes in the Covid-
                                                                              The graph, as of now, contains 36 million nodes and 59 million re-
Graph. Each paper node has author nodes connected to affiliation
                                                                              lationships but is still growing as the modular software framework
nodes that, in turn, are linked to location nodes. Papers can be linked
                                                                              encourages to add and integrate new data sources. Server-wise,
to COVID-19 patents. The Lens (https://about.lens.org/covid-19/)
                                                                              CovidGraph relies on Docker Container. To integrate a new data
provides datasets of patent documents and literature concerning hu-
                                                                              source, it needs to be wrapped in a container and it needs to pro-
man corona viruses and COVID-19. The CovidGraph furthermore
                                                                              vide information such as connection data and mapping information
contains information about clinical COVID-19 studies from the
                                                                              (https://github.com/covidgraph/data_template). An ETL-process
ClinicalTrials.gov registry. Studies are represented as clinical trials
                                                                              (https://git.connect.dzd-ev.de/dzdtools/motherlode) subsequently
nodes which are linked to multiple other nodes representing more
                                                                              extracts the data from the new source, transforms the data in accor-
detailed information about each study. Also included in the Covid-
                                                                              dance with the provided mapping information, and loads the data
Graph are case statistics and case data from Johns Hopkins Univer-
                                                                              into the main CovidGraph.
sity [7] and population estimates from the United Nations World
Population Prospects (https://population.un.org/wpp/). Nodes in-
clude city, country, province, daily report and age group. Biomedi-           4   INTEGRATION OF SIMULATION STUDIES
cal data encodes information about genes, proteins, pathways and              Via the aforementioned ETL-process, we connected the Covid-
different diseases associated with COVID-19. The data comprises               Graph and the Management System for Models and Simulations
information from various biological and biomedical resources and              (MaSyMoS, [8]). MaSyMoS is a Neo4j graph database for storing
is connected to Gene Ontology terms. The Gene Ontology is a re-               and retrieving data items describing biomedical simulation studies.
source for computational representation of the function of genes              The data is extracted from repositories for computational biology
and gene products [4]. Information about genes from the NCBI                  models (BioModels [15] and Physiome Model Repository2 [25])
Gene Database [2] is stored in Gene nodes which are connected                 and integrated in a single graph (Fig. 2 (B)). We consider a com-
to other nodes describing the underlying biology. Therefore, the              putational biology model a mathematical model written in a for-
connected nodes include Gene Symbols according to the Ensembl                 mal machine-readable language, such that it can be systematically
Genome Browser, a genome database [10]. The gene symbols are                  parsed and employed by simulation and analysis software without
mapped to synonyms. Since genes are expressed in various tissues              further human translation [12]. A biomedical simulation study is
the gene nodes are linked to Gtex Tissue nodes containing gene                considered any calculation performed on a model and describing
expression data from the GTEx Portal [14]. For genes that are part            evolution of the biological system represented, for instance, over
of a pathway there exists a relation between the corresponding                spatial and/or temporal dimensions [23]. MaSyMoS links simulation
gene node and pathway node. The data included in the COVID-                   studies, their results and corresponding models. Curated simula-
Graph describes which genes are members of a pathway according                tion studies are furthermore annotated with meta-data, primarily
                                                                          2
                                                                                                      (B)
                                    (A)

Figure 2: (A) Original CovidGraph data model with data from i) Patents, ii) an index for biomedical terms (BioBERT [13]), iii)
BioMedical Ontologies [2, 4, 5, 9–11, 17, 21, 22], iv) COVID-19 related papers [3, 24], v) Clinical Trials [26]), vi) and a Statistical &
Geographic information [7, 16]. (B) A simplified MaSyMoS [8] meta graph containing i) simulation models formerly encoded
in SBML and CellML (not shown) [20], ii) simulation descriptions formerly encoded in SEDML [20], iii) bioontologies encoded
in OWL, iv) and links to publications in PubMed.


reference publications and ontological terms from bio-ontologies             IDs (cmp. Figure 1). For Gene Ontology, ChEBI and Disease Ontol-
[4–6, 11]. MaSyMoS provides access to over 1000 manually cu-                 ogy more than 94% of the terms stored in MaSyMoS were connected
rated simulation studies originally published in BioModels. This set         to terms in the CovidGraph. The UniProt coverage reached 41%.
contains highly curated studies targeting COVID-19 disease and
spreading (https://www.ebi.ac.uk/biomodels/covid-19). The result-               Example: COVID-19 spread in Wuhan city. The simulation study
ing knowledge graph offers domain-specific retrieval and similarity          by Roda at al. [19] investigates the COVID-19 spread in Wuhan
measures, and it enables efficient access and reuse. As all model            city in the beginning of 2020. Figure 3 shows a Neo4j excerpt of the
have been shown to reproduce the published results, they are a               model in MaSyMoS and the association to disease information in
valuable resource for biomedical investigations.                             the CovidGraph. The association is build by a matching reference
   The integration of MaSyMoS data with CovidGraph was two-                  publication and a matching ontology entry from the Disease On-
folded: First we matched papers (publications) from both domains.            tology. More specifically, the model is linked (in the middle, dark
Then we connected biomedical ontology terms from both resources              green) to several resources (pink). For example, one annotation
thereby linking disease knowledge and biomedical simulation stud-            refers to an ontology term from the Disease Ontology and is asso-
ies. The Paper data set (cmp. Fig. 2 (A)) in CovidGraph is represented       ciated to the corresponding entry in the CovidGraph (on the right,
by different nodes (e.g., the abstract, authors, paper ID). In MaSyMoS       brown). Another example is the reference publication which links
a paper is represented by a single publication node containing the           to the corresponding publication in the CovidGraph (on the right,
same aforementioned set of information about a publication. Con-             blue). We consider this example a first step towards bridging the
sequently, we mapped the corresponding IDs (PubMedID and DOI)                gap between medical research and systems biology.
from CovidGraph paper ID nodes and MaSyMoS publication nodes,
thus connecting relevant publications from both data sets. This              5   TAKEAWAYS & FUTURE WORK
mapping resulted in 19 connections. This result is in our expected           The CovidGraph project integrates COVID-related data from hetero-
range, as the underlying publication corpus covers different areas of        geneous data sources, mainly from the medial and health domains,
interest (e.g. cell cycle, MAPK and apoptosis for simulation models          into a single knowledge graph. We demonstrate that even for fairly
& clinical trials, respiratory studies and diseases for CovidGraph).         distinct scientific domains such as computational biology modeling
The BioMedical data set in the CovidGraph represents different               and clinical research, it is possible to link knowledge graphs and
ontologies with relevance for COVID-19 research. These ontologies            thereby quickly provide new data sources. The presented version of
have possible connections and overlap with ontological terms used            CovidGraph provides a tool set and a single-access point to previ-
to annotate simulation studies in MaSyMoS (cmp. Figure 2 (B)). Our           ously disconnected data sources. Biomedical and clinician scientists
analyses showed that most overlap can be observed in gene infor-             can explore a rich set of data items, which are not connected in any
mation, chemical entities, proteins and diseases. Consequently, we           other resource. CovidGraph is only one example for rapid integra-
mapped ontological terms in MaSyMoS and CovidGraph for Gene                  tion of knowledge. The HealthECCO infrastructure offers solutions
Ontology (1810 connections), ChEBI (1211 connections), UniProt               for integration and exploration of other diseases, building on the
(911 connections) and Disease Ontology (72 connections) by their             same integration workflow showcased in this paper.
                                                                         3
                                                                                                              BIOMD…


                                                                                                                                                                   Rate
                                                mu                                                                                                                Law for


                                                                                                      MASYMOS_HAS_MODEL
                                                                                                                                                                  Susce…                                                                          http://id…                               http://id…


                                                                                                                                    M MASYMOS_BELONGS_TO
                                                                                                                                                                                                                                                                                                           Kausthu…


                                                                 MA


                                                                                                                                                                                                                                                                       MAS
                                                                                                                                                             N


                                                                                                                                                                                                                                                                                                              O
                                                                                                                                                          TIO


                                                                                                                                                                                                                                                                                                            _T
                                                                    S


                                                                                                                                                                                                                                                                                                           N…
                                                      YM


                                                                                                                                                                                                                                                                                                          GS
                           rho


                                                                                                                                                       NC


                                                                                                                                                                                                                                                                           YM
                                                        MA


                                                                                                                                                                                                                                                                                                  _BBELO
                                                         OS
                                                                                                                                                                                                                                http://id…


                                                                                                                                                                                                                                                                                                      S_is
                                                                                                                                                     TO


                                                                                                                                                                                                                                                                                                       ON
                                                                                                                                                    FU


                                                                                                                                                                                                                                                                                                      OR
                                                                                                                                                                                                                                                                           OS_
                                                 MA


                                                           SY


                                                                                                                                                                                                                                                              MAS
                                                                                                                                                  S_
                                                            _H


                                                                                                                                                  S_


                                                                                                                                                                                                                                                                                                     EL
                                                   SY                                                                                                                                                                                           M


                                                                                                                                                                                                                                                                                                    AT
                                                                                                                                                                                                                                                                                                 YMO
                                                              MO


                                                                                                                                                NG
                                                                                                                                                                                                                                                 AS


                                                              AS


                                                                                                                                               HA
                                                     MO


                                                                                                                                                                                                                                                                                                 SS_
                                                                                                                                                                                                                                                                              BELO


                                                                                                                                                                                                                                                                                                 RE
                                                                                                                                                                                                                                                    YM


                                                                                                                                                                                                                                                                  YM
                                                                                                                                              O
                                                        S


                                                                                                                                             S_
                                                                _P
                                                                                                                                                                                                                                        M


                                                                 S_
                                                    _H


                                                                                                                                                                                                                                                                                             YMOO
                                                                                                                                                                                                                                                                                                                                                                                          COVID-19


                                                                                                                                            EL
                                                                                                                                                                                                                                         AS


                                                                                                                                                                                                                                                                                               _C
                                                                                                                                                                                                                                                           OS                                                                                                  MASYMOS_DOID_DESCRIBES_…


                                                                                                                                                                                                                                                                                           MAS
                                                                                                                                           O
                                                                   AR
                                                      AS


                                                                   BE
                                                                                                                                                                                                                                                                                                                                                  http://id…


                                                                                                                                                                                                                                                                                        MAS YM
                                                                                                                                                                                                                                                                   OS_
                                       MA                                                                                                                                                                                                  YM                                                                            nOf


                                                                                                                                         _B
                                                                                                                                       YM
                                                                                                                                                                                                                                                             _h


                                                                                                                                                                                                                                                                                            _IS
                                                        _P


                                                                                                                                                                                                                                                                                   NG…
                                                                                                                                                                                                                                                                                                                    ersio


                                                                     AM
                                          SY


                                                                      LO
                                                                                                                                                                                                                                                               as


                                                                                                                                      OS
                                                           AR                                                                                                                                                                                   OS


                                                                                                                                                                                                                                                                                          AS
                                                                                                                                     AS
                                                                                                                                                                                                                                                                                                               S_isV


                                                                                                                                                                                                                                                                                         OS
                                            MO                                                                                                                                                                                                                   Ta


                                                                                                                                                                                                                                                                      is
                                                             AM


                                                                         NG
                                                                                                                                                                                                                                                  _B                                                YMO


                                                                         ET


                                                                                                                                   YM


                                                                                                                                                                                                                                                                                         M
                                               S_                                                                                                                                                                                                                  xo


                                                                                                                                                                                                                                                                                       YM
                                      MASY                     ET                                                                                                                                                                                         EL         n                           MAS


                                                                            ER
                                           MOS_BEL


                                                                            S_


                                                                                                                                 AS
                    beta                                         ER                                                                                                                        TION                                                              O                                                                        S_TO


                                                                                                                                                                                                                                                                                     AS
                                                 HAS_                                                                                                                    MASYMOS_HAS_ANNOTA                                                                   N…


                                                                              TO
                                                   ONPA


                                                                                                                                M
                                                                                                                                                                                                                                                                          Mon                                        LONG


                                                                                                                                                                                                                                                                                    M
                                                      GSRAME                                                                                                                                                                                                                                                     S_BE
                                                        _T    TER
                                                                                                                                                                                                                                                                         Jul 13                        YMO
                                     MASY
                                                          O                                                                                                                                                                                                                                         MAS
                                          MOS_                                                                                                                                                                                                                        n 19:19:55                        MA
                                               BELO                                               Roda2020                                                                                                                                                          xo    CE…                              S
                                                    NGS_
                                                         TO                                                                                                                                                                                                   s   Ta                                           YM
                                                                                                                                                                                                                                                                                                                    OS
                                                                                                 - SIR model                                                                                                                                                ha                                                           _B
                                                                                                                                                                                                                                                          S_


                                                                                                                                                                                                                                                                                   S_ha…
                                                                                                                                                                                                                                                                       BE…
                                                                                                 of COVID-…                                                                                                                                                                                                                 E
                                                                                                                                                     MA
                                                                                                                                                       SY                                                                                             O                 N…                       MA                             LO
                                                                                                                                                                                                                                                                                                                                     NG
                                                                                                                                                          MO               MASYMOS_BELONGS_TO                                                    YM              LO                                 S   YM                              S   _T
                                                                                                                                                            S_H                                                                                AS             BE


                                                                                                                                                                                                                                                                    MOS_
                                                                                                                                                                                                                                                                                                          OS                                  O
                                                                                                                                                                                                                                                            S_                                                 _is
                                                                                                             N
                                                                                                                                                                AS


                                                                                                                                                                                                                                                                                     MO
                                                                       N                                                                                           _C                                                                         M                                                                   De                                                                                                                   Why is it
                                                                                                        CTIO

                                                                 IO                                                                                                  OM                                                                                    O                                                        scri
                                                          CT                                                                                                                                                                                       YM


                                                                                                                                                                                                                                                                                MASY
                                                                                                                                                                       PA                                                                                                                                               bed                                                                                                           difficult to


                                                                                                                                                                                                                                                                   MASY
                                                        EA                                                                                                               RT                                                                      AS
                                                                                                       TO
                                                                                                                                                                                                                                                                                                                           By


                                                                                                                                        MA
                                                      _R                                                                                                                   ME                                                                                                                                                                                                                                                         accurately
                                                                                                    REA


                                                                                                                                                                             NT                                                                 M
                                                                                                                           MA
                                                                                                                                                     MA
                                                                                                   GS_


                                                    AS                                                                                                                                                                                                                                                                                                                                                            PAPER_HAS_PAPERID


                                                                                                                                           SY
                                                  _H
                                                                                                                                                        SY
                                                                                                                                                          MO                                                                     http://id…                                                                                                         http://id…    MASYMOS_RESOURCE_DESCRIBES_PAPERID   32289100                       predict the
                                                                                                                              SY
                                                                                                HAS_


                                                                             TO

                                                                                                                                              MO
                                            S                                                                                                                S_B                                                                                                                                                                                                                                                                      COVID-19
                                                                                               LON


                                           O                               S_
                                                                                                                                 MO

                                                                                                                                                                ELO
                                         YM                                                                                                                                                                                                                                                                                                                                                                                           epidemi…
                                                                                                                                                 S
                                       AS                    NG                                                                                                     NG

                                                                                                                                  _B
                                                                                            OS_


                                                                                                                                    S
                                                                                           S_BE


                                                          LO                                                                                                          S_                                                                                    http://id…
                                                                                                                                   _H MASYMOS_BEL


                                      M                                                                                                                                 TO
                                                                   IES


                                                                                                                                    EL
                                                        BE


                                                                                                                                     M
                                                                                                                                     AS M
                                                                                   MASYMOS_HAS_SPECI SYM


                                                      S_
                                                                                 YM


                                                                                                                                      MA S_TO
                                                                                                                                      ON


                                                                                                                                      AS
                                                                 EC


                                                                                        YMO


                                                     O
                                                                                                                                        _R AS


                                                                                                                                        YM
                                                                                                                                         SY
                                                   YM                                                                                                                                                 Wuhan
                                                                             MAS


                                                                                                                                         G
                                                                 _SP


                                                 AS
                                                                                                                                          EA YM


                                                                                                                                          O
                                                                                     _TO


                                                                                                                                           MO A


                                                                                                                                            S_
                                                                                    MAS


                                                M
                                                                                                                                             CT O
                                                               AS


                                                                                                                                               BE
                                                                                                                                                S_ SY
                                                                                   GS


                                                                                                                                                 M
                                                                                                                                                 IONS_


                                                                                                                                                 LO
                                                            S_H


                                                                                                                                                   BE M


                                                                                                                                                                                                        MASY
                                                                                       ON


                                                                                                                                                    NG
                                                                                                                                                     LOO
                                                             O


                                                                                                                                                      S_


                                                                                                                                                                                                          IN
                                                                                     EL


                                                                                                                                                                                                        IES
                                                                                                                                                    IN                      IN
                                                          YM


                  Suscept…                                                                                                                       D_
                                                                                                                                                        HA


                                                                                                                                                                         D_
                                                                                                                                                        NSG


                                                                                                                                                         TO


                                                                                                                                                                                                            MASYMO
                                                                                                                                                                                                       D_
                                                                       Infected…                                                              TE


                                                                                                                                                                                                             MOS_CO
                                                                                  S_B


                                                                                                                                                          _HS
                                                                                                                                                          ONGS_TO


                                                                                                                                          CA                           TE


                                                                                                                                                                                                     EC
                                                                                                                                                           S_
                                                        S


                                                                                                                                                                                                    TE
                                                                                                                                                                    CA
                                                                                                                                                            A_STO
                                                                                                                                       _LO
                                                     MA


                                                                                                                                                              SP


                                                                                                                                                                                                  SP
                                                                                O


                                                MA                                                                            MA S_IS         Infected… _LO


                                                                                                                                                                                                 CA
                                                                                                                                                               _S
                            MA


                                                                                                                                                                                          S
                                                                             YM


                                                   SY
                                                                                                                               MSO
                                                                                                                                                                 EC


                                                                                                                                                                  PE
                                                                                                                                                                                        IE


                                                                                                                                                                                                S_
                                                                                                   MA


                                                                                                                                                              IS


                                                                                                                                                                                       LO
                                                      M
                                                                                                    ES


                                                                                                                            SY YMO                         S_            MA          EC
                              SY


                                                          OS
                                                                                                                                                                     C
                                                                              S


                                                                                                                                                                                             AIN

                                                                                                                                                                                                                   S_IS_LO
                                                                                                                          MA
                                                                                                                                                                     IES


                                                                                                                                                                             SY SP


                                                                                                                                                                                                                    NTAIN
                                                                                                                                                        MO
                                                                                                                                                                      IE


                                                                                                                                                                                    IS_
                                                            _H
                                                                           MA


                                                                                                                                      S_
                      MA


                                MO


                                                                                                                                                                                S_
                                                                                                                                                                                M
                                                                                                                                                                        S
                                                              AS
                                                                                                       MA


                                                                                                                                        HA            Y
                                                                                                                                                   E…


                                                                                                                                                                    S
                                                                                                                                                                              IN OS_H


                                                                                                                                                                                           NT
                                                                                                                                                   AS
                                                                                                                                                                                  S_
                                                                 _P                                                                        S_                   CIE
                        SY


                                                                                                         OS
                           S_


                                                                   RO                                                                                       PE             TA
                                                                                                                                               _…
                                                                                                          SY


                                                                                                                                             PR M


                                                                                                                                                                                        CO
                                                                                                                                                _R


                                                                                                                                                                                MO     AS
                                          MA                                                                                                    OD S_S                  ON


                                                                                                                                                                                                                         S_SPEC
                          MO


                                                                      DU                                                   MA
                             HA


                                                                                                                                                                                           _P
                                                                                                            _H
                                                                                                             MO


                                                                                                                                            _IS


                                                                                                                                            AS


                                             SY                                                                                                      IN


                                                                                                                                                                                      S_
                                                                        CT                                                   SY                                       _C                     RO


                                                                                                                                                                                                                          CATED_
                                                                                                                                                 TAUCT
                                                                                                                                                                              SY
                                                M                                                                                                                      MA
                               S_


                                                                                                              AS
                                S_


                                                                                                                                MO              N                   S                           DU
                                                                                                                                          _H


                                                    OS                                                                                                                    S
                                                                                                                                         OS


                                                                                                                                                                                         MO
                                                                                                                                              O
                                                                                                                S_


                                                                                                                                           _C                    MO        YM
                                                                                                                                                                            MA


                                                                                                                                   S_                                                              CT
                                  RE


                                                      _IS
                                  IS_


                                                                                                                 _R


                                                                                                                                     ISMOS
                                                                                                                                       OS


                                                                                                                                                             SY                OS
                                                                                                                  IS_


                                                                                                                                      YM


                                                            _P                                                                                                                         SY
                                                                                                                                    SY _PR                MA
                                                                                                                                                                                  _IS
                                     AC


                                                                                                                    E…
                                     RE


                                                               RO


                                                                                                                                                                                                                               I…
                                                                                                                                    YM


                                                                                                                                 MA
                                                                                                                                   S
                                                                                                                     RE


                                                                                                                                                                                      _P
                                                                                                                                                                                     MA

                                                                 DU                                                                        OD


                                                                                                                                                                                                                                 IN
                                                                                                                                MA


                                                                                                                                                                                         RO
                                        …
                                        AC


                                                                                                                                            S


                                                                   CT                                                                         UC
                                                                                                                        AC


                                                                                                                                                                                            DU
                                                                                                                                         MA


                                                                                                                                                 T
                                          TA


                                                                                                                                                                                              CT
                                                                                                                          …
                                            …


                                                                                                                          Infected                                  Confirm…                                     Recover…
                                     Suscept…


    Figure 3: Simulation study by Roda at al. [19] represented in MaSyMoS (model in light blue) with links to CovidGraph.


   The CovidGraph-Team hopes to motivate other data providers to                                                                                                                                                                                                                                             Pedro Mendes, et al. 2005. Minimum information requested in the annotation of
link up with our resource, but we also like to discuss the applicability                                                                                                                                                                                                                                     biochemical models (MIRIAM). Nature biotechnology 23, 12 (2005), 1509–1515.
                                                                                                                                                                                                                                                                                                        [13] Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim,
of our graph database infrastructure on existing data silos.                                                                                                                                                                                                                                                 Chan Ho So, and Jaewoo Kang. 2020. BioBERT: a pre-trained biomedical language
                                                                                                                                                                                                                                                                                                             representation model for biomedical text mining. Bioinformatics 36, 4 (2020),
                                                                                                                                                                                                                                                                                                             1234–1240.
ACKNOWLEDGMENTS                                                                                                                                                                                                                                                                                         [14] John Lonsdale, Jeffrey Thomas, Mike Salvatore, Rebecca Phillips, Edmund Lo,
The work presented here is the result of the HealthEcco Team (https:                                                                                                                                                                                                                                         Saboor Shad, Richard Hasz, Gary Walters, Fernando Garcia, Nancy Young, et al.
                                                                                                                                                                                                                                                                                                             2013. The genotype-tissue expression (GTEx) project. Nature genetics 45, 6 (2013),
//healthecco.org/team/). The COVID-19 collection in BioModels                                                                                                                                                                                                                                                580–585. https://doi.org/10.1038/ng.2653
was built with the help of an EOSC COVID-19 Fast Track funding.                                                                                                                                                                                                                                         [15] Rahuman S Malik-Sheriff, Mihai Glont, Tung VN Nguyen, Krishna Tiwari,
                                                                                                                                                                                                                                                                                                             Matthew G Roberts, Ashley Xavier, Manh T Vu, Jinghao Men, Matthieu Maire,
                                                                                                                                                                                                                                                                                                             Sarubini Kananathan, et al. 2020. BioModels—15 years of sharing computational
REFERENCES                                                                                                                                                                                                                                                                                                   models in life science. Nucleic acids research 48, D1 (2020), D407–D415.
 [1] Renzo Angles and Claudio Gutierrez. 2008. Survey of graph database models.                                                                                                                                                                                                                         [16] United Nations. 2019. World population prospects 2019: highlights.
     ACM Computing Surveys (CSUR) 40, 1 (2008), 1.                                                                                                                                                                                                                                                      [17] Kim D Pruitt, Tatiana Tatusova, and Donna R Maglott. 2007. NCBI reference
 [2] Garth R Brown, Vichet Hem, Kenneth S Katz, Michael Ovetsky, Craig Wallin,                                                                                                                                                                                                                               sequences (RefSeq): a curated non-redundant sequence database of genomes,
     Olga Ermolaeva, Igor Tolstoy, Tatiana Tatusova, Kim D Pruitt, Donna R Maglott,                                                                                                                                                                                                                          transcripts and proteins. Nucleic acids research 35, suppl_1 (2007), D61–D65.
     et al. 2015. Gene: a gene-centered information resource at NCBI. Nucleic acids                                                                                                                                                                                                                          https://doi.org/10.1093/nar/gki025
     research 43, D1 (2015), D36–D42. https://doi.org/10.1093/nar/gku1055                                                                                                                                                                                                                               [18] Ian Robinson, Jim Webber, and Emil Eifrem. 2013. Graph Databases. O’Reilly
 [3] Kathi Canese and Sarah Weis. 2013. PubMed: the bibliographic database. The                                                                                                                                                                                                                              Media, CA, USA.
     NCBI Handbook 2 (2013), 1.                                                                                                                                                                                                                                                                         [19] Weston C Roda, Marie B Varughese, Donglin Han, and Michael Y Li. 2020. Why
 [4] The Gene Ontology Consortium. 2021. The Gene Ontology resource: enriching a                                                                                                                                                                                                                             is it difficult to accurately predict the COVID-19 epidemic? Infectious Disease
     GOld mine. Nucleic Acids Research 49, D1 (2021), D325–D334. https://doi.org/10.                                                                                                                                                                                                                         Modelling 5 (2020), 271–281.
     1093/nar/gkaa1113                                                                                                                                                                                                                                                                                  [20] Falk Schreiber, Björn Sommer, Tobias Czauderna, Martin Golebiewski, Thomas E
 [5] UniProt Consortium. 2019. UniProt: a worldwide hub of protein knowledge.                                                                                                                                                                                                                                Gorochowski, Michael Hucka, Sarah M Keating, Matthias König, Chris Myers,
     Nucleic acids research 47, D1 (2019), D506–D515. https://doi.org/10.1093/nar/                                                                                                                                                                                                                           David Nickerson, et al. 2020. Specifications of standards in systems and synthetic
     gky1049                                                                                                                                                                                                                                                                                                 biology: status and developments in 2020. Journal of integrative bioinformatics
 [6] Paula de Matos, Adriano Dekker, Marcus Ennis, Janna Hastings, Kenneth Haug,                                                                                                                                                                                                                             17, 2-3 (2020).
     Steve Turner, and Christoph Steinbeck. 2010. ChEBI: a chemistry ontology and                                                                                                                                                                                                                       [21] Lynn Marie Schriml, Cesar Arze, Suvarna Nadendla, Yu-Wei Wayne Chang,
     database. Journal of cheminformatics 2, 1 (2010), 1–1.                                                                                                                                                                                                                                                  Mark Mazaitis, Victor Felix, Gang Feng, and Warren Alden Kibbe. 2012. Disease
 [7] Ensheng Dong, Hongru Du, and Lauren Gardner. 2020. An interactive web-based                                                                                                                                                                                                                             Ontology: a backbone for disease semantic integration. Nucleic acids research 40,
     dashboard to track COVID-19 in real time. The Lancet infectious diseases 20, 5                                                                                                                                                                                                                          D1 (2012), D940–D946.
     (2020), 533–534. https://doi.org/10.1016/S1473-3099(20)30120-1                                                                                                                                                                                                                                     [22] The GTEx Portal. 2020. GTEx Portal Documentation. https://gtexportal.org/
 [8] Ron Henkel, Olaf Wolkenhauer, and Dagmar Waltemath. 2015. Combining                                                                                                                                                                                                                                     home/documentationPage. Online, accessed 12 October 2020.
     computational models, semantic annotations and simulation experiments in a                                                                                                                                                                                                                         [23] Dagmar Waltemath, Richard Adams, Daniel A Beard, Frank T Bergmann,
     graph database. Database 2015 (2015), bau130.                                                                                                                                                                                                                                                           Upinder S Bhalla, Randall Britten, Vijayalakshmi Chelliah, Michael T Cooling,
 [9] Daniel Scott Himmelstein, Antoine Lizee, Christine Hessler, Leo Brueggeman,                                                                                                                                                                                                                             Jonathan Cooper, Edmund J Crampin, et al. 2011. Minimum information about a
     Sabrina L Chen, Dexter Hadley, Ari Green, Pouya Khankhanian, and Sergio E                                                                                                                                                                                                                               simulation experiment (MIASE). PLoS computational biology 7, 4 (2011), e1001122.
     Baranzini. 2017. Systematic integration of biomedical knowledge prioritizes drugs                                                                                                                                                                                                                  [24] Lucy Lu Wang, Kyle Lo, Yoganand Chandrasekhar, Russell Reas, Jiangjiang Yang,
     for repurposing. eLife 6 (Sept. 2017), e26726. https://doi.org/10.7554/elife.26726                                                                                                                                                                                                                      Darrin Eide, Kathryn Funk, Rodney Kinney, Ziyang Liu, William Merrill, et al.
[10] Tim Hubbard, Daniel Barker, Ewan Birney, Graham Cameron, Yuan Chen, L                                                                                                                                                                                                                                   2020. Cord-19: The covid-19 open research dataset. ArXiv arXiv2004. (2020),
     Clark, Tony Cox, J Cuff, Val Curwen, Thomas Down, et al. 2002. The Ensembl                                                                                                                                                                                                                              10706v2.
     genome database project. Nucleic acids research 30, 1 (2002), 38–41. https:                                                                                                                                                                                                                        [25] Tommy Yu, Catherine M Lloyd, David P Nickerson, Michael T Cooling, Andrew K
     //doi.org/10.1093/nar/30.1.38                                                                                                                                                                                                                                                                           Miller, Alan Garny, Jonna R Terkildsen, James Lawson, Randall D Britten, Peter J
[11] Bijay Jassal, Lisa Matthews, Guilherme Viteri, Chuqiao Gong, Pascual Lorente,                                                                                                                                                                                                                           Hunter, et al. 2011. The physiome model repository 2. Bioinformatics 27, 5 (2011),
     Antonio Fabregat, Konstantinos Sidiropoulos, Justin Cook, Marc Gillespie, Robin                                                                                                                                                                                                                         743–744.
     Haw, et al. 2020. The reactome pathway knowledgebase. Nucleic acids research                                                                                                                                                                                                                       [26] Deborah A Zarin, Tony Tse, Rebecca J Williams, Robert M Califf, and Nicholas C
     48, D1 (2020), D498–D503. https://doi.org/10.1093/nar/gkz1031                                                                                                                                                                                                                                           Ide. 2011. The ClinicalTrials.gov results database - update and key issues. New
[12] Nicolas Le Novère, Andrew Finney, Michael Hucka, Upinder S Bhalla, Fabien                                                                                                                                                                                                                               England Journal of Medicine 364, 9 (2011), 852–860.
     Campagne, Julio Collado-Vides, Edmund J Crampin, Matt Halstead, Edda Klipp,
                                                                                                                                                                                                                                                                                           4

</pre>