Structured review on Huntington’s disease iron
hypothesis
Karolis Cremers1,∗ , Marco Roos1 , Katy Wolstencroft2 , Eleni Mina1 and
Núria Queralt-Rosinach1,∗
1
    Leiden University Medical Center, Albinusdreef 2, 2333 ZA Leiden, The Netherlands
2
    Leiden Institute of Advanced Computer Science, Niels Bohrweg 1, 2333 CA Leiden, The Netherlands


                                         Abstract
                                         Here we present a Structured Review (SR) on the relationships of iron with Huntington’s Disease.
                                         Including relationship predictions made by different edge prediction models, the results of the inclusion
                                         of the Gene Ontology structure on relationship predictions and experimental data representation within
                                         the SR.

                                         Keywords
                                         Structured Review, Knowledge graph, Ontologies, FAIR, Network Analysis, Linked Data


Motivation
Structured Reviews (SRs) organize and semantically represent the current knowledge around a
research hypothesis in a structured manner, enabling semantic querying and data mining [1].
   In this work we present the application of a SR to explore the relationship of iron with
Huntington’s Disease (HD). HD (OMIM:143100) is a heritable rare neurodegenerative disease
caused by an elongated CAG repeat within the huntingtin (HTT, HGNC:4851) gene. The exact
mechanisms that lead to disease pathogenesis remain unclear, however one of the current
hypotheses implicates the accumulation of iron in HD brain. Abnormal accumulation of iron in
the brain has been associated with several other neurodegenerative diseases. Therefore, current
therapies often include iron chelators to combat iron build up.
   Our SR is a knowledge graph that includes information surrounding the iron hypothesis in
HD. We constructed a HD knowledge graph that integrates genes, anatomy, genotypes, variants,
physiology and disorders as concepts and their relationships such as “role of” (𝑅𝑂_0000081)
and “in homology relationship with” (𝑅𝑂_𝐻 𝑂𝑀0000001). Every instance of concepts and
relationships within the SR is annotated with references, similar to a normal review article.
These concepts and relationships integrated in the SR are extracted from two curated sources.
First, the Monarch Initiative knowledge base is queried for the retrieval of relevant gene and

SWAT4HCLS 2023: The 14th International Conference on Semantic Web Applications and Tools for Health Care and Life
Sciences
∗
    Corresponding author.
Envelope-Open k.m.p.cremers@lumc.nl (K. Cremers); n.queralt_rosinach@lumc.nl (N. Queralt-Rosinach)
Orcid 0000-0003-0169-8159 (K. Cremers); 0000-0002-8691-772X (M. Roos); 0000-0002-1279-5133 (K. Wolstencroft);
0000-0002-8972-9206 (E. Mina); 0000-0002-1756-3905 (N. Queralt-Rosinach)
                                       © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                       CEUR Workshop Proceedings (CEUR-WS.org)
    CEUR
                  http://ceur-ws.org
    Workshop      ISSN 1613-0073
    Proceedings
phenotype information. Second, the TFtargets library is used to obtain transcription regulatory
information. In addition to the curated databases, the SR integrates gene expression information
from a HD RNA-seq experiment (GSE64810) to include genes that are specifically altered in
HD. Finally, we include a set of concepts and relationships of interest pre-selected by domain
experts.
   We want to place emphasis on the fact that a SR has utility at multiple points of the research
cycle: First it can be constructed at the start of the research cycle to describe and explore a
specific hypothesis. Second, it can be used during research as a reference for interdisciplinary
collaboration. Finally it can be used as a tool to contextualize experimental results at the end of
the cycle. In order to encourage the use of the SR throughout the cycle it is hosted on a Wikibase
server, where Wikipedia style pages represent node information in the KG. This allows users to
review, edit and update the graph with knowledge involving the hypothesis.
   To aid the use of computational analysis, including hypothesis generation, and knowledge
exploration we load the SR into a NEO4J instance. We use the Graph Data Science Library
(GDSL) to apply relationship prediction algorithms that provide insight on missing information
in the SR and potential research hypotheses. In addition to the relationship predictions, we
improve the semantic richness of the SR by using NEOsemantics to integrate OWL Ontologies.
   Ontologies are used both as independent nodes and as descriptors of concepts and relation-
ships within the SR. These ontologies provide consensus definitions for users, such as data
scientists, unfamiliar with the underlying biological concepts. These definitions, combined
with the concept-relationship structure of the SR, allow for sharing of contextual information
surrounding a hypothesis across disciplinary fields. This is especially important in highly
interdisciplinary activities such as disease research.
   Here, we will present our ongoing results on the HD iron SR. This will include three parts:
First, relationship predictions between disease related genes and iron related concepts made
by topology based edge prediction models. Second, the Gene Ontology based improvement of
the SR. Third, the graph based representation of the RNA-seq experimental results and their
relationship to Iron.


Acknowledgments
The work leading to this poster is supported by grants from European Joint Project on Rare
Diseases (EJP RD, COFUND-EJP N° 825575), the collaboration project Trusted World of Corona
(TWOC) co-funded by the PPP Allowance made available by Health Holland, Top Sector Life
Sciences & Health; to stimulate public-private partnerships and the Leiden Center for Com-
putational Oncology (LCCO): A strategic initiative of the LUMC Oncology Center-Building
Individual Digital Tumor-Host Twins For Precision Medicine.


References
[1] N. Queralt-Rosinach, G. S. Stupp, et al., Structured reviews for data and knowledge-driven
    research, Database 2020 (2020). doi:1 0 . 1 0 9 3 / d a t a b a s e / b a a a 0 1 5.