Understanding the Behaviour of Complex Biomolecular
Networks by Combining Logical and Semantic Modeling

        Ali Ayadi1,2 , Cecilia Zanni-Merk1 , and François de Bertrand de Beuvron1
      1 ICUBE/SDC Team (UMR CNRS 7357)-Pole API BP 10413, Illkirch 67412, France
2   LARODEC Laboratory, Institut Supérieur de Gestion de Tunis, University of Tunis, Rue de la
                                liberté, Bardo 2000, Tunisia
              {ali.ayadi,cecilia.zanni-merk,debeuvron}@unistra.fr


         Abstract. In literature, most researches related to the understanding of the be-
         haviour of the cell networks focus only on some parts of the biomolecular net-
         work. However, to completely understand their behaviour over time we have to
         integrate the different parts of the biomolecular network and analyse them to-
         gether.
         The objective of the present study is to propose to the biologist a platform to sim-
         ulate the state changes of biomolecular networks with the hope of steering their
         behaviours. In this paper, we firstly present an efficient formalism to represent the
         dynamic behaviour of biomolecular networks. This logical model is based on the
         three levels of systems theory: structural, functional and behavioural modeling.
         We then propose a semantic approach based on four ontologies to formalise the
         domain knowledge of complex biomolecular networks. Both of these approaches
         provide the necessary elements to model, analyse and understand the dynamic
         behaviour and the transition states of these networks.
         Key words: Systems biology, complex biomolecular networks, logical modeling,
         dynamical modeling, semantic technologies.


1     Introduction

Systems biology is a comprehensive quantitative analysis of the manner in which all
the components of a biological system interact functionally over time [1]. Yet, un-
derstanding cellular behavioural variability and its evolution over time is one of the
most complex tasks facing current researchers. Indeed, with the development of high-
throughput techniques such as DNA sequencing, the biological experiments have dis-
covered much knowledge about genes, proteins and metabolites. These advances are
enabling researchers to comprehensively integrate the molecular components proper-
ties in a powerful framework called the complex biomolecular network. This network
consists of a set of nodes, denoting the molecular components and a set of edges, denot-
ing the interactions among these cellular components. These networks are considered
as systems that dynamically evolve from a state to another so that the cell can adapt
itself to changes in its environment.
    Many formalisms have been proposed in recent years for the modeling of biologi-
cal networks. Among these formalisms, we can list the ordinary differential equations
[2], the stochastic methods [3, 4], the boolean networks [5, 6], the Bayesian networks
2        A. Ayadi, C. Zanni-Merk and F. de Bertrand de Beuvron

[7], the Petri nets [8], etc. However, most of these researches focus only on modeling
isolated parts of this network, such as the metabolic network or the gene regulatory
network and do not study the dynamics of the network as a whole. Moreover, there are
many other applications based on semantic technologies have focused on understanding
the behaviour of biological networks such as the development of the Systems Biology
ontology3 , the Sequence ontology4 , the BIOpax5 , see also the works of Knüpfer et al.
[9], [10].
    In this paper, we firstly present a logical approach for modeling the dynamic be-
haviour of biomolecular networks. This formalism is based on the three levels of analy-
sis of systems theory: structural, functional and behavioural modeling. Second, we pro-
pose a semantic approach to formalize the domain knowledge of complex biomolecular
networks. This approach based on four ontologies provides the necessary concepts for
modeling the dynamic behaviour and the state transitions of these networks.
    This paper is organised as follows: Section 2 details the logical formalisation for
describing the logical structure, function and behaviour of complex biomolecular net-
works. Section 3 is devoted to present the semantic approach that provides a rich de-
scription for modeling a biomolecular network and its state changes. Finally, in Section
4, we provide conclusions and future works.

2    Formal Modeling of Biomolecular Networks
In this section, we detail the mathematical formalism that will allow to translate, in log-
ical terms and under certain assumptions, the dynamicity of a complex biomolecular
network. The first parts of this approach have been presented in [14]. Several improve-
ments to cope with the behavioural aspects are presented here in Section 2.1.
    Indeed, a biomolecular network is a dynamical system that can be characterised by
the triple (be, do, become) describing its structure, function and behaviour according to
the systems theory [11]. This modeling is based on three basic modeling pillars:
The structural modeling: to describe the architecture of the biomolecular network.
The functional modeling: to describe what can carry out each component of the biomolec-
   ular network, specifying the conditions for these activities.
The behavioural modeling: to describe how the biomolecular network and its indi-
   vidual components evolve over time.
    Thus, the biomolecular network BN can be represented by its structure SR, its func-
tion FR and its behaviour CR[t0 ,tn ] that evolves over time t (a simulation interval [t0 ,tn ]).
Therefore, mathematically, the biomolecular network BN is defined as follows:
                                    BN = (SR, FR,CR[t0 ,tn ] )
For lack of space, we will not detail the structural SR and the functional FR modeling.
We will just remember their basic notation. For more details about these parts, please
see our previous work [14].
    The structure of the biomolecular network SR = (M, I) is a graph defined by:
 3 http://www.ebi.ac.uk/sbo/main/
 4 http://www.sequenceontology.org/
 5 http://www.biopax.org/
                      Towards an Ontology for Understanding the Behaviour of CBN            3

 – M= {m1 , m2 , . . . , mn } denotes all the molecules composing the network. M is parti-
   tioned as: MG the set of genes, MP the set of proteins and MM the set of metabolites.
 – I= {i1 , i2 , . . . , im } denotes the set of interactions between the network’s molecules.
   Thus, for an edge i ∈ I, we note s(i) the starting node and d(i) the destination
   node. The partition of the graph nodes induces a partition of the different types
   of interactions. We have three interactions between molecular components of the
   same type, four interactions between the nodes belonging to different networks and
   two interactions IGM and IMG are not taken into account because there is no direct
   interaction between the genes and metabolites and vice versa.
The partition of M and I are detailed in [14]. Moreover, the function of the biomolecular
network, denoted by FR, associates to each graph edge i ∈ I the type of its interaction
and the condition that activates it. These types of interactions belongs to the set of
concepts of the Interaction Ontology proposed by Van Landeghem et al. [13].

2.1   Behavioural Modeling
Time Discretisation. Complex biomolecular networks are dynamic systems charac-
terised by continuous interactions. The activity of these interactions can be modeled in
the form of differential equations. However with the large size of these networks, solv-
ing these differential equations in continuous time leads to very significant practical
difficulties and heaviness in their implementation.
    Thus, in order to model the dynamic evolution of a biomolecular network and re-
produce its behaviour over time, we proceed with a behavioural simulation in discrete
time. This simulation allows to study the behaviour of biomolecular network through
successive transitions. The state of a node (representing a cell component) at the next
generation is calculated according to its state and the state of its predecessors at the
current generation as well as the possible influence of each one of its incoming edges.
The node states evolve synchronously. The results of the simulation will be tested and
validated by biologist experts.

State of the Network. The state of the network at a given time if defined by a function
en(n,t) which associates to each node its state at the moment t.
                      (
                   en  Activation ∈ {True, False}          if m ∈ MG .
       en : (m,t) 7→
                       [cm (t)] ∈ R                         if m ∈ MP ∪ MM .
 – For all m ∈ MP ∪ MM :     en(m,t) = [cm (t)] ∈ R   where: cm (t): the value of the
   concentration of the molecule denoted by the node m at a given time t.

 – For all m ∈ MG :     en(m,t) = Activation    where: Activation ∈ {True, False}.
   Associating a gene with a concentration is not meaningful. Instead, a gene may
   have two specific states, activated or not.
We can define ER(t) the state of the biomolecular network at an instant t with a set
representing the states of all components in the network at any given time t.
                       ER(t) = hen(m1 ,t), en(m2 ,t), ..., en(mn ,t)i
4         A. Ayadi, C. Zanni-Merk and F. de Bertrand de Beuvron

Transition of the Network State. For a node m ∈ M, we define ie(m) (resp. oe(m))
the set of incoming edges (resp. outgoing edges) on m, defined as follows:
             ie(m) = {i | d(i) = m}          and         oe(m) = {i | s(i) = m}
We also define Pred(m) the set of predecessor nodes on the node m such as:
                    Pred(m) = {n ∈ M; ∃i ∈ I | s(i) = n and d(i) = m}
     The state of a node at time t + 1 depends on its state at time t, as well as the possible
influence of each one of its incoming edges. This influence obviously depends on the
state of the starting node of the arc in question.
     For each node m, we define an aggregate function Am (relating to the node m) which
calculates the evolution of the node status between two successive instants of the simu-
lation. This aggregate function Am depends on the current state of the node m, the state
of its predecessor nodes Pred(m) and the characteristics of its incoming edges ie(m).
                 en(m,t + 1) = Am (en(m,t), ie(m), en(n,t) ; n ∈ Pred(m))

Network Steering. A state transition in the network occurs by changing at least one
of its nodes. The changes of a node state (that is, changes in the concentration of the
molecule) can occur either by an internal stimulus (for example, due to reactions that
are internal to the cell) already seen with the aggregate function (Section 2.1) or by an
external stimulus generated outside the cell (for example, because of a medicine taken
by the patient). Indeed, a stimulus is an event that can cause changes in the state of
the molecule where it operates and therefore to change the state of all the biomolecular
network (changing a node automatically modifies other network nodes).
    An external stimulus S is a triplet [t, m, ∆c ], where:

    – t is the time of introduction of the stimulus S.
    – m is the node targeted by the stimulus S.
    – ∆c is the change in concentration caused by the stimulus S and which depends on
      the type of the node:
        • If m ∈ MG , ∆c determines the activation or deactivation of a gene: ∆c ∈
            {Activated, Deactivated}.
        • Else, if m ∈ MP ∪ MM , ∆c represents the change of the concentration caused by
            the stimulus S:    ∆c ∈ R .

    We denote ER(t), with t ∈ N, the state of the network at time T (t) = t0 + t.∆T
(where ∆T is the time step and t0 the initial time of the simulation).
    To simulate the different transition states of a biomolecular network, we give a state
ER(0) at time t0 and a time step size ∆T . Then the successive states ER(t + 1) are
calculated from the current state ER(t) according to the interactions and the aggregate
functions defined by the network, and the external stimuli.
    At a given time t + 1, for each m ∈ M we have:
    – If there are no external stimuli in time t for the node m then:
             en(m,t + 1) = Am (en(m,t), ie(m), en(n,t))           where: n ∈ Pred(m)
                       Towards an Ontology for Understanding the Behaviour of CBN         5

    – Else
       • If m ∈ MG :
                                         en(m,t + 1) = ∆c
       • Else (if m ∈ MP ∪ MM ):
           en(m,t + 1) = Am (en(m,t), ie(m), en(n,t) ) + ∆c          where: n ∈ Pred(m)


Behaviour. The behaviour of the biomolecular network CR[t0 ,tn ] is given by the se-
quence of its successive states during the simulation time.

                          CR[t0 ,tn ] = [ER(0), ER(1), ..., ER(n)]

Indeed, the behaviour of the network extends between two distinct instants t0 and tn
forming the simulation interval [t0 ,tn ].


3     A Semantic Approach for Analysing the Transittability of
      Complex Biomolecular Networks

Modeling the behaviour of complex biomolecular networks requires, first and foremost,
to formalize the domain knowledge. However, it is not sufficient to simply describe it.
Certainly, the behaviour of biomolecular networks is investigated through appropriate
semantic structures for the description of their components that must not be overlooked.
Thus, the use of a formalized language such as ontologies provides a rich description
but also allows to perform reasoning. To do this, we propose a semantic architecture
composed of four ontologies: three of them already exist in the literature, the Gene On-
tology (GO) [15, 16], the Simple Event Model Ontology (SEMO) [17], the Time On-
tology (TO) [19] and we are developing the Biomolecular Network Ontology (BNO).
Linked together, these ontologies provide the necessary concepts for modeling the dy-
namic behaviour and the transition states of a complex biomolecular network.


3.1    The Gene Ontology

In this study, the Gene Ontology6 is considered as a core ontology. It ensures the de-
scription and the classification of cellular components. It provides a structured terminol-
ogy for the description of gene functions and processes, and the relationships between
these components [20].
    We chose to use the Gene Ontology for the following reasons, (1) it is an initiative
of several genomic databases such as the Saccharomyces Genome database (SGD), the
Drosophila genome database (FlyBase), etc. to build a generic ontology for describing
the role of genes and proteins, (2) it is the most developed and most used in biology
(since 2000), and (3) it provides annotation files about large number of cellular entities.
6 http://www.geneontology.org
6         A. Ayadi, C. Zanni-Merk and F. de Bertrand de Beuvron

3.2    The Simple Event Model Ontology

The Simple Event Model ontology7 proposed by Van Hage et al. [17] provides the
necessary knowledge for the description of events. The ontological architecture of the
Simple Event Model ontology consists of four basic classes: Event that specifies what
is happening, Actor that indicates the participants of an event, Place that describes the
location where the event happened, and Time that describes the moment.
    We chose to use the Simple Event Model ontology because it provides the necessary
concepts to describe and model events in various subject domains.


3.3    The Time Ontology

The Time ontology8 developed by Hobbs and Pan [18], [19] enables a more intuitive
use of the time dimension while making the most of semantic knowledge. It gives a rich
vocabulary to describe the topological relationships that may exist between time points
and intervals, and also provides information about time.
     The main classes of this temporal ontology can be summarized as TemporalEn-
tity which consists of two sub-classes Instant and ProperInterval, DurationDescription,
DateTimeDescription, TemporalUnit, etc. Also, it contains several proprerties such as
hasDurationDescription, intervalStarts, hasDateTimeDescription, etc.
     We chose to use the Time Ontology because of its basic structure that is not specific
to a particular application and because it is simple to adapt it in our context.


3.4    The Biomolecular Network Ontology

To study the dynamic behaviour and the transition states of a biomolecular network, it
is required to model its domain knowledge. Therefore, we developed the Biomolecular
Network ontology. This ontology is the major contribution of this paper, it is intended
to describe exhaustively the field of complex biomolecular networks by describing the
static aspect of its structure. It was defined in collaboration with domain experts.
     Figure 1 presents the Biomolecular Network ontology. We use the graphical no-
tation for OWL ontologies defined by Brockmans et al. [22] and Bārzdiņš et al. [23]
where boxes are OWL classes; full lines are object properties and dotted lines are data
properties. Full lines can be labelled to indicate restrictions meaning that the range of
the relationship is specialized. Only a few of the object properties restrictions are dis-
played in Figure 1 for the sake of clarity. This domain ontology consists of four main
classes:

    – The class Biomolecular Network: This class includes the different types of complex
      biomolecular networks. As mentioned earlier in Section 1, the complex biomolec-
      ular network can be composed by Gene Regulatory networks (GRNs), Protein-
      Protein Interaction networks (PPINs) and Metabolic networks (MNs) which corre-
      spond to the following concepts: Genomic Network, Proteomic Network and Metabolomic Network.
7 http://semanticweb.cs.vu.nl/2009/11/sem/
8 https://www.w3.org/TR/owl-time/
                     Towards an Ontology for Understanding the Behaviour of CBN         7

   These types of networks can be connected to the other ontology concepts through
   three properties, has node that depicts its cellular components, has interaction that
   describes the interactions linked to its components and the property has node only
   that specifies exactly the nature and type of its components.
 – The class Node: This class contains the different types of cellular entities M that
   constitute the biomolecular network. In fact, we can identify three sub-classes: the
   Gene which describes the set MG , the Protein which models the set MP and the
   Metabolite which describes the MM . This class is connected with the Node State
   through the property has state.
 – The class Interaction: This class covers all the diverse types of interactions that can
   be operated among the nodes of the biomolecular network. This class consists of
   two sub-classes, Intraomic Interactions that covers the interactions between molec-
   ular components of the same type and the class Interomic Interaction that describes
   the interactions between molecular components of the different type. This class is
   connected to the Node class via two properties, has source and has end.
 – The class Node State contains the possible states of the nodes. This class is com-
   posed of two sub-classes, the Concentration and the Activation.
 – The class Interaction Type allows to specify the types and the nature of the interac-
   tion among cellular components. This class is linked to the BNO:Interaction class
   through the properties Has type. To successfully integrate the main Interaction on-
   tology concepts (IO:Activity flow and IO:Process) with the Biomolecular Network
   ontology, we create an abstract BNO UML BNO:Interaction Type to generalise
   those two Interaction ontology concepts (Figure 1).


3.5   The Relations Among the Ontologies

Concepts in the Biomolecular Network ontology are linked to the Gene ontology classes.
In fact, the concepts of the Gene Ontology are used to enrich the definitions of the con-
cepts of the Biomolecular Network ontology by an equivalence relation owl:equivalenceClass.
For example, as described in Figure 2(b), after inference the concept BNO:Protein
will be specialized by the concept GO:beta-galactosidase (GO: 0009341) because the
BNO:Node concept is equivalent to the concept GO:cellular component (GO: 0005575).
The Biomolecular Network ontology is also linked with the Simple Event Model ontol-
ogy through the BNO: Node concept, in fact an SEM:event can stimulate a molecular
entity (represented by the concept BNO: Node). The Simple Event Model ontology will
be used to describe the states of BNO:Node and its behaviour.
    Moreover, the Time ontology (TO) has been integrated in the Simple Event Model
ontology. The concept sem:Time was made equivalent to the concept TO:TemporalEntity
which represents the root of the Time ontology. Hence, the property sem:hasTime will
connect the Simple Event Model ontology to the Time ontology and, as a consequence,
the diverse types of temporal concepts will be defined as specializations of the class
sem:Time. Figure 2(b) shows a use of this principle. Thus, we can exploit the wealth of
temporal concepts provided by this temporal ontology to describe the SEM:event class.
Using these relationships it is possible to merge these ontologies to formalize the nec-
essary knowledge to study the state changes of the biomolecular network’s behaviour.
8   A. Ayadi, C. Zanni-Merk and F. de Bertrand de Beuvron


               Fig. 1. The Biomolecular Network Ontology (BNO).
                       Towards an Ontology for Understanding the Behaviour of CBN             9


                        ((a))                                            ((b))

Fig. 2. Example of merging: 2(a) The Gene ontology concepts to the Biomolecular Network
ontology concepts. 2(b) The Time ontology within the Simple Event Model ontology.


4   Conclusion

This paper proposes an effective approach for analysis and understanding the behaviour
of complex biomolecular networks over time. This approach combines both of a logi-
cal modeling of biomolecular networks and a semantic approach that consists on four
ontologies merged together.
    We present a logic-based approach for modeling the dynamic behaviour of biomolec-
ular networks. This formalism is based on the three levels of analysis of systems theory:
structural, functional and behavioural modeling.
    Moreover, the use of a semantic approach based on merging different ontologies can
overcome issues of study the state changes of the complex biomolecular networks and
their behaviour. We develop the Biomolecular Network Ontology (BNO) to describe the
static structure of complex biomolecular networks and merge it with the Gene Ontology
(GO) to provide structured terminologies for the description of cellular components. We
also chose the Simple Event Model Ontology (SEMO) to describe events and stimuli
which can stimulate the network’s components and integrated the Time Ontology (TO)
to study the different states of the biomolecular network and its nodes over time.


References

1. Aderem, A.: Systems Biology: Its Practice and Challenges Cell. 121(4), pp. 511-513 (2005)
2. Ratushny, A.V., Ramsey, S.A., Aitchison, J.D.: Mathematical modeling of biomolecular net-
  work dynamics. Network Biology: Methods and Applications. pp. 415-433 (2011)
3. Gillespie, D.T.: A general method for numerically simulating the stochastic time evolution of
  coupled chemical reactions. Journal of computational physics. 22(4), 403-434 (1976)
4. Wilkinson, D.J.: Stochastic modelling for systems biology. CRC press. (2011)
10        A. Ayadi, C. Zanni-Merk and F. de Bertrand de Beuvron

5. Kauffman, S.A.: Metabolic stability and epigenesis in randomly constructed genetic nets.
  Journal of theoretical biology. 22(3), 437-467 (1969)
6. Zhao, Y., Kim, J., Filippone, M.: Aggregation algorithm towards large-scale boolean network
  analysis. IEEE Transactions on Automatic Control. 58(8), 1976-1985 (2013)
7. Husmeier, D.: Sensitivity and specificity of inferring genetic regulatory interactions from
  microarray experiments with dynamic bayesian networks. Bioinformatics. 19(17), 2271-2282
  (2003)
8. Chaouiya, C., Klaudel, H., Pommereau, F.: A modular, qualitative modeling of regulatory
  networks using Petri nets. In Modeling in Systems Biology. Springer London. pp. 253-279
  (2011)
9. Courtot, M., Juty, N., Knüpfer, C., Waltemath, D., Zhukova, A., Dräger, A., ... & Hoops, S.:
  Controlled vocabularies and semantics in systems biology. Molecular systems biology, 7(1),
  543 (2011)
10. Knüpfer, C., & Beckstein, C.: Function of dynamic models in systems biology: linking struc-
  ture to behaviour. Journal of biomedical semantics, 4(1), 1 (2013)
11. Le Moigne, J. L.: La théorie du système général: théorie de la modélisation. jeanlouis le
  moigne-ae mcx. (1994)
12. Wu, F.X., Wu, L., Wang, J., Liu, J., Chen, L.: Transittability of complex networks and its
  applications to regulatory biomolecular networks. Scientific reports. 4, 4819 (2014)
13. Landeghem, S.V., Parys, T.V., Dubois, M., Inźe, D., de Peer, Y.V.: Diffany: an ontology
  driven framework to infer, visualise and analyse differential molecular networks. BMC Bioin-
  formatics. 17(1), 1–12 (2016)
14. Ayadi, A., Zanni-Merk, C., de Bertrand de Beuvron, F., Krichen, S.: Logical and Semantic
  Modeling of Complex Biomolecular Networks. Procedia Computer Science. vol. 396,pp. 475-
  484 (2016)
15. Smith, B., Williams, J., Schulze-Kremer, S.: The ontology of the gene ontology. In: AMIA.
  vol. 3, pp. 609–613 (2003)
16. Consortium, G.O., et al.: The gene ontology project in 2008. Nucleic acids research. 36(suppl
  1), D440–D444 (2008)
17. Van Hage, W. R., Malaisé, V., Segers, R., Hollink, L. and Schreiber, G.: Design and use
  of the simple event model (sem). Web Semantics: Science, Services and Agents on the World
  Wide Web. 9(2), 128–136 (2011)
18. Hobbs, J.R., Pan, F.: An ontology of time for the semantic web. ACM Transactions on Asian
  Language Information Processing (TALIP). 3(1), 66–85 (2004)
19. Hobbs, J.R., Pan, F.: Time ontology in owl. W3C working draft. 27, 133 (2006)
20. Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P.,
  Dolinski, K., Dwight, S.S., Eppig, J.T., et al.: Gene ontology: tool for the unification of biology.
  Nature genetics. 25(1), 25–29 (2000)
21. Duncan, J., Eilbeck, K., Narus, S.P., Clyde, S., Thornton, S., Staes, C.: Building an ontology
  for identity resolution in healthcare and public health. Online journal of public health informat-
  ics. 7(2) (2015)
22. Brockmans, S., Volz, R., Eberhart, A., Löffler, P.: Visual modeling of OWL DL ontologies
  using UML. In: International Semantic Web Conference. pp. 198–213. Springer (2004)
23. Bārzdiņš, J., Bārzdiņš, G., Čerāns, K., Liepiņš, R., Sprog̀is, A.: UML style graphical nota-
  tion and editor for OWL 2. In Perspectives in Business Informatics Research. Springer Berlin
  Heidelberg, 102-114 (2010)