=Paper=
{{Paper
|id=Vol-1636/paper-09
|storemode=property
|title=Completing Signaling Networks by Abductive Reasoning with Perturbation Experiments
|pdfUrl=https://ceur-ws.org/Vol-1636/paper-09.pdf
|volume=Vol-1636
|authors=Adrien Rougny,Yoshitaka Yamamoto,Hidetomo Nabeshima,Gauvain Bourgne,Anne Poupon,Katsumi Inoue,Christine Froidevaux
|dblpUrl=https://dblp.org/rec/conf/ilp/RougnyYNBPIF15
}}
==Completing Signaling Networks by Abductive Reasoning with Perturbation Experiments==
<pdf width="1500px">https://ceur-ws.org/Vol-1636/paper-09.pdf</pdf>
<pre>
     Completing signaling networks by abductive
      reasoning with perturbation experiments

       Adrien Rougny1 , Yoshitaka Yamamoto2,3 , Hidetomo Nabeshima2 ,
    Gauvain Bourgne4 , Anne Poupon5 , Katsumi Inoue6 , Christine Froidevaux1
       1
           Laboratoire de Recherche en Informatique, CNRS, Université Paris-Sud
                                2
                                   University of Yamanashi
                                       3
                                         JST Presto
                     4
                        LIP6, CNRS, Universit Pierre et Marie Curie
                              5
                                  BIOS group, INRA, CNRS
                           6
                              National Institute of Informatics


1     Introduction

Signaling networks model the flow of information occurring in cells after they
have been stimulated by an extracellular signal (for instance, a hormone). To-
gether with the rise of available high-throughput data, such networks become
always larger and more complex. Consequently, automatic methods have be-
come necessary for their analysis. Methods relying on discrete formalisms, such
as logical ones, seem well suited as numerical parameters are often difficult to
obtain.
    One fundamental task of cell signaling biology is to test whether available
experimental data can be explained by a given signaling network, and else, to
modify the network (by adding or removing an edge) or clean the data so that it
can be explained. Methods to accomplish this task take as input a representation
of signaling networks called interaction graph (IG). In IGs, nodes are molecules
(or activities that originate from molecules) and arcs are positive or negative
influences that molecules have on each other.
    These various works mainly differ on four aspects. (i) The semantics used to
interpret IGs: In [11] and [14, 12], authors interpret the arcs of an input IG under
the path semantics, introduced in [13], or under the Sign Consistency Model [15],
respectively, whereas authors of [4, 10] interpret IGs as causality networks, which
implies the use of a more constrained semantics. Authors of [14] also consider the
problem within the boolean network semantics. (ii) The experimental data they
take into account: Methods of [11, 4] take as input steady-state shift experiments,
whereas methods of [14, 12, 10] take into account perturbation experiments. (iii)
The modifications of the network or the cleaning of the data they propose to
explain unexplained experimental results: The methods in [11, 12] provide possi-
ble modification of the input network or the data, whereas the method in [4, 10]
allows only the completion of the input network by addition of edges. (iv) The
formalism they use: graph theory [14], integer linear programming [12], answer
set programming [11] and first-order logic [4, 10].


                                          95
    In this work, we propose a method to check whether a set of perturbation
experiments can be explained by a signaling network represented in SBGN-AF
[16], and else, to complete the network by adding edges to the network.
    SBGN-AF is a standard to represent signaling and gene regulation networks.
It extends the classical IG representation by including logical operators (the
AND and the OR operator) that permit to specify logical functions within the
graph. Taking as input SBGN-AF maps, we extend the path semantics of [13]
by considering such operators in the definition of the paths, that we formalize
in first-order logic, based on the translation of SBGN-AF maps into predicates
introduced in [8]. We also interpret perturbation experiments making a stronger
assumption than in [14, 12, 10], resulting in a more constrained setting (cf. sec-
tion 3). We perform both the explanation and the completion tasks within the
same abductive framework by using the consequence finding method from SO-
LAR [7].

2     Paths semantics with logical operators
Positive and negative paths of an SBGN-AF map are built by transitive closure
of the elementary arcs.. We interpret a positive path from an activity A to
an activity B as a possibility to explain an increase (resp. decrease) of B by
an increase (resp. decrease) of A. Analogously, we interpret a negative path
from A to B (denoted by inhibits• (A, B)) as a possibility to explain a decrease
(resp. increase) of B by an increase (resp. decrease) of A. Positive and negative
paths are denoted by stimulates• (A, B) and inhibits• (A, B), respectively. The
following axioms allow building positive and negative paths using the influences
and the logical operators of an SBGN-AF map.
                                                                                    •
                                                      stimulates(A, B) → stimulates (A, B)    (1)
                                                                                •
                                                        inhibits(A, B) → inhibits (A, B)      (2)
                                         •                                          •
                               stimulates (A, B) ∧ stimulates(B, C) → stimulates (A, C)       (3)
                                             •                                      •
                                    inhibits (A, B) ∧ inhibits(B, C) → stimulates (A, C)      (4)
                                             •                                  •
                                 stimulates (A, B) ∧ inhibits(B, C) → inhibits (A, C)         (5)
                                         •                                      •
                                 inhibits (A, B) ∧ stimulates(B, C) → inhibits (A, C)         (6)
                                        •                      •                    •
    and(O) ∧ input(A, B, O) ∧ stimulates (C, A) ∧ stimulates (C, B) → stimulates (C, O)       (7)
                                                               •                •
                           and(O) ∧ input(A, B, O) ∧ inhibits (C, A) → inhibits (C, O)        (8)
                                                               •                •
                          and(O) ∧ input(A, B, O) ∧ inhibits (C, B) → inhibits (C, O)         (9)
                                                               •                    •
                         or(O) ∧ input(A, B, O) ∧ stimulates (C, A) → stimulates (C, O)      (10)
                                                               •                    •
                         or(O) ∧ input(A, B, O) ∧ stimulates (C, B) → stimulates (C, O)      (11)
                                             •                 •                •
           or(O) ∧ input(A, B, O) ∧ inhibits (C, A) ∧ inhibits (C, B) → inhibits (C, O)      (12)


   Axioms (1-6) are the main transitivity axioms, while axioms (7-9) and (10-12)
express the semantics of the AND and the OR logical operators, respectively.

3     Formalization of experimental observations
We consider experimental observations that originate from perturbation experi-
ments. Such experiments consist in comparing the rate of an activity aT between


                                                 96
two batches of cells each having received a particular treatment. In the control
batch, cells are stimulated by a set of molecules, whose corresponding set of ac-
tivities are denoted by S. In the experimental batch, cells are first treated with
a number of inhibitors that suppress a set of activities denoted by KO. The cells
are then stimulated as in the control batch. We introduce a variable e that takes
the value ↓ (resp. ↑) if and only if (iff ) the rate of aT is lower (resp. higher) in the
experimental batch than in the control batch. We denote such an experimental
observation by the tuple (S, KO, aT , e).
    For a given experimental observation E = (S, KO, aT , e), if e =↓, then aT is
more inhibited or less stimulated by the activities of S in the experimental batch
than in the control batch due to the suppression of at least one activity of KO.
In the cells of the experimental batch, as all activities of KO are suppressed,
they can no longer be performed by the cells. Consequently, the lower overall
stimulation of aT can only be caused by suppressing at least one positive path
from an activity of S to aT . Thus, e =↓ iff there exists at least one positive path
outgoing from an activity of S, incoming to aT , and passing through an activity
of KO. Analogous reasoning is made for e =↑, hence e =↑ iff there exists at least
one negative path from an activity of S to aT and passing through an activity
of KO.
    Here, we make the hypothesis that suppressing the activities of KO has
an effect on the pathways that link the activities of S to aT . That is not the
case in [14, 12, 10], where the authors make the assumption that suppressing the
activities of KO only affects the pathway between activities of KO and aT , thus
not taking into account the activities of S. As a result, our interpretation is
more constrained. Therefore, experiments that could be explained by a network
with the interpretation of experiments made in [14, 12, 10] could no longer be
explained within our setting, resulting in the discovery of new possible arcs.
    To explicitly describe the role of S, we add one virtual activity node aS to
the prior network so that for each activity a ∈ S, we add a stimulation arc
from aS to a. According to our interpretation of perturbation experiments and
our transitivity axioms, each experimental observation E = (S, KO, aT , e) is
formalized as the following disjunction OE :


                      (stimulates• (aS , aKO ) ∧ inhibits• (aKO , aT ))∨
           _                                                              
 OE =                                                                          if e =↑;
                      (inhibits• (aS , aKO ) ∧ stimulates• (aKO , aT ))
        aKO ∈KO


                      (stimulates• (aS , aKO ) ∧ stimulates• (aKO , aT ))∨
           _                                                                  
 OE =                                                                              if e =↓ .
                      (inhibits• (aS , aKO ) ∧ inhibits• (aKO , aT ))
        aKO ∈KO
                                                                           (13)
    Given an SBGN-AF map N and an experimental observation E = (S, KO, aT , e),
we want to check if E can be explained by N . If not, we want to find a minimal
set of arcs that complete N in order to explain E. Both tasks can be realized
within the same abductive setting, presented hereafter.


                                            97
4   Abductive setting for the completion task

Let N be an SBGN-AF map and E = (S, KO, aT , e) be an experimental ob-
servation. Let B be the background theory formed of the translation of N into
predicates and axioms (1-12), and OE be the observation formalized from E.
Then, solving both the explanation and the completion task consists in search-
ing for all minimal hypotheses H such that B ∪ H |= OE and B ∪ H 6|= . If
B |= OE , then clearly N explains E.
    For the computation of H, we can use the consequence finding system SO-
LAR [7], that allows to define a set of abducibles by means of the language
bias P describing the negations of desirable hypotheses, and seek for all the
subsumption-minimal hypotheses belonging to P. In the completion task, every
added influence is either a stimulation or an inhibition. Besides, we restrict the
number of added influences to at most two for each observation in order to get
more realistic hypotheses that could be tested experimentally. Then, P is given
under the form h{¬stimulates( , ), ¬inhibits( , )}, Length ≤ 2i, where Length
is the number of literals (i.e., instances of ¬stimulates( , ) and ¬inhibits( , ))
allowed in the hypothesis.
    In general, SOLAR can produce a large amount of hypotheses. To reduce
it, we perform a first selection that operates directly at the generation step or
during a post-filtering step. We do not consider hypotheses that generate a loop
in the prior map and those that contain constants mapped to logical operators
or the constant aS . We then use a greedy algorithm to select hypotheses based
on the decreasing number of experimental observations they can explain.


5   Application: the FSHR-induced network

We applied our method to two pathways of the FSHR-induced signaling network,
namely the G protein pathway and the PI3K pathway taken from [1] (See Fig.
1). We built a dataset of 29 experimental observations by gathering and formal-
izing reliable experimental results from the literature related to the FSHR. For
each experiment, only one activity suppressor was used. Consequently, for each
experimental observation, the set KO is merely a singleton.
    Among the 29 different experimental observations, 17 observations could be
explained by the network, and the 12 remaining ones were used to complete the
network. For each of them we computed minimal hypotheses sufficient to explain
it when added to the network. We ran SOLAR (ver. 2) with 12 machines (Intel
Xeon E-1230 V2 (3.3GHz) and 8GB RAM) in parallel, with a limited executing
time of 4 hours.
    Each of the 12 observations could be explained by hypotheses containing a
unique influence, although more complex hypotheses were also generated. Con-
sequently, we chose to focus on the hypotheses containing a unique influence.
Using our greedy algorithm, we ranked more than 250 hypotheses containing a
unique influence generated during the abduction phase, and selected 28 among
them. Results are shown in Table 1.


                                       98
                                               1 2 3 4 5 6 7


                                               stimulates(p38mapk,pi3k)


                                               inhibits(mtor,camp epac)
                                               inhibits(rps6,camp epac)
                                               inhibits(p38mapk,erk12)
                                               inhibits(p38mapk,mek)
                                               inhibits(p38mapk,raf1)


                                               inhibits(rps6,rap1)
1 ({camp, epac}, pi3k, akt, ↓)
2 ({camp, epac}, pi3k, rps6, ↓)
3 ({camp, epac}, p38mapk, akt, ↓)
4 ({camp, epac}, pi3k, p70s6k, ↓)
5 ({camp, epac}, pi3k, rps6, ↓)
6 ({f sh f shr, epac}, pka, akt, ↑)
7 ({f sh f shr, epac}, p38mapk, akt, ↓)
8 ({camp, epac}, pka, p70s6k, ↑)
9 ({f sh f shr, epac}, p38mapk, erk12, ↓)
10 ({f sh f shr, epac}, camp epac, erk12, ↓)
11 ({f sh f shr, epac}, mek, p38mapk, ↓)
12 ({camp, epac}, pka, akt, ↑)
Table 1: Application to the FSHR-
induced network. Lines correspond
to experimental observations, columns
to selected hypotheses. A cell is green Figure 1: The FSHR-induced net-
if the hypothesis explains the observa- work, represented in SBGN-AF. The
tion. Experimental observations that are G protein pathway is represented in red
explained by the network, as well as and the PI3K pathway in blue.
hypotheses (8-28) are omitted.
    Hypothesis (1) proposes that p38MAPK could activate PI3K. In [3], the
authors make the hypothesis of such a crosstalk in Granulosa Cells. More-
over, activation of Akt in Zn2+ -treated cells has been shown to pass through
PI3K downstream of p38MAPK [9]. This result shows that p38MAPK is able
to trigger the PI3K pathway in Zn2+ treated cells, and thus this reinforces our
hypothesis for FSH stimulated cell. Hypotheses (2-4) all suggest an inhibitory
crosstalk between p38MAPK and the RAF/MEK/ERK pathway. In [5], the au-
thors clearly state that p38MAPK inhibits the RAF/MEK/ERK pathway during
muscle differentiation, thus suggesting a potential influence of p38MAPK on the
RAF/MEK/ERK pathway. Hypotheses (5-28) all suggest a crosstalk between
the pathway downstream of MEK and the cAMP-EPAC pathway. A crosstalk
between ERK and cAMP has indeed been evidenced in [6], even if it involves a
feedback loop (excluded in our work).
    According to our literature review, top ranked hypotheses are more promising
than low ranked ones, indicating that selecting hypotheses based on the number
of observations they can explain seems to be appropriate.
    Interestingly, experimental results (1,2,4,5,8) would have been explained by
the network considering the less constrained interpretation of experimental re-
sults given in [14, 12, 10], and would not have allowed to generate any hypothesis.


6      Concluding remarks
We have proposed a logical formalization of SBGN-AF maps and transitivity
axioms that allow to check, given an SBGN-AF map, whether some experimental


                                                                      99
observations can be explained by the map, and else to generate hypotheses that
complete the map. Application to the FSHR-induced signaling network shows
that the method leads to plausible hypotheses, some of which having already
been demonstrated in other signaling systems, and thus that it is promising.

References
 1. Gloaguen et al.: Mapping the follicle-stimulating hormone-induced signaling net-
    works. Frontiers in endocrinology 2 (2011)
 2. Choi et al.: Gonadotropin-stimulated epidermal growth factor receptor expression
    in human ovarian surface epithelial cells: involvement of cyclic amp-dependent
    exchange protein activated by camp pathway. Endocrine-related cancer 16(1), pp.
    179–188 (2009)
 3. Gonzalez-Robayna et al.: Follicle-stimulating hormone (fsh) stimulates phosphory-
    lation and activation of protein kinase b (pkb/akt) and serum and glucocorticoid-
    induced kinase (sgk): evidence for a kinase-independent signaling by fsh in granu-
    losa cells. Molecular Endocrinology 14(8), pp. 1283–1300 (2000)
 4. Inoue, K., Doncescu, A., Nabeshima, H.: Completing causal networks by meta-level
    abduction. Machine learning 91(2), pp. 239–277 (2013)
 5. Lee et al.: Activation of p38 mapk induces cell cycle arrest via inhibition of raf/erk
    pathway during muscle differentiation. Biochemical and biophysical research com-
    munications 298(5), pp. 765–771 (2002)
 6. Baillie et al. Phorbol 12-myristate 13-acetate triggers the protein kinase A-
    mediated phosphorylation and activation of the PDE4D5 cAMP phosphodiesterase
    in human aortic smooth muscle cells through a route involving extracellular signal
    regulated kinase (ERK). Molecular Pharmacology 60(5), pp. 1100–1111 (2001)
 7. Nabeshima, H., Iwanuma, K., Inoue, K.: Solar: a consequence finding system for
    advanced reasoning. In: Automated Reasoning with Analytic Tableaux and Related
    Methods, pp. 257–263. Springer (2003)
 8. Rougny et al.: Analyzing sbgn-af networks using normal logic programs. Logical
    Modeling of Biological Systems, pp. 325–361 (2013)
 9. Wu et al.: p38 and egf receptor kinase-mediated activation of the phosphatidyli-
    nositol 3-kinase/akt pathway is required for zn2+-induced cyclooxygenase-2 ex-
    pression. AJP-Lung Cellular and Molecular Physiology 289(5), L883–L889 (2005)
10. Yamamoto et al.: Completing sbgn-af networks by logic-based hypothesis finding.
    In: Formal Methods in Macro-Biology, pp. 165–179. Springer (2014)
11. Gebser et al.: Repair and Prediction (under Inconsistency) in Large Biological
    Networks with Answer Set Programming. In: KR (2010, April)
12. Melas et al.: Detecting and removing inconsistencies between experimental data
    and signaling network topologies using integer linear programming on interaction
    graphs. PLoS computational biology 9(9), p. e1003204 (2013)
13. Klamt et al. A methodology for the structural and functional analysis of signaling
    and regulatory networks. BMC bioinformatics 7(1), p. 56 (2006)
14. Samaga et al. The logic of EGFR/ErbB signaling: theoretical properties and anal-
    ysis of high-throughput data. PLoS Comput Biol 5(8), p. e1000438 (2009)
15. Siegel et al. Qualitative analysis of the relation between DNA microarray data and
    behavioral models of regulation networks. Biosystems 84(2), p. 153-174 (2006)
16. Mi et al. Systems biology graphical notation: activity flow language level 1. Nature
    Precedings (2009)


                                          100

</pre>