-

Specialized Predictor for Reaction Systems with Context Properties

Roberto Barbuti

Roberta Gori

Francesca Levi

Paolo Milazzo

0 0 Dipartimento di Informatica, Università di Pisa , Italy

31 43

Reaction systems are a qualitative formalism for modeling systems of biochemical reactions characterized by the non-permanency of the elements: molecules disappear if not produced by any enabled reaction. Reaction systems execute in an environment that provides new molecules at each step. Brijder, Ehrenfeucht and Rozemberg introduced the idea of predictors. A predictor of a molecule s, for a given n, is the set of molecules to be observed in the environment in order to determine whether s is produced or not by the system at step n. We introduced the notion of formula based predictor, that is a propositional logic formula that precisely characterizes environments that lead to the production of s after n steps. In this paper we revise the notion of formula based predictor by defining a specialized version that assumes the environment to provide molecules according to what expressed by a temporal logic formula. As an application, we use specialized formula based predictors to give theoretical grounds to previously obtained results on a model of gene regulation.

In the context of Natural Computing [ 11, 6 ], reaction systems [ 10, 2 ] have been proposed as a model for the description of biochemical processes driven by the interaction among reactions in living cells. A reaction system consists of a set of objects S representing molecules and a set of rewrite rules (called reactions) representing chemical reactions. A reaction is a triple (R, I , P ), where R, I and P are sets of objects representing reactants, inhibitors and products, respectively. A state of a reaction system is a subset of S representing available molecules. The presence of an object in the state expresses the fact that the corresponding biological entity, in the real system being modeled, is present in a number of copies as high as needed. This is called the threshold supply assumption and characterizes reaction systems as a qualitative modeling formalism.

The dynamics of a reaction system is given by occurrences of reactions (i.e. rewrite rule applications) based on two opposite mechanisms: facilitation and inhibition. Facilitation means that a reaction can occur only if all its reactants are present, while inhibition means that the reaction cannot occur if any of its inhibitors is present. The threshold supply assumption ensures that different reactions never compete for their reactants, and hence all the applicable reactions in a step are always applied. The application of a set of reactions results in the introduction of all of their products in the next state of the system. Reaction systems assume the non permanency of the elements, namely the next state consists only of the products of the reactions applied in the current step.

The overall behavior of a reaction system model is driven by the (set of) contextual elements which are received from the external environment at each step. Such elements join the current state of the system and, as the other objects in the system state, can enable or disable reactions. The computation of the next state of a reaction system is a deterministic procedure. However, since the contextual elements that can be received at each step can be any subset of the support set S, the overall system dynamics is non deterministic.

Theoretical aspects of reaction systems have been studied in [ 7, 3, 8, 4, 9, 13, 12 ]. In [ 3 ] Brijder, Ehrenfeucht and Rozemberg introduced the idea of predictor. Assume that one is interested in knowing whether an object s ∈ S will be present after n steps of execution of a reaction system. Since the only source of non-determinism are the contextual elements received at each step, observing such elements can allow us to predict the production of s after n steps. In general, not all contextual elements are relevant for determining if s will be produced. A predictor is hence the subset Q of S that is actually essential to be observed among contextual elements for predicting whether s will be produced after n steps or not.

In [ 1 ] we continued the investigation on predictors by introducing the new notion of formula based predictor. Formula based predictors consist in a propositional logic formula to be satisfied by the sequence of (sets of) elements provided by the environment. Satisfaction of the logic formula precisely discriminates the cases in which s will be produced after n steps from those in which it will not.

Formula based predictors (as well as standard predictors) do not assume anything about the elements provided by the environment. However, it is very often the case that the sequences of sets of object provided by the environment follow specific patterns or, more generally, have specific dynamical properties. For example, it might happen that some object are never provided by the environment, or that some objects are provided only after some others.

In this paper we revise formula based predictors by introducing specialized formula based predictors. The revised notion is specialized with respect to a temporal logic formula expressing the dynamical properties of the environment. More specifically, a specialized formula based predictor is a propositional logical formula that predicts the production of an object s after n steps, by considering only the subset of the context sequences satisfying the given temporal logic formula. A specialized predictor can be substantially a simpler formula than the corresponding (non-specialized) formula based predictor.

We illustrate specialized formula based predictors on a model of the lac operon expression in the E. coli bacterium originally proposed in [ 5 ]. 2

Reaction Systems

In this section we recall the basic definition of reaction systems [ 10, 2 ]. Let S be a finite set of symbols, called objects. A reaction is formally a triple (R, I, P ) with R, I, P ⊆ S, composed of reactants R, inhibitors I, and products P . We assume reactants and inhibitors to be disjoint (R ∩ I = ∅), otherwise the reaction would never be applicable. Reactants and inhibitors R ∪ I of a reaction are collectively called resources of such a reaction. The set of all possible reactions over a set S is denoted by rac(S). Finally, a reaction system is a pair A = (S, A), with S being the finite background set, and A ⊆ rac(S) being its set of reactions.

The state of a reaction system is described by a set of objects. Let a = (Ra, Ia, Pa) be a reaction and T a set of objects. The result resa(T ) of the application of a to T is either Pa, if T separates Ra from Ia (i.e. Ra ⊆ T and Ia ∩ T = ∅), or the empty set ∅ otherwise. The application of multiple reactions at the same time occurs without any competition for the used reactants (threshold supply assumption). Therefore, each reaction which is not inhibited can be applied, and the result of application of multiple reactions is cumulative. Formally, given a reaction system A = (S, A), the result of application of A to a set T ⊆ S is defined as resA(T ) = resA(T ) = Sa∈A resa(T ).

The dynamics of a reaction system is driven by the contextual objects, namely the objects which are supplied to the system by the external environment at each step. An important characteristic of reaction systems is the assumption about the nonpermanency of objects. Under such an assumption the objects carried over to the next step are only those produced by reactions. All the other objects vanish, even if they are not involved any reaction.

Formally, the dynamics of a reaction system A = (S, A) is defined as an interactive process π = (γ, δ), with γ and δ being finite sequences of sets of objects called the context sequence and the result sequence, respectively. The sequences are of the form γ = C0, C1, . . . , Cn and δ = D0, D1, . . . , Dn for some n ≥ 1, with Ci, Di ⊆ S, and D0 = ∅. Each set Di, for i ≥ 1, in the result sequence is obtained from the application of reactions A to a state composed of both the results of the previous step Di−1 and the objects Ci−1 from the context; formally Di = resA(Ci−1 ∪ Di−1) for all 1 ≤ i ≤ n. Finally, the state sequence of π is defined as the sequence W0, W1, . . . , Wn, where Wi = Ci ∪ Di for all 1 ≤ i ≤ n. In the following we d’say that γ = C0, C1, . . . , Cn is a n-step context sequence. 3

Preliminaries

In order to describe the causes of a given product, we use objects of reaction systems as propositional symbols of formulas. Formally, we introduce the set FS of propositional formulas on S defined in the standard way: S ∪ {true, f alse} ⊆ FS and ¬f1, f1 ∨ f2, f1 ∧ f2 ∈ FS if f1, f2 ∈ FS .

The propositional formulas FS are interpreted with respect to subsets of the objects C ⊆ S. Intuitively, s ∈ C denotes the presence of element s and therefore the truth of the corresponding propositional symbol. The complete definition of the satisfaction is as follows.

C |= true,

C |= f1 ∨ f2 iff either C |= f1 or C |= f2, In the following ≡l stands for the logical equivalence on propositional formulas FS . Moreover, given a formula f ∈ FS we use atom(f ) to denote the set of propositional symbols that appear in f and simpl(f ) to denote the simplified version of f . The simplified version of a formula is obtained applying the standard formula simplification procedure of propositional logic converting a formula to Conjunctive Normal Form. We recall that for any formula f ∈ FS the simplified formula simpl(f ) is equivalent to f , it is minimal with respect to the propositional symbols. Thus, we have f ≡l simpl(f ) and atom(simpl(f )) ⊆ atom(f ) and there exists no formula f ′ such that f ′ ≡l f and atom(f ′) ⊂ atom(simpl(f )).

The causes of an object in a reaction system are defined by a propositional formula on the set of objects S. First of all we define the applicability predicate of a reaction a as a propositional logic formula on S describing the requirements for applicability of a, namely that all reactants have to be present and inhibitors have to be absent. This is represented by the conjunction of all atomic formulas representing reactants and the negations of all atomic formulas representing inhibitors of the considered reaction. Definition 2. Let a = (R, I, P ) be a reaction with R, I, P ⊆ S for a set of objects S. The applicability predicate of a, denoted by ap(a), is defined as follows: ap(a) = Vsr∈R sr ∧

Vsi∈I ¬si .

The causal predicate of a given object s is a propositional formula on S representing the conditions for the production of s in one step, namely that at least one reaction having s as a product has to be applicable.

Definition 3. Let A = (S, A) be a r.s. and s ∈ S. The causal predicate of s in A, denoted by cause(s, A) (or cause(s), when A is clear from the context), is defined as follows1: cause(s, A) = W{(R,I,P )∈A|s∈P } ap ((R, I, P )) .

We introduce a simple reaction system as running example.

Example 1. Let A = ({A, . . . , G}, {a1, a2, a3}) be a reaction system with 1 We assume that cause(s) = f alse if there is no (R, I, P ) ∈ A such that s ∈ P . exactly by the context sequences leading to the production of s at step n + 1. Minimal formula based predictors can be calculated in an effective way.

Given a set of objects S, we consider a corresponding set of labelled objects S × IN. For the sake of legibility, we denote (s, i) ∈ S × IN simply as si and we introduce n Sn = Si=0 Si where Si = {si | s ∈ S}. Propositional formulas on labelled objects Sn describe properties of n-step context sequences. The set of propositional formulas on Sn, denoted by FSn , is defined analogously to the set FS (presented in Sect. 3) by replacing S with Sn. The satisfaction relation of Def. 1 applies also to formulas in FSn on subsets of Sn.

A labelled object si represents the presence (or the absence, if negated) of object s in the i-th element Ci of the n-step context sequence γ = C0, C1, . . . Cn. This interpretation leads to the following definition of satisfaction relation for propositional formulas on context sequences.

Definition 4. Let γ = C0, C1, . . . Cn be a n-step context sequence and f ∈ FSn a propositional formula. The satisfaction relation γ |= f is defined as

{si | s ∈ Ci, 0 ≤ i ≤ n} |= f .

As an example, let us consider the context sequence γ = C0, C1 where C0 = {A, C} and C1 = {B}. We have that γ satisfies the formula A0 ∧ B1 (i.e. γ |= A0 ∧ B1) while γ does not satisfy the formula A0 ∧ (¬B1 ∨ C1) (i.e. γ 6|= A0 ∧ (¬B1 ∨ C1)).

The latter notion of satisfaction allows us to define formula based predictor. Definition 5 (Formula based Predictor). Let A = (S, A) be a r.s. s ∈ S and f ∈ FSn a propositional formula. We say that f f-predicts s in n + 1 steps if for any n-step context sequence γ = C0, . . . , Cn

Note that if formula f f-predicts s in n + 1 steps and if f ′ ≡l f then also f ′ f-predicts s in n + 1. More specifically, we are interested in the formulas that f-predict s in n + 1 and contain the minimal numbers of propositional symbols, so that their satisfiability can easily be verified. This is formalised by the following approximation order on FSn . Definition 6 (Approximation Order). Given f1, f2 ∈ FSn we say that f1 ⊑f f2 if and only if f1 ≡l f2 and atom(f1) ⊆ atom(f2).

It can be shown that there exists a unique equivalence class of formulas based predictors for s in n + 1 steps that is minimal w.r.t. the order ⊑f .

We now define an operator fbp that allows formula based predictors to be effectively computed. Definition 7. Let A = (S, A) be a r.s. and s ∈ S. We define a function fbp : S × IN → FSn as follows: fbp(s, n) = fbs(cause(s), n), where the auxiliary function fbs : FS × IN → FSn is recursively defined as follows: fbs(s, 0) = s0 fbs((f ′), i) = (fbs(f ′, i)) fbs(¬f ′, i) = ¬fbs(f ′, i) fbs(true, i) = true fbs(s, i) = si ∨ fbs(cause(s), i − 1) if i > 0 fbs(f1 ∨ f2, i) = fbs(f1, i) ∨ fbs(f2, i) fbs(f1 ∧ f2, i) = fbs(f1, i) ∧ fbs(f2, i) fbs(f alse, i) = f alse The function fbp gives a formula based predictor that, in general, may not be minimal w.r.t. to ⊑f . Therefore, the calculation of a minimal formula based predictor requires the application of a standard simplification procedure to the obtained logic formula. Theorem 1. Let A = (S, A) be a r.s.. For any object s ∈ S, – fbp(s, n) f-predicts s in n + 1 steps; – simpl(fbp(s, n)) f-predicts s in n + 1 steps and is minimal w.r.t. ⊑f . Example 2. Let us consider again the reaction system of Ex. 1. We are interested in the production of E after 4 steps. Hence, we calculate the logic formula that f-predicts E in 4 steps applying the function fbp: fbp(E, 3) = fbs (G ∧ ¬B) ∨ (C ∧ D), 3 = fbs(G, 3) ∧ ¬fbs(B, 3) ∨ fbs(C, 3) ∧ fbs(D, 3) = (G3) ∧ ¬(B3 ∨ fbs(A, 2))) ∨ (C3 ∧ D3) = G3 ∧ ¬B3 ∧ ¬A2 ∨ (C3 ∧ D3) A context sequence satisfies fbp(E, 3) iff the execution of the reaction system leads to the production of object E after 4 steps. Furthermore, in this case the obtained formula is also minimal given that simpl(fbp(E, 3)) = fbp(E, 3). 5

The Temporal Logic for Context Sequences

We introduce a linear temporal logic for the description of properties of finite context sequences. In the logic, propositional formulas describe the properties of single contexts (i.e. the symbols that can/cannot appear in an element of a context sequence). Hence, such formulas play the role of state formulas in traditional temporal logics. Temporal properties are expressed by variants of the usual next and until operators, and by derived eventually and globally operators.

Definition 8 (Temporal Formulas). Let S be a set of objects. The syntax of temporal logic formulas on S is defined by the following grammar: ψ ::= f ψ ∨ ψ ψ ∧ ψ

Xψ ψUkψ

Fkψ

Gkψ where f ∈ FS and k ∈ IN ∪ {∞}. We denote with T LS the set of all temporal logic formulas on S. Given a context sequence γ = C0, . . . , Cn, the simple temporal formula f states that f is satisfied by C0, according to the definition of satisfiability of the propositional logic. Formula Xψ states that ψ is satisfied by the context sequence after one step, namely by C1, . . . , Cn. Formula ψ1Ukψ2 states that there exists k′ ≤ k such that C0, . . . , Ck′−1 satisfies ψ1 and Ck′ satisfies ψ2. Formula Fkψ states that there exists k′ ≤ k such that Ck′ satisfies ψ. Formula Gkψ states that for all i ≤ k context Ci satisfies ψ. Finally, operators ∨ and ∧ are as usual.

Note that the value of n has a significant role in the satisfiability of temporal formulas. In particular, if n < k we have that ψ1Ukψ2 is satisfied also if ψ1 is satisfied by all Ci with i ≤ n (never satisfying ψ2). Moreover, if n < k we have that Fkψ is always satisfied since, intuitively, ψ may be satisfied by a context Cj , with n < j ≤ k, that is not contained in γ. An analogous reasoning holds for Xψ when n = 0 (or XXψ when n = 1, etc...). As regards Gkψ, when n < k it is equivalent to Gnψ.

As usual in temporal logics, formulas Fkψ and Gkψ are actually syntactic sugar for trueUkψ and ψUkf alse, respectively. Moreover, if k ∈ IN, we have that also ψ1Ukψ2 can be rewritten into ψ2 ∨ (ψ1 ∧ X(ψ1Uk−1ψ2)), when k > 0, and ψ2, when k = 0. Consequently, in the semantics of the temporal logic, we can omit derived operators Fk and Gk, with k ∈ IN ∪ {∞}, and Uk with k ∈ IN.

The formal definition of satisfiability of temporal logic formulas on finite n-step context sequences is as follows.

Definition 9. Let γ = C0, C1, . . . , Cn be a n-step context sequence with n ≥ 0. Given a temporal logic formula ψ ∈ T LS , the satisfaction relation γ ⊢ ψ is inductively defined as follows: γ ⊢ f iff C0 |= f

γ ⊢ Xψ iff either n = 0 or γ′ ⊢ ψ γ ⊢ ψ1 ∧ ψ2 iff γ ⊢ ψ1 and γ ⊢ ψ2 γ ⊢ ψ1 ∨ ψ2 iff either γ ⊢ ψ1 or γ ⊢ ψ2 γ ⊢ ψ1U∞ψ2 iff either γ ⊢ ψ2

or γ ⊢ ψ1 and if n > 0 then γ′ ⊢ ψ1U∞ψ2 where γ′ = C0′, . . . , Cn′−1 with Ci′ = Ci+1 for 0 ≤ i ≤ n − 1.

Finally, we present an encoding of temporal formulas on objects S into propositional formulas on labelled objects Sn. The encoding depends on the parameter n ∈ IN reporting the length of the context sequences that we want to model.

Definition 10. Let S be a set of objects and n ∈ IN. The encoding of a temporal logic formula ψ ∈ T LS into a propositional logic formula on Sn is given by [[ψ]]0n where the function [[_]]_n : T LS × IN → FSn is defined as follows: [[f ]]in =⌊f ⌋i [[Xψ]]in = ([[ψ]]in+1 true if i < n if i = n [[ψ1 ∨ ψ2]]in =[[ψ1]]in ∨ [[ψ2]]i n [[ψ1 ∧ ψ2]]in = [[ψ1]]in ∧ [[ψ2]]i n ([[ψ2]]in ∨ [[ψ1]]in ∧ [[ψ1U∞ψ2]]in+1 [[ψ2]]in ∨ [[ψ1]]i n if i < n if i = n where ⌊_⌋i : FS → FSi is a function that replaces, in a given propositional logic formula f , every s ∈ S with the corresponding labelled object si ∈ Si.

The following theorem states the main property of the encoding of temporal formulas: the result of the encoding of a temporal formula is equivalent.

Theorem 2. Let γ = C0, . . . , Cn be a n-step context sequence. For any temporal formula ψ ∈ T LS it holds that γ ⊢ ψ ⇐⇒ γ |= [[ψ]]0n . 6

Specialized Formula Based Predictors

We specialize the notion of formula based predictor (given in Def. 5) w.r.t. a subset of context sequences characterized by a temporal logic formula ψ.

Definition 11 (Specialized Formula based Predictor). Let A = (S, A) be a r.s. s ∈ S, f ∈ FSn and ψ ∈ T LS . We say that f f-predicts s in n + 1 steps with respect to ψ iff for any n-step context sequence γ = C0, . . . , Cn such that γ ⊢ ψ we have that where δ = D0, . . . , Dn is the result sequence corresponding to γ and Dn+1 = resA(Cn ∪ Dn).

It should be clear that any formula f that f-predicts s in n + 1 steps also f-predicts s in n+1 steps with respect to any logical formula ψ. In particular, the formula fbp(s, n) (and its simplified version simpl(fbp(s, n)) is a specialized predictor for s in n + 1 steps for any ψ. However, the formula simpl(fbp(s, n)) typically is too general and therefore is not a minimal (w.r.t ⊑f ) formula based predictor specialized w.r.t. ψ.

Thus, we introduce a methodology that allows us to calculate minimal formula based predictors for an object s at n + 1 step, specialized w.r.t. ψ. The idea is to use the encoding of the temporal formula ψ computed with respect to the lenght of the context sequences n. The encoding [[ψ]]0n is a propositional formula on labelled objects Sn that models the n-step context sequences satisfying ψ. A formula based predictor specialized w.r.t. ψ can be derived by exploiting the encoding [[ψ]]0n to simplify a corresponding formula based predictor.

More formally, suppose that f f-predicts s in n + 1 steps. A formula f ′ f-predicts s in n + 1 steps w.r.t a temporal formula ψ whenever f ′ is logically equivalent to f considering the context sequences satisfying ψ, that is assuming that the conditions formalized by the formula [[ψ]]0n holds. The following theorem establishes this fundamental property of specialized formula based predictors.

Theorem 3. Let A = (S, A) be a r.s., s ∈ S, and f ∈ FSn such that f f-predicts s in n + 1 steps. Given ψ ∈ T LS and f ′ ∈ FSn such that

[[ψ]]0n ⇒ f ≡l f ′ we have that f ′ f-predicts s in n + 1 steps with respect to ψ.

We introduce a minimization algorithm that allows us to calculate a minimal formula f ′ that satisfies the property of Theorem 3, given a predictor f and a temporal logic formula ψ.

Let B = {true, f alse} and consider the incompletely specified booean function F : Bn → B composed of the following two disjoint subsets Y es, DontC of Bn. Y esF = {x1, .., xn | x1, .., xn is a boolean assignment that satisfies [[ψ]]0n ∧ f } is the subset of Bn where F is evaluated to 1 and DontCF = {x1, .., xn | x1, .., xn is a boolean assignment that satisfies ¬[[ψ]]0n} is a subset of Bn where the value of F is not specified. In each point of DontCF , called don’t care, F can assume the value 1 or 0. So there are m = 2|DontCF | different functions equivalent to F that may take the place of F according to particular needs. Consider any possible extention of F , let us call them F 1, ..., F m, obtained by considering Y esF i = Y esF ∪ Di with Di ⊆ DontCF and DontCF i = ∅. Each F i is now a completely specified boolean function and it can be shown that in this case its minimal sum of product form (SOP) also contains a minimal number of different variables. Hence, any algorithm that computes the minimal SOP form of F i (i.e. any well known algorithm looking for the prime implicants of F i) can be applied in order to minimize the number of variables of F i. Then consider the f ′ corresponding to the minimal SOP form between all the minimal SOP form of F 1, ..., F m, we are guaranteed that such f ′ has the minimal number of variables.

It is worth noting that considering all possible extensions of the incompletely specified boolean function F as well computing the SOP form of a completely specified booean function F i has an exponential cost. However, some heuristic that, in general, cannot guarantee minimality can be designed in order to improve the efficiency. Example 3. Let us consider the r.s. of Ex. 1 with the following reactions: Formula ψ1 describes an invariant property which holds at any step of the context sequence. The property models the situation in which object B is not a synthesizable product, i.e., it can not be found in the environment but it has to be produced by the reactions. Moreover, if the environment supplies C it also supplies D at the same time. Similarly, formula ψ2 shows that the environment always supply the object B. By contrast, formula ψ3 says that the environment will eventually supply the object B within 3 steps.

We show the specialized formula based predictors w.r.t. the previous temporal formulas considering again the production of E in 4-steps. We recall that the corresponding minimal formula based predictor (presented in Ex. 2) is simpl(fbp(E, 3)) = fbp(E, 3) =

G3 ∧ ¬B3 ∧ ¬A2 ∨ (C3 ∧ D3). In order to calculate the specialized versions of predictor we compute the encoding of the temporal formulas obtaining: [[ψ1]]03 = (¬B0 ∧ ¬B1 ∧ ¬B2 ∧ ¬B3) ∧ (¬C0 ∨ D0) ∧ (¬C1 ∨ D1) ∧ (¬C2 ∨ D2) ∧ (¬C3 ∨ D3) , [[ψ2]]03 = B0 ∧ B1 ∧ B2 ∧ B3 ,

[[ψ3]]03 = B0 ∨ B1 ∨ B2 ∨ B3 .

By applying the minimization algorithm to fbp(E, 3) and to ψ1, ψ2 and ψ3, we obtain the correponding specialized formulas f1, f2 and f3, respectively: f1 = (G3 ∧ ¬A2) ∨ C3 f2 = (C3 ∧ D3), f3 = fbp(E, 3).

In the case of formulas ψ1 and ψ2, the specialized formulas f1 and f2 are substantially reduced w.r.t. the formula based predictor, while formula ψ3 does not lead to a reduced specialized predictor. 7

Application

In this section we introduce a more complex biological example: the lac operon expression in the E. coli bacterium. The lactose operon is a sequence of genes that are responsible for producing enzymes for lactose degradation. We borrow the formalization of this biological system as a reaction system from [ 5 ]. Let A = (S, {a1, . . . , a10}) be the reaction system where S = {lac, lacI, I, I-OP, cya, cAM P, crp, CAP, cAM P -CAP, lactose, glucose, Z, Y, A} and the reaction rules are defined as follows a1 = ({lac}, {}, {lac}) a2 = ({lacI}, {}, {lacI}) a3 = ({lacI}, {}, {I}) a4 = ({I}, {lactose}, {I-OP }) a5 = ({cya}, {}, {cya}) a6 = ({cya}, {}, {cAM P }) a7 = ({crp}, {}, {crp}) a8 = ({crp}, {}, {CAP }) (lac operon duplication) (repressor gene duplication) (repressor gene expression) (regulation mediated by lactose) (cya duplication) (cya expression) (crp duplication) (crp expression) a9 = ({cAM P, CAP }, {glucose}, {cAM P -CAP }) (regulation mediated by glucose) a10 = ({lac, cAM P -CAP }, {I-OP }, {Z, Y, A}) (lac operon expression)

The regulation process is as follows: gene LacI encodes the lac repressor I, which, in the absence of lactose, binds to gene OP (the operator). Transcription of structural genes into mRNA is performed by the RNA polymerase enzyme which transcribes the three structural genes represented by lac into a single mRNA fragment. When the lac repressor I is bound to gene OP (that is, the complex I-OP is present) it becomes an obstacle for the RNA polymerase, and transcription of the structural genes is not performed. On the other hand, when lactose is present inside the bacterium, it binds to the repressor thus inhibiting the binding of I to OP . This inhibition allows the transcription of genes represented by lac by the RNA polymerase.

Two more genes encode for the production of two particular proteins: cAM P and CAP . These genes are called, respectively, cya and crp, and they are indirectly involved in the regulation of the lac operon expression. When glucose is not present, cAM P and CAP proteins can produce the complex cAM P -CAP which can increase significantly the expression of lac genes. Also in presence of the cAM P -CAP complex, the expression of the lac genes is inhibited by I-OP .

Note that the reactions {a1, a2, a5, a7} are needed to ensure the permanency of the genes in the system.

In [ 5 ] the authors investigate the effects on the production of enzymes Z, Y and A when the environment provides both glucose and lactose, only glucose, only lactose, or none of them. The genomic elements lac, lacI, cya and crp together with the proteins I, cAM P and CAP , that are normally present in the bacterium, are supplied to the system by the starting context C0. Then, an example context sequence γ = C0, . . . , C40 is considered, in which every element Ci with 1 < i <= 40 is a subset of {glucose, lactose}. Such a context sequence represents an environment in which the supply of glucose and lactose varies over time. By observing the result states D1, . . . , D40 obtained by executing the reaction system, the authors conclude that the enzymes Z, Y and A are produced in a step i only if lactose was the only element provided to the system two steps before. Formally, this can be expressed as follows:

Z, Y, A ∈ Di iff Ci−2 = {lactose}, with i > 3.

This conclusion has been reached empirically, by observing a single execution of the system with respect to an example context sequence. The conditions for the production of Z, Y and A can instead be studied by applying the notions of predictor we defined in this paper.

For sake of simplicity we consider the formula based predictors for the enzymes Z, Y and A in 4 steps, noting that the effects of all reactions can be observed after four steps. We obtain the following formula fbp(Z, 3) =fbp(Y, 3) = fbp(A, 3) =

((lac3 ∨ lac2 ∨ lac1 ∨ lac0) ∧ (cAM P CAP3 ∨ ((cAM P2 ∨ cya1 ∨ cya0) ∧ (CAP2 ∨ crp1 ∨ crp0) ∧ ¬glucose2)) ∧ ((¬IOP3 ∧ ¬I2 ∧ ¬lacI1 ∧ ¬lacI0) ∨ lactose2)).

The obtained formula is minimal w.r.t. ⊑f and therefore does not require the application of simplification techniques. Since we are interested in studying context sequences that supply only glucose and lactose it is useful to specialize the previous formula based predictor for context sequences described by the following temporal formula:

ψ = finitial ∧ G∞¬finternal ∧ XG∞¬f ′initial where finitial = lac ∧ lacI ∧ cya ∧ crp ∧ I ∧ cAM P ∧ CAP , f ′initial = lac ∨ lacI ∨ cya ∨ crp ∨ I ∨ cAM P ∨ CAP and finternal = I ∨ IOP ∨ cAM P CAP ∨ Z ∨ Y ∨ A.

Let us focus on the production of A, noting that, since fbp(Z, 3) = fbp(Y, 3) = fbp(A, 3), the same reasoning apply to enzymes Z and Y . By applying the minimization procedure described in Sect. 6 to fbp(A, 3) and ψ, we obtain the following reduced predictor of A in 4 steps specialized w.r.t. ψ

f ′ = ¬glucose2 ∧ lactose2.

Formula f ′ clearly shows that the enzymes A is produced at step 4 only if lactose was actually the only element provided to the system two steps before.

The specialized predictor f ′ we obtained agrees with the conclusion reached in [ 5 ] by reasoning on an example of context sequence.

Note that the same conclusion was reached in [ 1 ] by simplifying predictor fbp(A, 3) on the basis of informal reasoning on the properties of context sequences. In this paper, we have been able to obtain f ′ in a fully formalized and automatic way, on the basis of fbp(A, 3) and ψ. 8

Conclusions

We have presented a revised notion of formula based predictor [ 1 ] in which the predictor is specialized with respect to a temporal logic formula ψ expressing the properties of the context sequences. More specifically, the formula based predictor models the necessary conditions for an object s to be produced in n steps, assuming that the reaction system is executed with respect to a context sequence satisfying ψ. The advantage of the specialized version of predictor is that the resulting propositional formula can be substantially reduced with respect to the corresponding formula based predictor. As a consequence, this approach is very convenient whenever we are interested to observe whether an element s will be produced after n steps or not for a certain class of context sequences rather than for any possible context sequences.

As future work we plan to consider the application of the tabling and generalization procedures described in [ 1 ] to specialized predictors.

1. Barbuti , R. , Gori , R. , Levi , F. , Milazzo , P. : Investigating dynamic causalities in reaction systems. Submitted for pubblication ( 2015 )

2. Brijder , R. , Ehrenfeucht , A. , Main , M.G. , Rozenberg , G.: A Tour of reaction Systems . Int. J. Found. Comput. Sci . 22 ( 7 ), 1499 - 1517 ( 2011 )

3. Brijder , R. , Ehrenfeucht , A. , Rozenberg , G.: A Note on Causalities in Reaction Systems . ECEASST 30 ( 2010 )

4. Brijder , R. , Ehrenfeucht , A. , Rozenberg , G.: Representing Reaction Systems by Trees . In: Computation, Physics and Beyond. LNCS , vol. 7160 , pp. 330 - 342 ( 2012 )

5. Corolli , L. , Maj , C. , Marini , F. , Besozzi , D. , Mauri , G. : An excursion in reaction systems: From computer science to biology . Theor. Comput. Sci . 454 , 95 - 108 ( 2012 )

6. De Castro , L.N. : Fundamentals of natural computing: basic concepts, algorithms, and applications . CRC Press ( 2006 )

7. Ehrenfeucht , A. , Rozenberg , G.: Events and modules in reaction systems . Theor. Comput. Sci . 376 ( 1-2 ), 3 - 16 ( 2007 )

8. Ehrenfeucht , A. , Main , M.G. , Rozenberg , G.: Functions Defined by Reaction Systems . Int. J. Found. Comput. Sci . 22 ( 1 ), 167 - 178 ( 2011 )

9. Ehrenfeucht , A. , Main , M.G. , Rozenberg , G. , Brown , A.T.: Stability and Chaos in reaction Systems . Int. J. Found. Comput. Sci . 23 ( 5 ), 1173 - 1184 ( 2012 )

10. Ehrenfeucht , A. , Rozenberg , G.: Reaction Systems . Fundam. Inform. 75 ( 1-4 ), 263 - 280 ( 2007 )

11. Kari , L. , Rozenberg , G. : The many facets of natural computing . Communications of the ACM 51 ( 10 ), 72 - 83 ( 2008 )

12. Pardini , G. , Barbuti , R. , Maggiolo-Schettini , A. , Milazzo , P. , Tini , S.: A compositional semantics of reaction systems with restriction . In: Bonizzoni, P. , Brattka , V. , Löwe , B . (eds.) The Nature of Computation . Logic, Algorithms, Applications, Lecture Notes in Computer Science, vol. 7921 , pp. 330 - 339 . Springer Berlin Heidelberg ( 2013 )

13. Pardini , G. , Barbuti , R. , Maggiolo-Schettini , A. , Milazzo , P. , Tini , S. : Compositional semantics and behavioural equivalences for reaction systems with restriction . Theor. Comput. Sci. 551 , 1 - 21 ( 2014 )