Modelling of Grey Wolf Optimization Algorithm Using 2D P Colonies

                       Daniel Valenta1 , Lucie Ciencialová1,2 , Miroslav Langer1,2 , and Luděk Cienciala1,2
                             1 Institute of Computer Science, Silesian University in Opava, Czech Republic
            2   Research Institute of the IT4Innovations Centre of Excellence, Silesian University in Opava, Czech Republic
                {daniel.valenta, lucie.ciencialova, miroslav.langer, ludek.cienciala }@fpf.slu.cz

Abstract: The P colonies are a well-established version of                motion rule, then the other rule is an evolution rule. The
the P systems, the computational device based on mem-                     number of objects inside each agent is set by definition and
brane computing. One branch of the research of the P                      it is usually a very small; one, two or three.
colonies focuses on the possibility to consider the two-                  The agent itself has information neither about its position
dimensional environment, in which the agents act, and the                 nor about the position of the other agents in the environ-
2D P colonies were introduced. 2D P colonies showed                       ment. The communication between the agents is also pos-
to be suitable for the simulations of various (not only)                  sible only via the environment by leaving special symbols
multi-agent systems, and natural phenomena, like the flash                in it. These factors are limiting the 2D P colonies in sim-
floods.                                                                   ulating more advanced communities of agents. For the re-
However, the agents of the 2D P colony are able to com-                   mote information exchange, we add the agents the possi-
municate via the environment, by leaving some special                     bility to store and read the information from a blackboard.
symbols in it, this may not be sufficient for simulating                  The blackboard is a read/write device with an unchange-
some more complex communities of the agents, for ex-                      able structure given by definition. The agents can change
ample the hunting pack of the grey wolves.                                only defined parts of the blackboard.
In this paper, we introduce a formal model of the 2D P
colonies with the blackboard as an extension of the 2D P
colonies. We also allow the agents to use simple relational               2    Grey Wolf Optimization Algorithm
operations to compare real numbers that can be stored in-
side special symbols. We use this model to simulate the                   Grey wolf optimization algorithm (GWO) (see [4]) is al-
Grey wolf algorithm.                                                      ready well–established meta-heuristic optimization tech-
                                                                          nology. It is inspired by social dynamics found in packs
Keywords: pack algorithm, P system, 2D P colony, opti-                    of grey wolves and by their ability to dynamically create
mization, multi-agent system, blackboard.                                 hierarchies in which every member has a clearly defined
                                                                          role. Each wolf can fulfil one of the following role:
1    Introduction                                                             • Alfa is the dominant pair and the pack follows their
                                                                                lead for example during hunts or while locating a
2D P colony (see [1]) is a variant of P colonies (see [2]),
                                                                                place to sleep.
a community of agents set in the two dimensional envi-
ronment. It is a very simple membrane system originally                       • Beta wolves support and respect the Alpha pair dur-
derived from the P systems (see [3]).                                           ing its decisions.
2D P colony consists of a finite number of agents living in
a shared environment. The environment of 2D P colony is                       • Delta wolves are subservient to Alpha and Beta
represented by a 2D grid of square cells. In each cell, there                   wolves, follow their orders, and control Omega
is a multiset of objects. Each agent is represented by a fi-                    wolves. We divide them into scouts – they observe
nite collection of objects enclosed with a membrane. The                        the surrounding area and warn the pack, sentinels –
agent has programs consisting of rules. These rules are of                      they protect the pack when endangered and caretak-
three types: they may change the objects of the agents and                      ers – they provide aid to old and sick wolves.
they can be used for interacting with the joint shared envi-                  • Omega wolves help to filter the aggression of the
ronment of the agents and movement rule. The direction                          pack and frustrations by serving as scapegoats.
of the movement of the agent is determined by the contents
of cells surrounding the cell in which the agent is placed.               The primary goal of the wolves is to find and hunt down
The program can contain at most one motion rule. There                    prey in their environment. The prey equals the optimal
is one more condition set to achieve the greatest simplicity              solution to the given problem. The environment is repre-
in agent’s behaviour. If the agent moves, it cannot commu-                sented by a mathematical fitness function characterizing
nicate with the environment. So, if the program contains a                the problem. The value of the function at the current po-
      Copyright c 2020 for this paper by its authors. Use permitted un-
                                                                          sition of the particular wolf represents the highest-quality
der Creative Commons License Attribution 4.0 International (CC BY         prey. The wolf with the best value is ranked as Alpha, the
4.0).                                                                     second one as Beta, third as Delta, and all the other are
Omegas.                                                            Wolves’ positions within the environment are updated
The hunting technique of a wolf pack can be divided into         based upon the estimated location of the prey using Alpha,
5 steps:                                                         Beta, and Delta wolves as guides.
                                                                   Let X~ j (i) be a position vector of wolf j in i-th iteration.
  • Search for the prey – wolves are attempting to find the
    most valuable prey with respect to the effort required       The position vector of wolf j is updated as follows:
    to hunt it successfully.                                                                     ~    ~    ~
                                                                                   ~ j (i + 1) = X1 + X2 + X3 ,
                                                                                   X
  • Exploitation of the prey – wolves are attempting to                                               3
    draw attention to themselves and to separate the prey                                                                ~1 ,
                                                                 where i is the current iteration of the algorithms, and X
    from its herd.                                               ~    ~
                                                                 X2 , X3 are new potential position vectors of Alpha, Beta,
  • Encircling prey – the attempt to push the prey into a        and Delta wolves obtained from following formulas:
    situation from which it cannot escape.                                         ~1                ~1 ∗ D~α
                                                                                   X     = X~α (i) − A
  • The prey is surrounded – it can no longer escape.                              ~2
                                                                                   X                 ~2 ∗ D
                                                                                         = X~β (i) − A     ~β
                                                                                   ~3
                                                                                   X                 ~3 ∗ D
                                                                                         = X~δ (i) − A     ~δ
  • The attack – wolves attack the weak spots of the prey
    (belly, legs, snout) until it succumbs to fatigue. Af-
                                                                 where X~α (i), X~β (i), X~δ (i) are the position vectors of Al-
    terwards, they bring it down and crush its windpipe.
                                                                 pha, Beta, and Delta wolves, they are representing the po-
 The algorithm is inspired by this process and smoothly          sitions in the environment that are closest to the optimum
 transitions between scouting and hunting phases. In the         in i-th iteration. The vectors A    ~1 , A
                                                                                                          ~2 , A
                                                                                                               ~3 are calculated in
 scouting phase, the pack extensively scouts its environ-                                      ~
                                                                 the same way as vector A. The vectors D~α , D          ~β , D
                                                                                                                             ~ δ are
 ment through many random movements so that the al-              defining the distance of the wolf j position from the prey
 gorithm does not get stuck in a local extreme, while in         as follows:
 the hunting phase, the influence of random movements
 is slowly reduced and pack members draw progressively                           D~α    =    ~1 ∗ X~α (i) − X
                                                                                             C              ~ j (i)
 closer to the discovered extreme. To maintain the diver-                        ~β
                                                                                 D      =    ~2 ∗ X~β (i) − X
                                                                                             C              ~ j (i)
 gence between those phases, each wolf is assigned vectors
~A and C.
        ~                                                                        ~δ
                                                                                 D      =    ~3 ∗ X~δ (i) − X
                                                                                             C              ~ j (i)
          ~A is a vector with components rand (−1, 1) ∗ a,
 where rand(−1, 1), generates a random number between            where |~X| is the vector whose components are the absolute
 −1 and 1 and where                                              values of the components of ~X.
                                                                                 ~1 , C
                                                                                      ~2 , C
                                                                                           ~3 are computed in the same way as
                               
                                2i
                      a = 2 − imax ,                                 The vectors C
                                                                         ~
                                                                 vector C, and they influence the weight of the estimated
 while i is the algorithm’s current iteration, and imax is the
                                                                 position of the prey X~α , X~β , X~δ , increasing or decreasing
 maximum number of iterations.
                                                                 it.
    Vector ~A is random value between −2 and 2. We can
                                                                     We can see in Fig. 2 that thanks to this principle wolves
 see its impact in the Fig. 1. With growing iterations, it is
                                                                 have the tendency to move closer towards prey and encir-
 more likely that its value will be between -1 and 1. That
                                                                 cle it (wolves approach from different directions).
 makes it more likely for a wolf to be hunting. Another


          Figure 1: Vector ~A and its impact in 1D

component supporting the scouting phase is vector
                  ~ = rand(0, 2),
                  C
where rand(0, 2), generates a random number between              Figure 2: Positional updates of Omega wolves as it is de-
0 and 2. Vector C  ~ is like vector ~A, but iterations don’t
                                                                 scribed by the mathematical formula [4]
influence it. It helps the wolves behave more naturally.
Analogously, in nature, wolves encounter various obsta-
cles which prevent them from approaching prey comfort-
                                                                 2.1   Algorithm pseudocode
ably. Vectors ~A and C
                     ~ encourages wolves to prefer scout-
ing, or hunting, and so to avoid local optima regardless of      In this subsection we describe the algorithm in pseudo-
the algorithm’s current iteration.                               code. The inputs of the algorithm are dimensions of the
environment of the problem, boundaries of the environ-               • e ∈ A is the basic environmental object of the 2D P
ment of the problem, fitness function characterizing the               colony,
problem, size of the pack (number of wolves/agents), num-
ber of iterations of the algorithm, termination criteria and         • Env is a pair (m × n, wE ), where m × n, m, n ∈ N is
criteria of the fitness function.                                      the size of the environment and wE is the initial con-
                                                                       tents of the environment, it is a matrix of size m × n of
                                                                       multisets of objects over A − {e}.
                                                                     • Bi , 1 ≤ i ≤ k, are agents, each agent is a construct
                                                                       Bi = (oi , Pi , [o, p]) , 0 ≤ o < m, 0 ≤ p < n, where
                                                                          – oi is a multiset over A, it determines the initial
                                                                            state (contents) of the agent, |oi | = 2,
                                                                                  
                                                                          – Pi = pi,1 , . . . , pi,li , l ≥ 1, 1 ≤ i ≤ k is a finite
                                                                            set of programs, where each program contains
                                                                            exactly 2 rules, which are in one of the following
                                                                            forms each:
                                                                               ∗ a → b, called the evolution rule, a, b ∈ A,
                                                                               ∗ c ↔ d, called the communication rule,
                                                                                 c, d ∈ A,
                                                                               ∗ [aq,r ] → s, 0 ≤ q, r ≤ 2, s ∈ {⇐, ⇒, ⇑, ⇓},
                  Figure 3: Algorithm steps                                      called the motion rule,
                                                                     • f ∈ A is the final object of the colony.
    The algorithms pseudocode follows:
                                                                    A configuration of the 2D P colony is given by the state
    • In the first step, agents (wolves) are randomly spread     of the environment - matrix of type m × n with multisets
      out across the environment.                                of objects over A − {e} as its elements, and by the state
                                                                 of all agents - pairs of objects from alphabet A and the
    • In each iteration i:
                                                                 coordinates of the agents. An initial configuration is given
         – calculate the fitness value of each agent and de-     by the definition of the 2D P colony.
           termine the social hierarchy – Fig. 3. part 1.           A computational step consists of three parts. The first
           The agent with the best value (closest to the op-     part lies in determining the set of applicable programs ac-
           timum) is Alpha, second best is Beta, third best      cording to the current configuration of the 2D P colony.
           is Delta, and all others are Omega.                   In the second part, we have to select from this set one pro-
                                                                 gram for each agent, in such a way that there is no collision
         – calculate the best solution found thus far by Al-     between the communication rules belonging to different
           pha, Beta and Delta (X~α (i), X~β (i), X~δ (i)) and   programs. The third part is the execution of the chosen
           average it – Fig. 3. part 2,                          programs.
         – update positions of all the wolves X j (i + 1),          A change of the configuration is triggered by the execu-
           while random vectors ~A and C
                                       ~ are updated for         tion of programs and it involves changing the state of the
           each one – Fig. 3. part 3,                            environment, contents and placement of the agents.
                                                                    A computation is non-deterministic and maximally par-
         – check the termination criterion – Fig. 3. part 4.
                                                                 allel. The computation ends by halting when there is at
           Iterations terminate when fitness function value
                                                                 least one agent that has no applicable program.
           reaches a preset value.
                                                                    The result of the computation is the number of copies of
                                                                 the final object placed in the environment at the end of the
3     2D P Colonies                                              computation.
                                                                    The aim of introducing 2D P colonies is not studying
In this section, we recall the definition of 2D P colonies       their computational power but monitoring their behaviour
and other terms related to them.                                 during the computation.

Definition 1. A 2D P colony is a construct
                Π = (A, e, Env, B1 , . . . , Bk , f ), k ≥ 1,
                                                                 4     Modelling of Grey wolf optimization
where                                                                  algorithm using 2D P colonies
    • A is an alphabet of the colony, its elements are called    As for the modelling of GWO, some similarities as well as
      objects,                                                   a few differences have been found. Both are inspired by
                                                                                    
the nature, usable for solving optimization problems, and                            y+1                       for x = 0
both are multi-agent system models. For comparison see                  fE (x, y) =   fE (x − 1, 1)             for m > 0 and n = 0
                                                                                      fE (x − 1, fE (x, y − 1)) otherwise
                                                                                    
the differences / problems:

  • Environmental problem,
                                                                                         1a     2a    3a      4a
  • Communication problem,                                                               2a     3a    4a      5a
                                                                                         3a     5a    7a      9a
  • Randomness problem.                                                                  5a    13 a   29 a   61 a
These problems are described in the Table 1.
                                                                          Figure 4: Graphical representation of the environment
Table 1: Differences between Grey wolf algorithm and P
colony
 Difference / Grey wolf algo- 2D P colony
 System            rithm                                                4.2   Communication problem solution
 Environmental The environment The environment
 problem           is represented by is represented by                  We extend the 2D P colony by adding the blackboard that
                   a mathematical a multiset of sym-                    saves the agents’ best fitness values and is always acces-
                   fitness function.  bols in each cell                 sible to read and write by all agents. To simulate GWO
                                      of 2D grid.                       agents cannot decide how to continue with hunting without
 Communication The agents have They are commu-                          knowledge of their position in the environment, because
 problem           the knowledge of nities of simple                    the fitness value is not enough for calculating the prey’s
                   their global posi- reactive agents                   estimated position. That is why the approximate position
                   tion in the envi- independently                      is stored in the blackboard. The position of each agent is
                   ronment.           living and acting                 computed by a function using moving receivers (see Sec-
                                      in a joint shared                 tion 6. for more details) We will use the blackboard as a
                                      environment.
 Randomness        Random vectors Each rule is
 problem          ~A and C~ influence deterministic,
                   the movement of the only way
                   wolves in the en- to      implement
                   vironment.         randomness      is
                                      to     randomize
                                      the      choosing                           Figure 5: Simple Blackboard example
                                      rule for identical
                                      configurations.                   means of giving feedback to agents. Agents do not need
                                                                        to know their position in the environment, all agents who
   The definition of the 2D P colony needs to be adjusted               can contribute to the search will send their solution to the
to meet described requirements. Proposed solutions for                  blackboard points called receivers, and estimation of prey
those problems are described in the following subsections.              position is calculated as an average of distances collected
                                                                        by blackboard points from wolves Alpha, Beta and Delta.
                                                                        Omega wolves can ping the blackboard if changing po-
4.1   Environmental problem solution                                    sition and get their distance from the prey. If the distance
                                                                        would decrease compared to the original distance, then the
The environment will be a pair of matrix m × n and
                                                                        wolf will move.
fitness function fE , where m × n, m, n ∈ N is the size
of the environment and fE is a mathematical function
with the initial contents of environment. Alphabet V will
contains objects
               n that can store real
                                  o Snumbers and common
                         0   00
objects, V = x , x , x , . . .        {a, b, c, d, e, f , g, . . . },
The rules of the programs, which guide the agents,
will compare the number values of the objects using
operators greater than ” > ” and greater or equal to
” ≥ ”. The agent Ai will be defined as Ai = (oi , Pi , [o, p]),
|oi | = 2. For example, environment can be defined as
Env = (4 × 4, wE , fE ), with object a in each cell, and                              Figure 6: Blackboard structure
fitness function fE is the Ackermann function
4.3   Randomness problem solution                                 on the blackboard at position [2, 1]. By execution of rule
                                                                              00                                    00
                                                                  Update( 15 , BB[0, 4]), the agent with object 15 places
The omega wolves movement is based not only on posi-
                                                                  number 15 into the fifth member in the first row of the
tion of the prey but also on random vectors ~A and C.
                                                    ~ In
                                                                  blackboard. Only one agent can read or update value in
2D P Colony model the randomness is replaced by non-
                                                                  one position at the same time. If two agents have ap-
deterministic choice between several applicable rules.
                                                                  plicable programs that affect the same position in black-
                                                                  board, then only one of them can execute its program.
5     Model of numerical 2D P colony with the                     The other agent must use another applicable program, or
      blackboard                                                  be inactive if another applicable program does not exist.
                                                                  The agent able to write on the blackboard is chosen non-
Let us consider that there is not only multiset of objects in     deterministically.
each environmental cell, but there is also one (natural or           All newly introduced rules can be combined in the pro-
real) number computed by a function (called an environ-           grams with any kind of rules, except the movement rule,
mental function) depending on the position of the cell and        that can be paired only with the rewriting rule.
time. This number can be read by the agent, but it can be
                                                                  Definition 2. A numerical 2D P colony with blackboard
changed only by the environmental function.
                                                                  is a construct
    The number can be stored inside the agent inside special
                                                                            Π = (V, e, Env, A1 , . . . , Ak , BB, f ), k ≥ 1, where
kind of objects (box-objects). To read an environmental
number the agent needs a special reading rule. Using this           • V is an alphabet of the colony, its elements are called
rule agent rewrites an object inside of it into the special           objects, there are special objects b that can contain
object that contains the number that equals the environ-              an arbitrary number,
mental number. This rule is in the form: a  x , where
a ∈ V is an object and x is the environmental number in             • e ∈ V is the basic environmental object of the 2D P
the cell, where the agent executing the rule is placed. The           colony,
environmental number can be read by more than one agent             • Env is a triplet (m × n, wE , fE ), where m × n, m, n ∈ N
at the same time, and each agent can use the reading rule             is the size of the environment, and wE is the initial
multiple times in one program. For example, let ab is the             contents of the environment. It is a matrix of size m ×
content of agent with program p : a  x ; b  x , and                 n of multisets of objects over V − {e}, and fE is an
5 is the number stored in the environmental cell where the            environmental function.
agent is placed. After execution of the program p the con-
tent of agent is 5 5 .                                              • Ai , 1 ≤ i ≤ k, are agents. Each agent is a construct
    If there are two objects with numbers inside the agent,           Ai = (oi , Pi , [o, p]) , 0 ≤ o ≤ m, 0 ≤ p ≤ n, where
an execution of particular program can be restricted by
                                                                          – oi is a multiset over V , it determines the initial
these numbers. Agent can compare the numbers, and
                                                                            state (contents) of the agent, |oi | = 2,
distinguish which rule will be applied to which object.                           
The programs with such condition look like follows:                       – Pi = pi,1 , . . . , pi,li , l ≥ 1, 1 ≤ i ≤ k is a finite
hx > y : r1 , r2 i or hx ≥ y : r1 , r2 i , where x, y ∈ R and               set of programs, where each program contains
r1 , r2 are rules that work with objects with numbers. These                exactly 2 rules. Each rule is in the following
box-objects can be of different kind but both must store a                  form:
number.                                                                         ∗ a → b, the evolution rule, a, b ∈ V ,
    The memory of 2D P colony is extended by a black-                           ∗ c ↔ d, the communication rule, c, d ∈ V ,
board. It is a pair BB = (~fBB , BM), where ~fBB is a vector                    ∗ [aq,r ] → s, aq,r ∈ V, 0 ≤ q, r ≤ 2, s ∈ {⇐, ⇒
of i functions fi , and BM is a matrix of size i × j, i, j ∈ N.                   , ⇑, ⇓}, the motion rule,
 fi is a function that fills the members of corresponding row                   ∗ a  x , x ∈ R, a, x ∈ V, the reading rule
of matrix BM with values. The function can update only
such members that are not updated by agents.                                 If the program contains evolution or communi-
    To access and affect the blackboard, the agents have the                 cation rule r1 , r2 that each works with objects
rules for reading and writing a value placed in a spec-                      with numbers, it can be extended by a condition:
ified part of the blackboard. These rules use functions                      hx > y : r1 , r2 i , hx ≥ y : r1 , r2 i ,
Get(BB[ad], b) and Update(b, BB[ad]), where BB[ad] is                     – [o, p], 1 ≤ o ≤ m, 1 ≤ p ≤ n, is an initial posi-
an address of value on the blackboard which have to                         tion of agent Ai in the 2D environment,
read or written, and b ∈ V is the box-object. The get
                                                                    • f ∈ V is the final object of the colony.
rule is in a form a → Get(BB[ad], b) and the update rule
consist only from the function Update(b, BB[ad]), where              A configuration of the 2D P colony is given by the state
a, b ∈ V . When the rule a → Get(BB[2, 1], x ) is exe-            of the environment - matrix of type m × n with pairs - mul-
cuted, the object a inside the agent is evolved into the ob-      tiset of objects over V − {e}, and a number - as its ele-
ject x , and this object is filled in with the number stored      ments, and by the states of all agents - pairs of objects
                                                                        1. e  x 0 ; e → Get(BB[alpha, x ], i x ∈ R is num-
                                                                                                               
from the alphabet V , and the coordinates of the agents. An
initial configuration is given by the definition of the 2D P               ber placed in the environmental cell [r, s], alpha is ad-
colony.                                                                    dress of current value y of alpha wolf in blackboard;
   A computational step consists of three phases. In the                   This program is to read the number from the envi-
first step, the set of the applicable programs is determined               ronmental cell and the value of current alpha wolf in
according to the current configuration of the 2D P colony.                 blackboard.
In the second step, one program from the set is chosen for
each agent, in such a way that there is no collision between            2. The following set of programs Compare is to com-
the communication rules belonging to different programs.                   pare two numbers stored inside the agent:
In the third step, chosen programs are executed, the values                      D                          E
of the environment and on the blackboard are updated.                       (a) x > y : x → x , y 0 → A – I am new Al-
   A change of the configuration is triggered by the exe-                        pha,
cution of programs, and updating values by functions. It                    (b) x ≥ y : x 0 → b, x → x
involves changing the state of the environment, contents                        D                              E
                                                                                                            00
and placement of the agents.                                                (c) x → x , b ← Get(BB[beta], x ) ,
   A computation is non-deterministic and maximally par-                        D                       E
allel. The computation ends by halting when there is no                     (d) x > y : x → x , y 00 → B – I am new Beta,
agent that has an applicable program.
                                                                            (e) x ≥ y : x 00 → c, x → x
   The result of the computation is the number of copies of                     D                                   
the final object placed in the environment at the end of the                                                    000
                                                                             (f) x → x , c ← Get(BB[BB[gamma], x ], i,
computation.                                                                    D                        E
                                                                            (g) x > y : x → x , y 000 → C – I am new
                                                                                Delta,
5.1   Numerical 2D P Colony with the Blackboard for
      GWO                                                                   (h) x ≥ y : x 000 → d, x → x iv – I am Omega,

Because of a restricted number of pages of this contribu-               3. If there is A, B or C inside the agent, the agent updates
tion we show an idea of the construction of a numerical 2D                 the blackboard using programs:
P colony with the blackboard that models GWO in finding
                                                                            (a) Update( x , BB[alpha]), A → a0 ,
extreme of a function with two variables in this section.
   Pgw = (V, e, Env, A1 , A2 , . . . , Ak , BB, f ), k ≥ 1, where:          (b) Update( x , BB[beta]), B → a0 ,
                                                                            (c) Update( x , BB[gamma]), C → a0 ,
                      0    00       000     iv     v    vi
          n                                                o
   • V= b, b , b , b , b , b , b                             ∪              (d)    x   → e, a0 → e ,
                0               0     00                0
                                               , mKO , mKO , m00KO ∪
       
     ∪ e, f , a , b, c, d,
                          h, 0h ,00h ,000mOK
     ∪{n,                                   iv                          4. If the agent is an omega wolf, the agent is supposed
        A, B, D}∪ l, l , l , l , l | Y ∈ {⇐, ⇒, ⇑, ⇓} ∪                   to move so as to approach the prey. The agent reads
     ∪ kz1 z2 z3 z4 | zi ∈ {⇐, ⇒, ⇑, ⇓} ∧ k, i ∈ {1, 2, 3, 4}
                                                                           its distance from the prey computed by the function
  • e ∈ V is the basic environmental object,                               and placed on the blackboard, and it moves in a ran-
                                                                           dom direction. The direction is generated in such a
  • Env is a triplet (i × j, we , f (x, y)), where i, j ∈ N,               way, that the agent creates the object 1 with a low in-
    wE = |ar,s |, ar,s = ε, 1 ≥ r ≥ i, 1 ≥ s ≥ j,                          dex formed from four directions in random order. 1
                                                                           means that the agent will move in the first direction.
  • A1 , A2 , . . . , Ak are the agents, Ai = (Oi , Pi , [rx , ry ]),
                                                                                                                                 v
    where:                                                                  (a) d → 1w , x iv → Get(BB[dist. from prey], x ) ,
                                                                                w = z1 z2 z3 z4 ,    zi ∈ {⇒, ⇐, ⇑, ⇓},
         – |oi | = 2,                                                           z1 6= z2 6= z3 6= z4
         – P1 = P2 = · · · = Pk , Pi rules are defined below,               (b) 1w ↔ e, x v → x v ,
                                                                                *                     
         – [r, s] are the initial coordinates,                                      e        e       e
                                                                                                                    +
                                                                            (c)  e Xz1 z2 z3 z4 e  → zX , e → zX ,
  • BB is the blackboard, to the definition of the black-                           e        e       e
    board is devoted a separate subsection of this chapter,                     X ∈ {1, 2, 3, 4}, zX ∈ {⇒, ⇐, ⇑, ⇓}

  • f is the final object, f ∈ V .                                         Then the agent puts the object with movement se-
                                                                           quence into environmental cell and reads its distance
   The initial agent’s configuration is: ee and its position               from the prey from new location.
is [r, s].
   Programs Pi associated with i-th agent are:                              (a) zX → z0X , x v ↔ e ,
      (b) hz0X → z00X , e → hi,                                       ~ is a vector with elements that can be named
                                                                    • v1
          D
                                                     vi
                                                        E             AlphaValue, BetaValue, DeltaValue, AlphaPosition,
      (c) z00X ↔ x v , h → Get(BB[dist. from prey], x ) ,
                                                                      BetaPosition, DeltaPosition, preyPosition.
                                                                      If j > 7, then elements with index greater than 7 are
 5. If the new value is smaller then value from previ-
                                                                      without a name, they are addressed by its position.
    ous location, the agent consumes object correspond-
                                                                      The first three elements are serviced by the agents, so
    ing to direction and object with movement sequence
                                                                      blackboard function for the first row only copy their
    and rewrite its content to ee.
          D                             E                             values, if they are not updated by agents in current
      (a) x ≥ y : x v → mOK , y vi → h0 ,                             step of computation.
          D                             E                                                      ~ is 0 in each element.
                                                                          – initial content of v1
      (b) x > y : x vi → mKO , y v → h0 ,
                                                                      ~
                                                                    • v2       is a vector with elements named
      (c) hmOK → mOK , h0 ↔ z00X i,
          *                                                         A00 sDistanceFromPrey,        A01 sDistanceFromPrey,
               e e e
                                          +                                     0
                                                                      . . . , Ak sDistanceFromPrey, k is number of agents
                                   00 000
      (d)  e e e  → zX , zX → zX , zX is oppo-                      (wolves), the elements without name can be
               e e e                                                  addressed only by its position in blackboard matrix.
          site movement to zX ,
                                                                                               ~ is 0 in each element.
                                                                          – initial content of v2
      (e) mOK ↔ Xw , z000  iv
                      X → zX ,
                                                                    • f cn is a vector of functions ( f nc1(i), f nc2(i)) for
                      
             e e e
          *                          +
      (f)                    iv
             e e e → zX , zX → h  00                                  manipulating the vectors v1  ~ and v2, ~ where f nc1(i)
             e e e                                                    updates i − th element of vector v1   ~ and f nc2(i) up-
                                                                      dates i − th element of vector v2,  ~ 0 ≤ i ≤ j, j is a
      (g) hXw → e, h00 → ei,                                                                ~ and v2: ~
                                                                      dimension of vectors v1
 6. If the new distance is greater then old one, the agent                                                            i = {0, 1, 2},
                                                                                       
                                                                                        identity
    consumes object corresponding to the direction and                    i f nc1(i) =   B Position                   i = {3, 4, 5},
    moves to original location and rewrite the object 1                                 f nc1(3)+ f nc1(4)+ f nc1(5)
                                                                                                        3              i = 6.
    with movement sequence into object 2 with the same
    movement sequence, and the agent continues with                       ii f nc2(i) = | f nc1(preyPosition) − BPosition |, for
    investigation. If the agent moves back and there is                      i = index of agent Ai .
    object 4 with movement sequence, it did not find a                Auxiliary function BPosition is described in the section
    smaller distance to the prey and it stops working.                Receivers.
      (a) hmKO → mKO , h0 ↔ z00X i,
      (b) mKO → m0KO , z000
                        X →e ,                                  6     Receivers - from theoretical model into
      (c)    m0KO → m0KO , e ↔ Xw ,                                   real life
      (d)    m0KO → m00KO , Xw → (X + 1)w , X ∈ {1, 2, 3}
                                               v
                                                                A communication between agents and blackboard is real-
      (e)    m00KO → Get(BB[dist. from prey], x ), Xw ↔ e ,     ized by receivers. We equip our model with two receivers
            X ∈ {2, 3, 4},                                      that are listening signals coming from the agents. The abil-
      (f) m0KO → f , e ↔ 4w ,                                   ities of receivers are crucial for functioning of the func-
                                                                tions of the blackboard, because they are providers of val-
   The computation is possibly non-halting because of the       ues of B position function. We can assume, that receivers
three agents (alpha, beta and gamma) can always find            can "see" position of each agent but for wide areas it is not
applicable program. The positions of these three agents         very realistic. For our model we choose another approach.
are important to estimate prey position by the Blackboard       We introduce time into our model - it takes some time to
(more in Section 5.2).                                          signal from agents to reach receivers.

                                                                    • rcv - the blackboard has two receivers. Both receivers
5.2   Blackboard
                                                                      are located in the environment. Their initial position
The blackboard for GWO is a structure defined as follows:             are on the opposite sides of the boundary points of the
                                                                      environment Env (positions [0, 0] and [m − 1, n − 1],
              BBGWO = ( f~nc, [v1,
                               ~ v2]),
                                   ~   where:                         where m×n, m, n ∈ N, is the size of the environment).
                                                                      Receivers’ positions are updated in each derivation
  • dimension of both vectors v1, ~ v2
                                     ~ is j = max(7, k), k ≥
                                                                      step and receivers circulate around the environment
    1 is the number of agents, in this case, a matrix of              as follows:
    type i × j, i, j ∈ N, is represented by a vector of these
    vectors.                                                              – x < 1, y < n : x = x, y = y + 1,
                                                 Figure 7: Blackboard in use


        – x < m, y > n − 1 : x = x + 1, y = y,                  way than via the environment. They also have no informa-
        – x > m − 1, y > 0 : x = x, y = y − 1,                  tion about their own position and position of others in the
                                                                environment. These deficiencies can be solved by adding
        – x > 0, y < n + 1 : x = x − 1, y = y.                  the universal communication device, the blackboard, into
     Primary task of the rcv is to collect data (messages)      the definition of the 2D P colony.
     from agents.                                               The Grey wolf optimization algorithm is meta-heuristic
                                                                optimization technology inspired by the social dynamics
        – Messages have a given structure: msg :                of the packs of grey wolves. The looking for the extreme
          (contents, agent’s index, timestamp). contents        of a function is inspired by hunting the prey by the pack of
          is request of an agent - function Get, Update         the wolves.
          or it is an empty string, timestamp corresponds       We introduced the formal definition of the 2D P colony
          to the time, when request was sent.                   equipped with a blackboard and presented the ability of
                                                                this formal model to simulate the Grey wolf algorithm.
     Receivers are listening to agents’ signal. If both re-     The blackboard serves not only as a communication de-
     ceivers receive the same message from the agent, re-       vice, but also as a device capable to set the particular roles
     ceived message is being processed in the following         of the agents simulating the wolves, and successfully find
     sequence: computation of auxiliary function BPosition ,    the extreme of the function represented by the discrete en-
     execution of contents part of message.                     vironment of the 2D P colony.
        – BPosition = x ∈ R1 ∩ R2 , where R1 , R2 are cir-
          cles with centre at receivers’ position and radius
                                                                Acknowledgements. This work was supported by The
          r = now − sent, where now is time of receiving
                                                                Ministry of Education, Youth and Sports from the Na-
          the message, and sent equals to timestamp, x is
                                                                tional Programme of Sustainability (NPU II) project
          chosen randomly in the intersect area. The in-
                                                                „IT4Innovations excellence in science - LQ1602“.
          tersection shapes are changed over time due to
                                                                Research was also supported by the SGS/11/2019 Project
          the movements of the receivers.
                                                                of the Silesian University in Opava.
   At this point, it is important to focus on the use of the
blackboard by the agents. Agents can use blackboard’s
functions above.                                                References
   If the agent concludes that it is Alpha, it rewrites field
~
v1[0]  using communication program 3 in Fig. 7 on the left       [1] Cienciala, L., Ciencialová, L., Perdek, M.: 2D P colonies.
side. In the same way, Beta and Delta wolves can rewrite             In: Csuhaj-Varjú E., Gheorghe M., Rozenberg G., Salomaa
       ~
field v1[1] using the same function. On the right side of            A., Vaszil Gy. (eds) Membrane Computing. CMC 2012.
Fig. 7, the agent concludes that it is Omega, and it will            Lecture Notes in Computer Science, vol 7762. Springer,
try to move with blackboard’s assistance using communi-              Berlin, Heidelberg, pp. 161–172 (2012)
cation program 1.                                                [2] Kelemen, J., Kelemenová, A., Păun, Gh.: Preview of P
                                                                     colonies: A biochemically inspired computing model. In:
                                                                     Workshop and Tutorial Proceedings. Ninth International
7   Conclusion                                                       Conference on the Simulation and Synthesis of Living Sys-
                                                                     tems (Alife IX). pp. 82–86. Boston, Massachusetts, USA
The 2D P colonies as defined are not able to simulate the            (September 12-15 2004)
behaviour of the complex multi-agent systems. The agents         [3] Păun, Gh.: Computing with membranes. J. Comput. Syst.
of the 2D P colony are not able to communicate in other              Sci. 61(1), 108–143 (2000)
[4] Mirjalilia, S., Mirjalilib, S.M., Lewisa, A.: Grey Wolf
    Optimizer. Advances in Engineering Software 69, 46—
    61(2014)