Design and Simulation of a Wave-like Self-
   Organization Strategy for Resource-Flow Systems
                       Jan Sudeikat,                                          Jan-Philipp Steghöfer, Hella Seebach
      Wolfgang Renz, Thomas Preisler and Peter Salchow                                 and Wolfgang Reif
             Multimedia Systems Laboratory (MMLab),                       Institute for Software and Systems Engineering,
           Faculty of Engineering and Computer Science,                                   Augsburg University,
             Hamburg University of Applied Sciences,                      Universitätsstrasse 6a, 86135 Augsburg, Germany
             Berliner Tor 7, 20099 Hamburg, Germany                                   Email: {steghoefer, seebach,
                Email: {jan.sudeikat, wolfgang.renz,                              reif}@informatik.uni-augsburg.de
          thomas.preisler, peter.salchow}@haw-hamburg.de


   Abstract—In resource-flow systems, e.g. production lines,           This paper also shows how to pragmatically combine a top-
agents are processing resources by applying capabilities to them    down approach for the design of agent-based systems with a
in a given order. Such systems profit from self-organization as     bottom-up approach for the design of inter-agent coordination.
they become easier to manage and more robust against failures.
This paper proposes a decentralized coordination process that       While the exact interplay of the two concepts is not fully
restores a system’s functionality after a failure by propagating    elaborated here, it already becomes clear that both approaches
information about the error through the system until a fitting      are not necessarily orthogonal but that it is beneficial to
agent is found that is able to perform the required function. The   combine both views.
mechanism has been designed by combining a top-down design             This paper is structured as follows: in the following section,
approach for self-organizing resource-flow system and a systemic
modeling approach for the design of decentralized, distributed      the ODP, as a conceptual model for self-organizing resource-
coordination mechanisms. The systematic conception of the inter-    flow systems, is discussed and a prominent application sce-
agent process is demonstrated. Evaluations of convergence as well   nario, i.e. production automation, is introduced. In Section
as performance are performed by simulations.                        III, a programming model for self-organization is introduced.
                                                                    Subsequently, the intended coordination dynamics of self-
                      I. I NTRODUCTION                              organizing resource-flow systems are presented (see Section
                                                                    IV) and the realization of a decentralized role allocation
   A key driver in the development of autonomous and auto-          strategy is discussed and evaluated (Section V). Finally, we
nomic systems is the handling of complexity in large appli-         conclude and give prospects for future work.
cations that consist of a great number of interacting entities.
Traditional management and failure-handling approaches are             II. D ESIGN OF S ELF -O RGANIZING R ESOURCE -F LOW
no longer applicable as they do not scale well with the size                                 S YSTEMS
of the systems and the communication required by a central             In production automation systems, resources are transported
management becomes prohibitive, even with modern high-              between machines to subject these work peaces to a specific
speed networks. Therefore, engineers and computer scientists        sequence of work steps. The sequence of machines is typically
turn to self-organization as a means to deal with large complex     static. The machines that process the resources are highly
systems and to keep up with the growth of such applications.        specialized and only have one particular capability, i.e. an
   In this paper, we present a self-organizing process for the      individual operation, they can apply to the resources. The
class of self-organizing resource-flow systems. This class can      transport of resources is fixed as well, e.g. by a static layout of
be applied to a great variety of domains such as production         conveyor belts. This rigid structure simplifies the management
automation and logistics and systems in it can be modeled           but has far-reaching implications, since reconfigurations are
with the Organic Design Pattern (ODP) [1]. The decentralized        obstructed. The complete system has to be halted when
process proposed here is analyzed and modeled with the              internal errors make a single system component inoperable.
tools provided by the SodekoVS project [2]. Changes in the          Adjustments of the production process have to be carried out
configurations of agents propagate through the system like          by stopping the system and retooling machines.
a wave until the system in its entirety has restored a stable          A visionary alternative are flexible, agent-based production
state. During reconfiguration, parts of the system that are not     lines that enable failure tolerance. Machines can autonomously
affected by the process or have already been reconfigured are       reconfigure and transports of resources are carried out by au-
still able to resume their normal work. Evaluations show the        tonomous guided vehicles (AGVs). Such a scenario is depicted
quick convergence to stable states and the reconfigurations         in Fig. 1 where robots process a car body which is transported
only affect system partitions.                                      between the processing stations by autonomous carts.
Fig. 1. Robots with different capabilities (icons to the right of the robot)
process a car by applying one of their capabilities each (highlighted icon).


   There are other domains where similar resource-flows occur,
e.g., logistic scenarios or web service orchestration, we call
this class of systems Self-Organizing Resource-Flow Systems.
Their basic structure can be described with the ODP [3]
which defines the elements that constitute the system and
their relationship as shown in Fig. 2. Task define the required
processing of resources. Processing steps are carried out by                          Fig. 2.   The elements of the ODP for Resource-Flow Systems.
agents. The states of resources are modified by applying
capabilities. Agents have a set of capability available and
exchange resources, based on the shop layout (inputs and                         hold during the entire runtime of the system. Whenever the
outputs). Which capability an agent applies and with which                       invariants are violated, the system has to be reconfigured to
agents it exchanges resources is determined by a role1 . Roles                   fulfill the invariants again (Restore Invariant Approach [3]).
have a precondition that describes where the resource is                         The individual agents are able to monitor local invariants and
coming from, which state it has, and which task has to be                        thus implement the observation part of the O/C. The controller
performed on it. They also have a postcondition that describes                   part of the O/C is then responsible to calculate a new allocation
to which agent the resource has to be given and which state and                  of roles that restore the resource-flow and ensure that each
task it has after the agent has processed it. Most importantly,                  agent has a role that fits its capabilities and its input/output
the role defines the capabilitiesToApply, i.e., what an agent is                 relationship with other agents. How this calculation is done,
supposed to do with the resource.                                                however, is not specified at this point.
   To fulfill the tasks for a resource (i.e., to apply the correct
                                                                                  III. S YSTEMIC P ROGRAMMING OF S ELF -O RGANIZATION
capabilities in the correct order), a resource-flow is established
by the allocation of roles to agents that determines how the                        In the research project ”Selbstorganisation durch Dezentrale
resource is moved through the system and processed on the                        Koordination in Verteilten Systemen”2 (SodekoVS) [2], a
way. This means the combination of the roles of the agents by                    programming technique is developed that allows to equip
their pre- and postconditions respectively, is a connected chain                 software systems with self-organizing features. The self-
of agents along which resources in the system are forwarded                      organizing inter-agent process is described by discrete design
and processed. There is usually one such chain or resource-                      elements [6]. This enforces a conceptual separation of the
flow for each task that has to be fulfilled in a system. Each                    agent functioning and the coordination, i.e. the correlation of
agent, however, can participate in more than one resource-flow                   agent activities.
and thus be involved in several tasks at the same time.                             First, a modeling level for the description of inter-agent
   The interactions between the agents to handle resources are                   self-organization is provided. This modeling level supplements
also defined on the abstract system class level and can be                       agent-oriented software engineering practices with an orthog-
inherited by applications based on the ODP. They describe,                       onal description level that concerns the dynamic properties of
amongst other things, the handover of resources and the                          agent-based software systems [6]. The driving force of self-
detection of agent failures with a heartbeat mechanism. This                     organizing dynamics are distributed feedback loops among
way, both a formal analysis of the system class [3], generalized                 system elements [7]. These result from the mutual influences
mechanisms to deal with problems in the system class such                        among system elements and control how fluctuations in the
as deadlocks [4] as well as a generic runtime environment [5]                    system context are disseminated and collectively responded to.
become feasible.                                                                 The systemic modeling level addresses the description of these
   The ODP also contains an Observer/Controller (O/C). This                      networks of influences and it has been found that the visual-
element of the system structure is the abstract extension                        ization of the mutual interdependencies of system elements
point for the self-organization or reconfiguration mechanism.                    is useful for the anticipation of the dynamics that software
Correct system behavior is defined by invariants that have to                    systems are able to exhibit [8], [6]. Using a graph-based
                                                                                 modeling approach, System Dynamics [9] modeling concepts
   1 Please note that some of the terminology used in ODP has a slightly         are specialized for describing Multi-agent systems (MAS).
different semantics than the same terms in agent-oriented software engineering
due to the historic roots of ODP.                                                  2 Self-Organisation by Decentralized Coordination in Distributed Systems
These models are given as an Agent Causal Behavior Graph
(ACBG) [10]. The nodes in this graph-based modeling level
represent system variables that characterize the macroscopic
state of a MAS. These describe the number of agents that show
a specific behavior, e.g. play a role. In addition, the current
value of an interaction rate can be denoted with a specific
node type. The links among these variables denote mutual
influences and interdependencies. In this respect, influences
denote additive or subtractive contributions to node values,
e.g. when the activity of an agent increases or decreases           Fig. 3.    The SodekoVS-Architecture for the embedding of decentralized
the stock of a warehouse. Interdependencies describe causal         coordination in MAS [17].
relations where the activities of agents are mutually linked,
e.g. the number of hypothetical service requesters in a system
is expected to be positively linked to the number of activations    coordination is exemplified in the Sections IV and V. A
of service providers. When the number of requesters increases,      configuration language [10] allows to map ACBGs to agent-
the number of activations increases as well and vice versa.         based software systems. These mappings describe the realiza-
    Secondly, a programming model that allows the enactment         tion of influences among agents, i.e. the coordination-related
of ACBG-based prescriptions of self-organization processes          logic that controls the initiation, participation, and reaction to
[11], [10] facilitates application development. The key element     interactions as well as the media that mediate interactions. The
is a distributed architecture for the enactment of decentralized    detailing of these models, as a systematic programming effort,
inter-agent processes (cf. Fig. 3) [11]. This architecture serves   is not discussed in this article but details on the configuration
as a reference model for the integration of ACBG-based              of process enactments can be found in [17].
processes in MAS. It provides a conceptual framework for
fitting in different self-organization mechanisms and follows           IV. S YSTEMIC M ODEL OF A DAPTATION DYNAMICS
a layered structure. The topmost layer (Application Layer)             In [18], the systematic integration of decentralized coordi-
contains the realization of an agent-based application. The         nation strategies in MAS has been discussed. The conception
contained agents are understood as self-contained providers         of the appropriate coordination is approached by modeling the
of functionalities (Application Functionality). The contained       problematic, unintended behavior of applications. Based on the
agents individually control their activities and an underlying      identification of the Problematic Dynamic, a corresponding
Coordination Layer enables the purposeful affecting of agents       Solution Dynamic is derived that supplements the application
to concert the localized activities and establish collective        behavior with additional interdependencies and inter-element
behaviors.                                                          feedbacks to correct the system behavior and alleviate unin-
    The Coordination Layer describes an event-based dis-            tended effects.
tributed system [12], which allows to realize mutual influ-            The Problem Dynamic of an ODP-based resource-flow
ences among system elements. These influences correspond            systems is illustrated in Fig. 4 (right). Initially, agents are
to relations in ACBG-based models of inter-agent processes,         running and one or several roles are allocated to them which
thus the layer is a means to enact the described processes in       are executed in order to process resources. Random errors
MAS. The establishment of inter-agent influences, particularly      make it impossible for the agent to apply one or more of
for the construction of self-organizing systems, is based on        its roles. The adoption of roles that can not be applied is
two types of mechanisms [13], i.e. techniques for the infor-        controlled by a fluctuating rate (RF interrupt) that is positively
mation exchange among agents (e.g. reviewed in [14]) and            influenced by the availability of running, thus breakable,
mechanisms for the (adaptive) adjustments of agents (among          agents and the changing number of error events (Error). This
others classified in [15]), due to the perceived information        rate describes the resource-flows (RF) that are interrupted, due
(see Section VI). The Coordination Layer contains two types         to the breaking of agents. These failures within individual
of functional elements for the encapsulation of these aspects.      agents limit the number of running agents (negative link), thus
Coordination Media are conceptual containers of so-related          the problematic system behavior is dominated by a negative
interaction infrastructures. Specific interaction modes, e.g. the   feedback loop (α).
mediation by an environment [16] or Linda-like tuple spaces,           If not handled, this dynamic causes the number of agents
are encapsulated and reused by a generic interface [11]. Coor-      that are not running to increase over time. The design of
dination Endpoints interact on behalf of agents via these media     an appropriate Solution Dynamic concerns the derivation of
and are able to influence the agent execution. These elements       agent behaviors that counteract this unintended effect. A very
are used to encapsulate and automate the coordination-related       general structure is given on the left hand side of Fig. 4.
activities. These activities concern the interactions vie Media,    Agents that have roles they can no longer apply are Waiting
i.e. the invitation and participation of interactions, as well as   for Reconfiguration. The rate of interrupts positively influences
the affectation of modifications in the agent models.               the increase of this variable. The system is equipped with
    The ACBG-based modeling of dynamics of inter-agent              a reconfiguration mechanism, and for each of the waiting
                                                                     to play a role that involves Cap. 1 but may also apply Cap.
                                                                     3. Two types of Cart agents represent the initial provision
                                                                     (Producer) and the final collection of the processed workpieces
                                                                     (Consumer). Due to an error Robot1 can no longer apply Cap.
                                                                     1 and sends a request for assistance (2). This request is routed
                                                                     along the resource flow till it reaches an agent that is capable
                                                                     to execute the needed capability, here Robot 2 which replies
                                                                     to the request (3). The reply is routed to the requesting Robot
                                                                     as well as the Carts that are connected to the swapping robots.
                                                                     Consequently, the Robots and Carts reconfigure their local
        Fig. 4.   The Problem and Solution Dynamic of the ODP.
                                                                     roles. The robots update their roles and the resource flow
                                                                     is reestablished by adjusting the ports in the pre- and post-
                                                                     conditions of the roles of the connected Carts to ensure that
agents a new configuration is determined. Thus the system            workpieces reach the robots in the intended sequence. In the
shows a causal relation. In absence of waiting agents, no            best case, the originating and the receiving agents can just
reconfigurations take place. Occurrences of waiting agents           switch their roles, thus restoring the resource-flow.
enforce subsequent reconfigurations (Reconfigure) to restore a          In Fig. 5, Robot 2 is able to provide capability Cap. 1 but
set of executable roles. The reconfigurations thus increase the      Robot 1 is not able to replace Robot 2 as it is missing the
number of Running agents by complementing a counteracting            currently utilized Cap. 2 (3). Thus after the swap, Robot 1
feedback loop (β).                                                   remains in a problematic state and requests assistance (4).
   This Solution Dynamic deliberately omits the concrete             This request is propagated again till it reaches a robot with
mechanism with which new role allocations are determined.            the required capability (Robot 3) and the swap proceeds as
Also the locally applied techniques to the enactment of recon-       above (5). Since Robot 1 is able to replace the currently active
figurations are abstracted. A method to express the problem of       capability of Robot 3, i.e. Cap. 3, the correct sequence of ca-
finding a fitting role allocation as a constraint-solving problem    pabilities is finally reestablished (6). Here, the reconfiguration
has been presented in [19] and solved with a centralized             logic has been described for agents that only play on role at a
approach. Whenever an agent can no longer apply one of               time and the subsequent simulations concerns this simplified
its roles or whenever an agent breaks, the resource-flow is          scenario. In principle, agents can be part of several resource-
interrupted. When the interruption is detected, the system           flows and in that case, the agents only reconfigure for roles that
reconfigures in order to restore the flow. During the course of      include the broken capability and keep processing resources
the reconfiguration process, a new allocation of roles to agents     of other tasks. Consequently, the informed Carts change only
is calculated and the roles are communicated to the agents           the ports of the affected roles accordingly.
which then apply them again. The next section describes                 A detailed ACBG of the outlined reconfiguration algorithm
an alternative reconfiguration mechanisms in which new role          is illustrated in Fig. 6. This description refines the previously
allocations are found in a decentralized fashion by propagating      given Solution Dynamic (cf. Fig. 4) as it illustrates how the
the demand for local reconfigurations through the system.            decentralized strategy relates to the dynamics of ODP. In
                                                                     addition, it indicates the system-level effects of the decentral-
   V. WAVE - LIKE D ECENTRALIZED R ECONFIGURATION                    ized reconfiguration that are examined in Section V-A. When
   A completely decentralized reconfiguration approach is            agents are Waiting for Reconfiguration due to error events,
based on the idea that a wave of role re-allocation runs through     they show two behaviors. First, they are Deficient as one or
the system in order to re-establish the resource-flow. Assuming      more roles, which are required for the processing of resources,
that each agent is capable to exhibit a set of capabilities (see     are inoperable. These roles are distinguished by the reason for
Section II), a correct resource flow can be (re-)established by      deficiency. Agents can be rendered deficient by error events
the appropriate swapping of roles. Failing agents adopt actable      (By Break) or they deliberately decided to abandon a role in
roles and in return other agents help out by providing the           order to adopt another role on behalf of another agent (By
unactable roles. The failing agent emits a wave of reallocations     Change). In the former case, the agents are rendered inefficient
by sending requests for assistance along the resource flow.          and in the latter case agents assist other agents. Secondly,
Each recipient has to decide locally if it is capable and will       these agents are considered to be Non-Active, i.e. they have
swap roles. Generally, a single swap of roles is not enough to       the capacity to play another role. Agents can concurrently play
reestablish the full sequence of activities and transitive changes   several roles, therefore, the non-activity only denotes that the
of roles are required.                                               agent is underutilized, i.e. is capable to exhibit another role.
   The operating principle is exemplified in Fig. 5 for a simple     This means that in agents that can play several roles, the
scenario that requires a transitive swap of roles. Three Robots      Running and Non-Active roles do not exclude each other, but
are connected in series (1). For each agent the set of actable       an agent can be associated to an active role (Running) and
capabilities are listed and the topmost capability is used in        still have the capacities to play additional roles (Non-Active).
the currently active role, e.g. Robot 1 is currently configured         Agents are equipped with the ability to autonomously
                                          Fig. 5.   Exemplification of the decentralized reconfiguration.


                                                                           changing agents (source) commit to. Another commonality
                                                                           is that the adjustment of a role entails the restoring of the
                                                                           flow of resources among the agents (Restore RF). When
                                                                           an agent has adjusted its role, those agents that received
                                                                           resources from it or gave resources to it need to adapt as
                                                                           well. This activity is separated from the role change as the
                                                                           agents do not deliberately decide about these changes. These
                                                                           are reconfigurations within connected agents that are enforced
                                                                           as they are consequences of the deliberate changes. These
                                                                           reconfigurations may as well by transmitted via Coordination
                                                                           Media (see Section III).
                                                                              Assisting another agent introduces new deficiencies, as the
                                                                           assisting agent is giving up one role that needs to be played
                                                                           by another agent (Deficient by Change). Thus the net amount
                                                                           of Non-Active agents is unaffected. However, these changes
                                                                           may be necessary in settings where agents can not swap
                                                                           roles directly and transitive changes of roles are required (as
                                                                           exemplified in Fig. 5). When Non-Active agents Resume to
                                                                           adopt a role, the number of Non-Active agents is reduced.
                                                                           Still, this requires the availability of Non-Active agents.
                                                                              The changing activities of agents control the overall negative
Fig. 6. ACBG for the Solution Dynamic of the wave-like, decentralized      feedback (β) that increases the number of operational agents
reconfiguration algorithm                                                  and reduces the number of deficient agents. Three auxiliary
                                                                           feedbacks influence the exhibited system dynamics (δ,γ,ǫ).
                                                                           First, agents that assist others create a reinforcing feedback
change their allocation (Change Role). Deficient robots in-                loop (δ), which originates from the fact that an assisting agent
dicate their shortcoming to other agents (communicate de-                  adds additional deficiencies to the system. If an agent resumes
ficiency) via a Coordination Medium (cf. Section III). The                 its activities, deficiencies and non-active agents are removed
medium controls the sequential reception of the request along              instead, thus instituting a balancing feedback (γ). The ability
the flow of resources in the production line. Recipients de-               to resume is limited by the amount of Non-Active agents,
cide locally whether to change their role-allocation or not,               leading to a third balancing feedback (ǫ).
based on their individual abilities. The changing behavior is
distinguished by the receiving agent that adjusts the local                A. Estimating the Effects of Adaptation Dynamics
configuration. Non-Active agents Resume the executions since                  The outlined Solution Dynamic (cf. Fig. 4) denotes an inter-
these become operational. Running agents that adjust their role            agent process that can be implemented with the systemic pro-
Assist the requesting agent. These roles have different effects            gramming model (see Section III). Prior to the detailed design
on the agent population. All changes remove deficiencies and               and embedding of this process, the effected system behavior
the annotation source.targetRole == destination.Role indicates             is anticipated. One approach to estimate the dynamics of
that only those deficiencies are removed (destination) that                self-organizing MAS is their simulation in stochastic process
algebra models [20]. It is important to note that the resulting              the assisting adjustment. Deficiency ends when another agent
stochastic models abstract from the agent implementations and                processes the request for a role change, communicated via the
their internal reasoning. Instead, stochastic terms are used                 channel reqc . This simulation model abstracts from the agents
to describe the dynamics with which specific behaviors are                   that participate in the system. The time needed to restore the
adopted or left. The number of currently active process terms                resource flow (cf. Fig. 6), i.e. the adjustments of the roles of
resembles the number of agents that show specific behavior,                  the directly connected agents to reestablish the correct flow of
i.e. the variable-values of an ACBG-based process description.               resources among machines, is part of the time delay τrc . We
   Fig. 7 illustrates the simulation model used to anticipate                assume that the rerouting of resource transportations is always
the Solution Dynamic. The model is given in stochastic π-                    possible.
calculus [21] and simulated with the Stochastic Pi Machine3                     Simulations indicate that, at a high enough level of redun-
(SPIM). Details on the simulation language and graphical                     dancy, the system reliably recovers due to the decentralized
notation can be found in [22]. Agents are modeled as processes               switching of agent roles. This process describes a structure
that communicate/interact via channels. The stochastics of                   formation as the system maintains the operational system
interactions are given by annotating processes with delays                   configuration. The fraction of recovering situations is predicted
(τ ) and assigning interaction rates to channels [21], [22]. The             by this simulation to depend on the redundancy level in a
notation x denotes the sending of data on the channel x and                  similar way as is shown in the results section below.
x denotes the reception of data via the channel x.
                                                                             B. Agent-based Realization
                                                                                After the anticipation of the affected system behavior, this
                                                                             reconfiguration strategy has been integrated in a MAS by using
                                                                             systemic programming model (see Section III). The system
                                                                             implementation (Application Layer, see Figure 3) makes use
                                                                             of the freely available Jadex4 agent framework. The Robots
                                                                             and Carts within production lines are represented by Jadex-
                                                                             agents and the exchanged workpieces are mimicked by objects
                                                                             that are exchanged via FIPA Agent Communication Language
                                                                             (ACL) messages. A realization of the Coordination Layer (see
                                                                             Fig. 3) for this agent platform is utilized [11].
                                                                                For this application scenario a tailored Medium realization
                                                                             is utilized that routes request and reply messages along the
Fig. 7. Simulation Model of the Solution Dynamic in Stochastic π-Calculus.   resource flow. Conceptually though, agents are aligned in
                                                                             a circle thus all agents can be reached independent of the
   In the simulation model, the number of agent behaviors that               location of the incapacitated agent. Endpoints encapsulate the
are exhibited are expressed by the number of active processes.               logic to coordinate the reconfiguration process and interact
Processes communicate via two channels. The channel reqb is                  via the Medium. Endpoints observe the agent-operation and
used to communicate requests for a role-switch, due to an                    initiate the reconfiguration process by sending a help request
internal error, i.e. the internal breaking of the requester. When            if an agent becomes deficient. The help request is forwarded
roles are requested to be switched in order to assist an agent,              through the medium. Each endpoint along the message path
these requests are send via the channel reqc . Operative agents              decides whether to adopt the deficient agent’s role or continues
are denoted by the running process. Internal errors occur with               forwarding the help request. If the endpoint decides to adopt
a fixed rate, as defined by the delay τb . Inoperative robots                the role, a reply is sent. The reply is sent in both directions
are represented the concurrent execution of the def icientb                  through the medium to inform all agents which are affected
and non-active processes. Deficient processes end when a                     (robots and connected carts) by the reconfiguration process.
request for re-assignment of the affected role is processed by a             Again, each endpoint receiving the reply decides to change the
recipient. The non-active processes become resume processes                  agent configuration. The reply is sent backward through the
when they receive any request for re-allocation. The time delay              medium until all affected agents are informed. If an endpoint
τrc resembles the time needed to reconfigure. Afterwards, the                receives multiple coordination messages these messages are
agent is in operation, i.e. exhibits the running process. When               queued and processed in the order of their arrival.
running agents receive a request to switch roles, they convert
                                                                             C. Implementation Test Results
to the concurrent execution of the assist and def icientc
processes. The assistance transforms back to the running                       The example system illustrated in Fig. 5 has been imple-
process, delayed by the time to reconfigure the agent (τrc ). The            mented and tested for measuring the handling of breaking
def icientc processes describe the search for another agent that             capabilities. When the robot is rendered incapacitated by such
is able to play the role that an assisting agent possessed before            an event, the associated endpoint notices this and initiates
  3 http://research.microsoft.com/en-us/projects/spim/                         4 http://jadex-agents.informatik.uni-hamburg.de/
an interaction via the Coordination Medium that triggers the                    However, these papers from operations research focus on
swapping of roles. The first swap involves the incapacitated                    optimization of a shop floor and do not take into account
agent that is Deficient by Break. In this system configuration,                 robustness and reconfiguration that happens at run-time. Also,
a second swap is required that resolves a transient Deficient                   only partial problems are investigated, either focusing on the
by Change agent behavior.                                                       routing of carts or the scheduling of production machines.
   In addition, we examine the relation of the redundancy                          The work presented here concerns the run-time reconfig-
within agents with the effectiveness of the reconfigurations.                   uration by self-organization. The maintained structure is a
The effectiveness of the reconfiguration procedure is in-                       correct sequence of agents that are perturbed by individual
fluenced by the number of alternative capabilities that are                     failures that incapacitate agents. Prominent alternatives to the
available to the individual agents. This level of redundancy                    process presented here are centralized/distributed constraint
is measured by the ratio of the number of individual ca-                        solving techniques and market-based mechanisms. The correct
pabilities (Ci ), to the absolute number of capabilities (C)                    configuration is described with constraints and an approach
that are required for the processing of a workpiece ( CCi ).                    to use centralized constraint solving to restore functionality
In the following, we assume a homogeneous setting, where                        after a failure has already been published [19]. Consequently,
the robots are equipped with an equal number of redundant                       distributed constraint solving techniques [26] can be used as
capabilities. The composition of these capabilities is normally                 well, e.g. as studied in [27]. An example for market-based
distributed. In Fig. 8, measuring results for a simple scenario                 mechanisms is given in [28]. A manufacturing line is provided
with 10 different capabilities and agents are shown. Each                       with a flexible transport mechanism and work pieces and
single run processes of a fixed number of workpieces while,                     machine agents negotiate for the execution of working steps.
at two fixed instances of time, a randomly selected agent                       In a way, the algorithm proposed in this paper is an optimistic
is incapacitated. A first estimate of the effectiveness of the                  and minimalistic version of a distributed constraint solver. By
reconfigurations is the number of exchanged messages (see                       exchanging roles, the agents collectively restore the invariants
Fig. 8). The number of messages increases quickly when the                      of the system. The strategy is minimalistic since the number
number of redundancies decreases. This measurement can be                       of required messages and the amount of shared information is
analytically fitted with (c1 ∗(1−x))2 +c2 .5 . A complementary                  reduced. It is optimistic because the assisting role changed are
measurement is the number of hops that requests for assistance                  carried out before the complete solution is calculated. Its main
travel before a swapping agents is found. This describes the                    advantage over traditional distributed constraint solvers is the
logical distance between the swapping agents. The results for                   minimal amount of calculation that is involved at the agents.
this measurement have shown the same characteristics like the                   They can therefore be very small with only minimal CPU
message count and can be fitted with the same function (but                     power and RAM, making them cheap and easily replaceable.
different constants). Thus the decentralized reconfiguration                       Here, these design alternatives are qualitatively character-
strategy is particularly suited for production lines where the                  ized by a subset of criteria from [29]. Their quantitative com-
capability types are often available.                                           parison is left for future work. A first aspect is the necessary
                                                                                communication. The presented process minimizes the message
                                                                                content and only the information that immediately necessary
                                                                                to resolve a local failure is communicated. The number of
                                                                                messages ranges from a single role swap (best case) to the
                                                                                successive swapping of roles by all agents (worst case). The
                                                                                dependence of this measure on structural properties is shown
                                                                                in Figure 8. The communications are only carried out when
                                                                                failures are present. Alternatives involve that coordination-
                                                                                related messages have to be exchanged during the normal
                                                                                system operation, e.g. as workpieces constantly negotiate their
Fig. 8. Measurement results. The averaged number of messages are plotted        further processing [28]. Also the amount of computations and
over the redundancy of capabilities.                                            the considered/exchanged knowledge about the system state
                                                                                is minimized. The participation in the process involves only
                        VI. R ELATED W ORK                                      the local consideration whether an agent is capable to play a
                                                                                required role. Now further information about the global system
   Ant Colony Optimization has been used for decentralised
                                                                                state is processed. Centralized constraint solvers require the
control in production systems. [23] uses the mechanism to
                                                                                full knowledge about the system and decentralized solvers
control autonomous vehicles, similar to our carts. The distri-
                                                                                require the information from neighboring agents. In addition,
bution of jobs to production machines has not been a concern
                                                                                the adaptations are carried out concurrently to the system
there. The more complex problem of scheduling jobs to run
                                                                                operation. Unaffected partitions of the production line continue
on certain machines has, e.g., been tackled in [24] and [25].
                                                                                to work. Finally, the accuracy of the quality of the found
  5 c , c are application dependent constants. In this case, these are 10 and
     1 2
                                                                                solution, i.e. stable configuration, varies. The process follows
12 respectively.                                                                the heuristic that role swaps of nearby agents are favored over
the swaps of (logically) distant agents. The explicitly treatment                   [6] J. Sudeikat and W. Renz, “Qualitative modeling of mas dynamics -
of the underlying constraint problem allows in principle to                             using systemic modeling to examine the intended and unintended con-
                                                                                        sequences of agent coaction,” in Agent-Oriented Software Engineering
prepare the optimization of the found solutions.                                        X. Springer, 2009, to be published.
                                                                                    [7] Y. Brun, G. di Marzo Serugendo, C. Gacek, H. Giese, H. Kienle,
                          VII. C ONCLUSIONS                                             M. Litoiu, H. Müller, M. Pezzè, and M. Shaw, Software Engineering
                                                                                        for Self-Adaptive Systems. Springer-Verlag, 2009, ch. Engineering Self-
   In this paper, we have described a decentralized recon-                              Adaptive Systems through Feedback Loops, pp. 48–70.
                                                                                    [8] W. Renz and J. Sudeikat, “Modeling feedback within mas: A systemic
figuration process to restore valid system configurations in                            approach to organizational dynamics,” in Organised Adaptation in Multi-
self-organizing resource-flow systems. The reconfiguration al-                          Agent Systems, 2008, pp. 72–89.
gorithm works by exchanging roles with neighboring agents                           [9] J. D. Sterman, Business Dynamics – Systems Thinking and Modeling for
                                                                                        a Complex World. McGraw-Hill, 2000.
and by propagating change requests in a wave-like manner                           [10] J. Sudeikat and W. Renz, “MASDynamics: Toward systemic modeling
until all of them could be satisfied. The mechanism has                                 of decentralized agent coordination,” in Kommunikation in Verteilten
been developed by combining a top-down process for the                                  Systemen, ser. Informatik aktuell, 2009, pp. 79–90.
                                                                                   [11] ——, “Decomas: An architecture for supplementing mas with systemic
description of resource-flow systems and a bottom-up process                            models of decentralized agent coordination,” in Proc. of the 2009
for the design of agent coordination. Its performance has been                          IEEE/WIC/ACM Int. Conf. on Intel. Agent Tech., 2009, pp. 104–107.
demonstrated with a number of simulations.                                         [12] G. Mühl, L. Fiege, and P. Pietzuch, Distributed Event-Based Systems.
                                                                                        Springer-Verlag New York, Inc., 2006.
   The most interesting feature of the decentralized process                       [13] J. Sudeikat and W. Renz, Applications of Complex Adaptive Systems.
proposed here is that reconfigurations are organized locally in                         IGI Global, 2008, ch. Building Complex Adaptive Systems: On Engi-
the production line, i.e. the rest of the system is not impaired                        neering Self–Organizing Multi–Agent Systems, pp. 229–256.
                                                                                   [14] T. DeWolf and T. Holvoet, “Decentralised coordination mechanisms as
by a failure. Thus, parts of the system that are not involved into                      design patterns for self-organising emergent systems,” in Engineering
a local reconfiguration can continue to run normally. The way                           Self-Organising Systems, vol. 4335/2007, 2007, pp. 28–49.
the reconfiguration propagates also ensures that only a small                      [15] G. D. M. Serugendo, M. P. Gleizes, and A. Karageorgos, “Self-
                                                                                        organisation and emergence in mas: An overview,” in Informatica,
amount of agents is in a state of non-processing resources                              vol. 30, 2006, pp. 45–54.
at an instance of time. This feature will be prominent also                        [16] H. V. D. Parunak and S. Brueckner, “Engineering swarming systems,” in
when using the wave-like algorithm in non-linear production                             Methodologies and Software Engineering for Agent Systems. Kluwer,
                                                                                        2004, pp. 341–376.
situations, which is a straight-forward generalization instance.                   [17] J. Sudeikat and W. Renz, “Programming adaptivity by complementing
Local heuristics taking into account exchange–success rates                             agent function with agent coordination: A systemic programming model
could be predefined or evolved using learning algorithms.                               and development methodology integration,” Communications of SIWN,
                                                                                        vol. 7, pp. 91–102, may 2009, iSSN 1757-4439.
Future work includes a more detailed study on the combination                      [18] ——, “On the modeling, refinement and integration of decentralized
of bottom-up design of coordination methods as proposed in                              agent coordination – a case study on dissemination processes in net-
SodekoVS and of top-down design methodologies as promoted                               works,” in Self-Organizing Architectures, 2010, pp. 251–274.
                                                                                   [19] F. Nafz, F. Ortmeier, H. Seebach, J.-P. Steghöfer, and W. Reif, “A
with the ODP. This will also include a comparison of their                              universal self-organization mechanism for role-based Organic Comput-
respective advantages and problems that occur when both                                 ing systems,” in Proceedings of the Sixth International Conference on
worlds are combined.                                                                    Autonomic and Trusted Computing (ATC-09), 2009.
                                                                                   [20] M. Casadei, L. Gardelli, and M. Viroli, “Simulating Emergent Properties
                                                                                        of Coordination in Maude: the Collective Sorting Case,” in 5th Int.
                         ACKNOWLEDGMENT                                                 Workshop on the Foundations of Coordination Languages and Software
                                                                                        Architectures (FOCLASA), 2006.
   This research is partly sponsored by the German research                        [21] C. Priami, “Stochastic π–calculus,” Computer Journal, vol. 6, pp. 578–
foundation (DFG) in the project SodekoVS and in the DFG                                 589, 1995.
special priority program “Organic Computing” (SPP 1183) in                         [22] A. Phillips, Symbolic Systems Biology: Theory and Methods. Jones and
                                                                                        Bartlett Publishers, 2010, ch. A Visual Process Calculus for Biology.
the project SAVE ORCA.                                                             [23] V. A. Cicirello and S. F. Smith, “Ant colony control for autonomous
                                                                                        decentralized shop floor routing,” in International Symposium on Au-
                              R EFERENCES                                               tonomous Decentralized Systems, 2001.
                                                                                   [24] N. Liouane, I. Saad, S. Hammadi, and P. Borne, “Ant systems & local
 [1] H. Seebach, F. Ortmeier, and W. Reif, “Design and Construction of                  search optimization for flexible job shop scheduling production,” Int.
     Organic Computing Systems,” IEEE Congress on Evolutionary Compu-                   Journal of Comp., Comm. & Control, vol. 2, no. 2, pp. 174–184, 2007.
     tation, 2007, pp. 4215–4221, Sept. 2007.                                      [25] C. Gagné, M. Gravel, and W. L. Price, “Solving real car sequencing
 [2] J. Sudeikat, L. Braubach, A. Pokahr, W. Renz, and W. Lamersdorf,                   problems with ant colony optimization,” European Journal of Opera-
     “Systematically engineering selforganizing systems: The sodekovs ap-               tional Research, vol. 174, no. 3, pp. 1427 – 1448, 2006.
     proach,” Electronic Communications of the EASST, vol. 17, 2009.               [26] M. Yokoo, E. Durfee, T. Ishida, and K. Kuwabara, “The distributed
 [3] M. Güdemann, F. Nafz, F. Ortmeier, H. Seebach, and W. Reif, “A spec-              constraint satisfaction problem: Formalization and algorithms,” IEEE
     ification and construction paradigm for Organic Computing systems,”                Transactions on Knowledge and Data Engineering, vol. 10, no. 5, pp.
     in Proceedings of the Second IEEE International Conference on Self-                673–685, 1998.
     Adaptive and Self-Organizing Systems, 2008, pp. 233–242.                      [27] G. Clair, E. Kaddoum, M.-P. Gleizes, and G. Picard, “Self-regulation in
 [4] J.-P. Steghöfer, P. Mandrekar, F. Nafz, H. Seebach, and W. Reif, “On              self-organising multi-agent systems for adaptive and intelligent manu-
     Deadlocks and Fairness in Self-organizing Resource-Flow Systems,” in               facturing control,” in SASO ’08: Proc. of the 2008 Sec. IEEE Int. Conf.
     Proceedings of the 30th International Conference on Architecture of                on Self-Adaptive and Self-Organizing Systems, 2008, pp. 107–116.
     Computing Systems (ARCS), LNCS 5974. Springer, 2010, pp. 97–100.              [28] N. R. Jennings and S. Bussmann, “Agent-based control systems,” IEEE
 [5] F. Nafz, F. Ortmeier, H. Seebach, J.-P. Steghöfer, and W. Reif, “A generic        Control Systems, vol. 23, no. 3, pp. 61–74, 2003.
     software framework for role-based Organic Computing systems,” in              [29] E. Kaddoum, M.-P. Gleizes, J.-P. Georgé, and G. Picard, “Characterizing
     SEAMS 2009: ICSE 2009 Workshop Software Engineering for Adaptive                   and evaluating problem solving self-* systems,” in COMPUTATION-
     and Self-Managing Systems, 2009.                                                   WORLD ’09: Proc. of the 2009 Computation World, 2009, pp. 137–145.