Designing Reliable Communication for Heterogeneous
                     Computer Systems

             Miroslaw Hajder, Janusz Kolbusz, and Roman Korostenskyi

            University of Information Technology and Management in Rzeszow
     miroslaw.hajder@gmail.com,{jkolbusz,rkorostenskyi}@wsiz.rzeszow.pl


       Abstract. This study describes the network design solution to the problem of
       connecting heterogeneous computer systems based on analysis of multipartite
       hypergraphs. To do this proposes a mathematical model of reliability for the two
       modes of operation of the system: with redundancy communication subsystem
       and the division of communication load. As the evaluation criteria applied solu-
       tions expected changes in processing capacity, latency communication and sys-
       tem reliability. Solution design task is sought in the collection Pareto optima,
       which describes a method for selecting a particular solution in case of equiva-
       lence with respect to the vector of the objective function.

       Key words: Multipartite hypergraphs, architecture connections, system reliabil-
       ity model, design methodology


1    Introduction
Another important feature of modern processing systems is the variety of offered ser-
vices. Nowadays, in the same network they are different, often incompatible with com-
munication services (eg. Isochronous and synchronous transfer). This issue requires a
change of quality of provided services, through the dynamic allocation of independent
communication channels to users or services present in the network specifically in par-
ticular, this applies to multimedia services rendered in real-time or services comprising
critical infrastructure. Modern distributed systems are also characterized by high dy-
namics of changes in operating parameters. Load elements of computing and commu-
nications is changing rapidly, which prevents the design and execution of the network to
meet even medium-term requirements of the users. Another disadvantage occurring in
communication subsystems ubiquity of traffic is bursty traffic, obstructing, and some-
times preventing proper functioning of the network. Currently, the solution to the above
problems is the use of load leveling, both communications and computing. Moreover,
an effective solution to most of these problems can be also providing a flexible re-
configuration of connections, preferably at the logical level, without having to modify
the hardware architecture. In this way, links can be dynamically adapted to the current
traffic pattern.
     Effective methods of reconfiguring connections should be seen in the use of modern
communication technologies, especially those that allow to realize it at the logical level
and which additionally improve the utilization of physical communication channels.
An interesting issue is the construction of multi-channel network of bus-sharing bus
                                                                                     183

logic dynamic range conversion of a set of buses to which the user is attached. Because
of that it becomes possible to adapt to the existing network architecture in the traffic
patterns. However, the use of such architecture requires solving a specific design task,
which for obvious reasons should be characterized by an acceptable time complexity
and memory. Because of the combinatorial nature of the task it is difficult to meet.
    These issue may include the design task to build large class of system configuration
tasks. This task is mostly decomposed into three basic subtasks: a. The selection of sys-
tem components; b. their deployment; c. determining the connection between them. In
previous work, component selection subtask is solved inter alia by implemented using
for this purpose methods seeking the shortest path [1], [2] Backpack block [2], [3] the
clustering multipartite graph [4], mullioned clique [5], morphological analysis. Sub-
sequently, a solution subtasks arrangement of the components, currently is the most
frequently used variants task assignment [2], [6], [7], [8]. To determine the connections
between system components there is the most commonly used a method of agglomera-
tion [9], [10] and the method solving the task of building the optimal hierarchy [11].


2   Architecture Connections and System Reliability Model

The Fig. 1 shows a distributed system consisting of KN node calculation - decoration,
each of which has a Kl retunes elements of the transceiver and the KB communica-
tion buses. Buses Bi (i = 1, . . . , KB ) are logical and can be implemented on the basis
of methods of reproduction of wave-go in one physical bus DF . The addition of logi-
cal channels to any node Ni physical bus BF is done by using a physical connection
channels l and distributors of bus channels C.


               Fig. 1. Trunk generalized computing system architecture


    Interconnect architecture of the analyzed system can be dynamically reconfigured
by tunable elements transmitter - receiver. If you change the traffic pattern elements of
the transmitter - receiver extent of the wave changes, and indeed attach themselves
184

to another logical bus. The system can operate in two modes: a. redundancy com-
munication subsystem; b. the communication load sharing. To evaluate the operating
characteristics of the system in one of these modes, the probabilistic model suggested
combinatorial, and in particular evaluating the reliability R. In this model, the kit: the
transceiver - receiving (≈), the physical connection cable (l), and a manifold physical
channel (C) are treated as a single device connection. Let pio - the probability of perfor-
mance transceiver - the receiving node computational pl - probability of physical fitness
connecting channel; pc - the probability of the distributor channel efficiency Trunk, the
pf k - probability of physical fitness BUS channel. Then, the probability of pku effi-
ciency of connecting the node to the logical channel BUS is defined as: pku = pio pl pc
and puku probability of the merger selected node with other nodes calculation is equal
to puku = pio pl pc pf k .
    Trunk consider the reliability of a distributed system with connections complete
(each compute node is connected to each bus logic) and equal rights computational
nodes, which is the most general example of this class of systems. The probability
pwe (kwe ) efficiency connection sets providing connection to the bus channel not less
       min
than kwe   compute nodes with their total number KN determined by the expression:
                                          KN
                                                                    k
                                          X
                                                  Cki piku (1 − pku ) we
                                                                           −i
                     pwe (kwe ) = pf k                                                 (1)
                                            min
                                         i=kwe

    Let KB to be the amount of bus logic (ie. The maximum multiplicity system inter-
face), psp
        we - likely performance computing node. Then, using expression (1), according
to the proposed condition for the system in redundancy, reliability R is defined by the
following formula:

         KN                                       KB
                          j              K −j                      k            K −k
         X       j
                                                  X
  R=            CK N
                     (psp         sp
                       we ) (1 − pwe )
                                       N                 k
                                                        CK  p (j) (1 − pwe (j)) B
                                                           B we
                                                                                       (2)
          min
       j=kwe                                      k=1

    Consider the current system of equal rights computational nodes working in load
sharing mode of communication. Let W (kwe , km , σ) to be the amount of system effi-
ciency states consisting of kwe compute nodes, connected with km canals system, which
can be selected with the existence of σ denials transceiver components. In order to de-
termine the amount of W (kwe , km , σ) state performance of the system in the event of
damage, etc., it is proposed to use the methods of include - exemption. H1 (σ) number
of states malfunction of the entire system in case of refusal not less than one minimum
section for a system with equal rights nodes is equal to:
                            σ−KB          2     σ−KB        PKB−1 α
            H1 (σ) = KN C(K           + CK     C(K                 CKB =
                         σ−KB
                              N −1)KB        N
                                              2
                                                  PKB −1 α α=1
                                                    N −1)KB
                                                                                 (3)
                       C(KN e −1)KB KN + CKN α=1 CKB
                                                                  Piσ       i
    Then, the value of W is equal to W (kwe , km , σ) = Ckσwe km + i=1 (−1) Hi (σ),
          σ
where: Ckwe km - total number of states σ denials system for transmitting elements -
receiving; iσ - the maximum number taken into account when assessing the minimum
cross sections; Hi (σ) - the number of states of a computing system failure, refusal
                                                                                             185

to transmit elements - receiving no less than the minimum i sections. The probability
Pku (kwe , km , σ), kwu ensures consistency nodes by km bus for refusals σ is defined
by the expression:
                                                     kwe km −σ                       σ
              Pku (kwe , km , σ) = W (kwe , km , σ) pku        (1 − pku )                    (4)
    Using the expression (4), the likelihood of Pku (kwe , km ) ensures the consistency
of computing nodes kwe , km buses, with denials of the existence of σ can be written as:
                                           (kwe −1)km
                                              X
                      Pku (kwe , km ) =                   Pku (kwe , km , σ)                 (5)
                                              σ=0

   Using the expression (5), it will determine the likelihood of consistency Puku (kwe ),
kwe compute nodes:
                   PKB        l    l             KB −l
   Puku (kwe ) = l=k     min CK pf k (1 − pf k )
                         m      B
                                                        Pku (kwe , l),
        min
where: km - the minimum required number of buses needed to provide the required
bandwidth. In this way, the reliability of calculation system is equal to:
                          KN
                                              n               K −n
                          X
                                   n
                  R=              CK N
                                       (psp         sp
                                         we ) (1 − pwe )
                                                         N
                                                                     Puku (n)                (6)
                            min
                         n=kwe

    For computing system client-server mode redundancy it will determine the likeli-
hood of Pks (KK , KS ) efficiency BUS communication channel to which, through the
operational elements transmitter - receiver including no less than kkmin customers with
their total number of KK and ksmin servers with the total number of KS :
                                        K K       K
                                                  PS       i                   K −i
                                                              pi (1 − pku ) K
                                        P
            Pks (KK , KS ) = pf k                         CK K ku
                                                                                         ·
                                          min j=k min
                                       i=kk      s
                                                                                             (7)
                                   j               K −j
                                  CK  pj (1 − pku ) S
                                     S ku


    Let psp      sp
          k and ps be the likelihood of efficiency for clients and servers nodes, respec-
tively. Then, using the expression (7), reliability Rcan be written as:
                               K K
                                                      m           K −m
                                       m
                                           (psp        sp
                               P                           K
                       R=             CK K   k ) (1 − pk )                ·
                               min
                            m=kk
                            K
                            PS        n           n            K −n
                                     CK K
                                          (psp        sp
                                            s ) (1 − ps )
                                                          S
                                                                      ·                      (8)
                           n=ksmin
                Km
                P      l                      l                           K −l
                      CK m
                           (Pks (KK , KS )) (1 − Pks (KK , KS )) m               .
                l=1

    Let’s consider the reliability of client-server system with a complete blend
of computing and communication subsystem working in load sharing mode. Let
W (ks , kk , km , σ)to be the number of states efficient system consisting of servers ks ,
kk clients, km bus, in the presence of σ denials transceiver components. For the client-
server system, the number of failure conditions the H1 (σ) no less than a minimum
cross section can be determined using the expression:
186


               H1 (σ) = (kk + ks ) Ckσ−k  m
                                       m (kk +ks −1)
                                                     + kk ks Ckσ−k m
                                                                m (kk +k
                                                                              =
                                                         km
                                                                       s −1)
                                                              −1                                  (9)
                    = Ckσ−k                                      Ckαm ,
                                                            P
                            m
                         m (kk +ks −1)
                                         kk + ks + kk ks
                                                                 α=1

      a number of states proper functioning as:
                                                            Piσ        i
                 W (kk , ks , km , σ) = Ckσm (kk +ks ) +       i=1 (−1) Hi (σ) ,                 (10)
      Where: Ckσm (kk +ks ) - the total number of computing system states that may occur
at σ refusals. The probability Pks (ks , kk , km , σ) ensures consistency ks servers and kk
clients using km bus in case of refusal elements σtransmitter - receiver can be written
as:
                                                            (k +kk )km −σ               σ
           Pku (ks , kk , km , σ) = W (ks , kk , km , σ) pkus               (1 − pku ) .         (11)
    The probability of Pku (ks , kk , km ) servers to ensure coherence ks and kk clients
using km bus in the event of a refusal elements transmitter - receiver has been defined
as:
                                           P(ks +kk )km
                   Pku (ks , kk , km ) =      s=0         Pku (ks , kk , km , σ) ,               (12)
      and the probability Pku (ks , km ) of consistency ks servers and kk clients as:

                           PKm      k   km             Km −km
        Pku (ks , kk ) =     km =1 CKm pf k (1 − pf k )       Pku (ks , kk , km ) .              (13)

      Using the expression (11), (12), and (13) the sought reliability R will be written as:
                               PKk           k   sp k       sp Kk −k
                        R=         k=kkmin CK (pk ) (1 − pk )
                                               k
                                                                      ·
                       PKs           s     sp s      sp Ks −s
                                                                                                 (14)
                           s=ksmin CKs (ps ) (1 − ps )        Pku (s, k)
    The above-described methodology we will use for further connections to network
design measuring system.


3     Task Design and Its Solution
We will consider hypergraph H = (V, E) comprising a set V = {v} of vertices and a
set of E = {e} of edges, which represent a subset of the set V, i.e. e ⊆ V . Hypergraph H
is a k-regular if each of its edge e ∈ E consists of k vertices. On the other hand, hyper-
graph H is l-partite graph, if the set of its vertices is divided into l subsetsV1 , V2 , . . . , Vl ,
in such a manner that the vertices of each of the edges e = (v1 , v2 , . . . , vl ) ∈ E belong
to different parts of the graph, ie. vi ∈ Vi , where i = 1, . . . , l. For the determination of
l- partite hypergraphs we will use record form H = (V1 , V2 , . . . , Vl ).
     Let’s consider the l- partite hypergraph        H = (V1 , V2 , . . . , Vl ). In this graph, the
part a = V1A , . . . , ViA , . . . , VlA , EA , for i = 1, . . . , l and VlA ⊆ Vl , where any two
                                             

edges e1 , e2 ∈ EA overlap in one and the same vertex v ∈ V1A and do not overlap at
any vertex v ∈ VlA , will be called star. This means that the cardinality of V1A is 1, and
                                                                                             187

the vertex v ∈ V1A , will be called the center of the star. We distinguish the simple and
complex stars. If any pair of edges e1 , e2 ∈ EA covers only in one vertex v ∈ V1A ,
then the star is called simple. Otherwise, a star will be called complex. The number of
edges of the star will be called degree. For the edge e = (v1 , v2 , . . . , vl ) ∈ E of the star
verticesv1 and vl we will call end. In turn, the vertices v2 , . . . , vl−1 will be determined
as internal. Vertices set of the part of graph V2 , . . . , Vl−1 are composed of empty pairs
of disjoint sets Vi (vj ), vj ∈ Vj , where: i = 2, . . . , l − 1, j = i + 1.
     For hypergraph H = (V, E) its subhypergraph H1 = (W, U ) we’ll be called hy-
pergraph for which set of the vertices W is the vertices subset V of hypergraph H, ie.
W ⊆ V and the edge set U is the edges subset E of the hypergraph H, wherein if the
(x, y) ∈ E and x, y ∈ W , then (x, y) ∈ U . Hypergraph cohesion component will be
called the set of its vertices, such as any of two of its elements there is a path between
them, but there is no path leading from the vertex of belonging to this collection to any
other vertex outside. If there is in the subhypergraph H1 = (V1 , E1 ) of hypergraph H
consistency of each component there is a star with center at some vertex v ∈ V1 , the
H1 we will call the coverage of hypergraph H stars.
     Fixed design task was to find such a connection structure that will ensure the maxi-
mization or minimization of operating parameters, such as communication delay, errors
in access to the communication medium as a result of his occupation, performance com-
putational processing nodes and others. Such a task can be solved by seeking the cover
at least the stars of trigeminal graph. Let’s consider the task.
     Input data. As an input data we use the design task:

 1. B = {b} - A set of logical communication bus dedicated by physical channel under
    communication using any of the methods of reproduction communication;
 2. F = {f } - A set of access protocols, which describes the functioning logical BUS
    communication [12];
 3. N = {n} - A set of customers of the system, using BUS communication channels.

    Customers are divided into groups d ∈ D taking their communication requirements
into account, where D = {d} - is a set of types of communication requirements. El-
ements of the set D shall be as follows: d = 0 – service streams sensitive to errors
and latency; d = 1 – support for latency-sensitive flows; d = 2 – service streams are
sensitive to transmission errors; d = 3 – handling sensitive streams not being sensitive
to these factors.
    The definition of design task. Each of the compute nodes n ∈ N should be as-
signed a set of M bus b, which is a subset of B, ie. M ⊆ B, each of which will operate
on the basis of one of the access protocols f ∈ F .
    The mathematical model. The task design is iteratively solved. In each of the steps
and for each of the compute nodes there is sought one bus b ∈ B providing for node
communication services using Access Protocol f ∈ F . To avoid multiple of includes
a node on the same bus at each successive step from the available buses the client is
excluded from those for whom it has already been connected.
    The mathematical model is based on the 3-partite hypergraph H = (V, E) =
(X, Y, Z, E). Busses from the set B = {b} correspond to the vertices of the first part x
(x ∈ X). Each of them (at the same time each logical bus) is assigned to the label η (x)
188

determining the transmission characteristics of the bus, in the simplest case, the number
of nodes measured, that it can handle.
    The f elements of the set F bus access protocols correspond to the vertices y of
the second part of hypergraph H (y ∈ Y ), and the elements n from N compute nodes
correspond to the vertices zof the third part of hypergraph (z ∈ Z).The set of edges
E = {e} includes all three vertices (x, y, z)such that x ∈ X_, y ∈ Y , z ∈ Z. There
are permitted only those edges, for which a selected bus the client can handle bi ∈ X
using a communication protocol fl ∈ Y2 . Collection E = {e} of all edges is determined
essentially by a set of all admissible triples e = (x, y, z). Taking account of the value of
the parameter n (x) for x ∈ Xin hypergraph H = (V, E) = (X, Y, Z, E) permissible
step design task solution will be any of his sub hypergraph β = (Vβ , Eβ ) forVβ ⊆
V and Eβ ⊆ Eof which each component represents the simple consistency star of
stage with the center in the apex x ∈ X. As S = S (H) = {s}we denote the set of all
feasible solutions of tasks covering hypergraph H stars.
    Each of edges e ∈ E of hypergraph H = (V, E) there is assigned three scales
describing the following characteristics solutions:

 1. ω (e) = φ (x, y, z) - Expected customer conversion processing performance in a
    system in which the client is supported in communication with the bus X, which
    uses a communication protocol y. To evaluate the performance characteristics of
    the proposed process there were used in [13], [14], [15]. Described measures
    therein are modified so that they reflect the change in performance which are due
    to changes in the architecture of calls. Because of the nature of the bus network
    connecting the primary measure of performance is the number of nodes, for which
    there is available transportation network [16].
 2. ξ (e) = φ (x, y, z) - Expected change in the communication delay on demand of
    the customer for these conditions. The level of changes in the delay is determined
    on the basis of the stochastic model using the method described in [17].
 3. ψ (e) = φ (x, y, z) - Expected change in system reliability for client-server pre-
    served the conditions of point. 1. To determine the reliability of the method was
    applied changes presented in §2.

    Solution design task. The rating of the solutions will be shown as a multi-criteria.
The proposed set of criteria is obviously exemplary, and his selection depends on the
needs of the designer, in particular concerning the nature of the future operation of
the network merger. For the case of under consideration there are the three functions
described below.
    Let’s consider the set A = {a} of acceptable solutions of design task. For each of
them we define the following characteristics assessing the quality of solutions:

 1. Criterion performance computing: Φ1 (a) = max min ω (e), where: Ea – sub of
                                                     a∈A e∈Ea
    edge of hypergraph H belonging to solution a. Using this criterion, we strive to
    maximize the minimum level of performance (computing or communication) sys-
    tem.                                         P
 2. Communication delay criterion: Φ2 (a) = min e∈Ea ξ (e), which provides net-
    work search junction with minimal delay summary. For systems with varying levels
                                                                                        189

    of validity of nodes, the value ξ (e) of expected changes in the communication de-
    lay is called using vertex priority.
                                                 P
 3. The criterion of reliability: Φ3 (a) = max e∈Ea ψ (e). This criterion provides a
    network architecture search for which the total reliability is maximum likewise in
    the case of a delay criterion communication Φ2 .

     The possibilities of the above method is not limited to the application of the sum-
mation as a min or max. To assess the quality of solutions it can be used any of the
method of folding (convolution) parameters, including methods taking the weight of
each sub-parameters into account. These sub-criteria are related by a function to form
Φ (a) = (Φ1 (a) , Φ2 (a) , Φ3 (a)). Multi-criteria objective function Φ (a)defines a set A
feasible solutions, a set of Pareto Ap composed of Pareto optima ap . If two solutions
a1 , a2 ∈ A vector objective function Φ (a) are equivalent, then the set of Ap is secreted
full set of alternatives AA , which is, in fact, a maximum system vectorially different
optima Pareto.


4   The Research, Results and Further Work

The work approach used to create a multi-channel design methodology destined for
fieldbus communication service systems, client-server computing. The existing method-
ology is focused on providing definite level of computing capacity of the entire system,
regardless of its reliability parameters. In the presented in the work version, methodol-
ogy seeks the optimal solution with respect to multi-criteria objective function that one
of the criteria is reliability. Depending on how sub-criteria ties and a set of restrictions
proposed methodology also allows you to specify the connection architecture, charac-
terized by: a. maximum reliability with certain: the minimum efficiency and maximum
delay network communication links; b. the minimum communication delay with the
reduction in the minimum reliability and performance; c. maximum computing perfor-
mance of a specified acceptable level of reliability and delays. There was also tested
solution, the aim of which was to design load leveling various communication buses.
For each of the solutions sought there are limited the maximum cost of construction.
We analyzed the network of connections complete and partial, flat and hierarchical.
    Simulation studies of obtained architectures for computing model client-server
based on the methodology presented in [16] showed that the use of multi-channel com-
munication systems can flexibly adapt to the current needs of the computing system.
Increasing performance computing system with unchanging resources, obtained by re-
configuring its connections reached 260%. Deviations in terms of the burden of com-
munication channels does not exceed 31% and the probability of rejecting a service
request has fallen nearly 8-fold. The algorithm of the interconnection network is char-
acterized by polynomial time complexity, which can react in real time to any change in
traffic patterns.
    Further research will focus on finding effective methods of searching for coverage
of celebrities simple multipartite hypergraph, which will allow the use of graphs in the
design process of any of valor, and this will introduce further design criteria.
190

References
1. T. Yu, Y. Zhang, and K. J. Lin, "Efficient Algorithms for Web Services Selection with End-to-
   End QoS Constraints," ACM Transaction on The Web, vol. 1, no. 1, May 2007.
2. M. R. Garey and D. S. Johnson, Computers and Interactability. The Guide to the Theory of
   NP-Completeness. San Francisco: W.H. Freeman & Company, 1979.
3. H. Kellerer, U. Pferschy, and D. Pisinger, Knapsack Problems. Berlin: Springer, 2004.
4. C. Chekuri and O. Hudry, "Optimal clustering of multipartite graphs," Discr. Appl. Math., vol.
   156, no. 8, pp. 1330-1347, 2008.
5. M. Dawande, P. Keskinocak, and J. M. Swaminathan, "On bipartite and multipartite clique
   problems," Journal of Algorithms, vol. 41, no. 2, pp. 388-403, 2001.
6. A. Scarelli and S. C. Narulla, "A multicriteria assignment problem," Journal of Multi-Criteria
   Decision Analysis, vol. 11, no. 2, pp. 65-74, 2002.
7. E. Cela, The Quadratic Assignment Problem. Dordrecht: Kluwer Academic Publishers, 1998.
8. T. Oncan, "A survey on the generalized assignment problem," INFOR, vol. 45, no. 3, pp.
   123-141, 2007.
9. A. K. Jain, M. N. Murty, and P.J. Flynn, "Data clustering: a review.," ACM Comput. Surv., vol.
   31, no. 3, pp. 264-323, 1992.
10. N. Jardine and R. Sibson, Mathematical Taxonomy. London: John Wiley & Sons, 1971.
11. S. Mishin, "Optimal organizational hierarchies in firms.," J. of Business Economics and Man-
   agement, vol. 8, no. 2, pp. 79-99, 2007.
12. R. Dutta, A. E. Kamal, and A. E. Rouskas, Traffic Grooming for Optical Networks.: Springer,
   2009.
13. B. R. Haverkort, Performance of computer communication systems : a model-based ap-
   proach. Chichester: John Wiley & Sons Ltd, 1999.
14. M. Hajder, H. Loutskii, and W. Streciwilk, Informatyka. Wirtualna podroz w swiat systemow
   i sieci komuterowych. Rzeszow: Wydawnictwo Wyzszej Szkoşy Informatyki i Zarzadzania w
   Rzeszowie, 2002.
15. S. Sahni and V. Thanvantri, "Parallel Computing: Performance Metrics and Models," IEEE
   Parallel and Distributed Technology, vol. 4, pp. 43-56, Jan. 1996.
16. M. Hajder, "Improving the efficiency of multi-channel backbones client-server," Visnyk of
   National Technical University of Ukraine, vol. 37, pp. 37-48, 2002.
17. M. Hajder and M. Kielbus, "Matematyczny model opoznien w sieci z komutacja pakietow,"
   in XV Konferencja Sieci i Systemy Informatyczne, Lodz, 2007, pp. 47-50.