<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Self-Organising System for Resource Finding in Large-Scale Computational Grids</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Fabrizio Messina, Giuseppe Pappalardo, Corrado Santoro University of Catania Dept. of Mathematics and Informatics Viale Andrea Doria</institution>
          ,
          <addr-line>6 - 95125 - Catania</addr-line>
          ,
          <country country="IT">ITALY</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>-This paper presents a novel approach for resource finding and job allocation in a computational Grid. The proposed system is called HYGRA, for HYperspace-based Grid Resource Allocation. The basic principle involves the construction of a virtual hyperspace in which the available amount of each resource type is used as a geometric coordinate, making each Grid node representable as a point in this virtual hyperspace. A distributed overlay construction algorithm aims at connecting each node with the nearest k nodes w.r.t. the euclidean distance defined in the hyperspace. In this system, a job request, which can be also represented as a point, navigates the hyperspace, from node to node, following the overlay links which minimize the euclidean distance between the current node and the target point representing the job itself. The paper describes the algorithms for overlay construction and resource finding and assesses their validity and performances by means of a simulation approach.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. INTRODUCTION</title>
      <p>
        Computational Grids [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] are computer systems whose
infrastructure, in terms of software architecture and protocols,
must feature very important characteristics, such as
faulttolerance, scalability and efficiency. These systems are made
of thousands CPUs, therefore the operations provided must
strongly take into account the size of the system. Such an
aspect is particularly important for operations like job
submission: this operation implies to find a Grid node featuring
a desired amount of certain resources, such as number of
CPUs, CPU time, RAM and disk space, network bandwidth,
software libraries, etc.; when the overall system is composed
of thousands of nodes, a proper fast and efficient node finding
algorithm is mandatory, since a brute-force approach involving
all nodes is obviously neither opportune nor sound.
      </p>
      <p>
        A typical solution to the problem above involves the use
of a hierarchy of special repositories, holding a distributed
database which stores the information about each Grid node
and its kind and amount of available resources. This is
the technique employed by the Monitoring and Discovery
System [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], provided by the Globus Toolkit1, the
de-facto standard software tool for building interoperable
computational Grids. But even if MDS is widely used, it
suffers of peculiar problems, well known to MDS designers,
which seriously affect its performances: the most important
problem is a possible lack of consistency of stored information,
which is due to the unavoidable latencies between the instant
in which a node changes its state (i.e. some resources become
available or unavailable) and the instant in which the MDS
database is updated; during this time period, the MDS database
contains information which do not reflect the real node’s state,
thus affecting the validity of the resource finding phase; if
the system size grows, in terms of number of Grid nodes,
the probability of the occurrence of such an inconsistency
increases.
      </p>
      <p>
        A possible approach for solving the problem above, which
has been studied in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], foresees to directly query the
nodes about their resource availability, instead of contacting
“someone else” which could not have fresh information. Since
querying all nodes is not a viable solution, the cited
approaches consider the employment of peer-to-peer techniques,
which aim at organising the nodes in a proper overlay network
in order to make the query and match process easier.
      </p>
      <p>
        With these concepts in mind, this paper proposes a P2P
approach, called HYGRA (for HYperspace Grid Resource
Allocation), strongly based on concepts proper of spatial
computing [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The proposed solution, which is completely
decentralised (no central or aggregated repositories, or
superpeers), models the entire Grid as n-dimensional hyperspace
in which resource types, with their availability, represent
coordinates: each node thus virtually occupies a specific point
in the hyperspace. Following this abstraction, nodes with
resource availability close to each other feature points, in the
hyperspace, that are near to each other; therefore, to find a
node able to offer a certain amount of resources for a job
implies to locate a specific region of the hyperspace and
discover (one of or the best of) the nodes occupying that
region.
      </p>
      <p>To make this possible, HYGRA organises the Grid nodes
in an overlay network, where each node is virtually connected
to some other nodes and interactions among nodes can occur
only following such links; overlay construction is performed
by means of a decentralised algorithm in which each node/peer
aims at ensuring that the node is always kept linked with at
most k of the “nearest” nodes in the hyperspace.</p>
      <p>The overlay network built so far is exploited by the resource
finding algorithm which is thus reduced to a “geometric
problem”: given that we know the coordinates of the region
holding candidate nodes (and this can be obtained from job’s
requirements), the query can start from any node and follow
the links leading to nodes with minimum distance from the
target region. Also in this case, the approach is completely
decentralised, since the peer receives the query, checks it and,
if necessary, forwards it to a linked node until the target can
be found.</p>
      <p>The paper is structured as follows. Section II introduces the
basic concepts and symbols used later on in the description
of the HYGRA. Section III describes the system architecture.
Section IV details the working scheme of HYGRA, reporting
and explaining the overlay construction and resource finding
algorithms. Section V discusses the result of a simulation study
about the performances of HYGRA. Section VI concludes the
paper.</p>
    </sec>
    <sec id="sec-2">
      <title>II. BASIC HYGRA MODEL</title>
      <p>To better explain the technique proposed in this paper,
we first introduce a mathematical formulation with a set
of symbols and relations that are used to model the Grid
environment and the hyperspace, and helps the understanding
of the HYGRA algorithms.</p>
      <sec id="sec-2-1">
        <title>A. Model of Resource and Job Request</title>
        <p>We consider a Grid composed of n nodes; for each node i,
let si denote its state, meant to describe resource availability
on i, at a certain time instant; assuming that, in the overall
Grid, all nodes offers m different types of resources (say,
CPU, memory, disk space, network, OS type, etc.), the state
si of a node will be a vector (ri,1, ri,2, . . . , ri,m); here each
component ri,j represents the amount of resource j available
at node i.</p>
        <p>Let us define resource domains D1, D2, . . . , Dm,
meaning that ri,j ∈Dj for i=1 . . . n, j=1 . . . m. If a resource is
measured by a quantity, such as RAM, disk space, network
bandwidth or CPU time, Dj will be a numeric interval, ranging
from a minimum to the maximum admissible resource amount
over all nodes. If the resource is of a different kind, such as
the presence of a certain library or the availability of a certain
library version, Dj will be a set of values each describing a
potential resource instance.</p>
        <p>In this system model, a job submission request, which
carries job’s requirements, is represented as the tuple q =
((f1, q1), (f2, q2), . . . , (fm, qm)), where qj is the amount of
the resource of type j requested by the job, and fj is a
predicate used to match the requirement with the availability:
we say that node n∗ can host a job carrying request q iff
fj (rn∗,j , qj ) = true, ∀j ∈ 1, . . . , m. We assume, without loss
of generality, that predicate fj can be either the = or the ≥
relation: the former means that the request and availability
must exactly match, while, in the latter case, the predicate
succeeds if the resource availability is greater than the resource
requested2.</p>
        <p>Let us introduce an example to better explain the practical
usage of the symbols provided. We consider a Grid whose
nodes possess the following resources: amount of RAM,
2These two predicates cover most cases of job allocation requests in a Grid.
number of CPU cores and GCC library version; let us
suppose that there cannot be more than 8 GBytes RAM and
8 cores per node, and that some nodes offer glibc version
2.10.2 while others offer version 2.9.1. For this Grid, resource
domains will be D1 = 0 . . . 8192 (assuming RAM is specified
in Megabytes), D2 = 0 . . . 8 and D3 = {2.9.1, 2.10.2}.
In this system, a request for a job that needs 3 cores, at
least 25 MBytes of RAM and glibc version 2.10.2 can be
represented as {(≥, 25), (=, 3), (=, 2.10.2)}.</p>
      </sec>
      <sec id="sec-2-2">
        <title>B. Model of the Hyperspace</title>
        <p>
          The hyperspace model used by HYGRA starts from the
formulation introduced above, provided that a transformation
is applied to domains not featured by a numeric value: for each
domain Dj which is a finite set of known values, we associate
a (natural) number to each of such values; as instance, domain
D3 = {2.9.1, 2.10.2} of the example above can be remapped
to D3 = {1, 2} = [
          <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
          ] ⊂ N.
        </p>
        <p>After this transformation, each domain is indeed a subset
of N or R, therefore we can construct a metric space (S, d),
where S = D1 × D2 × . . . × Dm, elements are vectors v ∈ S,
and the metric d is the euclidean distance. Each Grid node i
features a state si which (provided again the proper domain
transformation) becomes a vector of the metric space si ∈ S:
in other words, we can say that a node i of the Grid, according
to its state, represents a point in the hyperspace S.</p>
        <p>Following the same abstraction, in a job request q =
((f1, q1), (f2, q2), . . . , (fm, qm)), vector q = (q1, q2 . . . , qm)
is also an element of S, and q, due to the presence of
the predicates, determines a partition of S (or a semispace),
S(q) ⊂ S, made of all elements in S that satisfy predicates
fj :</p>
        <p>S(q) = {v ∈ S : fj (vj , qj ) = true ∀j ∈ 1, . . . , m}
We call S(q) the admissible region for job q since any Grid
node i such that si ∈ S(q) is able to host the job. The aim of
HYGRA is to provide a decentralised approach to allow the
discovery of the nodes belonging to the admissible region of
a given job that needs to be allocated and executed.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>III. HYGRA ARCHITECTURE</title>
      <p>From the software architecture point of view, HYGRA is
made by means of a multi-agent system. Since the overall
structure is based an overlay network, each Grid node runs
a NODEAGENT holding three kind of information: (i) the
state si of the node; (ii) the set of references of other
NODEAGENTs, which belong to directly linked nodes of the
overlay3; and (iii) the state sj of each node j relevant to
directly linked NODEAGENTs.</p>
      <p>Each NODEAGENT can communicate with other
NODEAGENTs by exchanging the following types of
messages:</p>
      <p>3Of course such a reference may be an IP address, a couple IP/port, a FIPA
agent name, etc. This is a matter of the implementation and its choice does
not affect the architecture or the validity of the approach.
1) Job Allocation Request q. This message is sent from a
NODEAGENT, which is not able to allocate the job, to
a linked NODEAGENT selected according to a proper
forwarding policy detailed in Section IV.</p>
      <sec id="sec-3-1">
        <title>2) State Change Notification. It is sent from the</title>
        <p>NODEAGENT of a node i to all the linked
NODEAGENTs when node’s state si changes to s′i
(due to a new job arrival or the termination of the
execution of a running job); the message obviously
carries information about the new state s′i.
3) 2-hop Status Query. This is a query message sent
from the NODEAGENT of a node i to all its linked
NODEAGENTs and aims at obtaining the state sk of each
node linked to all the nodes directly linked with i. A
proper 2-hop Status Reply message is expected following
the transmission of this query.</p>
      </sec>
      <sec id="sec-3-2">
        <title>4) Link Creation Notification. It is sent from a</title>
        <p>NODEAGENT i to a NODEAGENT j to inform
the latter that i wants to be connected with j. After the
reception of this message, NODEAGENT j updates its
set of directly linked agents by including also i.
5) Link Cut Notification. It is sent from a NODEAGENT i
to a (connected) NODEAGENT j to inform the latter that
the link i/j has to be destroyed. After the reception of
this message, NODEAGENT j updates its set of directly
linked agents by removing i.</p>
        <p>These messages are exchanged during the two main
activities of the NODEAGENTs, (i) overlay construction, which
aims at (self-)organising the overlay network in order to
ensure certain neighborhood properties; and (ii) job allocation,
in which a job request has to be fulfilled by checking the
availability of resources of a node and, if this is not the case,
properly forwarding the request to a linked NODEAGENT. The
details and algorithms of such activities are described in the
following Section.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>IV. HYGRA WORKING SCHEME</title>
      <p>As reported above, the working scheme of HYGRA is based
on organising the Grid nodes in an overlay network featuring,
in the metric space (S, d), a neighbourhood property based on
the euclidean distance. In particular, for each node i, given
the set L(i) of the nodes directly connected with i, these
nodes are those which feature the minimal euclidean distance
d(si, sk), ∀k ∈ L(i). In other words, for each node i the
following property holds:
∀k ∈ L(i), 6 ∃sh ∈ S, h 6= k, h 6∈ L(i) : d(si, sh) &lt; d(si, sk)
(1)</p>
      <p>If the property above holds, the resulting overlay network
is a topological graph that can be traversed by means of e.g.
a minimal path algorithm in order to find nodes belonging to
the admissible region.</p>
      <sec id="sec-4-1">
        <title>A. Construction of the Overlay Network</title>
        <p>The overlay construction technique is based on an
algorithm run by the NODEAGENTs which is regulated
by two parameters, degmin and degmax, respectively
the minimum and maximum degree of each node4;
therefore degmin, degmax ∈ N, degmin &lt; degmax and
degmin ≤ |L(i)| ≤ degmax, ∀i. The basic algorithm run by
the NODEAGENT of a generic node i can be summarised in
the following steps:
Algorithm 1 Overlay Construction
1) by sending 2-hop Status Messages to each node of L(i),
build the set L′ (i) = (L(i) ∪ (∪∀j∈L(i)L(j))) − {i}, that
is the set of directly linked and 2-hop linked nodes of i;
2) since the NODEAGENT now knows all the states of
nodes in L′ (i) (i.e. their coordinates in the hyperspace),
order nodes in L′ (i) according to the distance to i (in
ascendant order′ ), i.e. d(si, sk), ∀k ∈ L′ (i);
3) build the set L (i) by taking at most the degmax first
nodes from L′ (i); these will be the nodes in L′ (i) which
are the neares′ t to i;
4) if L(i) = L (i), i is still connected to the nearest
possible nodes, thus property (1) holds and the algorithm
stops here. ′
5) connect node i with all nodes in L (′i); to this aim,
disconnect, from i, nodes in L(i) − L (i), by sending
them a Lin k′ Cut Notification message and connect i with
nodes in L (i) − L(i), by sending them a Link Creation</p>
        <p>Notification message; then update L(i) = L′ (i).
6) restart from step 1.</p>
        <p>The basic principle of the overlay construction algorithm
should be quite clear: given any configuration of the overlay
network, at each step of the algorithm each node tends to be
connected to nodes that are nearer; this should be enough to
ensure that, sooner or later, property (1) will be met. In such
a case, condition in step 4 of algorithm 1 holds, meaning that
the node has reached a stability; no more runs of the algorithm
are needed unless the state si of the node changes due to the
arrival of a new job or the termination of a running job: in
this case, since the node has changed its coordinates in the
hyperspace, property (1) could no more hold and the right
links need to be re-created.</p>
        <p>Bootstrapping the overlay network is also a simple
operation: a new node x which wants to join must know only one
existing node of the network, k: it has to link itself with k
and nodes in L(k), thus L(x) = k ∪ L(k), and immediately
run Algorithm 1; at the first run, property (1) could not hold,
but as soon as some steps of the construction algorithm are
executed, a stable condition can be reached.</p>
        <p>In order to understand the behaviour of the overlay
construction technique, we built a software simulator, which is
then described in Section V. Figure 1 shows some screenshots
taken during the construction of the overlay following the
algorithm described and using a Grid featuring two types
of resources (in order to allow the representation of the</p>
        <p>4According the literature on graphs, the degree of a node (or vertex) is the
number of its links (edges) or its connected nodes.
(a) Initial condition (random)
(b) After one step</p>
        <p>(c) Stable condition (after five steps)
network in two dimensions). The degree coefficients, degmin to a wasting of computational power since Algorithm 1 runs
and degmax, are respectively set to 6 and 15. The initial continuously without converging to a stable condition.
condition of the network, in which all links are randomly set, The solution we employed to avoid this problem is based
is shown in Figure 1a, while Figure 1b and Figure 1c shows on running a cycle of Algorithm 1 according to a certain
the network condition after respectively 1 and 5 iterations of probability; exploiting a model similar to that of simulated
Algorithm 1: the ability of the system to self-organise and to annealing, we associate to each node a virtual temperature ti,
meet property (1) is quite evident. which is initially set to a maximum value Tmax and decreases</p>
        <p>Step 4 of Algorithm 1 reports a condition which, if met, en- at each cycle (till reaching 0) unless the node changes its
sures that the network, for what the single node is concerned, state: in this case, ti is reset again to Tmax. Therefore, a
has reached a stability. The question is: can we ensure that cycle of Algorithm 1 runs with a probability Pi = Tmtiax , thus
sooner or later this condition will hold? Or, in other words, ensuring that, even if the algorithm starts locally oscillating,
does the algorithm terminate? While, by looking the individual sooner or later (if the node does not changes its state) this
behaviour of the single node, at first sight the answer could be oscillation terminates and a stability is reached. According to
positive, the situation is much more complex when the mutual this optimisation, the final behaviour of each NODEAGENT is
influence between linked nodes is considered. Indeed, since the described by the Algorithm 2 below:
same link connects two nodes, say i1 and i2, operations made
by the algorithm in i1 affect the behaviour of i2 and vice- Algorithm 2 Overlay Construction with Optimisation
versa. If we take into account such a mutual influence, on the 1) Set ti = Tmax;
basis of the topological configuration of the nodes, during step 2) Compute Pi = Tmtiax ;
5 of Algorit h′m 1 the following situations may happen: 3) Run one cycle of Algorithm 1 with probability Pi;
1) i2 ∈ L (i1) ∧ deg(i2) = degmax; according to the first 4) if node i has changed its state si (due to job arrival or
condition, the algorithm in i1 should connect i1 to i2, job termination), set ti = Tmax;
however, since the degree of i2 is already degmax, a new 5) otherwise set ti = ti − 1, unless ti is still 0;
connection would cause an overcome of the maximum 6) go to step 2;
2) id2eg∈reeL(liim1)it−; L′ (i1) ∧ deg(i2) = degmin; the link B. Resource Finding and Job Allocation Algorithm
between i1 and i2 should be removed, but this operation The overlay construction algorithm described so far aims at
would cause the ′ degree of i2 to go ′below degmin; organising the network in order to ease the resource finding
3) i2 ∈ L(i1) − L (i1) ∧ i1 ∈ L (i2); in this case, phase. Indeed, the resource finding algorithm is quite simple
the same link is considered completely different by each and is based on check-and-forward policy: roughly speaking,
node since it needs to be removed for i1 but added for i2; once a node receives a job submission request, it checks if it
the result is a sort of “local oscillation” of the′ algorithm. can fulfill it (i.e. the node belongs to the admissible region),
A simp′le check on the degree of each node in L (i)—resp. otherwise the node forwards the request to the linked node
L(i) − L (i)—is able to easily solve situations 1 and 2: if the which is the nearest, according to its state, to the admissible
resulting degree is not between degmin and degmax, the node region. The real algorithm is based on the principle above and
is not connected—resp. disconnected. applies some peculiar strategies for the choice of the next node</p>
        <p>The third situation is more hard to tackle: indeed both nodes (when more than one of it are candidates) and for the recovery
are in accordance with the algorithm and there is no way to when a path leading to a “dead end” (i.e. a wrong path which
choose if the link must be removed or preserved. It should be cannot lead to the admissible region) is followed.
noted that such a local oscillation does not provoke a loss of A request, which is carried by a Job Allocation Request
consistency of the overlay network: indeed property (1) is not is represented by the tuple (q, P ), where P is the ordered
violated but the problem is only with a lack of efficiency due sequence of nodes visited till now. The request is submitted to
any NODEAGENT of the Grid with P initially set to empty;
when a NODEAGENT receives such a message, the following
algorithm is executed:
Algorithm 3 Resource Finding
of free resources; in order words, the further node, with
respect to q, is chosen;
2) BestFit, it selects the node that best fits the allocation,
leaving the amount free resources nearer to zero; in other
words, we choose the nearest node, with respect to q.
1) on the arrival of a job request (q, P ), check if the node
can host the job, that is if fj (ri,j , qj ) = true, ∀j ∈
1, . . . , m;
2) if the condition is met, allocate the job in node i and</p>
        <p>terminate the algorithm with success;
3) if the previous condition is not met, build the set L(i)−P
and check if the set N = (L(i)−P )∪S(q) is not empty,
i.e. determine the set of nodes in L(i) − P which belong
to the admissible region;
4) if such a set is empty, select a node i′ in L(i) − P which</p>
        <p>minimises the distance d(si′ , q);
5) if set N is not empty, select a node i′ in N on the basis</p>
        <p>of an heuristic H(N ) which is detailed below;
6) if one of the previous two steps is successful, and thus
node i′ exists, we are approaching the admissible region,
therefore update P ′ = P ⊕ {i′ } by concatenating i′ to
sequence P , and forward the request (q, P ′ ) to node i′ ;
7) if L(i) − P = ∅ there is no node that can allow the
request to approach the admissible region since all linked
nodes have been already visited and the algorithm would
end in a infinite circular loop; in this case there are two
possible causes: (i) the admissible region contains no
nodes, or (ii) the path followed led to “local minimum”.</p>
        <p>Indeed, with the current knowledge, the NODEAGENT
has no way to understand the real cause and it can only
suppose that the path is wrong: maybe making a different
choice could help in finding the target, therefore we
find, in the sequence P , the position of i and select the
previous node i′ . If such a node exists, the NODEAGENT
forwards the request to i′ , otherwise (that is, i is the first
node in P ) the algorithm terminates with a “node not
found” message, meaning that the job request cannot be
fulfilled.</p>
        <p>Admissible region for the request</p>
        <p>Local minimum</p>
        <p>Request submission</p>
        <p>A final remark is needed to explain step 7. As it has been
detailed in Algorithm 3, since there is no global knowledge
or view of the network, there is no way, for a NODEAGENT,
to understand if the admissible region is empty; the only fact
which can be deducted is that the NODEAGENT is no more
able to proceed further. However, as it is depicted in Figure 2,
experimental results proved that such a condition occurs also
in some extreme cases in which nodes are placed in points
such that a path, for a certain job request, seems to lead
to the admissible region, but indeed reaches a “dead end”
or, in other words, what we call a local minimum. To solve
such conditions, a second choice is given to the algorithm
by backtracing to a previous node of the path followed: a
different branch of the graph is selected thus increasing the
probability to exit from the local minimum and reach the
admissible region. Obviously, if the admissible region is really
empty, all the alternative branches selected would end in nodes
still visited and, sooner or later, the first node of the path
will be reached again: after this, no choices will be left and
therefore the algorithm will end with a failure indication (by
high probability).</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>V. PERFORMANCE EVALUATION</title>
      <sec id="sec-5-1">
        <title>A. The Simulator</title>
        <p>If step 2 succeeds, the job has to be allocated on node i and
the amount of resources needed by the job must be granted
to it; in this case, the state of node i changes from si to s′i,
the node occupies another point in the hyperspace, it becomes
“hot” and overlay construction algorithm restarts in order to let
the network to re-organise itself. In a similar way, when a job
terminates its execution on a node, it releases all the resources
needed: also in this case the state of the node changes and a
restart of the overlay construction algorithm is triggered.</p>
        <p>Step 5 of Algorithm 3 entails to select the next node
according to an heuristic; it is employed when, in proximity
of the admissible region, the set L(i) − P contains more than
one node belonging to S(q), so the question is which one of
such nodes we have to select. To this aim, we have proposed
and tested two different strategies:</p>
        <p>In order to study the performance of HYGRA, we
implemented a software simulator. It is a time-driven simulation
tool which is able to represent the behaviour of Grid nodes,
modeling both the overlay construction and the resource
finding algorithm, also simulating job allocation and execution.</p>
        <p>The tool is capable to simulate the flow of time by means of
discrete “ticks”. For each timer tick a) a step of the overlay
construction algorithm is executed on all nodes; b) a step of
the resource finding algorithm is executed for all the requests
circulating in the network; c) a bunch of new job requests is
generated, on the basis of a frequency parameter specified in
1) MaxFree, it selects the node that has the highest amount the configuration file (see below); d) a check on the execution
of the allocated job is performed, deallocating from the node
terminated jobs.</p>
        <p>The simulator takes various parameters as inputs and
produces some indexes, briefly explained below, in order to
evaluate the goodness of the proposed technique. Table I
summarises a subset of the input parameters (the most
important ones) that affect the average load of the system and
the behaviour of the algorithm.</p>
        <p>Parameter (a) specifies the probability distribution by which
the number of requests per tick will be sampled, while
parameter (b) specifies the distribution of the amount of resources
over the overlay network5. Parameter (c) is very similar to the
previous but specifies the distribution of the specific resource
when a job submission request is generated. A well-tuning of
the parameters (a-c) would induce different load conditions
in the Grid and thus permits the evaluation of the approach
on the basis of different system configurations. Parameter (d)
is strictly bound to the behaviour of the resource finding
algorithm. If set to “backtracking” the simulator is forced to
adopt the backtracking strategy whenever a local minimum is
found; on the other hand, by specifying “Stop” the simulator
is forced to conclude the journey of the request whenever no
choices are available6.</p>
        <p>It’s worth observing that the backtracking strategy is
generally affected by the dynamics of the overlay network, i.e.,
by the unexpected changing of coordinates and the subsequent
reorganization of links. Indeed, once the request has reached
a local minimum, one or more node previously saved in
its history (the traversed path) could have changed its own
coordinates due to a resource allocation or releasing. In this
case, our choice was to end the journey of the request and
signaling a failure, exactly as the totality of nodes in the path
were visited back and no any alternative paths were found.</p>
        <p>We call this case the hole-in-history exception. However, we
verified it occurs by very low probability, thus measuring and
discussing it anymore it’s probably not worth.</p>
        <p>The parameter (e) is concerned about the discerning of the
set of nodes already visited from those not yet traversed. When
“ByCoordinates” is specified, the comparison of the nodes
is based on their coordinates. In other words, the request
5X has to be replaced by the id of the resource
6The “Stop” variant is useful to assess the improvement provided by the
introduction of the backtracking strategy.
stores not only the list of IDs of the visited nodes, but
also the set of vectors corresponding to their coordinates. If
one or more nodes are found in the set of neighbours such
that their coordinates do not match with those stored in the
history, such nodes are eligible for the selection even if they
have been already visited. It is easy to understand that when
“ByCoordinates” is set, there is no proof that the algorithm
will terminate, i.e., the journey of the capsule will end.</p>
        <p>However, the termination of the algorithm could be trivially
guaranteed by imposing a max value in the number of nodes
visited by each request. The motivation behind the introduction
of the variant “ByCoordinates” is still the dynamics of the
overlay network. In fact, basing the comparison on the ID of
nodes might be too restrictive because some nodes that have
changed their coordinates, will have reorganised their links.</p>
        <p>Accordingly, they could take part in another path which might
be useful to find the way to the admissible region.</p>
      </sec>
      <sec id="sec-5-2">
        <title>B. Experiments and Results</title>
        <p>We made a series of experiments on a test-bed of 100 nodes
and two type of resources with a value ranging in the interval
[100, 200] by a uniform random distribution (see parameter
(b) of Table I). Moreover, the value of degm and degM for
the overlay construction are respectively set to 8 and 15, and
job requests are generated using a Poisson distribution, with
an average value ranging from 20 jobs/tick to 150 jobs/tick
(parameter (a) of Table I). The values for resources requested
(parameter (c) of Table I) by each job are sampled from the
Poisson distribution using an average of 50. Job execution
duration is also randomly generated with a Poisson distribution
using an average value of 40 time ticks.</p>
        <p>The results of the simulations are reported in Tables II
and III. N Alloc and N F ails represent, respectively, the
number of requests successfully allocated, and the number of
failures of the algorithm, i.e. a candidate node exists but it
cannot be found. On the other hand, N Rej is the number
of requests that cannot be allocated since there is no node
able to support them7. Anyway we show for shortness only
the ratio NNFAalliolcs , which we say F ailRatio. The simulator
also produces two other interesting indexes, N Steps and
Optimality (Opt. in Tables II and III). N Steps represents
the number of hops (nodes) performed, in average, by the
allocation algorithm before a suitable node has been found.
The Optimality index let us to understand the “goodness”
of the algorithm and is evaluated as follows. Each time the
algorithm performs an allocation for a request q, say nalloc
the node which has satisfied the request, the Optimality
is computed by using the formula 1 − d(nalloc,nbestSelection) ,
max(dist(ni,nj))
where nbestSelection is the best candidate node computed by
doing (i) a total ordering of all nodes basing on the selected
resource finding strategy (BestF it or M axF ree), and finally
(ii) selecting the best node from that ordering.</p>
        <p>7Clearly, NAlloc + NRej = NReq.</p>
        <p>Jobs/Tick
(avg)
20
30
40
50
60
70
80
100
120
150
Jobs/Tick
(avg)
20
30
40
50
60
70
80
100
120
150
0
7
8
6
8
8
19
24
32
34
0
12
32
30
43
54
72
112
124
145</p>
        <p>Fails
Ratio
0.0
0.0014
0.0016
0.0012
0.0017
0.0016
0.0040
0.0048
0.0064
0.0068</p>
        <p>Table II and III show the results of the simulation study; here
we reported only the results of strategy M axF ree because, for
all the indexes evaluated, it behaves quite better than BestF it.</p>
        <p>The first and most important remark to highlight is that the
number of failures in Table II is lower than that reported in
Table III, showing the advantages of applying the backtracking
strategy. Furthermore, the overall trend of fail-ratio in Table III
can even be considered low enough if compared to the total
number of requests (see “FailsRatio” in Table III).</p>
        <p>The second interesting parameter is the average number of
steps performed by the allocation algorithm, which reasonably
increases with the increment of the job generation rate. We
have to remark also that the performances of the
“Backtracking” strategy is comparable to that exploited by the “Stop”
strategy in term of number of steps, in average, necessary to
allocate a request.</p>
        <p>The last performance parameter to take into account is the
Optimality, that in our experiments appears to be independent
of the job generation load8.</p>
        <p>Finally we have to report that even simulations were
performed also for the “ByCoordinates” variants, we do not
include the related results here, because the improvement in
term of performance is not so significant.</p>
        <p>TABLE II
STRATEGY: BACKTRACKING / NODE-SELECTION: BYID</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>VI. CONCLUSIONS</title>
      <p>This paper has described a novel technique, based on a
selforganising approach, for solving the problem of job
allocation/resource finding in large scale computational Grids. The
proposed system, called HYGRA, exploits spatial computing
concepts and maps the entire Grid system into an hyperspace
where each node, according to the availability of its resources,
virtually occupies a point in the hyperspace. A completely
decentralised algorithm, which exploits the euclidean distance
among nodes, is able to self-organise an overlay network
where each node is virtually linked with some other nodes
feature a specific neighbourhood property. A check-and-forward
algorithm is then employed during job submission to search,
by surfing the overlay network, the node able to host at best
the job, given its requirements. The proposed technique has
been evaluated by means of software tool, able to simulate not
only the behaviour of the algorithms but also the dynamics of
job generation, submission and termination. Simulation results,
provided in terms of computational cost, effectiveness and
sensitivity with respect to Grid load conditions, have shown
the validity of the HYGRA system.</p>
      <p>TABLE III</p>
      <p>STRATEGY: STOP / NODE-SELECTION: BYID</p>
      <p>Opt.
(avg)
0.61
0.66
0.66
0.63
0.63
0.61
0.59
0.66
0.61
0.65
Opt.
(avg)
0.68
0.67
0.62
0.64
0.63
0.61
0.59
0.61
0.58
0.63</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>K.</given-names>
            <surname>Czajkowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fitzgerald</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Foster</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Kesselman</surname>
          </string-name>
          , “
          <article-title>Grid Information Services for Distributed Resource Sharing,” in 10thIEEE International Symposium on High-Performance Distributed Computing (HPDC</article-title>
          <year>2001</year>
          ), San Francisco,
          <year>August</year>
          6-9
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A. Di</given-names>
            <surname>Stefano</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Santoro</surname>
          </string-name>
          , “
          <article-title>A Peer-to-Peer Decentralized Strategy for Resource Management in Computational Grids,”</article-title>
          <source>Concurrency &amp; Computation: Practice &amp; Experience</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>I.</given-names>
            <surname>Foster</surname>
          </string-name>
          and C. Kesselman, Eds.,
          <article-title>The Grid (2nd Edition): Blueprint for a New Computing Infrastructure</article-title>
          . Morgan Kaufmann,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Iamnitchi</surname>
          </string-name>
          and
          <string-name>
            <surname>I. Foster</surname>
          </string-name>
          , “On Fully Decentralized Resource Discovery in Grid Environments,” in
          <source>International Workshop on Grid Computing</source>
          <year>2001</year>
          , Denver, CO,
          <year>Nov</year>
          .
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Minoli</surname>
          </string-name>
          , A Networking Approach to Grid Computing. Wiley InterScience,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>X. Z. J.</given-names>
            <surname>Schopf</surname>
          </string-name>
          , “
          <article-title>Performance Analysis of the Globus Toolkit Monitoring</article-title>
          and Discovery Service,
          <fpage>MDS2</fpage>
          ,” in
          <source>International Workshop on Middleware Performance (MP 2004) at IPCCC</source>
          <year>2004</year>
          ,
          <year>April 2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Talia</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Trunfio</surname>
          </string-name>
          , “
          <article-title>Toward a Synergy Between P2P and Grids,” IEEE Internet Computing</article-title>
          , vol.
          <volume>7</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>94</fpage>
          -
          <lpage>96</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>F.</given-names>
            <surname>Zambonelli</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Mamei</surname>
          </string-name>
          , “
          <article-title>Spatial Computing: an Emerging Paradigm for Autonomic Computing</article-title>
          and Communication,” in 1st International Workshop on Autonomic Communication, Berlin (Germany),
          <year>October 2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , J. Freschl, and
          <string-name>
            <given-names>J.</given-names>
            <surname>Schopf</surname>
          </string-name>
          ,
          <article-title>“A Performance Study of Monitoring and Information Services for Distributed Systems,” in 12thIEEE International Symposium on High-Performance Distributed Computing (HPDC</article-title>
          <year>2003</year>
          ), Seattle, Washington, June 22-24
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>