Semantic Service Discovery in MAS Using Social Networks E. del Val, M. Rebollo, and V. Botti Grupo de Tecnologı́a Informática - Inteligencia Artificial Departamento de Sistemas Informáticos y Computación Universidad Politécnica de Valencia Camino de Vera S/N 46022 Valencia (Spain) {edelval,mrebollo,vbotti}@dsic.upv.es Abstract. Service-Oriented Multi-Agent Systems are dynamic systems popu- lated by heterogeneous agents. These agents model their functionality as services in order to allow heterogeneous agents or other entities to interact in a standard- ized way. Furthermore, due to the large-scale and the adaptive needs of the sys- tem, the traditional directory facilitators or middle-agents are not suitable for the management of the agents services. In this paper we present a distributed system where there is no a ce. The proposal provides a fully decentralized structure and allows agents to locate services using only local information. The system is en- hanced using semantic information in the generation of the system structure and also in the search process. 1 Introduction Service-Oriented Multi-Agent Systems (SOMAS) can be described as open and dy- namic systems, where agents provide basic functionality through services and new agents can enter to the system and existing ones leave. An important issue that has raised great interest in the research community in the latest years is service discov- ery. In open systems where there is a large number of agents and the available agents change dynamically, finding the appropriate agent which offers the service required is not an easy task. Conventional approaches to locate agents with certain functionality in SOMAS, such as registries or middle-agents, are centralized approaches which are not always appropriated for large-scale and highly dynamic environments. These proposals present some weakness such as bottlenecks, complexity or the huge amount of mem- ory needed to keep all the information about the agent’s functionality when the system scales. Distributed approaches, such as agent coalitions or federations of registries, have been proposed to solve some of these problems but the required coordination effort to create the coalitions and to maintain data consistency between distributed registries makes these proposals not suitable for highly dynamic environments. An alternative for traditional proposals is the use of social networks[25][28]. Hu- man beings create social structures in a decentralized way which allows to locate other individual in a few steps considering only local information. This fact was observed by Milgram in the well known experiment of ’six degrees of separation’[24]. The results of this experiment arose two questions: how is the structure of these social networks 23 and how an effective search of individuals is carried on only with local information. In the wake of this experiment, several works started to pay attention on the analysis of the underlying structures in human societies and the properties of these structures. As a result of that, several models based on mathematical functions have been pro- posed to simulate the structure in real social networks. These models try to reflect how social links are established between individuals to form a network which can guide the search. These networks such as small-world or preferential attachment network, are called navigable social networks and it can be ensured that short paths between two random individuals can be found using only local information. How an effective search is carried on only with local information is the other important aspect. Which criteria should be follow by the individuals in order to guide the search towards the tar- get? This depends on the structure of the network. There are some strategies that have better results depending of the underlying structure where it is applied. For instance in small-world networks similarity or geographical distance can be considered a good parameters to consider while strategies guided by degree are not so suitable. In this work, we propose the use of a social network model as the underlying struc- ture of a service discovery system for agents. The structure relies on a social feature present in many social networks called homophily[15]. Homophily expresses the idea that similar people interact with higher frequency than dissimilar people. Therefore, in our system agents with similar roles and services have more probability to be linked. The system provides a fully decentralized structure and allows agents to locate services using only local information. The system is enhanced using semantic information in the generation of the system structure and also in the search process. 2 Related Work Open and dynamic environments where the scalability and the workload are low make use of middleagents to facilitate service discovery [14][22][23]. The matchmakers could provide an optimal matching due to they consider all the registered services in the sys- tem. Unfortunately, this kind of agents could be a bottleneck when the workload in- creases. Other drawbacks are their complexity, the huge amount of memory needed to keep service advertisements and the cost of service composition as the number of ser- vices grows significantly. Different approaches have been suggested to overcome the above mentioned problems. Peer-to-peer approaches [12][2][20][31] broadcast a query using local knowledge The drawback of this approach to service discovery is that the communication among agents is essential and the overall communication traffic over- head may be large. Another distributed way to locate distributed services is to form coalitions or clusters[19][18][16]. Nevertheless, the choice of what coalitions are going to be formed is a difficult task. This entails recursively to calculate the values of the coalitions and later selecting the coalition with the best result. A third way for agents to discover services in efficiently is the distribution of the middleagents or facilitators [21] [17] [13]. These proposals suggest to split the function of the service facilitator among a group of agents. The system designer assigns a local matchmaker to each host or segment of the system, which provides matchmaking services to agents in its vicinity (its segment). In systems with very large segments the problems of scalability are only 24 marginally relieved by this approach because the large segments become overloaded systems which have local bottlenecks. Another case in which this approach is not use- ful is in systems with many cross-links between segments. In this case the overhead of coordinating tasks among local matchmakers might be greater than the benefit obtained from their distribution. 3 Proposed System and Definitions The proposal that we present here tries to overcome drawbacks of current discovery approaches in open SOMAS through a completely distributed approach, considering semantic information about organizational roles and services. This approach is based on social networks as underlying structure. The advantages and contributions of this proposal compared to others are: – System which integrates services and agents. Agents have social and proactive ca- pabilities which provide more flexibility and adaptability to the system. Services facilitate the reusability and interoperability. Here, agents functionality is described in terms of services, therefore we obtain the advantages of both technologies. – System structure that guarantees, in general, that a service if it exits is going to be found in a bounded number of steps. – Service discovery strategy that only needs local information to navigate the network in order to reach the required service. – The use of semantic information to create the system structure and to lead the ser- vice search. – Inclusion of organizational information in the service discovery process. D EFINITION 1 (Agent-Service Discovery System). An SDS is defined as SDS = (A, L), where A is the set of agents that are part of the SDS (nodes): A= {a1 ,...,an }, and each edge `=(ai , aj )∈L indicates the existence of a knowledge or communication relationship between agent ai and aj in the system (undirected links). Agents are social entities which have local knowledge about its immediate neigh- bors, including their identity, degree, organizational information and the semantic de- scription of the services they offer, but it is unaware of the rest of the agents present in the system. D EFINITION 2 (Agent). An agent ai =(R,N ) |ai ∈ A is a social entity which can play several roles in different organizational units R={r1 ,. . . ,rn }:|R| > 0, has a neighbor- hood N ={ak ,...,an }| ak ∈ N , ∃ (ai ,ak )∈L, |N | > 0. The agent role determines the kind of services an agent offers. The role provides an abstract layer over the services that the agent offers and it is used to create the structure of the system. Roles are defined inside an organization unit ou. The organization unit establishes a set of policies responsible of the structure of the system. These policies are related to basic system operations (join, leave, discover. . . ). 25 D EFINITION 3 (Role). A role in our system is defined as r = (φ,S,ou) ∈ R where φ is a semantic concept for the role, ou is the organization where the agent plays the role r and S={s1 ,. . . sn } is the set of services offered by the agent. Each service si is a semantic service defined by the tuple: si = (I,O), where I denotes the set of inputs and O denotes the set of outputs. The I and O of the service are semantic concepts defined in a common ontology. To simplify the system, we are going to consider that each agent plays one role and offers one service. The agent-service discovery system that we present relies on a property present in many real social networks: homophily. This word expresses the idea that similar people tend to interact and establish links with higher probability than dissimilar people. There are two types of homophily[15]: – choice homophily, where patterns of interaction are driven by preferences for sim- ilarity. This kind of homophily has two forms: status homophily, where the indi- viduals are considered similiar if they share a cultural background, and value ho- mophily, where individuals are considered similar on the basis of shared values, attitudes, and believes. – induced homophily, emerges not from individual choice, but from influence dynam- ics that make individuals more similar over time[29]. In this work we focus on choice homophily and its two forms. In general, homophily has demonstrated that is one of the most pronounced features in social networks[3][27]. Due to the efficiency of the social networks with this feature, we consider important to consider this property in our system. D EFINITION 4 (Agent homophily). In our SDS the homophily between two agents is based on the status homophily and the value homophily: – value homophily (Hv (Si ,Sj )) is defined over the agent’s services and it is consid- ered as the semantic similarity between the services offered by the agents. – status homophily (Hs (ri ,rj )) is defined over the agent’s role and it is considered as the semantic similarity between the roles played by the agents Therefore, the homophily between two agents is defined as the linear combination of value and status homophily: H = α ∗ Hv + (1 − α) ∗ Hs (1) We are going to describe with more detail how are calculated each kind of ho- mophily. The homophily function Hs (ri ,rj ), means the degree of match dom (exact, subsumes, plug-in, fail) between the semantic concept of the roles played by the agents. Hs (ri , rj ) = role match(ri .φ, rj .φ) (2) where role match is the function which calculates the semantic similarity between ri .φi and rj .φj . The homophily function Hv (ri .S,rj .S), means the degree of match between the services offered by the agents. 26 Hv (ri .S, rj .S) = β ∗ match(Ii , Ij ) + (1 − β) ∗ match(Oi , Oj ) (3) [ [ I= s.I, O = s.O (4) ∀s∈r.S ∀s∈r.S match(Ii , Ij ) = max(WG0 =(Ii ∪Ij ,E 0 ) ) (5) where function match solves a bipartite matching problem between semantic ser- vices. In our case we have two bipartite graphs, one where the vertexes are the inputs of the services and the other where the vertexes are the outputs. In the case of matching between the inputs of the services (the process is the same for the outputs), the bipar- tite graph G=(Ii ∪Ij , E) has a set of vertexes with Ii inputs and the other set with Ij inputs. Given a bipartite graph for the service inputs, G=(Ii ∪Ij ,E), its matching G0 =(Ii ∪Ij ,E 0 ) E 0 ⊆ E is a graph where all the vertexes of one set are connected with the other set of vertexes only with one edge. In this graph, we allow that edges share a vertex to give more flexibility to the matching. The sum of weights (WG0 ) of the edges in the matching is maximized: X ωk ∀ek ∈E 0 WG0 =(Ii ∪Ij ,E 0 ) = (6) max(|Ii |, |Ij |) 4 System Operations 4.1 Join The process that an agent should follow to get into the SDS is as following (see Alg. 1): the agent ai tries to establish a set of connections with other agents already present in the system. The number of connections that the agent is going to establish is generated by a random function which follows an exponential distribution. The idea is to generate a system with an exponential degree distribution to achieve the structure of a preferen- tial attachment network[1]. A preferential attachment network it is characterized by a degree distribution which follows a power-law degree distribution, p(dg) ∝ dg λ , where p(dg) indicates the probability to be connected to a node with degree dg. This means that there are some nodes have a high degree and the majority has a low degree. This structure ensures that the diameter of the network is ln |A|, where |A| is the number of agents in the SDS and in some situations, when 2<λ<3, is lnln |A|[11]. This model is present in many ’online communities’ such as WWW, electronic mail or citation graphs [26]. These networks are the result of a growth process in which new nodes that join the system prefer to be connected to well connected nodes. Once the agent knows the number of connections, it should decide which agents are going to be its neighbors. The probability of an agent ai to establish a connection with agent aj is directly proportional to the homophily degree between the agents, if the agents are more similar, they have more probability to be connected. This condi- tion allows a new agent not only to establish ’short connections’ between agents with 27 similar roles and semantic services, but also between agents that are not similar (’long connections’). The idea of ’long connections’ is to create short paths between groups of agents that do not offer similar services and reduce the number of hops needed to discover services. The probability to establish a connection between two agents in the system is based on the homophily degree between them H(ai ,aj ). Algorithm 1 Join, where a is the new agent and SD the system function Join(a, SD) connections←ExpRandom(λ) connected←F alse dg ← 0 while ¬connected ∧ dg ≤ connections do ar ←random(SD) if H(a, ar ) ≥ U niRandom(0,1) then `(a,ar ) a.dg←a.dg+1 updateN eighbors(a, ar ) connected←T rue end if end while end function 4.2 Leave When an agent leave the system could be for three reasons: the agent decide voluntary to leave the system, failure or ’sabotage’. Periodically an agent sends a keep alive message to its neighbors. The agent will notice that one of its links is broken whether after sending a message, the time to receive an answer from the neighbor expires. In that case, the agent deletes the neighbor from its neighbor list and establishes a new link with other agent in the network to keep their degree (see Alg. 2 and Alg. 3). Algorithm 2 Leave, where a is the agent and SD the system function Leave(a, SD) local for ai ∈ a.N do removeLink(a, ai ) newLink(ai ,SD) end for end function 4.3 Search D EFINITION 5 (Service discovery problem) Given a set of agents A situated in a SDS= (A,L), the service discovery problem is defined as a probabilistic decision-making task in which an agent ai ∈ A is looking for an agent aj ∈ A, which offers the required service st . 28 Algorithm 3 newLink, where a is the agent and SD the system function Link(a, SD) connected←F alse while ¬connected do ar ←random(SD) if sim(a, ar ) ≥ U niRandom(0, 1) then `(a,ar ) ar .dg←ar .dg+1 updateN eighbors(a, ar ) connected←T rue end if end while end function The search process in the system is based only on the agent local knowledge. When the agent ai is looking for an agent aj which offers the required service st , ai selects which of its neighbors is the most appropriated to redirect the query instead of broadcast the query to all the neighborhood. In many networks which reflects power-law charac- teristics, the search is suggested to be based on degree. However, this makes that highly connected nodes could be overloaded with requests. Our selection criteria is based on previous proposals presented in [27][30], where the selection of the most promising neighbor is based on two criteria: degree and the similarity. In our sytem the selection criteria is based on the agent degree and the semantic similarity between agents services and roles. Until the target agent aj is found, all future agents involve in the discovery process will make their decision similarly (see Alg. 4). Algorithm 4 Search where as is the source agent, st is the required service and S the system and Θ is the similarity threshold function Search(as , st , S, Θ) s←getService(Ags ) a←as steps←0 while sim(s, st ) ≥ Θ ∧ steps≤T T L do pmax ←0 for ai ∈ a.N do dg ← a.dg s ← a.s p ← P (H(a, ai ),dg) if p > pmax then pmax ← p a ← ai end if end for end while return a end function 5 Simulation Results The test can be divided in two groups. The first group compares the performance of typ- ical distributed search strategies (degree, similarity, random) to the proposal presented 29 in this paper. The second test evaluates the fault tolerance of the SDS when relation- ships between agents in the system are broken randomly (an agent leaves the system) or following some patterns (which corresponds to deliberate failures ’sabotage’). 5.1 System Characterization The experiments have been done in a set of networks that simulate the SDS structure. These networks are preferential attachment networks and have generated as the result of the join operation of agents. We have implemented two kind of networks: A where the agents have not roles and Network B where each agent plays a role. We consider 10 types of different roles. Each network is composed of 1000 agents with one seman- tic service each one. The services and roles have been assigned to the agents using a uniform distribution. 5.2 Performance In this section we evaluate the search operation in our SDS. Due to the similarities of our system and p2p systems, we compared the search operation to other typical search strategies used in p2p systems: random, degree, similarity, similarity and degree. We have analysed the behavior of each strategy in 5000 searches in networks A and B. In figures 1a and 1b, the results obtained after the service search process are pre- sented. We see that in general the strategies in a network with organizational informa- tion have a better performance than the same strategies in a network without this infor- mation. That shows that organizational information in the system can guide the search process better than the systems that only provide information related to the degree and services. Between all the strategies, the search operation that we present in this paper has a better performance than the others. This is because it considers, apart form the degree and semantic service information, the roles that agents play. This information reduces the set of possible agents suitable to offer the service. An important parameter to consider in SDS is the number of steps to reach the tar- get agent. Figure 2a shows the mean path length obtained with each strategy in networks with role information (B). In general, all the strategies return paths with more steps as the number of agents in the network grows. When the size of the network is over 700 agents, the path length does not increase significantly. This shows that the structure of the SDS is suitable for large-scale systems. In Figure 2b the success rate of each search strategy in SDS is depicted. An obvious result is that as the system scale increases, the percentage successful searches decreases. The search operation presented here is the algorithm less influenced by the number of agents in the system. In general, the search operation in the 80% of searches finds a path between the source agent to the target agent. 5.3 Fault Tolerance The last and very important check is the behavior of the SDS under failures. The prob- lem appears when a broken link splits the system into two isolated parts, since some 30 250 250 Similarity&Degree EVN Similarity&Degree Search operation Search EEVN operation Similarity Similarity Similarity-Based Similarity Similarity -based Similarity-Based Degree Degree Degree Degree-Based Degree Degree-Based -based Random Random Random-Based Random Random Random-Based 200 200 Mean Frecuency Mean Frecuency 150 150 100 100 50 50 0 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 Path Length When Successful Path Length When Successful (a) (b) Fig. 1: Search performance without/with role information 45 100 Searchoperation Search Search operation EEVN EEVN Search operation Similarity&Degree EVN Similarity&Degree Similarity&Degree EVN EVN Similarity&Degree Degree Degree-Based Similarity Degree-Based Similarity Similarity Degree Similarity-Based Degree 90 Similarity-Based 40 Random-Based Random-Based 80 35 % Successful Searches 70 Mean Path Length 30 60 50 25 40 20 30 15 20 0 100 200 300 400 500 600 700 800 900 1000 0 100 200 300 400 500 600 700 800 900 1000 n agents n agents (a) (b) Fig. 2: Mean path length and success agents will no longer be reachable. To analyse it, agent failures have been modelled as a failure of all its connexions. When some links are broken, an alternative path has to be found. For random failures (see Fig.3a and Fig.3b), it can be observed that when the number of deleted agents is from 10% to 30%, the path length increases, due to there are alternative paths (with more steps) to find the agent with the required service. When the number of deleted agents ranges from 30% to 50%, the network is divided in several isolated parts. Only the searches inside the isle will success, so the number of 31 successful searches decreases and the path length decreases because the isle diameter are smaller. 42 70 Search operation EEVN EEVN Search operation Similarity&Degree EVN Similarity&Degree EVN Degree-Based Degree-Based 40 Similarity-Based Similarity-Based Random-Based 60 Random-Based 38 50 % successful searches 36 Mean Path Length 40 34 30 32 20 30 10 28 26 0 10 15 20 25 30 35 40 45 50 10 15 20 25 30 35 40 45 50 % deleted agents % deleted agents (a) (b) Fig. 3: Mean path length and success with random failures An interesting case is what happens when a deliberate failure is provoked. In the case of systems that follow a power-law, the worst case occurs when agents with high- est degree (hubs) are disconnected. Figure 4a and 4b shows how ’sabotage’ affects the performance of the search process. In this case, the path length increases due to only a few highly connected hubs have been deleted and an alternative path exists. The per- formance attending the number of successful searches decreases considerably as the number of deleted hub increases. 6 Conclusions The aim of this work is to provide an alternative to traditional approaches that deal with the service discovery task in large-scale open SOMAS. Our proposal tries to over- come drawbacks present in other centralized (bottlenecks, complexity, huge amount of memory needed, global knowledge) and distributed (network traffic, congestion, coor- dination effort, data consistency between distributed registries, update data) discovery approaches. We consider that structures used in social networks facilitate the task of locating agent services in a few steps using only local information. For that reason we investigate the use of social networks as underlying structure of a service discovery system. This structure is based on the concept of similarity between individuals, con- sidering organization role and services, and uses semantics to calculate this similarity. Furthermore, we provide several operations for the agents to be part of the system. An evaluation of the search functionality compared to other traditional p2p strategies is also 32 42 90 Search operation EEVN Search operation EEVN Similarity&Degree EVN Similarity&Degree EVN Degree-Based Degree-Based Similarity-Based 80 Similarity-Based 40 Random-Based Random-Based 70 38 % successful searches 60 Mean Path Length 36 50 40 34 30 32 20 30 10 2 4 6 8 10 12 14 2 4 6 8 10 12 14 n deleted hubs n deleted hubs (a) (b) Fig. 4: Mean path length and success under ’sabotage’ conditions provided. The behavior of the system under failure and ’sabotage’ circumstances have been also evaluated. The results of the experiments show that the system is robust under failure and that the search functionality performs well. Acknowledgment This work is supported by TIN2009-13839-C03-01 and TIN2008-04446 projects, CONSOLIDER-INGENIO 2010 under grant CSD2007-00022, FPU grant AP-2008- 00601 awarded to E. del Val. References 1. Barabasi, A. L. and Albert, R. Emergence of Scaling in Random Networks. Science, 286(5439):509–512, 1999. 2. E. Bircher and T. Braun. An Agent-Based Architecture for Service Discovery and Negotiation in Wireless Networks. 2004. 3. Basit Chaudhry, Chris Marton, and Hana Shepherd. Homophily and structure in multiplex networks. 4. Miller, Lynn, and James M. Birds of a feather: Homophily in social networks. Annual Review of Sociology, 2001. 5. M. Moore and T. Suda. A decentralized and self-organizing discovery mechanism. In Proc. Of the First Annual Symposium on Autonomous Intelligent Networks and Systems, 2002. 6. Jeffrey Travers and Stanley Milgram. An experimental study of the small world problem. Sociometry, 1969. 7. Yamini Upadrashta, Julita Vassileva, and Winfried Grassmann. Social networks in peer-to- peer systems. paper presented at the. In 38th Hawaii International Conference on System Sciences, pages 3–6, 2005. 33 8. Alexei Vázquez. Growing network with local rules: Preferential attachment, clustering hier- archy, and degree correlations. Physical Review E, 67(5), May 2003. 9. Duncan J. Watts, Peter Sheridan Dodds, and M. E. J. Newman. Identity and search in social networks. Science, 296:1302–1305, 2002. 10. Hui Zhang, Ashish Goel, and Ramesh Govindan. Using the small-world model to improve freenet performance, 2002. 11. R. Cohen and S. Havlin. Scale-Free Networks Are Ultrasmall. Physical Review Letters, 90(5), February 2003. 12. J. Dang and M. Hungs. Concurrent Multiple-Issue Negotiation for Internet-Based Services. Number Vol.10 - 6. 2006. 13. S. Jha, P. Chalasani, O. Shehory, and K. Sycara. A formal treatment of distributed match- making. In Proc. of the 2nd Int. Conference on Autonomous Agents, number Vol.3, pages 457–458, 1998. 14. M. Klusch, B. Fries, and K. Sycara. Automated semantic web service discovery with owls- mx. In AAMAS, 2006. 15. Miller, Lynn, and James M. Birds of a feather: Homophily in social networks. Annual Review of Sociology, 2001. 16. M. Moore and T. Suda. A decentralized and self-organizing discovery mechanism. In Proc. Of the First Annual Symposium on Autonomous Intelligent Networks and Systems, 2002. 17. S. Mullender and P. Vitanyi. Distributed Match-Making. Number Vol.3. 1988. 18. E. Ogston and S. Vassiliadis. Local distributed agent matchmaking. In Proceedings of the 9th International Conference on Cooperative Information Systems, 2001. 19. E. Ogston and S. Vassiliadis. Matchmaking among minimal agents without a facilitator. In Proceedings of the 5th International Conference on Autonomous Agents, pages 608–615, 2001. 20. A. Ouksel, Y. Babad, and T. Tesch. Matchmaking software agents in b2b markets. In Pro- ceedings of the 37th Annual Hawaii International Conference on System Sciences (HICSS’04), 2004. 21. K. Sigdel, K. Bertels, B. Pourebrahimi, S. Vassiliadis, and L.S. Shuai. A framework for adaptive matchmaking in distributed computing. In In proceeding of GRID Workshop, 2005. 22. K. Sycara and M. Klusch. Brokering and matchmaking for coordination of agent societies: A survey. Coordination of Internet Agents: Models, Technologies and Applications, 2001. 23. Katia Sycara, Matthias Klusch, Seth Widoff, and Jianguo Lu. Dynamic service matchmaking among agents in open information environments. SIGMOD Record, 28:47–53, 1999. 24. Jeffrey Travers and Stanley Milgram. An experimental study of the small world problem. Sociometry, 1969. 25. Yamini Upadrashta, Julita Vassileva, and Winfried Grassmann. Social networks in peer-to- peer systems. paper presented at the. In 38th Hawaii International Conference on System Sciences, pages 3–6, 2005. 26. Alexei Vázquez. Growing network with local rules: Preferential attachment, clustering hier- archy, and degree correlations. Physical Review E, 67(5), May 2003. 27. Duncan J. Watts, Peter Sheridan Dodds, and M. E. J. Newman. Identity and search in social networks. Science, 296:1302–1305, 2002. 28. Hui Zhang, Ashish Goel, and Ramesh Govindan. Using the small-world model to improve freenet performance, 2002. 29. M. McPherson, L. Smith-Lovin, and J. Cook. Birds of a feather: Homophily in social net- works. Annual Review of Sociology, 2001. 30. Şimşek and Jensen. Navigating networks by using homophily and degree. NAS, 2008. 31. Antnio L. Lopes and Lus M. Botelho. Improving Multi-Agent Based Resource Coordination in Peer-to-Peer Networks. Journal of Network, 2008. 34