MMGPS Workshop, London The Chatty Web approach for global semantic agreements Philippe Cudré-Mauroux, Karl Aberer Distributed Information Systems Laboratory (LSIR) Swiss Federal Institute of Technology, Lausanne (EPFL) ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab The Problem (1) SwissProt peers EMBLChange peers species, … authors, titles, organism, … other peers authors, … A lab at MIT organism Query posted at EPFL A lab in Trondheim species organism species organism EMBLChange site at Cambridge Swissprot site at Geneva ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab The Problem (2) • How to obtain semantic interoperability among heterogeneous data sources without relying on pre-existing, global semantic models? ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Outline of the solution • Local translations enabling global agreements SwissProt peers other peers authors, titles, organism, … authors, … EMBLChange peers species, … A lab at MIT organism → authors organism species → organism A lab in Query posted Trondheim at EPFL species organism species organism → species organism EMBLChange site at Cambridge Swissprot site at Geneva ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Query Forwarding • To whom shall we send the queries? – To peers susceptible of sending us a response… • Simplistic solutions – Local Neighboring (same schema) • Low recall – Query Flooding (entire network) • Low precision, high network load • Semantic Gossiping – Query forwarding by selecting the right peers – Query dependant PHBs (Per-Hop Behaviors) – Query / transformed queries analysis • Intrinsic measures (syntactic distances) • Extrinsic measures (semantic distances) ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab On Translations ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Similarity Measures • Syntactic Similarity – Similarity measure between an original and a transformed query. – Iterative computation of information loss in selections / projections. • Semantic Similarities – Probabilistic analysis (max. likelihood) upon the correctness of translations based on feedback received ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Semantic Similarity • Cycles Detection – Detection of query cycles: • - (T1-> n) (Ai) = (Ai) √ • - (T1-> n) (Ai) = (Aj) x • - (T1 -> n) (Ai) = ∅ • Results Analysis – Content-retrieval techniques: – classification rules to relate a returned documents to queries (extensional VS intentional expression of concepts) ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Realizing semantic interoperability • Evaluations based on Chatty Web simulations. • Automatic correction of erroneous mappings based on evidences gathered. • Small-world graph => Self-repairing semantic networks ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Some Results (1) Sensitivity to TTL (cycle analysis only, 25 peers, 4 concepts) ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Some Results (2) Scalability (results analysis only, 4 concepts, TTL=3, misclassification rate=0.1, 2 documents/peer on avg.) ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab Some Results (3) Combined results (25 peers, 4 concepts, TTL=6 | 3, misclassification rate=0.1, 2 documents/peer on avg.) ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab P-Grid Implementation (ongoing work) ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab References • Start making sense: The Chatty Web approach for global semantic agreements, Karl Aberer, Philippe Cudré-Mauroux, Manfred Hauswirth 1st issue of Journal of Web Semantics. • The Chatty Web: Emergent Semantics Through Gossiping Karl Aberer, Philippe Cudré-Mauroux, Manfred Hauswirth Proceedings of the Twelfth International World Wide Web Conference (WWW2003), 20-24 May 2003, Budapest, Hungary. • A Framework for Semantic Gossiping Karl Aberer, Philippe Cudré-Mauroux, Manfred Hauswirth SIGMOD Record, 31(4), December 2002. • http://www.p-grid.org/ ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab MMGPS Workshop, London The Chatty Web approach for global semantic agreements Philippe Cudré-Mauroux, Karl Aberer Distributed Information Systems Laboratory (LSIR) Swiss Federal Institute of Technology, Lausanne (EPFL) ©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab