Representing Probabilistic Relations in RDF Yoshio Fukushige Network Development Center Matsushita Electric Industrial Co., Ltd. 4-5-15 Higashi-shinagawa, Shinagawa-ku, Tokyo 140-8632, Japan fukushige.yoshio@jp.panasonic.com Abstract Probabilistic inference will be of special impor- tance when one needs to know how much we can say with what all we know given new observa- tions. Bayesian Network is a graphical probabilis- tic model with which one can represent probabilis- tic relations intuitively and several effective algo- rithms for inference are developed. This paper re- ports a now ongoing work in its design stage which provides a vocabulary for representing probabilistic knowledge in a RDF graph which is to be mapped Figure 1: example Bayesian Network to a Bayesian Network to do inference on it. • the relations are expressed by a graph, which is a famil- 1 Introduction iar notion in the Semantic Web, and thus intuitive and In the real world, especially in the scientific fields like Life easy to understand Science, or in applications like contents classification and rec- • effective calculation algorithms (including simulation ommendation, it is often the case that relationship between methods) have been developed resources holds probabilistically, or we can make statements only with uncertainty. Such relationships can be well de- 3 Requirements for the representation scribed with probabilistic model. And probabilistic inference will be of special importance when one needs to know how language much we can say with what we know incompletely. The aim of this work is not to just represent Bayesian net- In this position paper, I report my ongoing work which pro- works in the Semantic Web, but to get a language (or exten- vides a vocabulary for representing probabilistic knowledge sion vocabulary) which can describe probabilistic relations in in a RDF graph. I introduce Bayesian Networks and list the a way that is Semantic Web compatible and easy to map to requirements for the representing language and candidate vo- a BN. It is to put together the distributed information in the cabulary. Semantic Web, and do probabilistic inference in the BN. The components of a BN are nodes and links and CPT’s 2 Bayesian Network attached to the nodes. A node represents a set of exhaustive A Bayesian Network(BN) (Pearl 88) [1] is a graphical model and mutually exclusive propositions (called partition). to represent probabilistic relations. It is a directed acyclic The representation language should be able to express: graph (DAG), representing probabilistic dependencies among • a partition, i.e. a set of exhaustive and exclusive propo- a set of propositions. A node represents a set of exhaustive sitions and exclusive set of propositions (called ’variable’ or ’par- tition’). A link represents a direct dependency between the • propositions in such a way that they are distinguished variables. Each node is accompanied with a conditional prob- from ground facts ability table (CPT) that represents the probabilistic relation- • a probability with which a proposition holds ship between the variables. The posterior probability distribu- with/without certain conditions tions (”beliefs”)for each variable could be calculated by prop- agating beliefs. Figure 1 shows an example illustration of a BN (CPTs are 4 Vocabulary for RDF representation not shown). It has 5 nodes and 5 links connecting them. RDF is a W3C standard as one of the fundamental building Advantages of employing BN are among others: blocks of the Semantic Web. By representing probabilistic relations in RDF, one gets advantage of reusing existing vo- 6 Conclusion and future works cabularies and tools for RDF processing, and one can treat This position paper reported a ongoing work which provides a the probabilistic relations themselves as resources and incor- vocabulary for representing probabilistic knowledge in a RDF porate them into RDF graphs. graph. To provide a vocabulary that satisfy the requirements Open issues (other than implementing issues) include: above, I introduced the following classes and predicates: • Relationship with rule languages classes prob:Partition, prob:ProbabilsticStatement, • How to standardize Query Languages against BN store prob:Clause, prob:Probability, • How to learn BNs from data or/and partial description in RDF. predicates prob:predicate, prob:hasProbability, prob:condition, • How to deal with / avoid cyclic probabilistic description prob:case, prob:about in RDF Details are omitted because of the limit of the space. De- • How to deal with continuous probabilistic distributions tails are to be available at. Points to note are: References • Conditions are linked with prob:Partition’s, not [1] Pearl, J.: Probabilistic reasoning in intelligent systems: with each cases. networks of plausible inference, Morgan Kaufmann, • Introduction of prob:Clause’s 1988 prob:Clause is a generalization of the RDF [2] Becket, D.: Turtle - Terse RDF Triple Language, reification. prob:Clauserepresents has one http://www.ilrt.bris.ac.uk/discovery/2004/01/turtle/ prob:predicate and zero or more ’terms.’ (cf. the [3] Fukushige, Y.:Representing Probabilistic Knowl- pattern 2 of N-ary relationship representations in [9]) edge in the Semantic Web, position paper The following is an example graph which represents a for the W3C Workshop on Semantic Web for probabilistic relation: “if cond0, then case1 has probability Life Sciences, http://www.w3.org/2004/09/13- prob1 and case2 has probability prob2” (in a Turtle [2] seri- Yoshio/PositionPaper.html, 2004 alization) [4] Ding, Zh., Peng, Y.: A Probabilistic Extension to Ontol- [a prob:Partition; ogy Language OWL, Proc. of the 37th Hawaii Interna- prob:condition :cond0; tional Conference in System Sciences, 2004 prob:case [5] Gu, T., Pung, H.K., Zhang, D.Q.: A Bayesian Approach [a prob:ProbabilisticStatement; for Dealing with Uncertain Contexts, Proc. of the Second prob:about :case1; International Conference on Pervasive Computing (Per- prob:hasProbability :prob1], vasive 2004) ,”Advances in Pervasive Computing”, Aus- [a prob:ProbabilisticStatement; trian Computer Society, vol. 176, 2004 prob:about :case2; prob:hasProability :prob2]. [6] oli, M., Hyvönen, E.: A Method for Modeling Uncer- ]. tainty in Semantic Web Taxonomies, Proceedings of the 13th international World Wide Web conference, pp.296- 297, 2004 5 Related works [7] Koller, D., Levy, A., Pfeffer, A.: P-CLASSIC: A tractable (Ding & Peng 2004) [4] and (Gu, Pung & Zhang 2004) [5] probabilistic description logic, Proceedings of the 14th are close works to this. They proposes to augment OWL to National Conference on Artificial Intelligence (AAAI- allow probabilistic markups, and provides a set of transfor- 97), pp.390-397, 1997 mation rules to convert the probabilistically annotated OWL [8] Yelland, P.M.: An Alternative Combination of Bayesian ontology into a BN. Networks and Description Logics, Proceedings of the (Holi & Hyvönen 2004) [6] is an attempt to express and KR2000 Conference, pp.225-234, 2000 calculate overlapping of concepts in RDF(S) and OWL using BN. [9] Noy, N., Rector, A.: Defining N-ary Relations on Works in combinatorial use of BNs and Description Log- the Semantic Web: Use With Individuals, W3C ics includes, among others, (Koller, Levy & Pfeffer 1997) Working Draft, http://www.w3.org/TR/2004/WD-swbp- [7] which presents P-CLASSIC; a probabilistic version of the n-aryRelations-20040721/, 2004 Description Logic CLASSIC, and (Yelland 2000) [8] which incorporates Description Logics into BNs. While others address T-Box knowledge, (Fukushige 2004)[3] proposes a method to encode probabilistic relations in A-Box, which is in the same direction with this work.