=Paper=
{{Paper
|id=Vol-71/paper-10
|storemode=property
|title=Engaging Prolog with RDF
|pdfUrl=https://ceur-ws.org/Vol-71/Omelayenko.pdf
|volume=Vol-71
}}
==Engaging Prolog with RDF==
Engaging Prolog with RDF
Borys Omelayenko
Department of Computer Science
Vrije Universiteit, De Boelelaan 1081, 1081hv,
Amsterdam, the Netherlands
borys@cs.vu.nl
Infrastructure
Abstract Prolog engine
Data
Inference
Prolog has been often used to represent the axioms results
and inference over RDF data models often by con- RDF
verting all the data to plain-text Prolog facts and RDF
programs. In this paper we present the PR O D E F
infrastructure for using Prolog for inferencing over PRoDeF wrapper
Program
RDF data on the Web by representing Prolog pro- Predicates in other locations
grams in RDF, allowing them to be distributed over Predicates as
the Web and even incomplete, and represent rea- RDF Web Services
RDF RDF
soning results in a form suitable for further auto-
matic processing.
Figure 1: The infrastructure.
1 Introduction
RDF and RDF Schema lack the means for representing ax- 2. The Prolog programs should be available on the Web in a
ioms and rules, which are still necessary to build any kind of distributed manner possibly decomposed into pieces and
applications and different approaches originating from differ- represented in RDF according to a certain RDF Schema
ent motivations and requirements have been proposed. The
same time Prolog [Bratko, 1990] has been extensively used 3. RDF data should be interpreted in Prolog
to inference over data models represented in RDF,1 however, 4. Program execution should assume that some parts of the
mostly RDF and Prolog have been connected in an ad-hoc program may require some time to download from dif-
manner, primarily by converting everything to plain-text Pro- ferent locations or being not available at the moment
log facts and programs.
5. A clear algorithm for converting plain-text Prolog pro-
We intend to build the infrastructure depicted in Figure 1, grams to and from their RDF representation should be
where a special wrapper connects the Prolog engine to the provided
Web. It parses a Prolog program represented in RDF and
downloads the RDF data modules to be processed with the 6. The reasoning results should be represented in RDFand
program. The program itself is distributed to several loca- should allow updating the data
tions and the predicates used in one location may be defined 7. The rule language should allow representing the con-
in other places. The predicates, especially resource-critical or straints at the RDF Schema level and smoothly link them
performing some specific function, may even be implemented to instance data .
in other languages and accessible as web services. Finally, the
We try to meet these requirements in the Prolog wrapper
inference results are represented in RDF for further automatic
discussed in this Chapter.
processing.
We believe that such an infrastructure should possess the 1.1 Where are the Limits of Ontology Languages?
following basic properties:
The ontology languages for the Semantic Web incorporate
1. The language should have clear semantics, sufficient and certain means for representing axioms that can be then used
stable tool support, and existing expertise in terms of without any additional rule language.
available literature, courses, and skills RDF and RDF Schema contain the axioms needed
to form the object-attribute language for represent-
1 ing the conceptual models: rdfs:subClassOf and
http://www.google.com/search?q=using+
prolog+rdf rdfs:subPropertyOf are used to organize classes
and properties into hierarchies, rdfs:domain and oped within the initiative is essentially an XML serialization
rdfs:range specify the attachment of properties to for the rules, and it specifies the rules in a generic form of
classes. This set of axioms is often difficult to use in practice, a head and a body consisting of atomic predicates with pa-
e.g. the conjunctive semantics of multiple occurrences of rameters that can be also interpreted in Prolog. RuleML is
rdfs:domain or rdfs:range means that a property probably the only rule language for RDF that defines an RDF
may be attached to an intersection of one or more classes, syntax for the rules themselves.
and not a union (disjunctive semantics). This poses some However, there are several differences in the goals pursued
problems whenever a property has to be attached to several in RuleML and PR O D E F. These are:
classes. RDF facts. RuleML focuses at a universal representation of
In OWL RDF Schema is extended and several groups of the rule on the (Semantic) Web and thus makes no as-
axioms are introduced. These are: sumptions about the structure and arity of the facts,
Equality axioms sameClassAs, while any inference engine dealing with RDF naturally
samePropertyAs, sameIndividualAs and deals with binary RDF facts only;
differentIndividualFrom to denote that two
Implementation. RuleML is not linked to a specific infer-
classes, properties or individual are equivalent;
ence engine and thus needs to provide its own inter-
Property characteristics are introduced to de- pretation of the rules together with a linkage to in-
fine property characteristics inverseOf, ference engines. PR O D E F follows the opposite ap-
TransitiveProperty, SymmetricProperty proach tightly connecting the RDF serialization to stan-
to define inverse, transitive and symmetric properties; dard Prolog semantics. From the engine implementation
allValuesFrom and someValuesFrom to define side, PR O D E F relies on the decades-long experiences
property range restrictions; in making Prolog engines;
Cardinality constraints of properties. RDF Schema interpretation. The RDF serialization used
This set of axioms allows modelling numerous frequently in RuleML makes no commitment to RDF Schema and
needed constraints. For example the bossOf relation is of- uses rdf:Bags and rdf:Sequences to represent the
ten used to represent organizational structures. To model its rules that are not representable in RDF Schema.
transitivity in RDF Schema one needs to create and interpret The RuleML initiative has been hosting a workshop on rule
a special rule, while it can be directly modelled in OWL with languages for the Semantic Web where a number of initiatives
the TransitiveProperty property. However, in many have been presented.3
organizations the set of bosses of an employee who can ac- Squish4 also known as RDQL is somewhat similar to SQL
tually give him the orders is limited to two levels: the imme- but is further elaborated to query RDF triples. This similarity
diate boss and his/her immediate boss, and not a single step to SQL allows seamless integration with database back-ends.
further. This axiom can not be modelled in OWL directly and However, querying in Squish is bounded to plain RDF, with-
a rule is needed to model this two-steps transitivity. out any support for RDF Schema or high-level languages.
Another sort of examples include the axioms using value The Sesame RDF querying engine [Broekstra et al., 2002]
constraints, e.g. to classify some offers according to price uses the RDF Query Language RQL.5 Similar to Squish,
where cheap offers would assume 0 EUR < price < RQL statements contain the select-from-where construct,
500 EUR. however, an RQL interpreter is supposed to understand RDF
Schema axioms: transitivity of the subclass-of relation, its
1.2 Rule Languages for RDF connection to rdf:type, etc.
Many applications require rules and axioms that can not be di- A comparison of different RDF query languages is pub-
rectly represented in RDF Schema or OWL. However, RDF lished on the W3C web site6 together with query samples and
and RDF Schema do not possess any rule language, that is may serve as an interesting information source.
caused by numerous difficulties that are expected in standard- The Object Constraint Language OCL7 is the expression
ization of such a language at present time. However, this need language for the Unified Modeling Language (UML) that al-
has been widely understood in the Semantic Web community lows specifying constraints about the objects, links, and prop-
and several approaches for such a rule language have been erty values of UML models. OCL is a pure expression lan-
proposed. guage and any OCL expression is guaranteed not to change
Triple [Sintek and Decker, 2001] is proposed as an RDF anything in the model. Whenever an OCL expression is eval-
query and inference language, providing full support for re- uated, it simply delivers a value. OCL is a modelling lan-
sources and their namespaces, models represented with sets
3
of RDF triples, reification, RDF data transformation, and an http://www.soi.city.ac.uk/˜msch/conf/
expressive rule language for RDF. The language is intended ruleml/
4
to be used with a Horn-based inference engine. http://swordfish.rdfweb.org/rdfquery/
5
The RuleML2 initiative aims at defining a shared rule http://sesame.aidministrator.nl/
markup language to specify forward (bottom-up) and back- publications/rql-tutorial.html
6
ward (top-down) rules in XML. The language being devel- http://www.w3.org/2001/11/
13-RDF-Query-Rules/
2 7
http://www.dfki.uni-kl.de/ruleml/ www.omg.org/docs/ad/97-08-08.pdf
Language Requirements from Section ?? 2 The Usage Scenario
1:Exp. 2:Web 3:RDF 4:Exec. 6:Res.
Consider the prototypical scenario shown in Figure 2 that de-
Triple – – + – –
picts an RDF document that has a certain constraint a at-
RuleML – +/– – – –
tached to it with the goal property. It refers to the defini-
RQL + – + – (planned)
tion of a made in another file as a PR O D E F program. To
Squish – – – – –
verify the document over the constraint a Prolog parser needs
PAL +/– – – – –
to access the definition of a, that is, in turn, defined over b
OCL +/– – – – –
and c, where c is again defined in another file. In this way
Prolog + +? +? +? +?
? a parser needs to go along the rdfs:isDefinedBy links
in PR O D E F
attached to the predicates and extract their definitions from
Table 1: An estimate of the popularity of each of the rule different locations on the Web. At certain moment all the
language proposals for RDF, as queried on 7 January 2003 predicates would be collected, defined in terms of l triple
and o triple’s, and the constraint can be verified.
Query string Papers in CiteSeer Google results 3 The Ontology for PR O D E F
RDF and Triple 4 282 The ontology for PR O D E F represents the syntactic structure
RDF and OCL 2 591 of Prolog programs and is depicted in Figure 3. In this on-
RDF and RuleML 11 803 tology we do not try to represent the execution semantics of
RDF and Prolog 31 16,100 the programs, but treat program text as data and encode it as
data, leaving its interpretation to a Prolog engine.
Table 2: An estimate of the popularity of each of the rule
The modules of Prolog code are modelled with the class
language proposals for RDF, as queried on 7 January 2003
PrologModule that contains module’s logical name rep-
resented with the rdf:id attribute and physical location
of the module encoded with the rdfs:isDefinedBy at-
guage rather than a programming language and it is not pos- tribute.10 A module may export several predicates linked with
sible to write program logic or flow-control in OCL. As a side the export property, call several directives, e.g. consult,
effect, not everything in OCL is promised to be directly exe- as mentioned in the calls property, and contain rule
cutable. definitions. PrologModules are instantiated with
The Protégé axiom language PAL8 is used together with RDF files with program code located somewhere on the Web.
the Protégé editor9 to specify knowledge base constraints. The class PredicateName represents a predicate name
The syntax of PAL is a variant of the Knowledge Interchange that requires certain numberOfParameters. The name
Format (KIF) and it supports KIF connectives but not all of tag is targeted at a human user while the rdf:id’s of the
KIF predicates and statements. PAL is not really targeted to- PredicateName instances represent their identifiers used
wards RDF and RDF Schema. by the parser. The ontology includes several pre-defined in-
The RDF parser for SWI Prolog is a very relevant and pop- stances of PredicateName reserved for o triple and
ular initiative on using Prolog with RDF. The parser is capa- l triple that correspond to the fact names reserved in
ble of converting RDF documents into Prolog facts and then PR O D E F to represent RDF data, and bagof, setof and
utilize Prolog for reasoning, but it does not address RDF rep- forall that correspond to the special Prolog constructs.
resentation of Prolog programs themselves. The PredicateName’s represent the predicate names with-
Table 1 represents a summary showing the features pos- out any connection to their possible use with different param-
sessed by the languages in respect to the desired features eters.
listed in Section ??. As we can see none of the languages These are represented with the
fulfills all of them with RQL being the closest. ClauseWithParameters class that connects a
predicate name to a list of parameters. The parame-
Interesting to mention the estimate of the popularity of the ters are organized as a list of instances of the Parameter
rule and query languages. We queried the Web for relevant class, where each instance corresponds to one parameter
documents as presented in Table 2. The table illustrates that (a variable or a constant) and points to the next parameter in
Prolog has been frequently used with RDF, however, with a the list.
relatively small amount of publications made on that. Ob- For example, an instance of PredicateName may look
viously, these results are a subject of various distortions and like the following:
they do not indicate more than they do. However, it is obvi-
support able to compete with the decades-long experience in and correspond to myPredicate/3,11 and an instance of
Prolog tool development. ClauseWithParameters may look like this:
10
rdfs:isDefinedBy belongs to RDF and RDF Schema and
8
http://protege.stanford.edu/plugins/ are not presented in the figure.
paltabs/pal-documentation/ 11
Some of the conventions on encoding predicate names in Prolog
9
http://protege.stanford.edu/ that are lifted in PR O D E F as described later
Linking Prolog Programs on the
Web
Data.rdf
Product 55€
price
supp
lier Supplier
name
HP
goal
Data.rdf a
a_in_Prolog.rdf
isDefinedBy isDefinedBy
a(X,X):-b(X),c(Y).
b(X):- … Other_rules.rdf
Missing
isDefinedBy c(Z):- …
means ‘here’
Figure 2: The usage scenario for PR O D E F: a data module named Data.rdf contains two objects: object Product with prop-
erties Prince and supplier linking it to the second object Supplier with property name. This data module has to comply
the constraints represented by the goal predicate named a. In turn, a is defined in another file named a in Prolog.rdf as
represented by the RDF Schema property isDefinedBy. The definition of a consists of two predicates b defined in the same
file as a, and c defined in another file named Other rules.rdf.
rdf:Predicate rdf:Object
23: price Product01 55 EUR
parameters
and correspond to myPredicate(A,’This is a Figure 5: The way RDF statements are aligned to PR O D E F
string constant’). clauses.
Similar to the predicates, the RDF data triples are mod-
elled as the instances of ClauseWithParameters.
Figure 5 shows the relation between the standard RDF ontology.
modelling of RDF triples and the one used in PR O D E F. Figure 4 illustrates how a piece of a Prolog program may
A certain triple Product01, price, 55 EUR (e.g. be encoded in PR O D E F. The figure contains the sample code
triple #23) is modelled in RDF with an instance of defining transitive subClassOf predicate and the tree illus-
rdf:Statement with the property rdf:Predicate trating its RDF representation in PR O D E F. The tree contains
pointing to the property name, and in PR O D E F – with two branches: RULE000 and RULE001 corresponding to
an instance of ClauseWithParameters with the the two (disjunctive) definitions of subClassOf and point
property predicate. The rdf:Subject is mod- to the PiR:predicate name subClassOf. RULE000
elled in PR O D E F with the first Parameter linked comes with the list PAR000 of PiR:parameters, con-
to ClauseWithParameters with the property taining PiR:parameter X and PiR:nextParameter
parameters. The rdf:Object is modelled with Y. The PiR:body of the rule consists of the clause
the second Parameter linked to the previous one with the CLS000 pointing to the name o triple and its pa-
nextParameter property. rameters X, Y, and constant ’...#subClassOf’.
However, the modelling and interpretation of RDF triples In a similar way RULE001 has its PiR:body clause
is primarily done with the supporting tools and not by a hu- CLS001 with PiR:parameters X, Z, and constant
man user. Accordingly, the different ways of modelling the ’...#subClassOf’, and the PiR:nextClause
triples in RDF and PR O D E F may not affect the utility of the CLS002 pointing to PiR:predicate subClassOf
Parameter nextParameter
parameter operator parameter parameter
ClauseWithParameters nextClause , == \== > ... Constant Variable
connector calls* goal isa value name
isa , ; body PrologModule Bags literal literal
parameters definitions* predicate
Rule export*
parameters predicate
HeadParameter PredicateName
Figure 3: The ontology for representing Prolog programs in RDF
numberOfParameters name io io io io io
integer literal bagof setof o_triple l_triple findall
subClassOf
PiR:predicate PiR:predicate
RULE001 o_triple RULE000
PiR:parameters PiR:predicate PiR:body PiR:predicate PiR:predicate PiR:body PiR:parameters
PiR:parameter
PAR005 CLS001 CLS000 PAR000
PiR:nextParameter PiR:parameter PiR:nextClause PiR:parameters PiR:parameters PiR:nextParameter
PAR006 X CLS002 PAR007 PAR002 PAR001 X
PiR:parameter PiR:parameters PiR:parameter PiR:nextParameter PiR:nextParameter PiR:parameter PiR:parameter
Y PAR010 X PAR008 PAR003 X Y
PiR:parameter PiR:nextParameter PiR:parameter PiR:nextParameter PiR:parameter PiR:nextParameter
Z PAR011 CONST001 PAR009 CONST000 PAR004
PiR:parameter value PiR:parameter value PiR:parameter
Y ....#subClassOf Z ...#subClassOf Y
Figure 4: An example of a Prolog program being encoded in RDF
subClassOf(X,Y) :- o_triple(X,'http://www.w3.org/TR/1999/PR-rdf-schema-19990303#subClassOf',Y).
subClassOf(X,Y) :- \_triple(X,'http://www.w3.org/TR/1999/PR-rdf-schema-19990303\#subClassOf',Y).
subClassOf(X,Y) :- o_triple(X,'http://www.w3.org/TR/1999/PR-rdf-schema-19990303#subClassOf',Z),subClassOf(Z,Y).
subClassOf(X,Y) :-o\_triple(X,'http://www.w3.org/TR/1999/PR-rdf-schema-19990303\#subClassOf',Z),subClassOf(Z,Y).
and parameters Z and Y. allow the use of different namespaces as the qualifiers to dis-
tinguish different predicates with the same names on the Web.
3.1 RDF facts Opposite to the predicates, the variables are used only lo-
The facts used by the rule language should not be abstract and cally within a single rule definition and may not be accessed
disconnected but need to be grounded to RDF statements. from the outside.
The facts in Prolog are represented with statements of an Predicate names make some sense only if there is a link
arbitrary arity indicating that one or more string-valued con- to the location where they are actually defined. We use
cepts are in a certain relation to each other. the rdfs:isDefinedBy property of a resource to denote
For example, the statement price(’Product’,’55 the file where the predicate is actually defined. [?] defines
EUR’) denotes that something called ‘Product’ is in relation rdfs:isDefinedBy as ‘an instance of rdf:Property
‘price’ to something called ‘55 EUR’. that is used to indicate a resource defining the subject re-
source. This property may be used to indicate an RDF vocab-
In RDF the facts are represented with the RDF triples.
ulary in which a resource is described’ and thus is perfectly
Each triple of the form (object,property,value) de-
suitable for this purpose.
notes that two objects, object and value are in relation
property to each other.
For example, the previous statement can be re-written
3.3 Namespaces
as RDF triple (’Product’,price,’55 EUR’), which In XML and RDF namespaces are used as qualifiers for the
may be encoded in RDF/XML as the following: names allowing two equivalent names to be distinguished
paces as parts of predicate and variable names. Essentially
In RDF only binary facts are allowed and a thus only the namespaces for the variables that are used only within the
binary facts need to be represented to Prolog. We predicate definitions are not that important as for the predi-
interpreted them in a uniform way: each RDF triple cates that may be accessed globally.
(object, property, value) where value takes
rdfs:Literal strings is represented with Prolog fact 4 The Execution of PR O D E F Modules on the
l triple(object,property,value), a triple with
an rdf:Resource value is translated into Prolog fact Web
o triple(object,property,value). No other The Prolog programs on the Web are executed in a different
facts are allowed. way than in classical Prolog systems and the inference results
This interpretation is similar to the one used in SWI- are produced for further automatic processing rather than di-
Prolog12 where the result of importing an RDF file is a list of rect human consumption.
rdf(Subject, Predicate, Object) triples, where
Subject is either a plain resource (an atom), or one of 4.1 Modules and Goals
the terms each(URI) or prefix(URI) with the obvi-
ous meaning. Predicate is either a plain atom for explic- Prolog programs are decomposed into modules that are im-
itly non-qualified names or a term NameSpace:Name. If ported by the engine by executing the directive consult. In
NameSpace is the defined RDF name space it is returned as the Web scenario the modules may well be distributed all over
the atom rdf. Finally, an Object is represented by its URI, the Web and consulting a module would require prior down-
a Predicate or a term has the format literal(Value) loading of the correspondent RDF file. Accordingly, instead
if they take literal values. of a local file name consult need to receive an URL of the
module.
3.2 Names for Predicates and Variables In PR O D E F each RDF data module contains two parts:
RDF data itself and a possible annotation of the rdf:RDF
Historically, Prolog imposes certain constraints on the names tag with the special pir:goal property linking it to the goal
for predicates and variables that originate from the plain-text description, a set of axioms that are applicable to the data
encoding used in Prolog programs. In standard Prolog predi- module:
cate names are represented with the identifiers starting with a
small letter, and variable names start with a capital letter. This here goes RDF data
way of name encoding looks a bit archaic from the XML and
RDF perspective. The goal descriptions are special objects that define the ax-
We encode both predicates and variables as RDF ob- ioms, the applicable data modules, and the interpretation of
jects whose rdf:ID’s correspond to their identifiers (or the axioms. It consists of (Figure 6):
names). These objects are easily distinguished because
they are defined as the instances of a certain class, either axioms pointing to the axiom with the name property and
PredicateName, Variable or a Constant. Accord- its definition with the rdf:isDefinedBy property.
ingly, we lift the restriction on the case for the first letter, and The axioms are subclassed into PositiveAxioms and
NegativeAxioms to specify whether the predicate is
12 a positive test those results represent correct data, or a
http://www.swi-prolog.org/packages/rdf2pl.
html negative test that results in the incorrect data.
We naturally represent the solutions as a bag of RDF ob-
Location
jects, each of which contains n properties with the names
X1 , ..., Xn and the values x1 , ..., xn .
isa
If a goal is defined as a conjunction of several predicates
goal(X1 , ..., Xn ) : −P1 (X1 , ..., Xm ), ..., Pk (X1 , ..., Xm )
where each Pi receives some or all of the n arguments of
Repository
goal, then we may represent each resulting object as a set
isa
of objects P1 , ..., Pk , each of which corresponds to one of
datasets* the predicates defining the goal. It may then make sense to
Dataset
url
explicitly represent these predicates in the reasoning results
and process them further separately.
literal
Accordingly, the Prolog interpreter receives a parameter
‘detail level of the results’ 1, 2, ..., ∞ that specifies the num-
ber of objects representing each result, where ∞ forces the
Goal
axioms*
results to be fully decomposed. However, further elaboration
NegativeAxiom of this scheme is rather a subject of further research.
5 Summary
isa
In the paper we propose a solution for the problem of repre-
senting Prolog programs on the (Semantic) Web and dealing
Axiom
with distributed data modules in RDF.
isa
A number of questions remain open:
PositiveAxiom
• How the language should be restricted (or better to say,
name
which extensions to Prolog should be prohibited). Pri-
marily this refers to the problems of batch execution of
the programs that may not use any graphic user interface
nor console output;
• The definition of an interface between PR O D E F and the
literal
predicates implemented with the other languages and
available as web services;
• A number of issues concerning distributed program ex-
ecution remain open.
Figure 6: The structure of a goal.
Extra information together with the ontologies, exam-
ples and occasional tool support is available at the PR O D E F
datasets that are applicable to the axioms. Each dataset homepage.13
may consist of several locations pointed with their urls,
or a query made in a repository. References
Quite often it happens that a certain location with a piece [Bratko, 1990] Ivan Bratko. Prolog Programming for Artifi-
of a program is not accessible at the moment. What should cial Intelligence. Addison-Wesley, 1990.
if the definition of a certain predicate can not be found? A [Broekstra et al., 2002] Jeen Broekstra, Arjohn Kampman,
possible solution path is to provide the Prolog engine with and Frank van Harmelen. Sesame: A Generic Architec-
a parameter specifying server’s behavior: to wait, to ignore ture for Storing and Querying RDF and RDF Schema. In
the predicate, or to fail. This failure is then included in the Ian Horroks and James Hendler, editors, Proceedings of
reasoning results. the First International Semantic Web Conference (ISWC-
2002), number 2342 in LNCS, pages 54–68, Sardinia,
4.2 Reasoning Results Italy, June 9-12 2002. Springer-Verlag.
The Prolog engine and the wrapper return two types of infor- [Sintek and Decker, 2001] Michael Sintek and Stefan
mation: Decker. TRIPLE - An RDF Query, Inference, and Trans-
The list of failed locations that could not be accessed and formation Language. In Proceedings of the Workshop
the data or program modules could not be downloaded; on Deductive Databases and Knowledge Management
(DDLP-2001), October 20-22 2001.
Prolog inference result: success, failure, yes, or
no;
The list of solutions in form of tuples (x1 , ..., xn ) that
correspond to the goal predicate with arguments
13
(X1 , ..., Xn ) returned in case of success. http://www.cs.vu.nl/˜borys/PiR/