<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Streaming OWL</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Mike</forename><surname>Dean</surname></persName>
							<email>mdean@bbn.com</email>
							<affiliation key="aff0">
								<orgName type="institution">BBN Technologies</orgName>
								<address>
									<settlement>Ann Arbor</settlement>
									<region>MI</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Streaming OWL</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">550501A192D177D5EBF71A45E3D4E639</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T04:02+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Stream processing can offer significant performance and scalability advantages for many Semantic Web applications. An important OWL profile for stream processing includes single OWL statements that allow inference and/or generation of new rules with single statement bodies. This position paper discusses our experiences and ideas in this area.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>A major next step for the Semantic Web is likely to be support for streaming content, rather than focusing on sedentary web pages and knowledge bases. In our work we've seen 10x+ performance improvements when using streaming vs. materializing and then navigating an in-memory model for suitable applications. This is analogous in the XML world to using SAX vs. DOM. Jeremy Carroll similarly found a threefold time and space improvement over abstract syntax tree approaches in applying stream processing to recognizing OWL dialects <ref type="bibr" target="#b0">[1]</ref>.</p><p>Semantic Web streaming involves processing one RDF statement at a time, while maintaining a minimal amount of state. A useful profile of OWL can be supported by streaming, as discussed in Section 2, particularly when statements are used to generate rules with single statement bodies. In keeping with the 2character OWL 2 profile <ref type="bibr" target="#b1">[2]</ref> naming convention, we might call such a streaming profile OWL SL (which also avoids confusion with OWL-S). Section 2 details OWL SL, while Section 3 describes previous work that led up to these ideas, Section 4 discusses a prototype implementation using DERI Pipes, and Section 5 offers a generalization. Section 6 discusses additional work we plan to pursue, and Section 7 concludes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">OWL SL</head><p>Table <ref type="table" target="#tab_0">1</ref> shows the constructs in OWL SL. This is another example of "RDFS plus a little bit of OWL" that many Semantic Web content developers have found useful. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Related Work</head><p>Early in the DARPA Agent Markup Language (DAML) program I developed dumpont<ref type="foot" target="#foot_0">1</ref> , a program that provides a view of OWL class and property hierarchies while depicting restrictions using a representation that's basically a combination of Java method signatures and Kleene regular expressions. Compared to ontology browsers that focus on a single class at a time, dumpont provides an effective means of "seeing the forest for the trees". We periodically found processes consuming excessive CPU time on the www.daml.org system hosting the dumpont web service. This was usually caused by people trying to run dumpont on a large ontology such as OpenCyc. Converting the program from internalizing a model to a streaming implementation using Jena's ARP parser alleviated the problem.</p><p>Around 2003, I added inference support to our DAML DB triple store<ref type="foot" target="#foot_1">2</ref> (which is now available in open source as Parliament<ref type="foot" target="#foot_2">3</ref> ) by adding a simple rule engine limited to single-statement bodies (which avoided any need for unification or query optimization). Triggers were set on non-variable subjects, predicates (other than rdf:type, unless it was the only non-variable) and objects that appeared in rules. Rules were generated on the fly and maintained only in memory. The idea was to generate a large number of very specific rules rather than employ a small number of more general and complex rules <ref type="bibr" target="#b2">[3]</ref>. The application that motivated this work had a knowledge base that included a "reference load" data set of about 1 million statements plus regularly incoming triples from natural language extraction of web pages. The reference load happened to include a largely unused OWL version of the United Nations Standard Products and Services Code (UNSPSC), which included about 65,000 rdfs:subClassOf statements, each of which generated 2 in-memory rules with associated triggers. DAML DB still started up in a few seconds on a commodity server, so we never bothered to remove UNSPSC. It turns out that the types of rules and techniques we used here are exactly what's needed for stream processing.</p><p>Recently, in performing an analysis of the 2008 and 2009 Billion Triples Challenge corpora, I found a 5-10X increase in performance using stream processing <ref type="bibr" target="#b3">[4]</ref>.</p><p>Other people are also getting interested in streaming of Semantic Web and other content. DERI Pipes <ref type="bibr" target="#b4">[5]</ref> provides a research framework and graphical interface for stream processing of Semantic Web and other data.. IBM System S provides a highly scalable but non-semantic streaming infrastructure. Streambase and other Complex Event Processing engines provide stream processing for tuples. Brad Allen proposed using Atom for distributing RDF content <ref type="bibr" target="#b5">[6]</ref> at SemTech 2007 while Nova Spivak has recently blogged<ref type="foot" target="#foot_3">4</ref> and twittered about the Stream replacing the Web.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Prototype Implementation</head><p>We're developing a prototype DERI Pipes<ref type="foot" target="#foot_4">5</ref> operator that embodies these ideas and will report on it at the workshop. The basic approach is to check each incoming statement for each of the OWL SL constructs and execute code that either adds to the internal state (e.g. for rdfs:subClassOf) or that infers additional statements (e.g. rdf:type).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">More General Streaming</head><p>In many streaming applications, statements are likely to come in batches (e.g. from updated web pages) rather than just one at a time. In this case, it's likely that certain constructs (e.g. OWL Restrictions) will be grouped together. Making this assumption allows us to also add owl:allValuesFrom and owl:hasValue to an extended version of OWL SL, which might be called OWL SL*.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Knowledge Streams</head><p>We've been developing a concept we call Knowledge Streams, which is depicted in Figure <ref type="figure" target="#fig_0">1</ref> (from <ref type="bibr" target="#b6">[7]</ref>). This shows stream networks for 2 overlapping Communities of Interest (each likely using their own ontologies), with nodes (operators) providing filtering, translation, augmentation (enrichment), aggregation, alerting, inference, and other services. OWL ST could well be used for the inference operator.</p><p>Knowledge Streams can also be viewed as a step toward Semantic Complex Event Processing based on triples rather than tuples.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">Conclusions</head><p>We've identified a profile of OWL, which we call OWL SL, that's suitable for stream processing of RDF and OWL content. We hope we've also gotten other people excited about the prospects for stream processing of active Semantic Web content. </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Knowledge Streams</figDesc><graphic coords="4,134.85,224.67,396.00,306.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>OWL SL Constructs</figDesc><table><row><cell>rdf:type</cell></row><row><cell>rdfs:domain</cell></row><row><cell>rdfs:range</cell></row><row><cell>rdfs:subClassOf</cell></row><row><cell>rdfs:subPropertyOf</cell></row><row><cell>owl:inverseOf</cell></row><row><cell>owl:SymmetricProperty</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">http://www.daml.org/2001/03/dumpont/, http://www.daml.org/2003/09/dumpont/, and http://semwebcentral.org/projects/dumpont/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">http://www.daml.org/2001/09/damldb/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">http://parliament.semwebcentral.org</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">http://www.twine.com/item/128lryv9z-46/is-the-stream-the-next-new-metaphor</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">http://pipes.deri.org</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Streaming OWL DL</title>
		<author>
			<persName><forename type="first">J</forename><surname>Carroll</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. First European Semantic Web Symposium (ESWS 2004)</title>
				<meeting>First European Semantic Web Symposium (ESWS 2004)<address><addrLine>Heraklion, Crete</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2004-05">May 2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">C: OWL 2 Web Ontology Language Profiles</title>
		<author>
			<persName><forename type="first">B</forename><surname>Motik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Cuenca Grau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Horrocks</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Fokoue</surname></persName>
		</author>
		<author>
			<persName><surname>Lutz</surname></persName>
		</author>
		<ptr target="http://www.w3.org/TR/2009/CR-owl2-profiles-20090611/" />
	</analytic>
	<monogr>
		<title level="m">W3C Candidate Recommendation</title>
				<imprint>
			<date type="published" when="2009-06-11">11 June 2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Semantic Web Rules: Covering the Use Cases</title>
		<author>
			<persName><forename type="first">M</forename><surname>Dean</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. 3rd Intl. Workshop on Rules and Rule Markup Languages for the Semantic Web (RuleML 2004)</title>
		<title level="s">Springer LNCS</title>
		<meeting>3rd Intl. Workshop on Rules and Rule Markup Languages for the Semantic Web (RuleML 2004)<address><addrLine>Hiroshima, Japan</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2004-10">October 2004</date>
			<biblScope unit="volume">3323</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">How is the Semantic Web Being Used?: An Analysis of the Billion Triples Challenge Corpus</title>
		<author>
			<persName><forename type="first">M</forename><surname>Dean</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">5th Semantic Technology Conference</title>
				<meeting><address><addrLine>San Jose, California</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2009-05">May 2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Rapid Prototyping of Semantic Mash-Ups through Semantic Web Pipes</title>
		<author>
			<persName><forename type="first">D</forename><surname>Le-Phuoc</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Polleres</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Morbidoni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hauswirth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Tummarello</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. 18th World Wide Web Conference (WWW2009)</title>
				<meeting>18th World Wide Web Conference (WWW2009)<address><addrLine>Madrid, Spain</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2009-04">April 2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">A Semantic Web Without RDF/XML: Building RDF Applications in Atom</title>
		<author>
			<persName><forename type="first">B</forename><surname>Allen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">3rd Semantic Technology Conference</title>
				<meeting><address><addrLine>San Jose, California</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2007-05">May 2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Dean</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hebeler</surname></persName>
		</author>
		<title level="m">Semantic Web @ BBN. 5th Semantic Technology Conference</title>
				<meeting><address><addrLine>San Jose, California</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2009-05">May 2009</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
