Ontology-Based Introspection in Support of Stream Reasoning Daniel de Leng and Fredrik Heintz Department of Computer and Information Science Linköping University, 581 83 Linköping, Sweden {daniel.de.leng, fredrik.heintz}@liu.se Abstract and how they relate. The same information query further re- quires different configurations for different systems. Such Building complex systems such as autonomous robots usu- a manual task leaves ample room for programmer errors, ally require the integration of a wide variety of components including high-level reasoning functionalities. One important such as misspelling stream names, incorrect stream config- challenge is integrating the information in a system by setting urations and misunderstanding the semantics of stream con- up the data flow between the components. This paper extends tent. Furthermore, if the indefinite continuation of a stream our earlier work on semantic matching with support for adap- cannot be guaranteed, manual reconfiguration may be nec- tive on-demand semantic information integration based on essary at run-time, further increasing the risk for errors. ontology-based introspection. We take two important stand- In this paper we extend earlier work on semantic match- points. First, we consider streams of information, to handle ing (Heintz and de Leng 2013) where we introduced support the fact that information often becomes continually and in- for generating indirectly-available streams based on fea- crementally available. Second, we explicitly represent the se- mantics of the components and the information that can be tures. The extension focuses on ontology-based introspec- provided by them in an ontology. Based on the ontology our tion for supporting adaptive on-demand semantic informa- custom-made stream configuration planner automatically sets tion integration. The basis for our approach is an ontology up the stream processing needed to generate the streams of which represents the relevant concepts in the application do- information requested. Furthermore, subscribers are notified main, the stream processing capabilities of the system and when properties of a stream changes, which allows them to the information currently generated by the system in terms adapt accordingly. Since the ontology represents both the sys- of the application-dependent concepts. Relevant concepts tem’s information about the world and its internal stream pro- are for example objects, sorts and features which the sys- cessing many other powerful forms of introspection are also tem wants to reason about. Semantic matching uses the on- made possible. The proposed semantic matching functional- tology to compute a specification of the stream processing ity is part of the DyKnow stream reasoning framework and has been integrated in the Robot Operating System (ROS). needed to generate the requested streams of information. It is for example possible to request the speed of a particular ob- ject, which requires generating a stream of GPS-coordinates 1 Introduction of that object which are then filtered in order to generate a Building complex systems such as autonomous robots usu- stream containing the estimated speed of the object. Figure 1 ally require the integration of a wide variety of components shows an overview of the approach. The semantic matching including high-level reasoning functionalities. This integra- is done by the Semantics Manager (Sec. 4) and the stream tion is usually done ad-hoc for each particular system. A processing is done by the Stream Processing Engine (Sec. 3). large part of the integration effort is to make sure that each Semantic matching allows for the automatic generation component has the information it needs in the form it needs of indirectly-available streams, the handling of cases where it and when it needs it by setting up the data flow between there exist multiple applicable streams, support for cop- components. Since most of this information becomes incre- ing with the loss of a stream, and introspection of the set mentally available at run-time it is natural to model the flow of available and potential streams. We have for example of information as a set of streams. As the number of sensors used semantic matching to support metric temporal logical and other sources of streams increases there is a growing (MTL) reasoning (Koymans 1990) over streams for collab- need for incremental reasoning over streams to draw rele- orative unmanned aircraft missions. Our work also extends vant conclusions and react to new situations with minimal the stream processing capabilities of our framework. In par- delays. We call such reasoning stream reasoning. Reasoning ticular, this includes ontology-based introspection to support over incrementally available information is needed to sup- domain-specific reasoning at multiple levels of abstraction. port important functionalities such as situation awareness, The proposed semantic matching functionality is in- execution monitoring, and planning. tegrated with the DyKnow stream reasoning frame- When handling a large number of streams, it can be dif- work (Heintz and Doherty 2004; Heintz 2009; Heintz, ficult to keep track of the semantics of individual streams Kvarnström, and Doherty 2010; Heintz 2013) which pro- allocation. The work by Tang and Parker (Tang and Parker 2005) on ASyMTRe is an example of a system geared to- wards the automatic self-configuration of robot resources in order to execute a certain task. Similar work was per- formed by Lundh, Karlsson and Saffiotti (Lundh, Karlsson, and Saffiotti 2008) related to the Ecology of Physically Em- bedded Intelligent Systems (Saffiotti et al. 2008), also called the PEIS-ecology. Lundh et al. developed a formalisation of the configuration problem, where configurations can be regarded as graphs of functionalities (vertices) and chan- nels (edges), where configurations have a cost measure. This is similar to considering actors and streams respectively. A functionality is described by its name, preconditions, post- conditions, inputs, outputs and cost. Given a high-level goal described as a task, a configuration planner is used to con- figure a collection of robots towards the execution of the Figure 1: High-level overview of our approach. task. Some major differences between the work by Lundh et al. and the work on semantic information integration with DyKnow is that the descriptions of transformations are done vides functionality for processing streams of information semantically with the help of an ontology. Further, DyKnow and has been integrated in the Robot Operating System makes use of streams of incrementally available information (ROS) (Quigley et al. 2009). DyKnow is related to both Data rather than shared tuples as used by channels. The configu- Stream Management Systems and Complex Event Process- ration planner presented by Lundh et al. assumes full knowl- ing (Cugola and Margara 2012). The approach is general and edge of the participating agents’ capabilities and acts as an can be used with other stream processing systems. authority outside of the individuals agents, whereas we as- The remainder of this paper is organized as follows. Sec- sumes full autonomy of agents and make no assumptions on tion 2 starts off by putting the presented ideas in the context the knowledge of agents’ capabilities. Configuration plan- of similar and related efforts. In Section 3, we give an in- ning further shares some similarities with efforts in the area troduction to the underlying stream processing framework. of knowledge-based planning, where the focus is not on the This is a prelude to Section 4, which describes the details of actions to be performed but on the internal knowledge state. our approach, where we also highlight functionality of in- In a broader context, the presented ideas are in line with a terest made possible as the result of semantic matching. The broader trend that moves away from the how and towards the paper concludes in Section 5 by providing a discussion of what. Content-centric networks (CCN) seek to allow users the introduced concepts and future work. to simply specify what data resource they are interested in, and lets the network handle the localisation and retrieval of that data resource. In the database community, the problem 2 Related Work of self-configuration is somewhat similar to the handling of Our approach is in line with recent work on semantic mod- distributed data sources such as ontologies. The local-as- eling of sensors (Goodwin and Russomanno 2009; Rus- view and global-as-view approaches (Lenzerini 2002) both somanno, Kothari, and Thomas 2005) and work on se- seek to provide a single interface that performs any neces- mantic annotation of observations for the Semantic Sensor sary query rewriting and optimisation. Web (Bröring et al. 2011; Sheth, Henson, and Sahoo 2008; The approach presented here extends previous work Botts et al. 2008). An interested approach is a publish/sub- by (Heintz and Dragisic 2012; Heintz and de Leng 2013; scribe model for a sensor network based on semantic match- Heintz 2013) where the annotation was done in a sepa- ing (Bröring et al. 2011). The matching is done by creating rate XML-based language. This is a significant improvement an ontology for each sensor based on its characteristics and since now both the system’s information about the world and an ontology for the requested service. If the sensor and ser- its internal stream processing are represented in a single on- vice ontologies align, then the sensor provides relevant data tology. This allows many powerful forms of introspective for the service. This is a complex approach which requires reasoning of which semantic matching is one. significant semantic modeling and reasoning to match sen- sors to services. Our approach is more direct and avoids most of the overhead. Our approach also bears some similarity to 3 Stream Processing with DyKnow the work by (Whitehouse, Zhao, and Liu 2006) as both use Stream processing is the basis for our approach to seman- stream-based reasoning and are inspired by semantic web tic information integration. It is used for generating streams services. One major difference is that we represent the do- by for example importing, synchronizing and transforming main using an ontology while they use a logic-based markup streams. A stream is a named sequences of incrementally- language that supports ‘is-a’ statements. available time-stamped samples each containing a set of In the robotic domain, the discussed problem is some- named values. Streams are generated by stream processing times called self-configuration and is closely related to task engines based on declarative specifications. 3.1 Representing Information Flows 7 http://www.dyknow.eu/config.xsd" Streams are regarded as fundamental entities in DyKnow. 8 xmlns:spec= For any given system, we call the set of active streams the 9 "http://www.dyknow.eu/ontology# stream space S ⊆ S ∗ , where S ∗ is the set of all possible Specification"> streams; the stream universe. A sample is represented as a 10 11 came available, tv represents the time for which the sam- 13 ple is valid, and ~v represents a vector of values. A special 14 kind of stream is the constant stream, which only contains 15 one sample. The execution of an information flow process- 16 ing system is described by a series of stream space transi- 17 tions S t0 ⇒ S t1 ⇒ · · · ⇒ S tn . Here S t represents a stream 18 space at time t such that every sample in every stream in S 19 has an available time ta ≤ t. 20 Transformations in this context are stream-generating 21 functions that take streams as arguments. They are associ- 22 ated with an identifying label and a specification determin- 23 plementation of transformations from the stream processing 24 functionality. Transformations are related to the combina- 25 tion of implementation and parameters of their correspond- ing implementations. This means that for a given implemen- tation there might exist multiple transformations, each using The shown specification can be executed by the stream different parameters for the implementation. processing engine, which instantiates the declared computa- When a transformation is instantiated, the instance is tional units and connects them according to the specification. called a computational unit. This instantiation is performed In the example shown here, we make use of an XML-based by the stream processing engine. A computational unit is as- specification tree, where the children of every tree node rep- sociated with a number of input and output streams. It is able resent the inputs for that computational unit. The cu tag is to replace input and output streams at will. A computational used to indicate a computational unit, which may be a source unit with zero input streams is called a source. An example taking no input streams. A computational unit produces at of a source is a sensor interface that takes raw sensor data most one stream, and this output stream can thus be used and streams this data. Conversely, computational units with as input stream for other computational units. Indeed, only zero transmitters are called sinks. An example of a sink is a one computational unit explicitly defines the output stream storage or a unit that is used to control the agent hosting the name as result. When no explicit name is given, DyKnow system, such as an unmanned aerial vehicle (UAV). assigns a unique name for internal bookkeeping. Note that every cu tag has a label associated with it. This label rep- DyKnow’s stream processing engine as shown in Figure 1 resents the transformation used to instantiate the computa- is responsible for manipulating the stream space based on tional unit, which is then given a unique name by DyKnow declarative specifications, and thereby plays a key role as as well. As long as a transformation label is associated with the foundation for the stream reasoning framework. an implementation and parameter settings, the stream pro- cessing engine is able to use this information to do the in- 3.2 Configurations in DyKnow stantiation. In this toy example, the specification tree uses a A configuration represents the state of the stream process- GPS to infer coordinates, and combines this with RGB and ing system in terms of computational units and the streams infrared video data to provide the coordinates of some en- connecting them. The configuration can be changed through tities detected in the video data. Since DyKnow has been the use of declarative stream specifications. An example of implemented in ROS, currently only Nodelet-based imple- a stream specification is shown in Listing 1, and describes mentations are supported. a configuration for producing a stream of locations for de- The result of the stream declaration is that the stream tected humans. processing engine instantiates the necessary transformations and automatically assigns the necessary subscriptions for the Listing 1: Configuration specification format result stream to be executed. Additionally, it uses its own 1 /status stream to inform subscribers when it instantiates a 2