Implementing Customer Reception Service in Robot Cafe using Stream Reasoning and ROS based on PRINTEPS Takeshi Morita, Yu Sugawara, Ryota Nishimura, and Takahira Yamaguchi Faculty of Science and Technology, Keio University 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, 223-8522 Japan {t morita, nishimura, yamaguti}@ae.keio.ac.jp http://www.yamaguti.comp.ae.keio.ac.jp/ Abstract. We have developed PRactical INTElligent aPplicationS (PRINT- EPS) which is a platform for developing comprehensive intelligence appli- cations. This paper introduces an application of PRINTEPS for customer reception service in robot cafe by using stream reasoning and Robot Op- erating System (ROS) based on PRINTEPS, and for integrating image sensing with knowledge processing. Based on this platform, we demon- strate that the behaviors of a robot in a robot cafe can be modified by changing the applicable rule sets. Keywords: ROS, Stream Reasoning, PRINTEPS, SWRL 1 Introduction Designing machine-human task collaboration often requires integration of the image sensing technologies that help recognize surrounding circumstances by using a rule set of a target operation. However, a major hurdle exists in connect- ing the two directly. This is because a huge grain-size difference exists between information acquired through image sensing and that expressed by a rule set. As a means to achieve integration between rules and image sensing, we pro- pose a novel method to integrate signals (dynamic sensing data) and symbols (RDF stream) via Robot Operating System (ROS) [2] and stream reasoning tool C-SPARQL [1] based on PRINTEPS 1 [4]. In existing approaches (e.g. KnowRob [3]), the integration of image sensing and knowledge processing is achieved simply by adding knowledge expressions such as conceptual information to the object models. On the other hand, this study attempts to integrate dy- namic information (e.g., people involved in time-series changes) acquired through image sensing with static information (ontologies and business rules) by using C-SPARQL. We conducted a case study of a robot cafe customer-reception service using Pepper 2 , which is an emotion-recognizing humanoid robot. As a result, a robot 1 http://printeps.org/index en.html 2 https://www.aldebaran.com/en/cool-robots/pepper 2 Takeshi Morita, Yu Sugawara, Ryota Nishimura, Takahira Yamaguchi ROS Environment Image Sensing Multiple Windows Knowledge Kinect v2 Images of faces Age and gender of people estimation Cafe Entry Ontology Sensing at entrance Entrance detection IF … THEN … Information Stream Business Reasoning Rules Event Detection Speech Dialog … Robot speech Robot moving Robot!Pepper" Movement Fig. 1. System outline of customer reception service in robot cafe. can use the rule sets that the person has, and we can construct a more efficient robot service system. A demo movie for the customer reception service can be seen on YouTube 3 . 2 System Outline Figure 1 shows the system outline of customer reception service in robot cafe. A Windows machine with a kinect v2 collects the results of sensing at the cafe entrance to the ROS environment via UDP or socket communication. Then, the information sent is analyzed based on stream reasoning, an event of entry is de- tected, and the age and gender of the persons are judged. The information on the event or its attribute acquired in this process is used for knowledge processing, and a robot motion program runs based on the result acquired during knowledge processing. The robot moves and speaks out according to the instructions from the business rules. With such a series of processes, the robot responds to visitors at their arrival. 3 Customer Reception Service in Robot Cafe The customer reception service mainly consists of the “customer detection” pro- cess which detects an incoming customer by means of the Kinect sensor and the “greeting to the customer” process which orders the robot Pepper to give the customer a greeting based on the business rules described by semantic web rule langugage (SWRL). 3.1 Customer Detection Process Figure 2 shows the C-SPARQL query for detecting a customer. This query mea- sures at every one-second interval the distance between the face of every person 3 https://youtu.be/HbHHT2F2Cvo Title Suppressed Due to Excessive Length 3 !"#$%&"!'()"!*'+,-./01231.14.5/6(,127 8% 9!":$;'<='>?..@=AABC2D4E1,A4-@C2FBA-@C2FBAG16CA1H.IJ 9!":$;'2K<=>?..@=AALLLELME/2NAOPPPAQRARRS2K?..@=AA@256.1@-E/2NA4C<1A16.2C641J'\!8X#"'M-'%&"9'O-] ^_"!"'` U-'2K<=.7@1 @256.1@-=+,-./012a @256.1@-=@/-5.5/68.&501 U.-Ob'U.-RE' U.-O'@256.1@-=K5-.C641 UK5-.C641OE''U.-R'@256.1@-=K5-.C641 UK5-.C641RE c$X3VUK5-.C641R'S UK5-.C641O'8%'UK5<<121641Y :$T&"!VV<=.501-.C0@VU-b' @256.1@-=@/-5.5/68.&501bU.-OY J'<=.501-.C0@VU-b' @256.1@-=@/-5.5/68.&501bU.-RYY' 'dd'''QEO'>'UK5<<121641Y e #!W)9'c*'U-'_8Z$X#'V8Z#VUK5-.C641OY' >'UK5-.C641f./f16.2C641 dd'O'>'+W)X&VU-YY Fig. 2. C-SPARQL query for detecting customers. (detected within 3 s) and Kinect. The distances between the faces of the peo- ple detected within 3 s (ID: c1) and Kinect are chronologically shown as c1d1, c1d2, c1d3, and the values of c1d1-c1d2, c1d1-c1d3, and c1d2-c1d3, are com- puted in order to count the number of values greater than 0.1. The value 0.1 is determined based on the accident error of Kinect’s depth sensor value. This measurement is used to avoid erroneously detecting someone who is in front of the cafe but is not approaching it as a customer. A count exceeding 1 means that someone is approaching the cafe (Kinect), which is a measurement used to avoid erroneously reacting with a customer who is leaving the cafe. The av- erage value of the distances between the faces of the people detected within 3 s and Kinect is also calculated, and if the average value is less than the ?dis- tance to entrance (3.3 m), the customers are recognized as having entered the cafe, and their IDs, the aforementioned count, and the average values are re- turned. The value of ?distance to entrance is obtained using a SPARQL query for measuring a distance between Kinect and an entrance by predefining, based on the cafe ontology, the distance between the location in which Kinect is in- stalled and the entrance of the cafe. Currently, a distance between a person’s face and Kinect is the only information that is used. However, if the sensor can obtain various attribute information from a person in the future, more complex customer detection based on such information (e.g., discerning a customer from a cafe clerk based on clothing) will be realized. 3.2 Greeting to the Customer Process Figure 3 shows the rule that the module applies when determining a greeting statement upon detecting two customers. In this study, we used the reasoning engine Pellet 4 to apply the greeting properties and their values to the robot- class instance and to determine a greeting statement. The rule shown in Figure 3 4 http://clarkparsia.com/pellet 4 Takeshi Morita, Yu Sugawara, Ryota Nishimura, Takahira Yamaguchi !"#$%"&'()'"#$%"&'*+,-$''#."/0#!"#$%"&'()1'$2.&'*+ -$345()/$345*+,6'55'$()5'55'$*+,7%1-$345()'"#$%"&'+,)/$345*+ $383#631.#.3"()5'55'$+,)'"#$%"&'*+,1'$2'9:;()1'$2.&'+,)5'55'$*+, 1'$2'9<3()1'$2.&'+,)/$345*+,"4=8'$>?@41#3='$1()/$345+,A*, BC,/$''#."/()5'55'$+,DE'F&3='G,