<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>MSM</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Making Sense of Location-based Micro-posts Using Stream Reasoning</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>CEFRIEL - ICT Institute, Politecnico of Milano</institution>
          ,
          <addr-line>Milano</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dip. di Elettronica e dell'Informazione - Politecnico di Milano</institution>
          ,
          <addr-line>Milano</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>SIEMENS AG, Corporate Technology</institution>
          ,
          <addr-line>Muenchen</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Saltlux</institution>
          ,
          <addr-line>Seoul</addr-line>
          ,
          <country country="KR">Korea</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2011</year>
      </pub-date>
      <volume>1</volume>
      <fpage>13</fpage>
      <lpage>18</lpage>
      <abstract>
        <p>Consider an urban environment and think to its semi-public realms (e.g., shops, bars, visitors attractions, means of transportation). Who is the maven of a district? How fast and how broad can such maven influence the opinions of others? These are just few of the questions BOTTARI (our Location-based Social Media Analysis mobile app) is getting ready to answer. In this position paper, we recap our investigation on deductive and inductive stream reasoning for social media analysis, and we show how the results of this research form the underpinning of BOTTARI.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Continuous processing of information flows (i.e. data streams) has widely been
investigated in the database community. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In contrast, continuous processing of
data streams together with rich background knowledge requires semantic
reasoners, but, so far, semantic technologies are still focusing on rather static data. We
strongly believe that there is a need to close this gap between existing solutions
for belief update and the actual need of supporting decision making based on
data streams and rich background knowledge. We named this little explored, yet
high-impact research area Stream Reasoning [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The foundation for Stream
Reasoning has been investigated by introducing technologies for wrapping and
querying streams in the RDF data format (e.g., using C-SPARQL [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]) and by
supporting simple forms of reasoning [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] or query rewriting [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        We are developing the Stream Reasoning vision on top of LarKC [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The
LarKC platform is aimed to reason on massive heterogeneous information such
as social media data. The platform consists of a framework to build workflows,
i.e. sequences of connected components (plug-ins) able to consume and process
data. Each plug-in exploits techniques and heuristics from diverse areas such as
databases, machine learning and the Semantic Web.
We built our Stream Reasoning system by embedding a deductive reasoner and
an inductive reasoner within the LarKC architecture (see Figure 1). First,
BOTTARI pre-processes the micro-posts by extracting information5 whether a
micropost expresses a positive or a negative feeling of its author about a certain POI.
After BOTTARI data arrives to the stream reasoner as set of data streams,
a selection plug-in extracts the relevant data in each input stream in form of
windows. A second plug-in abstracts the window content from fine grain data
streams into aggregated events and produces RDF streams. Then, a deductive
reasoner plug-in is able to register C-SPARQL queries, whose results can be of
immediate use (cf. Section 4) or can be processed by other two sub-workflows.
Each sub-workflow is constituted by an abstracter and an inductive reasoner,
which uses an extended version of SPARQL that supports probabilities [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-2">
      <title>The BOTTARI mobile app</title>
      <p>The BOTTARI mobile app is a location-based service that exploits the social
context to provide relevant contents to the user in a specific geographic location.
5 This technology is a Saltlux trade secret.
The purpose of the BOTTARI service is to provide recommendations on local
context information to users through an augmented reality interface. BOTTARI
gives detailed information on local POIs, including trust or reputation
information. In Figure 2 , we provide some sample screenshots on how the BOTTARI
mobile application will look like once completed.</p>
      <p>The input data for the BOTTARI service come from public social networks
and location based services (Twitter, local blogs and Korean news), are converted
in RDF streams and are then processed and analysed by the system described
in Section 2. The RDF-ized data are modelled with respect to the ontology
represented in Figure 3, which is an extension to the SIOC vocabulary [?]. Our
model takes into account the specific relations of Twitter (followers/following,
reply/retweet); it adds the geographical perspective by modelling the POIs; it
includes the “reputation” information by means of positive/negative reviews.</p>
    </sec>
    <sec id="sec-3">
      <title>Computing Answers to User Questions</title>
      <p>The hybrid Stream Reasoning solutions we are developing is able to answer
questions like: Who are the opinion makers (i.e., the users who are likely to
influence the behaviour of their followers with regard to a certain POI)? How
fast and how wide are opinions spreading? Who shall I follow to be informed
about a given category of POIs in this neighbourhood?</p>
      <p>In the rest of the section we show how to issue the three queries above using
C-SPARQL and SPARQL with probabilities.</p>
      <p>Who are the opinion makers?</p>
      <p>Lines 1 and 3 of the following listing tell the C-SPARQL engine to register
the continuous query on the stream of micro-posts generated by BOTTARI
considering a sliding window of 30 minutes that slides every 5 minutes. Line 2
tells the engine that it should generate an RDF stream as output reporting who
are the opinion makers for a certain POI and if they are rating it positively or
negatively.
1. REGISTER STREAM OpinionMakers COMPUTED EVERY 5m AS
2. CONSTRUCT { ?opinionMaker a twd:opinionMaker ; twd:discuss [ ?opinion ?poi ] . }
3. FROM STREAM &lt;http://bottari.saltlux.com/posts&gt; [RANGE 30m STEP 5m]
4. WHERE {
5. ?opinionMaker a twd:TwitterUser ;
6. twd:posts [ ?opinion ?poi ] .
7. ?follower sioc:follows ?opinionMaker;
8. twd:posts [ ?opinion ?poi ] .
9. FILTER ( cs:timestamp(?follower) &gt; cs:timestamp(?opinionMaker)
10. &amp;&amp; ?opinion != twd:talksAbout )
11. }
12. HAVING ( COUNT(DISTINCT ?follower) &gt; 10 )
The basic triple pattern (BTP) at lines 5 and 6 matches micro-posts of the
potential opinion makers with a POI. The variable opinion can match one of
the properties talksAbout, talksAboutPositively, or
talksAboutNegatively. The BTP at lines 7–8 looks up the followers of the opinion makers. The
#MSM2011
FILTER clause at line 9 checks whether the micro-posts of the followers, which
talk about the same POI, occurs after those from the opinion makers. At line
10 the query filters out actions of type twd:talksAbout and concentrates on
micro-posts clearly discussing a POI in a positive or negative way. Finally, at
line 12 the clause HAVING promotes the true opinion makers which have at
least ten followers who expressed the same opinion about the POI after them.
How fast and wide opinions are getting spread?</p>
      <p>
        Using the RDF stream computed by the previous query, the query in the
following listing informs about how wide the micro-posts of an opinion maker
are getting spread in half an hour. To do so, it considers the reply and re-tweet
relationships among tweets (i.e., tweets linked by the discuss property in
BOTTARI data model). Being discuss a transitive property, the C-SPARQL engine
uses the materialization technique presented in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] to incrementally compute the
transitive closure of discuss.
1. REGISTER STREAM OpinionSpreading COMPUTED EVERY 30s AS
2. SELECT ?user ?opinionMakerTweet count(?aPositiveTweet) count(?aNegativeTweet)
3. FROM STREAM &lt;http://bottari.saltlux.com/posts&gt; [RANGE 30m STEP 30s]
4. FROM STREAM &lt;http://bottari.saltlux.com/OpinionMakers [RANGE 30m STEP 30s]
5. WHERE {
6. ?user a twd:opinionMaker ;
7. twd:post ?opinionMakerTweet .
8. { ?aPositiveTweet a twd:Tweet ;
9. twd:discuss ?opinionMakerTweet ;
10. twd:talksAboutPositively ?poi .
11. } UNION {
12. ?aNegativeTweet a twd:Tweet ;
13. twd:discuss ?opinionMakerTweet ;
14. twd:talksAboutNegatively ?poi .
15. }
Lines 1, 3 and 4 tell the C-SPARQL engine to register the continuous query on
the stream of micro-posts generated by BOTTARI and on the streaming results
of the opinion makers query. In both cases, a sliding window of 30 minutes, which
slides every 30 seconds, is considered. The BTP at lines 6–7 matches the
microposts of the opinion makers. The BTP at lines 8–10 and the BTP at lines 12–14
look up other micro-posts that, respectively, positively and negatively discussed
those of the opinion makers. Line 2 asks the engine to generate a variable binding
reporting how many positive and negative micro-posts are discussing the
microposts of the current opinion makers.
      </p>
      <p>Who shall I follow?</p>
      <p>Let us consider now a specific BOTTARI user named Giulia. In the
following listing we show a query that asks for the mavens Giulia should follow to be
informed about attractions for kids, even among people she does not know. The
system uses the social network of Giulia and the last window in the stream
(generated by the query in the first listing) to determine such predicted probability.
1. SELECT ?user ?prob
2. FROM STREAM &lt;http://bottari.saltlux.com/OpinionMakers [RANGE 30m STEP 30s]
3. WHERE{
4. ?opinionMaker a twd:opinionMaker ;
5. twd:discuss [ twd:talksAboutPositively ?poi ] .
6. ?poi skos:subject twd:attractionsForKids .
7. :Giulia twd:following ?opinionMaker. WITH PROB ?prob
8. FILTER ( ?prob &gt; 0.8 &amp;&amp; ?prob &lt; 1 )
9. } ORDER BY ?prob
#MSM2011</p>
      <p>The BGP at lines 4–6 matches the opinion makers that have been recently
expressing positive opinions about attractions for kids. The triple patter at line
7 matches BOTTARI users that Giulia is following. Note that the following
relationship may have not been asserted yet, the construct WITH PROB extends
SPARQL by letting it query an inducted model. The variable ?prob assumes the
value 1 for the user she follows already and assumes the estimated probabilities
between 0.8 and 1 for users she may be recommended to follow (cf. line 8). The
ORDER BY clause is used to return users sorted by decreasing probability. The
query answer includes pairs of users and predicted likelihood (e.g. :Alice with
probability 0.99, :Bob with probability 0.87).</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusions and Future Works</title>
      <p>In this paper we presented BOTTARI, a location-based mobile application which
is able to supply contents and personalized suggestions to the users. We
explained the processing of new recommendations, based on the elaboration of
data streams generated by microblogging platforms like Twitter and foursquare.
The computation is defined as a workflow combining Semantic Web and machine
learning techniques and it is executed on top of the LarKC platform.</p>
      <p>Our future work will focus on the development of the first stable version of
the BOTTARI application and its release as Android app. The initial release
will focus on Korea and will be evaluated by following a user-centered approach:
a set of users will try out the application, supplying us feedbacks via a survey
with questions about the system and its accuracy in providing suggestions.
This work was partially supported by the EU project LarKC (FP7-215535).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Barbieri</surname>
            ,
            <given-names>D.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Braga</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceri</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Della Valle</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tresp</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rettinger</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wermser</surname>
          </string-name>
          , H.:
          <article-title>Deductive and Inductive Stream Reasoning for Semantic Social Media Analytics</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          <volume>25</volume>
          (
          <issue>6</issue>
          ) (
          <year>2010</year>
          )
          <fpage>32</fpage>
          -
          <lpage>41</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Cheptsov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , et al.:
          <article-title>Large Knowledge Collider. A Service-oriented Platform for Large-scale Semantic Reasoning</article-title>
          .
          <source>In: Proceedings of WIMS</source>
          <year>2011</year>
          .
          <article-title>(</article-title>
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Garofalakis</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gehrke</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rastogi</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <source>Data Stream Management: Processing High-Speed Data Streams</source>
          . Springer-Verlag New York, Inc. (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Della</given-names>
            <surname>Valle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Ceri</surname>
          </string-name>
          , S., van
          <string-name>
            <surname>Harmelen</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fensel</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>It's a Streaming World! Reasoning upon Rapidly Changing Information</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          <volume>24</volume>
          (
          <issue>6</issue>
          ) (
          <year>2009</year>
          )
          <fpage>83</fpage>
          -
          <lpage>89</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Barbieri</surname>
            ,
            <given-names>D.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Braga</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceri</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Della Valle</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grossniklaus</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>C-SPARQL: a Continuous Query Language for RDF Data Streams</article-title>
          .
          <source>Int. J. Semantic Computing</source>
          <volume>4</volume>
          (
          <issue>1</issue>
          ) (
          <year>2010</year>
          )
          <fpage>3</fpage>
          -
          <lpage>25</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Barbieri</surname>
            ,
            <given-names>D.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Braga</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceri</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Della Valle</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grossniklaus</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Incremental Reasoning on Streams and Rich Background Knowledge</article-title>
          .
          <source>In: Proc. of ESWC2010</source>
          . (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Ren</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pan</surname>
            ,
            <given-names>J.Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Towards Scalable Reasoning on Ontology Streams via Syntactic Approximation</article-title>
          .
          <source>In: Proc. of IWOD2010</source>
          . (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Fensel</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , et al.:
          <article-title>Towards LarKC: a Platform for Web-scale Reasoning</article-title>
          .
          <source>In: Proc. of ICSC</source>
          <year>2008</year>
          .
          <article-title>(</article-title>
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Tresp</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bundschus</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rettinger</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Materializing and querying learned knowledge</article-title>
          .
          <source>In: Proc. of IRMLeS</source>
          <year>2009</year>
          . (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Berrueta</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , et al.:
          <source>SIOC Core Ontology Specification. W3C Member Submission, W3C</source>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>