<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Personalized Filtering of the Twitter Stream</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pavan Kapanipathi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabrizio Orlandi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amit Sheth</string-name>
          <email>amitg@knoesis.org</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexandre Passant</string-name>
          <email>alexandre.passantg@deri.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Digital Enterprise Research Institute</institution>
          ,
          <addr-line>Galway</addr-line>
          ,
          <country country="IE">Ireland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Kno.e.sis Center</institution>
          ,
          <addr-line>Dayton, OH -</addr-line>
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <fpage>6</fpage>
      <lpage>13</lpage>
      <abstract>
        <p>With the rapid growth in users on social networks, there is a corresponding increase in user-generated content, in turn resulting in information overload. On Twitter, for example, users tend to receive uninterested information due to their non-overlapping interests from the people whom they follow. In this paper we present a Semantic Web approach to lter public tweets matching interests from personalized user pro les. Our approach includes automatic generation of multi-domain and personalized user pro les, ltering Twitter stream based on the generated pro les and delivering them in real-time. Given that users interests and personalization needs change with time, we also discuss how our application can adapt with these changes.</p>
      </abstract>
      <kwd-group>
        <kwd>Semantic Web</kwd>
        <kwd>Social Network</kwd>
        <kwd>Twitter</kwd>
        <kwd>PubSubHubbub</kwd>
        <kwd>User Pro ling</kwd>
        <kwd>Personalization</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Online Social Networks have become a popular way to communicate and
network in the recent times, well known ones include Facebook, MySpace, Twitter,
Google+, etc. Twitter, in speci c, has rapidly grown in the recent years, reaching
460,000 average number of new users per day in the month of March 2011. These
numbers have in turn played a crucial role to increase the number of tweets from
65 million to 200 million3 in the past year. This proves that the interested users
are therefore facing the problem of information overload. Filtering uninteresting
posts for users is a necessity and plays a crucial role [8] to handle the information
overload problem on Twitter.</p>
      <p>On Twitter it is necessary to follow another user in order to receive his/her
tweets. The user who receives the tweets is called a follower and the user who
generates the tweet is called a followee. However, they receive all the tweets from
the users that are also not of their interests. Twitter by itself provides features
such as keyword/hashtag search as a nave solution for the information overload
problem, but these lters are not su cient to provide complete personalized
information for a user. Although Twarql [6] improved the ltering mechanism
for Twitter by leveraging Semantic Web technologies, the user still needs to
track information by manual selection or formulation of SPARQL Query using
Twarql's interface. So far applications such as TweetTopic [1] and \Post Post"4
focus on ltering the stream of tweets generated from the people who are followed
by the user. Instead of limiting the user experience only to his/her personal
stream we propose a Semantic Web approach to deliver interesting tweets to the
user from the entire public Twitter stream. This helps ltering tweets that the
user is not interested in, which in turn reduces the information overload.</p>
      <p>
        Our contributions include (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) automatic generation of user pro les (primarily
interests) based on the user's activities on multiple social networks (Twitter,
Facebook, Linkedin). This is achieved by retrieving users' interests, some implicit
(analyzing user generated content) and some explicit (interests mentioned by the
user in his/her SN pro le). (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) Collecting tweets from the Twitter stream and
mapping (annotating) each tweet to its corresponding topics from Linked Open
Data. (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) Delivering the annotated tweets to users with appropriate interests in
(near) real-time.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Architecture</title>
      <p>
        Our architecture can be separated into three modules (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) Semantic Filter (SF)
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) Pro le Generator (PG) (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) Semantic Hub (SemHub) as illustrated in
Fig
      </p>
      <sec id="sec-2-1">
        <title>4 http://postpo.st/</title>
        <p>ure 1. In this section we rst explain the interaction between the three modules,
later each one is explained in detail.</p>
        <p>In the above architecture two processes run in parallel (a) Filtering of tweets
(b) Subscription to the System. The sequence for each process is represented by
di erent types of arrows in Figure 1. The Subscription to the system is included
in the Semantic Distributor. The Semantic Distributor (SD) comprises of both
SH and PG. Once the user requests for the subscription (Seq. i in Figure 1)
he/she is redirected to the PG (Seq. ii ). PG generates the pro les based on
the the user's activities on multiple social networks (Seq. iii ). These pro les are
stored in the SemHubs' RDF store (Seq. iv ) using PuSH vocabulary 5. On the
other hand, Filtering of tweets is performed by annotating tweets from Twitter
stream in SF. The annotations are further transformed to a representation of
groups (SPARQL queries) of users who have interests corresponding to the tweet
(Seq. 1 ). These SPARQL Queries are termed as Semantic Groups (SG) in this
paper. The tweet with its SG is updated as an RSS feed (Seq. 2 ) and noti ed to
SemHub (Seq. 3 ). SemHub then fetches the updates (Seq. 4 ) and retrieves the
list of subscribers whose interests match the group representation of the tweet
(Seq. 5 ). Further the tweet is pushed to the ltered subscribers (Seq. 6 ).
2.1</p>
        <sec id="sec-2-1-1">
          <title>Semantic Filter</title>
          <p>
            Semantic Filter (Figure 1), primarily performs two functions: (
            <xref ref-type="bibr" rid="ref1">1</xref>
            ) Representing
tweets as RDF (
            <xref ref-type="bibr" rid="ref2">2</xref>
            ) Forming interested groups of users for the tweet.
          </p>
          <p>First, information about the tweet is collected to represent the tweet in RDF.
Twitter provides information of the tweet such as author, location, time,
\replyto", etc. via its streaming API. Including this, extraction of entities from the
tweet content (content-dependent metadata) is performed using the same
technique used in Twarql. The extraction technique is dictionary-based, which
provides exibility to use any dictionary for extraction. In our system the dictionary
used to annotate the tweet is a set of concepts6 from the Linked Open Data [2]
(LOD)7. The same set is also used to create pro les, as described in the next
Section 2.2. After the extraction of entities, the tweets are represented in RDF
using lightweight vocabularies such as FOAF, SIOC, OPO and MOAT. This
transforms the unstructured tweet to a structured representation using popular
ontologies. The triples (RDF) of the tweet are temporarily stored in an RDF
store.</p>
          <p>The annotated entities represent the topic of the tweet. These topics act as
the key in ltering the subset of users who receive the tweet. Topics are queried
from the RDF store to be included in SGs that are created to act as the lter.
The SG once executed at the Semantic Hub fetches all the users whose interests
match to the topic of the tweet. If there are multiple topics for the tweet then
the SG is created to fetch the union of users who are interested in at least one
topic of the tweet.
5 http://vocab.deri.ie/push
6 Topic and concept are used interchangeably.
7 http://richard.cyganiak.de/2007/10/lod/
2.2</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>User Pro le Generator</title>
          <p>
            The extraction and generation of user pro les from social networking
websites is composed of two basic parts: (
            <xref ref-type="bibr" rid="ref1">1</xref>
            ) data extraction and (
            <xref ref-type="bibr" rid="ref2">2</xref>
            ) generation of
application-dependent user pro les. After this phase other important steps for
our work involve the representation of the user models using popular ontologies,
and then, nally, the aggregation of the distributed pro les.
&lt;f o a f : t o p i c i n t e r e s t r d f : r e s o u r c e="http : / / dbpedia . org / r e s o u r c e /Semantic Web" /&gt;
&lt;wi : p r e f e r e n c e &gt;
&lt;wi : WeightedInterest &gt;
&lt;wi : t o p i c r d f : r e s o u r c e="http : / / dbpedia . org / r e s o u r c e /Semantic Web" /&gt;
&lt;r d f s : l a b e l &gt;Semantic Web&lt;/r d f s : l a b e l &gt;
&lt;wo : weight&gt;
          </p>
          <p>&lt;wo : Weight&gt;
&lt;wo : weight value r d f : datatype="http : / /www. w3 . org /2001/XMLSchema#double "&gt;0.5&lt;/wo :
weight value &gt;
&lt;wo : s c a l e r d f : r e s o u r c e="http : / / example . org /01 S c a l e " /&gt;</p>
          <p>&lt;/wo : Weight&gt;
&lt;/wo : weight&gt;
&lt;opm : wasDerivedFrom r d f : r e s o u r c e="http : / /www. t w i t t e r . com/BadmotorF" /&gt;
&lt;opm : wasDerivedFrom r d f : r e s o u r c e="http : / /www. l i n k e d i n . com/ in / f a b r i z i o r l a n d i " /&gt;
&lt;/wi : WeightedInterest &gt;
&lt;/wi : p r e f e r e n c e &gt;
[ . . . ]
&lt;wo : S c a l e r d f : about="http : / / example . org /01 S c a l e"&gt;
&lt;wo : max weight r d f : datatype="http : / /www. w3 . org /2001/XMLSchema#decimal "&gt;1.0&lt;/wo :
max weight&gt;
&lt;wo : min weight r d f : datatype="http : / /www. w3 . org /2001/XMLSchema#decimal "&gt;0.0&lt;/wo :
min weight&gt;
&lt;/wo : Scale&gt;</p>
          <p>First, in order to collect private data about users on social websites it is
necessary to have access granted to the data by the users. Then, once the
authentication step is accomplished, the two most common ways to fetch the pro le
data is by using an API provided by the system or by parsing the Web pages.
Once the data is retrieved the next step is the data modeling using standard
ontologies. In this case, a possible way to model pro le data is to generate
RDFbased pro les described using the FOAF vocabulary [4]. We then extend FOAF
with the SIOC ontology [3] to represent more precisely online accounts of the
person on the Social Web. Additional personal information about users' a liation,
education, and job experiences can be modeled using the DOAC vocabulary8.
This allows us to represent the past working experiences of the users and their
cultural background. Another important part of a user pro le is represented
by the user's interests. In Figure 2 we display an example of an interest about
\Semantic Web" with a weight of 0.5 on a speci c scale (from 0 to 1) using
the Weighted IntListingerests Vocabulary (WI)9 and the Weighting Ontology
(WO)10. In order to compute the weights for the interests common approaches
are based on the number of occurrences of the entities, their frequency, etc.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>8 DOAC Speci cation: http://ramonantonio.net/doac/0.1/ 9 WI Speci cation: http://purl.org/ontology/wi/core# 10 WO Speci cation: http://purl.org/ontology/wo/core#</title>
        <p>Finally, the phase that follows the modeling of the FOAF-based user pro les
and the computation of the weights for the interests is the aggregation of the
distributed user pro les. When merging user pro les it is necessary to avoid
duplicate statements (and this is done automatically by a triplestore during the
insertion of the statements). Furthermore, as in the case of the interests, if the
same interest is present on two di erent pro les it is necessary to: represent
the interest only once, recalculate its weight, and update the provenance of
the interest keeping track of the source where the interest was derived from.
As regards the provenance of the interest, as showed in Figure 2, we use the
property wasDerivedFrom from the Open Provenance Model11 (OPM) to state
that the interest was originated by a speci c website.</p>
        <p>As regards the computation of the aggregated global weight for the interest
generated by multiple sources, we propose a simple generic formula that can be
adopted for merging the interest values of many di erent sources. The formula
is as follows:</p>
        <p>Gi =</p>
        <p>
          X ws
s
wi
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
Where: Gi is the global weight for interest i ; ws is the weight associated to the
source s; wi is the weight for the interest i in source s.
2.3
        </p>
        <sec id="sec-2-2-1">
          <title>Semantic Hub</title>
          <p>
            The Semantic Distributor module comprises of Semantic Hub [5] and Pro le
Generator. Semantic Hub (SemHub) is an extension of Google's
PubSubHubbub (PuSH) using Semantic Web technologies to provide publisher-controlled
real-time noti cations. PuSH is a decentralized publish-subscribe protocol which
extends Atom and RSS to enable real-time streams. It allows parties
understanding it to get near-instant noti cations of the content they are subscribed to, as
PuSH immediately pushes new data from publisher to subscriber(s) where
traditional RSS readers periodically pull new data. The PuSH ecosystem consists
of a few hubs, many publishers, and a large number of subscribers. Hubs enable
(
            <xref ref-type="bibr" rid="ref1">1</xref>
            ) publishers to o oad the task of broadcasting new data to subscribers; and
(
            <xref ref-type="bibr" rid="ref2">2</xref>
            ) subscribers to avoid constantly polling for new data, as the hub pushes the
data updates to the subscribers. In addition, the PuSH protocol is designed to
handle all the complexity in the communication easing the tasks of publishers
and subscribers.
          </p>
          <p>The extension from PuSH protocol to Semantic Hub is described in [5]. In
our work, SemHub performs the functionality of distributing the tweets to its
interested users corresponding to the Semantic Groups generated by SF. The
SemHub will have only one publisher as shown in Figure 1 which is the SF, and
there can be multiple subscribers. SemHub, as in our previous work, does not
focus on creating a social graph of the publisher, the PG is responsible to store
the subscribers's FOAF pro le in the RDF store accesssed by the SemHub.
11 OPM Speci cation: http://openprovenance.org/</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3 Implementation</title>
      <p>In this section we provide the implementation details for each module in the
architecture. Firstly to collect tweets we use the twitter4j Streaming API 12.
Starting with SF, the entity extraction of tweets is dictionary-based similar to
the extraction technique used in Twarql [7]. This technique is opted due to
performance requirements for real-time noti cations. A set of 3.5 million entities13
from DBpedia is built as an in-memory representation for time-e cient and
longest sub-string matching. The in-memory representation is known as ternary
interval search tree (Trie) and the longest sub-string match using trie is
performed at time complexity of O(LT) where L is the number of characters and T
is the number of tokens in the tweet.
&lt;http : / / t w i t t e r . com/ rob / s t a t u s e s /123456789 &gt;</p>
      <p>mmmrssfodiioooooafaaaccf:ttt t:::::: yHmtttchaaapooaagggenlksggglteyeeeecrwdddnrWWWoetsoaiiiidotttt&lt;hhhocwtrhW:tetMdhpdddda&lt;i:bbbicn/thppprg/oteeetebidddpxslliiiaoaaaa: /msgt:::t/hPHKKpietoolirnemwsilgslto.iyovtKHwmrte;agoueroro//rmrdderu.apocnshobt.dhmrh&gt;eiiaa/rennrs;oob5n;;&gt;ythe;earKs?im# Kfabrdashian / Kris Humphries
&lt;hortptdpof :::/ts/ytaptrewtTititmeopero. c:oOmn/2lri0onb1e/P0srteas0et3nucs2ee0Ts/;1172:53455:46278+90#0p:0r0esence &gt;;</p>
      <p>opo : customMessage &lt;http : / / t w i t t e r . com/ rob / s t a t u s e s /123456789 &gt; .
&lt;http : / / t w i t t e r . com/rob&gt; geonames : l o c a t e d I n Dbpedia : Ohio .
[ . . . ]</p>
      <p>As mentioned in section 2.1, tweets are transformed into RDF using some
lightweight vocabularies, see Figure 3 for an example. The RDF is then stored
in an RDF store using SPARQL Update via HTTP. For performance issues it is
preferable to have the RDF Store on the same server. However, architecturally it
can be located anywhere on the Web and accessed via HTTP and the SPARQL
Protocol for RDF. Presently, this RDF generated for each tweet is stored in a
temporary graph and topics/concepts of the tweet are queried. These concepts
are then used to formulate the SPARQL representation of the group (SG) of
users who are interested in the tweet. The RSS is updated as per the format
speci ed in [5] with the SG and the Semantic Hub is noti ed. The SG for the
tweet in Figure 3 will retrieve all the users who are interested in at least one of
the extracted interests (dbpedia:Kim Kardashian, dbpedia:Kris Humphries,
dbpedia:Hollywood ).</p>
      <p>The Semantic Hub used for our implementation is hosted at
http://semantichub.appspot.com. The SemHub executes the SG on the graph
12 http://stream.twitter.com
13 http://wiki.dbpedia.org/About (July 2011)
that contains the FOAF pro les of subscribers generated by PG. The
corresponding tweets are pushed to the resulting users.</p>
      <p>Pro le Generator considers three di erent social networking sites: Twitter,
LinkedIn and Facebook for generating user pro les. In order to collect user data
from each of those platforms, we developed three di erent types of applications.
For Twitter and Facebook we implemented similar PHP scripts that makes use
of the respective query API publicly accessible on the Web. For LinkedIn we
use a XSLT script that parses the LinkedIn user pro le page and generates an
XML le containing all the attributes found on the page. The user information
collected from Twitter is the publicly available data posted by the user, i.e.
his/her latest 500 microblog posts. The technique used for entity recognition in
the tweets of the user is the same one used in SF for annotating the tweets.
The extracted concepts are then ranked and weighted using their frequency of
occurrences. A similar approach is described in [9].</p>
      <p>While on Twitter we create pro les with implicitly inferred interests, on
LinkedIn and Facebook we collect not only interests that have been explicitly
stated by the users, but also their personal details such as contacts, workplace
and education. The user personal data is fetched through the Facebook Graph
API as well as the interests (likes) that are then mapped to the related Facebook
pages representing the entities. We represent the entities/concepts on which the
user is interested in using both DBpedia and Facebook resources.</p>
      <p>The weights for the interests are calculated in two di erent ways depending on
whether or not the interest has been implicitly inferred by the entity extraction
algorithm (the Twitter case) or explicitly recorded by the user (the LinkedIn
and Facebook cases). In the rst case, the weight of the interest is calculated
dividing the number of occurrences of the entity in the latest 500 tweets by the
total number of entities identi ed in the same 500 tweets. In the second case,
since the interest has been manually set by the user, we assume that the weight
for that source (or social networking site) is 1 (on a scale from 0 to 1). So we
give the maximum possible value to the interest if it has been explicitly set by
the user.</p>
      <p>
        Our approach as regards the computation of the new weights as a result of the
aggregation of the pro les is straightforward. We consider every social website
equally in terms of relevance, hence we multiply each of the three weights by a
constant of 1=3 (approximately 0.33) and then we sum the results. According
to the previously described formula (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) in this case we use the following values:
ws = 1=3:8s.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion and Future Work</title>
      <p>In this paper we described an architecture for ltering the public Twitter stream
and delivering the interesting tweets directly to the users according to their
multi-domain user pro le of interests. We explained how we generate
comprehensive user pro les of interests by fetching and aggregating user information
from di erent sources (i.e. Twitter, Facebook and LinkedIn). Then, we detailed
how we extract entities and interests from tweets, how we model them using
Semantic Web technologies, and how is possible to automatically create dynamic
groups of users related to the extracted interests. According to the user groups
the tweets are then \pushed" to the users using the Semantic Hub architecture.</p>
      <p>In future, we want to extend our work to handle social streams in general (not
only Twitter). Also, leveraging inferencing (category - subcategory relationships)
on LOD, rather than just ltering based on concepts. Our extention would also
include users not only subscribe to concepts from LOD as interests but also
subscribe to a SPARQL Query as in Twarql. We are also working on providing
interesting information and ranking based on the user's social graph.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgements</title>
      <p>
        This work is funded by (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) Science Foundation Ireland under grant number
SFI/08/CE/I1380 (L on 2) and by an IRCSET scholarship supported by Cisco
Systems (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) Social Media Enhanced Organizational Sensemaking in Emergency
Response, National Science Foundation under award IIS-1111182, 09/01/2011
08/31/2014.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>M.S.</given-names>
            <surname>Bernstein</surname>
          </string-name>
          , Bongwon Suh, Lichan Hong, Jilin Chen, Sanjay Kairam, and
          <string-name>
            <given-names>E.H.</given-names>
            <surname>Chi</surname>
          </string-name>
          .
          <article-title>Eddi: interactive topic-based browsing of social status streams</article-title>
          .
          <source>In The 23rd annual ACM symposium on User interface software and technology</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Christian</given-names>
            <surname>Bizer</surname>
          </string-name>
          , Tom Heath, and
          <string-name>
            <surname>Tim</surname>
          </string-name>
          Berners-Lee.
          <article-title>Linked data - the story so far</article-title>
          .
          <source>Int. J. Semantic Web Inf. Syst.</source>
          ,
          <volume>5</volume>
          (
          <issue>3</issue>
          ):1{
          <fpage>22</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. John Breslin, Uldis Bojars, Alexandre Passant, Sergio Fernandez, and
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Decker</surname>
          </string-name>
          . SIOC:
          <article-title>Content Exchange and Semantic Interoperability Between Social Networks</article-title>
          .
          <source>In W3C Workshop on the Future of Social Networking</source>
          ,
          <year>January 2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Dan</given-names>
            <surname>Brickley</surname>
          </string-name>
          and
          <string-name>
            <given-names>Libby</given-names>
            <surname>Miller</surname>
          </string-name>
          .
          <source>FOAF Vocabulary Speci cation 0.98. Namespace Document 9 August 2010 - Marco Polo Edition</source>
          . http://xmlns.com/foaf/spec/,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Pavan</given-names>
            <surname>Kapanipathi</surname>
          </string-name>
          , Julia Anaya, Amit Sheth, Brett Slatkin, and
          <string-name>
            <given-names>Alexandre</given-names>
            <surname>Passant</surname>
          </string-name>
          .
          <article-title>Privacy-Aware and Scalable Content Dissemination in Distributed Social Networks</article-title>
          .
          <source>In ISWC 2011 - Semantic Web In Use</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Pablo</surname>
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Mendes</surname>
            , Alexandre Passant, and
            <given-names>Pavan</given-names>
          </string-name>
          <string-name>
            <surname>Kapanipathi</surname>
          </string-name>
          .
          <article-title>Twarql: tapping into the wisdom of the crowd</article-title>
          .
          <source>I-SEMANTICS '10</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Pablo</surname>
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Mendes</surname>
            , Alexandre Passant, Pavan Kapanipathi, and
            <given-names>Amit P.</given-names>
          </string-name>
          <string-name>
            <surname>Sheth</surname>
          </string-name>
          .
          <article-title>Linked Open Social Signals</article-title>
          .
          <source>In IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>D.</given-names>
            <surname>Ramage</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dumais</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Liebling</surname>
          </string-name>
          .
          <article-title>Characterizing microblogs with topic models</article-title>
          .
          <source>In ICWSM</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Ke</given-names>
            <surname>Tao</surname>
          </string-name>
          , Fabian Abel,
          <string-name>
            <given-names>Qi</given-names>
            <surname>Gao</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.J.</given-names>
            <surname>Houben</surname>
          </string-name>
          . TUMS:
          <article-title>Twitter-based User Modeling Service</article-title>
          .
          <source>In Workshop on User Pro le Data on the Social Semantic Web (UWeb)</source>
          ,
          <source>ESWC</source>
          <year>2011</year>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>