<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Microblog Processing : A Study</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sandip Modha</string-name>
          <email>sjmodha@gmail.com.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dhirubhai Ambani Institute of Information and Communication Technology Gandhinagar</institution>
          ,
          <addr-line>Gujarat</addr-line>
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Microblog</institution>
          ,
          <addr-line>Summarization,Ranking,Language Model, JM smoothing, Dirichlet Smoothing</addr-line>
        </aff>
      </contrib-group>
      <fpage>5</fpage>
      <lpage>8</lpage>
      <abstract>
        <p>Sensing Microblog from retrieval and summarization become the challenging area for the Information retrieval community. Twitter is one of the most popular micro blogging platforms. In this paper, Twitter posts called tweets are studied from retrieval and extractive summarization perspectives. Given a set of topics or interest profiles or information requirement, a Microblog summarization system is desinged which process Twitter sample status stream and generate day-wise, topic-wise tweet summary. Since volume of the Twitter public status stream is very large, tweet filtering or relevant tweet retrieval is the primary task for the summarization system. To measure the relevance between tweets and interest profiles, Language model with Jelinek-mercer smoothing, Dirichlet smoothing and Okapi BM25 model are used. Behaviour of Language Model smoothing parameter λ for JM-smoothing and µ for dirichlet smoothing is also studied. Summarization is anticipated as clustering problem. TREC MB 2015 and TREC RTS 2016 dataset is used to perform experiment. TREC RTS oficial metrics nDCG@10 − 1 and nDCG@10 − 0 are used to evaluate outcome of experiment. A detailed post hoc analysis is also performed on experiment results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>MOTIVATION AND CHALLENGES</title>
      <p>Microblog become popular social media to disseminate or broadcast
the real world event or opinion about the event of any nature. As
on 2016, Twitter has 319 million active users across the world1.
With this large user base, Twitter is the interesting data source for
the real time information. On many occasion, it has been observed
that Twitter was the first media to break the event. Many times,
thousands of users across the world geography interact on same
topic or interest profiles with diverse views. Following are the
major challenges for Microblog summarization. Henceforth, topic
or interest profile will be used interchangeability in rest of paper.
i) Since Twitter imposes limitation on length of tweet, it become
very dificult for retrieval system to retrieve tweet without the
proper context. So, tweet sparseness is the critical issues for
the retrieval system.
ii) On Many topics, the volume of the tweet is very large. Most of
the tweets are redundant and noisy.
iii) On Twitter, Some of the topics are being discussed for a longer
period of time. They also diverted into many subtopics (e.g.
demonetization in India, Refugee in Europe ). It is very dificult
to track topic drifting for an event. To track topic drifting, one
1https://en.wikipedia.org/wiki/Twitter
has to update query vector by expanding or shrinking query
term.
iv) Tweet often include abbreviation (e.g. Lol,India written as ind),
smiley, special character, misspelling (tomorrow is written like
2moro). Tweet normalization is the biggest issue for microblog
processing.
v) On many occasion, it has been found that native language
tweets are in transliterated romanaized English</p>
      <p>
        There are two cases for Microblog summarization [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] (I)Online
summarization or Push notification: novel tweet sent to user in
real time where latency is important i.e. how fast we can deliver
relevant and novel tweet to interested user. (II) Ofline
summarization (Email digest): At the end of day, system generates topic-wise
novel and relevant tweet summary which essentially summarizes
what happened that day. In ofline summarization, latency is not
important. In this paper,latter case is considered for the experiment
      </p>
      <p>Summarization System should include relevant and novel tweet
in summary. If there are no relevant tweet for a particular interest
profile on a specific day, then this day is called silent day for that
interest profile and summarization system should not include any
tweet for that particular profile. If system correctly identify such
silent day, then it should be awarded with highest score (i.e.1). If
system include tweet in summary for an interest profile on silent
day, it receive score 0
2</p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORKS</title>
      <p>
        Jimmy Lin and Diaz [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] had introduced Microblog track since
2012 with objective to explore new IR methodology on short text.
Mosad et.al.[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] has trained their Word2vec model using 4 years
tweet corpus.They have used Okapi BM25 relevance model to
calculate the relevance score. To refine the scores of the relevant tweets,
tweets were re scored using the SVM rank package using the
relevance score of the previous stage. Luchen et.al.[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] expanded title
term each day with point-wise KL-divergence to extract 5 hashtags
and 10 other terms. For relevance score, they have used unigram
matching formula with diferent weight to original title terms and
expanded terms. Our approach is similar to [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] but we have
empirically tuned smoothing parameter for better results. In addition
to this, we have also incorporated two level thresholds which are
computed via grid search which control tweets to be part of the
daily summary
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>DATA AND RESOURCES</title>
      <p>
        TREC has started Microblog Track since 2012. In 2016 track was
merged with temporal summarization and renamed as Real time
summarization track [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. An experiment is performed on TREC RTS
2016 dataset[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and TREC 2015 dataset[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] to evaluate our system
performance. Table 1 describe statistics of both datasets.
      </p>
      <sec id="sec-3-1">
        <title>Dataset Detail</title>
      </sec>
      <sec id="sec-3-2">
        <title>Total Number of Tweets</title>
        <p>Interest Profiles for evaluation</p>
        <p>Size of Qrels</p>
        <p>Number of positive Qrel
Number of common Interest profiles between 2 Datasets</p>
        <p>Tweet download duration</p>
        <p>TREC RTS 2016
4 PROBLEM STATEMENT of days namely silent day and eventful day. An eventful day is
Given an interest profile IP = {I P1, I P2, ..I Pm }, and tweets T = {T1, T2, .., Tn } one in which there are some relevant tweets for the given interest
from the Dataset, we need to compute the relevance score between profile in a given day. In contrast, a silent day is one for which
tweets and interest profile in order to create profile wise ofline sum- there is no relevant tweet for the given interest profile. The system
mary S = {S1....Sn }. Where Si is the set of ith profile-wise day-wise should not include any tweet in the summary for that day for that
relevant and novel tweets. We can model profile specific summary particular interest profile. On a silent day, the system receives a
as below. score of one (highest score) if it does not include any tweet in the
Si = {t1, t2, .., tn } where ti ,tj ∈T summary for that interest profile and zero otherwise. Detecting
For given interest profile, Relevance score between tweet and a silent day for a profile is a critical task for the summarization
interest profile is greater than specified silent day threshold Ts system. The Ranking function is defines as follow
and relevance threshold Tr . In addition to this, these tweets should
be novel i.e. similarity between all tweet of the summary should F (I P , T ) = P (I P |T , R = 1)
less that the novelty threshold Tn . if any tweet ti is included in
the summary for a particular profile on a given day then it should
satisfy following constraints.</p>
        <p>The above equation describe that if tweet is relevant how likely
interest profile would be IP. The term P(IP|T) estimated by language
model.
• Length of day-wise summary of Interest profile upto 100</p>
        <p>tweets
• Sim(ti ,tj ) ≤Tn ∀tj ∈Si (Tn = Novelty threshold)
5</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>PROPOSED METHODOLOGY</title>
      <p>In this section, we describe our proposed approach to design a
Microblog summarization system.
5.1</p>
    </sec>
    <sec id="sec-5">
      <title>Query formulation from interest profile</title>
      <p>
        Interest Profiles are consist of 3-4 word title, sentence long
narrative and paragraph length narrative explaining detailed information
need [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. All the terms from title field and named entity from
description and narrative fields are extracted to generate query. A
dictionary is maintained to map named entity with abbreviated
forms.
5.2
      </p>
    </sec>
    <sec id="sec-6">
      <title>Tweet Pre-processing</title>
      <p>Tweets and Interest profiles were pre-processed before calculating
the relevance score.Non-English tweets are filtered using language
attribute of tweet object.Non-ASCII characters are removed. Tweet
having external URL embedded with text are expanded and text of
external URL are merged with tweet text. Tweet without external
URL and less than 5 tokens are filtered.
5.3</p>
    </sec>
    <sec id="sec-7">
      <title>Relevance Score</title>
      <p>To retrieve relevant tweets for a given interest profile, we have
implemented language model with Jelinek Mercer, Dirichlet
smoothing with parameters and µ respectively. In addition to this,we have
also used BM25 ranking model to tank tweets. There are two types</p>
    </sec>
    <sec id="sec-8">
      <title>Summarization Method</title>
      <p>To select the top relevant and novel tweets, we have designed a two
level threshold mechanism. At the first level, for any interest profile
on any day, if all the tweets ranked under this profile have scores
less than silent threshold Ts , we consider this day as silent day and
we will not consider any tweet in the interest profile’s summary.
We have empirically set silent day Ts using grid search. In the other
case, where we get tweet scores greater than Ts , we normalize the
tweet scores. We assign value 1 to tweet with highest score and
assign relative values to the other tweets in the rage of 0 to 1. We
include all tweets which values more than Tr 2 normalized score
of Tr 1 in the range of 0 to 1 and actual score Tr 1 in our candidate
list and extract top k tweets. thee second level relevance threshold
of Tr 1 and Tr 2 is also selected empirically using grid search.</p>
      <p>5.4.1 Novelty Detection using Tweet cluster. In this study,
Microblog or Tweet summarization problem is anticipated as a tweet
clustering problem. Once all the relevant tweets are retrieved,
clusters are formed using jaccard similarity of tweet’s text.Tweets
having external URL or tweets having temporal feature in the text are
given priority because such tweets are more informative than the
tweet with only text and without external. we have used regular
expression to extract temporal expression from tweet text.
6</p>
    </sec>
    <sec id="sec-9">
      <title>RESULTS</title>
      <p>
        To evaluate the performance of the system, Normal discounted
Cumulative gain, nDCG@10 is computed for each day for each
interest profile and is averaged across them [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].There are two variant
namely: nDCG@10-1, nDCG@10-0. In nDCG@10-1 [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], on silent
day, system receive score 1 if it does not include any tweet in the
summary for the particular interest profile and 0 otherwise.
However, in nDCG@10-0, for a silent day, system receives gain zero
irrespective of what is produced [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Our goal is to maximize the
value of nDCG@10-0 and nDCG@10-1 jointly, which gives a wider
picture, by tuning parameter and Ts in case of language model
with JM smoothing and µ and Ts in case of Dirichlet smoothing.
      </p>
      <p>
        While analyzing the evaluation metrics nDCG@10−1 and nDCG@
10-0 on TREC RTS 2016 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], our system had failed in some of the
interest profiles like RTS37(Sea World), MB265(cruise ship mishaps),
MB365(cellphone tracking) where we could detect some of the silent
days and had obtained some score in the nDCG@10-1 metric but
did not score in the nDCG@10-0 metric. This is why we look at
both the metrics while evaluating our system. The TREC RTS 2016
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] organizers had considered nDCG@10-1 which adds gain on
silent as well as eventful day as a primary metric to rank various
teams. However, ndcg-0 which reflects how many relevant and
novel tweets are part of the daily summary and does not add gain
on silent day is also very important. In our analysis, it was observed
that TREC RTS 2016 result[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] shows that empty run i.e. blank file
with zero tweets scored nDCG@10 − 1 = 0.2339 which is more than
average score of all the teams so is not a very accurate measure of
judging. COMP2016 team [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] receive score nDCG@10 − 1 = 0.2898
and nDCG@10 − 0 = 0.0684. So it shows that 76 percent of the
nDCG@10 − 1 score obtained by system is by remaining silent. In
this experiment, we have tried to tune parameters which maximize
nDCG-1 and nDCG-0 jointly. We believe that nDCG@10-0 is a very
important metric which indicate that how much relevant and novel
tweets were included in the summary. We report our best result
with nDCG@10-1=0.3524 and nDCG@10-0=0.1131. without any
sort of query expansion substantially outperform top team [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] in
TREC RTS 2016[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Improvement in ndcg@10-0 shows that we have
added more relevant tweet in interest profile summary which is
better in a lot of senses
      </p>
      <p>
        Table 2 shows system result with all standard ranking algorithm.
Results show that all the ranking function perform in line with
respect to each other, though Okapi BM25 model marginally
outperforms language model. Our result on language model with Dirichlet
smoothing and JM-smoothing outperforms result reported by [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
The factor behind this outperformance is we have chosen parameter
λ = 0.1 and µ = 1000 and two level threshold mechanism. suwaileh
et. al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] have set λ=0.7 and µ = 2000. Table 3 shows the 25 percent
improvement in the results reported by top team of TREC RTS 2016
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ][
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Table 4 shows system result on TREC MB 2015 Dataset [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
Here thresholds are decided empirically not through grid search.
our
result
      </p>
    </sec>
    <sec id="sec-10">
      <title>POST HOC ANALYSIS</title>
      <p>In this section, we discuss comprehensive performance analysis
of the summarization system from various perspectives. Since the
massive dataset is used in the experiment, Tweet Selection or Tweet
ifltering is the primary task of the summarization system. Since
Twitter restrict length of tweet,Tweet sparseness is the biggest
challenge of the relevant tweet retrieval.</p>
      <p>
        Interest Profiles are consist of 3-4 word title, sentence long
narrative and paragraph length narrative explaining detailed information
need [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The crucial part is how do we generate query from triplet
as shown in Table 1. Luchen et al.[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] reported that title keyword
play critical role in the retrieval. Our experiment also support these
ifndings.
      </p>
      <p>The objective of the summarization system is to identify all the
clusters formed across the given period for all the interest profiles
and should not include any tweet if the given day is silent for any
interest profile. Performance of Summarization system depends
upon 2 task (i) Relevant tweet retrieval (ii) Novelty detection across
relevant tweet.
7.1</p>
    </sec>
    <sec id="sec-11">
      <title>Interest Profile characteristics</title>
      <p>During post hoc analysis, It has been observed that interest profile
have diferent characteristic. Some of the interest profiles have
spatial restriction. For example bus Service to NYC, gay marriage
laws in Europe, job training for high school graduates US. For
some interest profile such spatial restriction is not applied; user
information is spread across the globe. E.g. emerging music styles,
adult summer camp, hidden icons in movies and television</p>
      <p>Generalized interest profiles have many silent day and interest
profile with spatial named entity have more relevant tweet. Named
entity play a very crucial role in relevant tweet retrieval. Some of
the title of interest profile does not include NE so we extracted
NE from narrative field and included in query. Interest profile or
query which does not have named entity as query term perform
very badly in result metric e.g. emerging music style.
7.2</p>
    </sec>
    <sec id="sec-12">
      <title>Named Entity Linking Problem</title>
      <p>Interest profile some time contain very generalize Named Entity.
E.g. legalizing Medical Marijuana US and matching tweet contain
a Named Entity Florida (Florida Medical Association to oppose
medical marijuana ballot amendment in Florida). Due NE linking
problem relevant tweet score less against the interest profile.
7.3</p>
    </sec>
    <sec id="sec-13">
      <title>Named Entity Normalization</title>
      <p>Due to limitation in length of tweet, Microblog user often writes
named entity in abbreviated form. E.g. DEA(Drug Enforcement
Agency). Though we have term like drug enforcement agency but
we can not retrieve tweet with above normalize named entity.
7.4</p>
    </sec>
    <sec id="sec-14">
      <title>Clustering Issues</title>
      <p>Since Tweet summarization is multiple document summarization
problem, each tweet along with external URL is considered as one
document. Since Twitter is the crowdsourcing platform, many user
report same event with diferent facts. So our novelty detection
algorithm fails to cluster all following in tweet in same cluster.
T1 : Woman Is Eaten Alive By A Tiger At A Safari Park
T2 : Woman attacked by a tiger when she gets out of her car in a
safari
T3 : Horror at Beijing Safari World as tigers attack women who
exited car, killing one, injuring another
7.5</p>
    </sec>
    <sec id="sec-15">
      <title>Inclusion of Conditional event in Interest</title>
    </sec>
    <sec id="sec-16">
      <title>Profile</title>
      <p>For the Interest profile like cancer and depression, our system
performs very badly. Here user is looking for patient sufering from
depression after diagnosed with cancer. It is very dificult to judge
co-occurrence of both events in the tweet.
7.6</p>
    </sec>
    <sec id="sec-17">
      <title>Inclusion of Sentiment in Interest profile</title>
      <p>Interest profile, like Restaurant Week NYC includes sentiment and
opinion or recommendation. Some of the tweet which are matching
but does not include sentiment perspective are marked as
nonrelevant. In future we have to keep hidden feature like sentiment
to increase the score of low score non-relevant tweet.
7.7</p>
    </sec>
    <sec id="sec-18">
      <title>Hash-tag Identification</title>
      <p>Hash-tag can be one of the features, for relevant tweet identification.
Relevant Hashtag identification will increase the score of relevant
tweet, e.g. key word is sea world and hash tag is #seaworld or self
driving car the relevant hash-tag is #selfdrivingcar
7.8</p>
    </sec>
    <sec id="sec-19">
      <title>Efect of Query Expansion</title>
      <p>It has been observed that interest profile not having proper named
entity, our system perform very badly in terms of evaluation metric
nDCG-1 and nDCG-0 in majority cases. We also hypotheses that
query expansion might work positively for these interest profiles.
Our result shows that query expansion for such topic improvise
the result nDCG-1 and nDCG-0. One can do query expansion bases
upon interest profiles or case 2 case basis.</p>
    </sec>
    <sec id="sec-20">
      <title>CONCLUSION</title>
      <p>
        In this paper, we presented summarization system using language
model with JM smoothing ,Dirichlet smoothing and Okapi BM25
model. Results show that All the ranking function perform in line
with respect to each other. Though Okapi BM25 model marginally
outperform language model. We have perform grid search to
determine optimal silent threshold Ts and relevance threshold Tr .We
have also identify smoothing parameter λ =0.1 for Language Model
with JM smoothing and in the case of dirichlet smoothing µ = 1500
for better results. We showed that by efectively choosing parameter
λ and µ , we can outperform the result obtained by [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
9
      </p>
    </sec>
    <sec id="sec-21">
      <title>CURRENT WORK</title>
      <p>TREC RTS metric give more emphasize to precision rather than
recall. query expansion may include non relevant tweet in the
summary thus it improve recall but precision decrease substantially
and produce adverse efect on the results. Relevance thresholds
are very critical for the summarization system for selection of
tweet in the day-wise topic-wise summary. After doing careful
analysis on TREC MB 2015 and TREC RTS 2016 dataset, we found
that non relevant tweets have score more than relevant tweets in
many occasions. It gives intuition for designing machine learning
technique or deep neural network to estimate silent day threshold
Ts S and relevance threshold Tr . As of now, we are working on
following hypothesis.</p>
      <p>H1: we can predict threshold for new dataset (TREC RTS 2016)
from old data set TREC 2015 dataset.</p>
      <p>Some of the interest profiles are common in both Datasets. Based
upon this fact,we have designed following hypothesis.</p>
      <p>H2: Irrespective of same topic or diferent topic, statistical
features of the rank list can be exploited to predict silent day relevance
threshold Ts and relevance threshold Tr</p>
      <p>As of now, I am working on machine learning model for
estimation of these thresholds for any Dataset downloaded from Twitter.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Mossaab</given-names>
            <surname>Bagdouri</surname>
          </string-name>
          and Douglas W Oard.
          <year>2015</year>
          . CLIP at TREC 2015:
          <article-title>Microblog and LiveQA</article-title>
          ..
          <source>In TREC.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Luchen</given-names>
            <surname>Tan Richard McCreadie Ellen Voorhees Jimmy Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Adam</given-names>
            <surname>Roegiest</surname>
          </string-name>
          and
          <string-name>
            <given-names>Fernando</given-names>
            <surname>Diaz</surname>
          </string-name>
          . [n. d.].
          <source>TREC RTS</source>
          <year>2016</year>
          <article-title>Guidelines</article-title>
          . http://trecrts.github.io
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Yulu</given-names>
            <surname>Wang Garrick Sherman and Ellen Voorhees Jimmy Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Miles</given-names>
            <surname>Efron</surname>
          </string-name>
          . [n. d.].
          <article-title>TREC 2015 Microblog Track: Real-Time Filtering Task Guidelines</article-title>
          . https: //github.com/lintool/twitter-tools/wiki/TREC-2015
          <string-name>
            <surname>-</surname>
          </string-name>
          Track-Guidelines
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Haihui</given-names>
            <surname>Tan Dajun Luo Wenjie Li</surname>
          </string-name>
          . [n. d.]. PolyU at TREC 2016
          <string-name>
            <surname>Real-Time Summarization</surname>
          </string-name>
          . ([n. d.]).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Jimmy</given-names>
            <surname>Lin</surname>
          </string-name>
          , Adam Roegiest, Luchen Tan,
          <string-name>
            <surname>Richard</surname>
            <given-names>McCreadie</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Ellen</given-names>
            <surname>Voorhees</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Fernando</given-names>
            <surname>Diaz</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Overview of the TREC 2016 real-time summarization track</article-title>
          .
          <source>In Proceedings of the 25th Text REtrieval Conference</source>
          , TREC, Vol.
          <volume>16</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Reem</given-names>
            <surname>Suwaileh</surname>
          </string-name>
          , Maram Hasanain, and
          <string-name>
            <given-names>Tamer</given-names>
            <surname>Elsayed</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Light-weight, Conservative, yet Efective: Scalable Real-time Tweet Summarization.</article-title>
          .
          <source>In TREC.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Luchen</given-names>
            <surname>Tan</surname>
          </string-name>
          , Adam Roegiest,
          <source>Charles LA Clarke, and Jimmy Lin</source>
          .
          <year>2016</year>
          .
          <article-title>Simple dynamic emission strategies for microblog filtering</article-title>
          .
          <source>In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM</source>
          ,
          <volume>1009</volume>
          -
          <fpage>1012</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Luchen</given-names>
            <surname>Tan</surname>
          </string-name>
          , Adam Roegiest,
          <source>Jimmy Lin, and Charles LA Clarke</source>
          .
          <year>2016</year>
          .
          <article-title>An exploration of evaluation metrics for mobile push notifications</article-title>
          .
          <source>In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM</source>
          ,
          <volume>741</volume>
          -
          <fpage>744</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>