<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Microblog Retrieval for Disaster Relief: How To Create Ground Truths?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>IIT(BHU) Varanasi</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>ribhav.soni.cse</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>spal.cse}@iitbhu.ac.in</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>Microblogging services like Twitter are an important source of real-time information during disasters and can be utilized to aid rescue, relief and rehabilitation e orts. The focus of this work is on the creation of gold standard data for automatic retrieval of helpful tweets. Using various experiments on the gold standard data prepared in the FIRE 2016 Microblog Track [3], we show that the gold standard data prepared in [3] missed many relevant tweets. We also demonstrate that using a machine learning model can help in retrieving the remaining relevant tweets by training an SVM model on a subset of the data and using it to get the most useful tweets in the entire dataset. We obtain high precision and recall even with very little training data, which makes such a model suitable for use in a real-time disaster situation.</p>
      </abstract>
      <kwd-group>
        <kwd>Crisis Informatics</kwd>
        <kwd>Disaster</kwd>
        <kwd>Emergency</kwd>
        <kwd>Hazards</kwd>
        <kwd>Microblog Retrieval</kwd>
        <kwd>Social Media</kwd>
        <kwd>Text Categorization</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Social media is a very useful resource for obtaining real-time information during
disasters. Traditional media like television, newspaper, etc. have limited use for
aiding in disaster relief due to their slow updates, and may even be unavailable
due to the disaster event. In such situations, social media presents valuable
information to aid in disaster relief and rehabilitation with very little time overhead
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Twitter in particular is especially suited for extracting details and rst-hand
accounts within moments of an event, anywhere in the world [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], and can thus be
exploited for help in relief work. However, it also involves challenges of ltering
out information about the crisis situation that is not useful for relief e orts,
including tweets expressing shock, condolences, opinion, etc. Some tweets that
are not useful for disaster relief e orts are shown in Table 1.
      </p>
      <p>
        The FIRE 2016 Microblog Track [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] focused on comparing di erent IR
methodologies for retrieval in such scenario, and led to the creation of a benchmark
collection of ground truth data for such tasks. However, based on our experiments,
we argue that the ground truth annotation exercise missed up to four times as
many tweets as were found. This represents a signi cant loss of information that
could potentially be very useful in a disaster situation. Also, since the accuracy of
      </p>
      <p>Tweet Text
RT @tarsem insan:,@Gurmeetramrahim Guru ji #MSGHelpEarthquakeVictims I m
also Shocked!!!,hearing #earthquake #MSGHelpEarthquakeVictims
RT @vrinda 90:,really sad to hear about d earthquake. praying for all the ppl who
su ered,&amp; lost their loved ones. hope they get all the h
The Government is,so quick to help earthquake victims but why are they so reluctant
to our own,farmers needs?
Haven't studied anything coz of earthquake and have to go for exam.
RT @guthali2:,Imagine Kejriwal were the PM in Nepal Earthquake situation, " Hum
kuch,nai kar sakte hai jee, army president ke neeche hai".
gold standard data is crucial for evaluation and comparison of retrieval systems,
it may lead to weaker systems being ranked above better systems.</p>
      <p>
        First, we manually labeled a small, random subset of the data and found
that many relevant tweets were missing from the gold standard in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. We then
proceeded to train an SVM model on a subset of the data, and used it to retrieve
100 tweets with the highest con dence score of the trained model. We found that,
averaged across all topics, only less than half of the relevant tweets among those
were identi ed in the gold standard in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        We also performed bootstrapping on the labeled random subset to estimate
the number of relevant tweets in the entire collection, and obtained about 5
times the relevant tweets from the gold standard in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Also, we trained our
SVM model on small fractions of the training data, and obtained high precision
and recall even with very little training data, which shows that such a model
can be used e ectively in disaster situations with very low time overhead.
      </p>
      <p>The rest of this paper is organized as follows. We rst describe the data used
in Section 2, our experiments and results in Section 3, and discussion and future
work in Section 4.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Data</title>
      <p>
        We used the dataset provided by the organizers of the FIRE 2016 Microblog
Track [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The data was a collection of 50,068 tweets posted during the
earthquake in Nepal in 2015 1.
      </p>
      <p>
        Organizations involved in relief work during disasters need speci c, actionable
information to help in the relief e orts. Thus, a set of seven speci c information
needs were identi ed by the authors in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] after consulting members of such
organizations.
      </p>
      <p>
        The task in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] involved retrieving tweets relevant to each of these seven
information needs, expressed as topics in TREC format. The seven topics are
listed in Table 2.
1 https://en.wikipedia.org/wiki/April 2015 Nepal earthquake
&lt; num&gt;Number: FMT1
&lt; title&gt;What resources were available
&lt; desc&gt;Identify the messages which describe the availability of some resources.
&lt; narr&gt;A relevant message must mention the availability of some resource like food,
drinking water, shelter, clothes, blankets, human resources like volunteers, resources
to build or support infrastructure, like tents, water lter, power supply and so on.
Messages informing the availability of transport vehicles for assisting the resource
distribution process would also be relevant. However, generalized statements without
reference to any resource or messages asking for donation of money would not be relevant.
&lt; num&gt;Number: FMT2
&lt; title&gt;What resources were required
&lt; desc&gt;Identify the messages which describe the requirement or need of some resources.
&lt; narr&gt;A relevant message must mention the requirement / need of some resource like
food, water, shelter, clothes, blankets, human resources like volunteers, resources
to build or support infrastructure like tents, water ter, power supply, and so on.
A message informing the requirement of transport vehicles assisting resource
distribution process would also be relevant. However, generalized statements without
reference to any particular resource, or messages asking for donation of money would not be relevant.
&lt; num&gt;Number: FMT3
&lt; title&gt;What medical resources were available
&lt; desc&gt;Identify the messages which give some information about availability of
medicines and other medical resources.
&lt; narr&gt;A relevant message must mention the availability of some medical resource like
medicines, medical equipments, blood, supplementary food items (e.g., milk for
infants), human resources like doctors/sta and resources to build or support
medical infrastructure like tents, water lter, power supply, ambulance, etc.
      </p>
      <p>Generalized statements without reference to medical resources would not be relevant.
&lt; num&gt;Number: FMT4
&lt; title&gt;What medical resources were required
&lt; desc&gt;Identify the messages which describe the requirement of some medicine or other medical
resources.
&lt; narr&gt;A relevant message must mention the requirement of some medical resource like
medicines, medical equipments, supplementary food items, blood, human resources like
doctors/sta and resources to build or support medical infrastructure like tents,
water lter, power supply, ambulance, etc. Generalized statements without reference
to medical resources would not be relevant.
&lt; num&gt;Number: FMT5
&lt; title&gt;What were the requirements / availability of resources at speci c locations
&lt; desc&gt;Identify the messages which describe the requirement or availability of
resources at some particular geographical location.
&lt; narr&gt;A relevant message must mention both the requirement or availability of some
resource, (e.g., human resources like volunteers/medical sta , food, water, shelter,
medical resources, tents, power supply) as well as a particular geographical location.
Messages containing only the requirement / availability of some resource, without
mentioning a geographical location would not be relevant.
&lt; num&gt;Number: FMT6
&lt; title&gt;What were the activities of various NGOs / Government organizations
&lt; desc&gt;Identify the messages which describe on-ground activities of di erent NGOs
and Government organizations.
&lt; narr&gt;A relevant message must contain information about relief-related activities
of di erent NGOs and Government organizations in rescue and relief operation.</p>
      <p>Messages that contain information about the volunteers visiting di erent
geographical locations would also be relevant. However, messages that do not contain
the name of any NGO / Government organization would not be relevant.
&lt; num&gt;Number: FMT7
&lt; title&gt;What infrastructure damage and restoration were being reported
&lt; desc&gt;Identify the messages which contain information related to infrastructure damage or
restoration.
&lt; narr&gt;A relevant message must mention the damage or restoration of some speci c
infrastructure resources, such as structures (e.g., dams, houses, mobile tower),
communication infrastructure (e.g., roads, runways, railway), electricity, mobile or
Internet connectivity, etc. Generalized statements without reference to
infrastructure resources would not be relevant.</p>
      <p>
        The gold standard preparation in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] involved three phases, which can be
brie y summarized as follows.
      </p>
      <p>1. Three annotators independently tried to search for relevant tweets using
intuitive keywords, after all tweets were indexed using Indri.</p>
      <p>2. All tweets identi ed by at least one of the three annotators in Phase 1
were considered and their relevance annotation nalized by mutual discussion
among the annotators.</p>
      <p>3. Standard pooling was employed, taking the top 30 results from each run
and deciding on their relevance.</p>
      <p>
        The initial collection by the authors of [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] consisted of about 100,000 tweets,
and the nal dataset of 50,068 tweets was obtained by removing duplicate tweets
(tweets with similarity greater than a threshold). The collection still included
many tweets that were not duplicates but expressed almost the same information.
All such instances were classi ed as relevant in the annotation exercise.
3
3.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Experiments and Results</title>
      <p>
        Exhaustive labeling on a small, random subset
A set of 700 tweets was randomly chosen, and relevance was judged for each
tweet in the set separately for each of the seven topics. Within the random
sample, the number of relevant tweets identi ed in the gold standard in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and
those identi ed by exhaustive labeling are given in Table 3.
      </p>
      <p>
        As we can see, within the random sample, the number of relevant tweets
identi ed by our exhaustive annotation was about 5 times of that identi ed in
the gold standard in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Bootstrapping to estimate the number of relevant documents in
the entire collection
After exhaustively labeling the random sample of 700 tweets, we used
Bootstrapping [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] for estimating the number of relevant tweets in the whole
collection. Bootstrapping is a resampling method that involves random sampling with
replacement, so we generated 1000 samples, each of size 700 tweets, from our
sample of 700 tweets with replacement. The number of relevant tweets in each
sample was computed, and then its average was taken across all 1000 samples.
The resulting number of tweets, divided by the sample size, was taken to be
an estimate for the fraction of relevant tweets in the entire collection. We thus
estimated the number of relevant tweets in the collection of 50,068 tweets to be
about 7,520 tweets (i.e., 15.02% of the tweets).
      </p>
      <p>
        On the contrary, only 1,565 relevant tweets (3.13% of the tweets) were
identi ed in the gold standard in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This represents a loss of about 6,000 useful
tweets missed by the annotators in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
3.3
      </p>
      <p>
        Machine Learning for automatic ltering of tweets
We trained machine learning models for automatic classi cation of tweets into
topics, with the aim of automatically retrieving the most useful tweets that may
have been missed in the annotation exercise in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. As one tweet can be relevant
to multiple topics, we applied supervised machine learning models separately for
each topic, thus training a total of seven binary classi ers.
      </p>
      <p>
        We used Support Vector Machines (SVM) for our classi cation task, as they
have been found to be among the best models for text classi cation [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. We
used the implementation of LinearSVC (SVM with linear kernel) in the
scikitlearn machine-learning library [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>Training data As seen in Table 3, we could identify at most only 53 relevant
tweets for one topic out of a sample of 700 tweets. Thus, the classi cation task
is highly skewed, with non-relevant tweets forming a large majority.</p>
      <p>To overcome the problems associated with such skewed classi cation, we
used undersampling, i.e., we balanced the training data by taking only as many
non-relevant tweets as we had relevant tweets.</p>
      <p>
        Besides the positively labeled tweets that we labeled from our sample of 700
tweets, we also had the set of relevant gold standard tweets from [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] to use
for our machine learning task. Table 4 lists the nal number of labeled tweets
that we used for each of the topics. (Our number of gold standard tweets are
slightly less than in the original gold standard because we could not download
about 500 tweets from the original collection from twitter due to those tweets
getting deleted in the meantime. Also, the number of relevant tweets from the
two sources, manual labeling by us of the sample of 700 tweets and gold standard
in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], do not add up perfectly, because some tweets are common between them.)
      </p>
      <p>We applied minimal preprocessing on the tweets. The only operation that we
applied was the removal of hashtag symbols (retaining the attached text).</p>
      <p>We randomly divided the available training data into 70% for training and
30% for testing, for each topic.
Feature Extraction Scikit-learn's CountVectorizer was used to extract token
counts with a bag-of-words model. We experimented using (1) unigram features
only, and (2) both unigram and bigram features, and got better results using
unigram features only. We thus used only unigram features for all our remaining
experiments. Also, no stemming or stopword removal was done, and tokenization
of tweets was done by extracting words of at least 2 letters.</p>
      <p>Then, T dfTransformer was used to convert the raw counts to tf-idf weights.
Thus, a bag-of-words model with unigram features of tf-idf weights was used.</p>
      <p>Each experiment was carried out for 100 iterations with random partitions of
the data in each iteration to training (70%) and test sets (30%), and the average
of all performance metrics for the 100 iterations was taken.</p>
      <p>Results The performance of the classi ers based on various metrics are shown
in Table 5. The precision-recall curve of the classi er for topic FMT1 is also
shown.
Precision-Recall curve for the SVM classi er for topic FMT1
)
%
(
n
o
i
s
i
c
e
r
P
100
90
80
70
60
50
50
60
70
Recall (%)
80
90
100
3.4</p>
      <p>Classi cation performance with number of examples
We tested the performance of our classi ers when using only a fraction of the
available data. For each classi er and each given fraction of data, we randomly
took a subset of the usable data for 100 iterations, and took the average of the
performance scores for the classi er on the 100 iterations. The F1 scores of the
classi ers with varying fractions of the data are shown in Table 6.</p>
      <p>Retrieving most relevant tweets in the entire collection
We used the trained classi ers to retrieve the 100 most relevant tweets for each
topic in the entire dataset by taking the 100 tweets with the maximum con dence
scores of each classi er.</p>
      <p>
        We manually checked the sets of 100 tweets corresponding to the seven topics
to determine how many of them were actually relevant, and how many of the
relevant ones were identi ed by the gold standard in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The results of this
exercise are shown in Table 7.
      </p>
      <p>
        2. Pooling works only when the number of participating systems is large, and
the systems are diverse. Unlike tracks on TREC, the number of participants in
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] was not large, and so standard pooling employed in Phase 3 also failed to
nd all relevant tweets. ([
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] studies the reliability of pooling, and concludes that
it is reliable if the depth of the pool is deep enough, i.e., many of the top results
from all systems are taken into account, which is true for TREC with a depth
of top 100 documents from each participating system, but taking only top 30
documents as was done in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] may not have been enough.)
      </p>
      <p>Since exhaustive annotation is not possible for the complete collection, to
nd relevant tweets in the remaining collection, a machine learning model as
presented in this paper can be trained and used on the remaining data to retrieve
the tweets with the highest con dence scores, and then manual con rmation of
the relevance can be carried out for as many tweets as annotator time permits.</p>
      <p>Another approach could be to exhaustively annotate a small random subset
of the data, and then use keywords of the relevant-marked tweets to query into
the entire collection, to retrieve relevant tweets in the remaining collection. This
is one future possibility for us to experiment with.</p>
      <p>
        Some of the relevant tweets that were missed in the creation of gold standard
in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] are listed in Table 8.
      </p>
      <p>We were able to achieve reasonably high F1 scores for our classi ers even with
a training size of a few hundred examples (Table 6). This shows that automatic
text classi cation is a viable approach to extract useful information from tweets
during times of disasters, since a few hundred examples can easily be annotated
in a short amount of time. It may also be fruitful to train supervised machine
learning models in advance for di erent types of disaster situations, and use
them in times of disaster until newly annotated data is obtained.</p>
      <p>
        To improve on the machine learning model, some avenues to explore are:
{ using more features, including word embeddings, spatio-temporal features,
linguistic features (as used in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]), etc.
{ employing better preprocessing techniques, like using twitter-speci c spelling
correction, expanding common twitter abbreviations, better data cleaning,
etc.
5
      </p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements</title>
      <p>We thank the anonymous reviewers for their thorough comments.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <article-title>Internet becomes a lifeline in nepal after earthquake</article-title>
          . http://www.computerworld.com/article/2914641/internet/ internet
          <article-title>-becomes-a-lifeline-in-nepal-after-earthquake</article-title>
          .html, accessed:
          <fpage>2017</fpage>
          -03-16
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Efron</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tibshirani</surname>
            ,
            <given-names>R.J.:</given-names>
          </string-name>
          <article-title>An introduction to the bootstrap</article-title>
          . CRC press (
          <year>1994</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Ghosh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghosh</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Overview of the re 2016 microblog track: Information extraction from microblogs posted during disasters</article-title>
          . Working notes of FIRE pp.
          <volume>7</volume>
          {
          <issue>10</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Joachims</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Text categorization with support vector machines: Learning with many relevant features</article-title>
          .
          <source>In: European conference on machine learning</source>
          . pp.
          <volume>137</volume>
          {
          <fpage>142</fpage>
          . Springer (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Khan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baharudin</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>L.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>A review of machine learning algorithms for text-documents classi cation</article-title>
          .
          <source>Journal of advances in information technology 1(1)</source>
          ,
          <volume>4</volume>
          {
          <fpage>20</fpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Mills</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Raghav</given-names>
            <surname>Rao</surname>
          </string-name>
          , H.:
          <article-title>Web 2.0 emergency applications: How useful can twitter be for emergency response</article-title>
          ?
          <source>Journal of Information Privacy and Security</source>
          <volume>5</volume>
          (
          <issue>3</issue>
          ),
          <volume>3</volume>
          {
          <fpage>26</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Pedregosa</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varoquaux</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gramfort</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Michel</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thirion</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grisel</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blondel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prettenhofer</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weiss</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dubourg</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vanderplas</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Passos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cournapeau</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brucher</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perrot</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duchesnay</surname>
          </string-name>
          , E.:
          <article-title>Scikit-learn: Machine learning in Python</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>12</volume>
          ,
          <volume>2825</volume>
          {
          <fpage>2830</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Rudra</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghosh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ganguly</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goyal</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghosh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Extracting situational information from microblogs during disaster events: a classi cation-summarization approach</article-title>
          .
          <source>In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management</source>
          . pp.
          <volume>583</volume>
          {
          <fpage>592</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Zobel</surname>
          </string-name>
          , J.:
          <article-title>How reliable are the results of large-scale information retrieval experiments?</article-title>
          <source>In: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval</source>
          . pp.
          <volume>307</volume>
          {
          <fpage>314</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>