=Paper=
{{Paper
|id=Vol-2036/T2-1
|storemode=property
|title=Overview of the FIRE 2017 track: Information Retrieval from Microblogs during Disasters (IRMiDis)
|pdfUrl=https://ceur-ws.org/Vol-2036/T2-1.pdf
|volume=Vol-2036
|authors=Moumita Basu,Saptarshi Ghosh,Kripabandhu Ghosh,Monojit Choudhury
|dblpUrl=https://dblp.org/rec/conf/fire/BasuGGC17
}}
==Overview of the FIRE 2017 track: Information Retrieval from Microblogs during Disasters (IRMiDis)==
<pdf width="1500px">https://ceur-ws.org/Vol-2036/T2-1.pdf</pdf>
<pre>
    Overview of the FIRE 2017 track: Information Retrieval from
              Microblogs during Disasters (IRMiDis)
                           Moumita Basu                                                          Saptarshi Ghosh
             UEM Kolkata, India; IIEST Shibpur, India                             IIT Kharagpur, India; IIEST Shibpur, India

                       Kripabandhu Ghosh                                                      Monojit Choudhury
                           IIT Kanpur, India                                                 Microsoft Research, India

ABSTRACT                                                                 tweets are classified into three classes - need-tweets, availability-
The FIRE 2017 Information Retrieval from Microblogs during Dis-          tweets, and others.
asters (IRMiDis) track focused on retrieval and matching of needs        Task 2: Matching need-tweets and availability-tweets: An
and availabilities of resources from microblogs posted on Twit-          availability-tweet is said to match a need-tweet, if the availability-
ter during disaster events. A dataset of around 67,000 microblogs        tweet informs about the availability of at least one resource whose
(tweets) in English as well as in local languages such as Hindi and      need is indicated in the need-tweet. In this task, the participants
Nepali, posted during the Nepal earthquake in April 2015, was            were asked to develop methodologies for matching need-tweets
made available to the participants. There were two tasks. The first      with appropriate availability-tweets.
task (Task1) was to retrieve tweets that inform about needs and
availabilities of resources; these tweets are called need-tweets and     Table 1 shows some examples of need-tweets and availability-tweets
availability-tweets. The second task (Task2) was to match need-          from the dataset that was made available to the participants (de-
tweets with appropriate availability-tweets.                             scribed in the next section). Note that the dataset contains tweets
                                                                         not only in English but also in local languages such as Hindi and
                                                                         Nepali, and also code-mixed tweets, as shown in Table 1.
CCS CONCEPTS
•Information systems →Query reformulation;
                                                                         2 THE TEST COLLECTION
                                                                         In this track, our objective was to develop a test collection contain-
                                                                         ing code-mixed microblogs for evaluating
1 INTRODUCTION                                                                  • Methodologies for extracting two specific type of action-
Various important information is posted on online social media                    able situational information – needs and availabilities of
like Twitter at the times of disaster events such as floods and earth-            various types of resources (need-tweets and availability-
quakes. However, this important information is immersed within                    tweets), and
a lot of conversational content such as prayers and sympathy for                • Methodologies for matching need-tweets and availability-
the victims. Hence automated methodologies are needed to extract                  tweets
the important information from the deluge of tweets posted dur-
                                                                         In this section, we describe how the test collection for both the
ing such an event [3]. In this track, we focused on two types of
                                                                         tasks of IRMiDis track was developed.
tweets that are very important for coordinating relief operations
in a disaster situation:
(1) Need-tweets: Tweets which inform about the need or require-
                                                                         2.1 Tweet dataset
ment of some specific resources such as food, water, medical aid,        As part of the same track in FIRE 2016, we had released a collection
shelter, mobile or Internet connectivity, etc.                           of 50, 018 English tweets related to the devastating earthquake
(2) Availability-tweets: Tweets which inform about the avail-            that occurred in Nepal and parts of India on 25t h April 20151 [2].
ability of some specific resources. This class includes both tweets      We also utilized this collection to evaluate several IR methodolo-
which inform about potential availability, such as resources be-         gies developed by ourselves and others [1, 2]. We re-use these
ing transported or despatched to the disaster-struck area, as well       tweets in the present track. Additionally, in the present track, we
as tweets informing about the actual availability in the disaster-       collected tweets in Hindi and Nepali (based on language identifica-
struck area, such as food being distributed, etc.                        tion by Twitter itself) using the Twitter Search API [4], using the
                                                                         keyword ‘नेपाल’, that were posted during the same period that of
The track had two tasks, as described below.                             the English tweets. A total 90K tweets were collected, and after re-
Task 1: Identifying need-tweets and availability-tweets: Here            moving duplicates and near-duplicates as before [1, 2], we obtained
the participants were asked to develop methodologies for identify-       a set of 16,903 tweets. Hence, a set of 66,921 tweets tweets was ob-
ing need-tweets and availability-tweets. Note that this task can be      tained – containing 50, 018 English tweets and 16, 903 tweets in
approached in different ways. It can be approached as a retrieval or     Hindi, Nepali or code-mixed tweets – which was used as the test
search problem, where two types of tweets are to be retrieved. Dif-      collection for the track.
ferently, the problem of identifying need-tweets and availability-
tweets can also be viewed as a classification problem, e.g., where       1 https://en.wikipedia.org/wiki/April_2015_Nepal_earthquake
     Examples of need-tweets                                                Examples of availability-tweets
     नुनाकोट िज ा थानिसग
                       ं गा वसमा अ हले स म कु नै राहत सामामी                Nepal earthquake: Spiritual group sends relief materials to vic-
     तथा उ ारटोल नपुगक
                     े ो खबरले दुखी बनायो,तेताितर प न स बि धत               tims [url]
     प …
     नेपाल म दवाओं क क त, एयरपोट पर हजार क भीड़ - आज                  ःवाः य म ालय र WHO सँगको सय          ं ोजनमा कर ब छ दजन
     तक #World [url]                                                 चल चऽकम ह औषधी र खा             वतरण तथा जनचेतना कायबममा
                                                                     #earthquake #Nepalifilms
     after 7days of earthquake! people are still crying, sleeping in RT @abpnewshindi: वमान म खाना, पानी और कंबल नेपाल
     rain, lack of food and water! hope it was dream but this all के लए भेजे गए ह . एस. जयशंकर #NepalEarthquake लाइव
     happens to us!                                                  देख- [url]
     Nepal earthquake: Homeless urgently need tents; Death toll #grgadventure donating our tents and sleepig bags for victims
     above 5,200 Read More… [url]                                    of the #nepal #earthquake [url]
        Table 1: Examples of need-tweets and matching availability-tweets, posted during the 2015 Nepal earthquake


            Topic for      Hindi Nepali English                              find matching availability-tweets for each need-tweet. Addition-
            Retrieval      tweets tweets tweets                              ally, pooling was used over the participant runs to identify relevant
          Need-Tweets         31     82       558                            matches which the annotators might not have found.
       Availability-Tweets   238    206      1326
  Table 2: Summary of the gold standard used in IRMiDis                      3 TASK 1: IDENTIFYING NEED-TWEETS AND
                                                                               AVAILABILITY-TWEETS
                                                                             11 teams have participated in Task1 and 18 runs were submitted. A
   The data was ordered chronologically based on the timestamp               summary of the methodologies used by each team is given in the
assigned by Twitter, and released in two stages. At the start of the         next sub-section.
track, the chronologically earlier posted 20K tweets were released
(training set), along with a sample of Need-tweets and Availability-         3.1 Methodologies
tweets in these 20K tweets (development set). The participating              We now summarize the methodologies adopted in the submitted
teams were expected to use the training and development sets to              runs.
formulate their methodologies. Next, about two weeks before the                     • iitbhu_fmt17: This team participated from Indian Insti-
submission of results, the set of chronologically later posted 46K                    tute of Technology (BHU) Varanasi, India. It submitted
tweets were released (test set). The methodologies were evaluated                     the following two Automatic (i.e. no manual step involved)
based on their performance over the test set.                                         runs. Both the runs used google translator API to convert
                                                                                      the code-mixed tweets.
2.2 Developing gold standard for retrieval                                               – iitbhu_fmt17_task1_1: It used Apache Lucene, a open
The gold-standard for both tasks was generated by ‘manual runs’.                           source Java-based text search engine library2 . Train-
To develop the gold standard set of need-tweets and availability-                          ing data was indexed using Standard Analyzer and
tweets, a set of three human annotators having proficiency in Eng-                         frequency of each token is training set recorded. Query
lish, Hindi and Nepali were involved. Additionally, annotators                             is generated by the disjunction of tokens with fre-
were a regular user of Twitter, and had previous experience of                             quency more than or equal to a threshold value. Tweets
working with social media content posted during disasters. The                             are categorized according to the score return by Lucene
gold standard development involved similar three phases as de-                             search engine.
scribed in [1, 2] – first each annotator individually retrieved need-                    – iitbhu_fmt17_task1_2: It treated the task as a classifi-
tweets and availability-tweets, then there was mutual discussion                           cation task, and used SVM algorithm. Undersampling
among the annotators to resolve conflicts, and finally there was a                         was employed. A threshold of 0.2 in the predicted
pooling step over all the runs submitted to the track.                                     score by the SVM classifier was set to classify a tweet
   The summary of the number of need-tweets and availability-                              as relevant.
tweets present in the final gold standard corresponding to three                    • DataBros: This team participated from Indian Institute of
different languages is reported in Table 2.                                           Information Technology, Kalyani, India. It submitted one
                                                                                      automatic run described below:
2.3 Developing gold standard for Matching                                                – iiests_IRMiDis_FIRE2017_1: The bag of words model
To develop the gold standard for matching, the same human an-                              was used with TfidfVectorizer to collect the features
notators were involved. The annotators were asked to inspect the                           including unigram and bigrams. Recursive Feature
gold standard for need-tweets and availability-tweets, and to man-                         Elimination (RFE) algorithm with LinearSVM was used
ually find out the set of need-tweets for which at least one match-
ing availability-tweet exists. The annotators were also asked to             2 https://lucene.apache.org/

                                                                        2
                   to compute the ranking weights for all features and                  – HLJIT2017-IRMIDIS_task1_3: SVM of Nonlinear ker-
                   sort the features according to weight vectors. In ad-                   nel classifier was used.
                   dition, Decision Tree Classifier is applied to classify        • HLJIT2017-IRMIDIS_1: This team participated from Hei-
                   the data.                                                         longjiang Institute of Technology, China. The task was
         • Bits_Pilani_WiSoc: This team participated from Birla In-                  viewed as a classification task in all the runs used words
            stitute of Technology and Science, Pilani, India. It submit-             as a feature.
            ted two automatic runs. Both the runs were generated by                     – HLJIT2017-IRMIDIS_1_task1_1: LibSVM classifier was
            using word embeddings and then fastText classification                         used.
            algorithm to classify the tweet to its appropriate category.                – HLJIT2017-IRMIDIS_1_task1_2: LibSVM classifier was
            The fastText classifier was trained on the labeled data and                    used.
            the previously created word embeddings.                                     – HLJIT2017-IRMIDIS_1_task1_3: Linear Regression model
               – BITS_PILANI_RUN1: Created word embeddings us-                             was used.
                   ing Skip-gram model.                                           • Iwist-Group: This team participated from, Hildesheim
               – BITS_PILANI_RUN2: Created word embeddings us-                       University, Germany. It submitted one automatic run Iwist_task1_1
                   ing CBOW model.                                                   that is described as follows. Pole-based overlapping clus-
         • Data Engineering Group: This team participated from                       tering algorithm was used to measure the degree of rel-
            Indraprastha Institute of Information Technology, Delhi,                 evance of the tweet. For ranking the tweets Euclidean
            India. It submitted one automatic run – DataEngineering-                 distance was used as a similarity measure and the object
            Group_1 described as follows:                                            closer to a pole was ranked higher.
               – DataEngineeringGroup_1: This run used Stanford CoreNLP           • Radboud_CLS Netherlands: This team participated from
                   library 3 for the POS tagging along with the lemma                Radboud University, the Netherlands and submitted the
                   identification of all the words in the tweet set. Fea-            following two semi-automatic runs described as follows.
                   tures were constructed using both the words present               Code-mixed tweets were preprocessed and translated to
                   in the tweets and its POS tag. Logistic Regression                English using Google translator.
                   model was used for this classification task.                         – Radboud_CLS_task1_1: A lexicon and a set of hand-
         • DIA Lab - NITK: This team participated from, National                           crafted rules were used to tag the relevant n-grams.
            Institute of Technology, Karnataka, India. It submitted                        Then the class labels were automatically assigned to
            one automatic run described as follows:                                        the tagged output. The output was initially ranked us-
               – daiict_irlab_1: This run used Doc2vec model to trans-                     ing combined score of human-estimated confidence
                   form tweets into embedding vectors of size 100. To                      of specific class label and tag pattern. However, the
                   convert the code-mixed tweets, the ASCII translitera-                   final ranking was generated by ordering the tweets
                   tions of unicode text (tweet) was used. The frequency                   within these ranked sets according to their tweet ID.
                   of each token available in a tweet is also used as the               – Radboud_CLS_task1_2: This run used a tool Relevancer
                   feature. These embeddings was the input for multi-                      for initial clustering of the tweets tagged as English
                   layer preceptron (a feed forward Artificial Neural Net-                 or Hindi. English clusters were annotated and used as
                   work model) for classification and w-Ranking Key Al-                    training data for the support vector machines (SVM)
                   gorithm was used to rank the tweets.                                    based classifier.
         • FAST-NU: This team participated from, FAST National                    • Amrita CEN 1: This team participated from Amrita school
            University Karachi Campus, Pakistan. It submitted one                    of Engineering, Coimbatore, India. It submitted one semi-
            automatic run described below:                                           automatic run AU_NLP_1 described as follows. The train-
               – NU_Team_run01: This run extracted textual features                  ing data was tokenized. Classifier was trained using the
                   using tf*idf scores. All non -English tweets are trans-           word count as feature. For ranking the tweets cosine sim-
                   lated using Google Translator API into English equiv-             ilarity was used.
                   alent text. The logistic regression based classifier is
                   used for classification.
         • HLJIT2017-IRMIDIS: This team participated from Hei-
            longjiang Institute of Technology, China. It submitted three   3.2 Evaluation Measures and Result
            automatic runs. The task was viewed as a classification        We now report the performance of the methodologies submitted to
            task in all the runs and the feature selection was based on    the Task1 of FIRE 2017 IRMiDis Track. We consider the following
            logistic regression method.                                    measures to evaluate the performance – (i) Precision at 100 (Pre-
               – HLJIT2017-IRMIDIS_task1_1: SVM of Liner kernel clas-      cision@100): what fraction of the top ranked 100 results are actu-
                   sifier was used.                                        ally relevant according to the gold standard, i.e., what fraction of
               – HLJIT2017-IRMIDIS_task1_2: AdaBoost classifier was        the retrieved tweets are actually need-tweets or availability-tweets,
                   used.                                                   (ii) Recall at 1000 (Recall@1000): fraction of relevant tweets (ac-
                                                                           cording to the gold standard) that are in the top 1000 retrieved
                                                                           tweets, and (iii) Mean Average Precision (MAP) considering the
3 https://nlp.stanford.edu/software/tagger.shtml                           full retrieved ranked list.
                                                                      3
              Run Id                     Type          Precision   Recall    MAP                           Method
                                                         @100      @1000                                  summary
      iitbhu_fmt17_task1_2             Automatic        0.7900     0.6160   0.4386                      SVM classifier
                                                                                              Undersampling was employed
    iiests_IRMiDis_FIRE2017_1          Automatic        0.7850     0.3542   0.2639              TfidfVectorizer, LinearSVM,
                                                                                                 , Decision Tree Classifier
           Bits_Pilani_1               Automatic        0.6800     0.2983   0.2073   POS tagging, word embeddings, Skip-gram model,
                                                                                                      fastText classifier
           Bits_Pilani_2               Automatic        0.7300     0.2634   0.1993    POS tagging, word embeddings, CBOW model,
                                                                                                      fastText classifier
    DataEngineeringGroup_1             Automatic        0.5400     0.2896   0.1304          POS tagging, Lemma identification,
                                                                                                 Logistic Regression model
 HLJIT2017- IRMIDIS_1_task1_3          Automatic        0.6850     0.1662   0.1208                   words as features
                                                                                                  Linear Regression model
      iitbhu_fmt17_task1_1             Automatic        0.5600     0.1570   0.0906        Query generation by token disjunction
                                                                                      more than threshold frequency, Apache Lucene
  HLJIT2017-IRMIDIS_1_task1_2          Automatic        0.3650     0.1176   0.0710                   Words as a feature
                                                                                                      LibSVM classifier
   HLJIT2017-IRMIDIS_task1_3           Automatic        0.4450     0.1642   0.0687      Logistic regression based feature selection,
                                                                                             SVM Nonlinear kernel classifier
     DIA_Lab_NITK_task1_1              Automatic        0.3850     0.1437   0.0681           Doc2vec, Multilayer preceptron,
                                                                                                       w-Ranking Key
   HLJIT2017-IRMIDIS_task1_2           Automatic        0.5500     0.1094   0.0633      Logistic regression based feature selection,
                                                                                                     AdaBoost classifier
  HLJIT2017-IRMIDIS_1_task1_1          Automatic        0.3050     0.0636   0.0317                   Words as a feature
                                                                                                      LibSVM classifier
          Iwist_task1_1                Automatic        0.0350     0.0916   0.0291            POS tagging, Cosine similarity,
                                                                                                  Greedy approach search
   HLJIT2017-IRMIDIS_task1_1           Automatic        0.1250     0.1414   0.0286      Logistic regression based feature selection,
                                                                                                SVM Liner kernel classifier
         NU_Team_run01                 Automatic        0.0700     0.0478   0.0047                       tf*idf scores,
                                                                                            Logistic regression based classifier
      Radboud_CLS_task1_1           Semi-automatic      0.7400     0.3731   0.2458
                                                                              Linguistic approach, Tagged n-grams,
                                                                               Automatically assigned class labels
     Radboud_CLS_task1_2        Semi-automatic 0.5500   0.2189 0.1736            Relevancer for initial clustering,
                                                                             SVM based classifier, Cosine similarity
           AU_NLP_1             Semi-automatic 0.0800   0.0645 0.0199         Tokenization, Word count as feature,
                                                                                 Classification, Cosine similarity
Table 3: Comparison among all the submitted runs in Task 1 (identifying need-tweets and availability-tweets). Runs are
ranked in decreasing order of MAP score.


   Table 3 reports the retrieval performance for all the submitted      4 TASK 2: MATCHING NEED-TWEETS AND
runs in Task1. Each of the measures (i.e. Precision@100, Recall@1000,     AVAILABILITY-TWEETS
Map) are reported by taking an average over both the topics need-
                                                                        In Task2, 5 teams participated and 10 runs were submitted. We first
tweets and availability-tweets.
                                                                        describe the runs, and then report the comparative evaluation.
   It is seen that classification-based approaches performed bet-
ter than the other methodologies based on word-embeddings or
searching tools like Apache Lucene, as is evident from the scores       4.1 Methodologies
in Table 3.                                                             We now describe the submitted runs.

                                                                             • DataBros : This team participated from Indian Institute of
                                                                               Information Technology, Kalyani, India. It submitted one
                                                                               automatic run. This run used POS (Parts of Speech) tag-
                                                                               ging and matching-score was obtained from the number
                                                                    4
                     Team Id              Precision@5     Recall   F-Score         Type                 Method summary
                    DataBros                 0.2482       0.3888    0.3030      Automatic         POS tagging, Common noun
                                                                                                           overlapping
            Data Engineering Group           0.2081       0.2904    0.2424      Automatic        POS tagging, Cosine similarity,
                                                                                                       Brute force search
            Data Engineering Group           0.1758       0.3677    0.2379      Automatic        POS tagging, Cosine similarity,
                                                                                                    Greedy approach search
              HLJIT2017-IRMIDIS              0.1819       0.1546    0.1671      Automatic          Indri, Dirichlet smoothing,
                                                                                                   KL distance sorting model
              HLJIT2017-IRMIDIS              0.2033       0.1405    0.1662      Automatic          Indri, Dirichlet smoothing,
                                                                                                   KL distance sorting model
              HLJIT2017-IRMIDIS              0.2051       0.0913    0.1264      Automatic          Indri, Dirichlet smoothing,
                                                                                                   KL distance sorting model
             HLJIT2017-IRMIDIS_1             0.0882       0.2178    0.1256      Automatic          Indri, Dirichlet smoothing,
                                                                                                     Correlation calculation
             HLJIT2017-IRMIDIS_1             0.0825       0.1475    0.1058      Automatic          Indri, Dirichlet smoothing,
                                                                                                     Correlation calculation
             HLJIT2017-IRMIDIS_1             0.0889       0.0211    0.0341      Automatic          Indri, Dirichlet smoothing,
                                                                                                     Correlation calculation
           Radboud_CLS Netherlands           0.3305       0.4450    0.3793   Semi-automatic        n-grams, Resource tagging

Table 4: Comparison among all the submitted runs in Task 2 (matching need-tweets and availability-tweets). Runs are ranked
in decreasing order of F-score.


       of overlapping of common nouns between Need-tweets                      • Radboud_CLS Netherlands: This team participated from,
       and Availability-tweets.                                                  Radboud University, Netherlands, and submitted the semi-
     • Data Engineering Group: This team participated from                       automatic run Radboud_CLS_task1_1. This method used
       Indraprastha Institute of Information Technology, Delhi,                  the tagged output obtained in the processing the tweets
       India. It submitted two automatic runs described as fol-                  for Task 1 using a linguistic approach. For every Need-
       lows:                                                                     tweet all the word n-grams were tagged as identifying a
         – Both the runs used POS tag of nouns and similarity                    resource; the approach attempt to find an exact match in
            between Need-tweets and Availability-tweets were mea-                the Availability-tweets and ranked the Availability-tweets
            sured by cosine similarity. However, for the first sub-              accordingly.
            mitted run the similarity threshold was set as 0.7 as
            inferred on the basis of experimentation. Thus, brute
            force approach was followed in searching.                    4.2 Evaluation Measures and Result
         – In the second submitted run, greedy approach was
                                                                         The runs were evaluated against the gold standards generated by
            followed and the search stopped as soon as it finds
                                                                         manual runs. Additionally, the annotators (same as used to de-
            the first five or lesser availability tweets with a cosine
                                                                         velop the gold standard) checked many of the need-availability
            similarity score greater than our set threshold of 0.7.
                                                                         pairs matched by the methodologies (after pooling), and judged
     • HLJIT2017-IRMIDIS: This team participated from Hei-
                                                                         whether the match is correct.
       longjiang Institute of Technology, China. It submitted three
                                                                             We have used the following IR measures to evaluate the runs.
       automatic runs. The task was viewed as an IR task. All the
                                                                         (i) Precision@5: Let n be the number of need-tweets correctly
       runs used the open source retrieval tool Indri language
                                                                         identified (i.e., present in the gold standard) by a particular match-
       model based on the Dirichlet smoothing for retrieval and
                                                                         ing methodology. For each need-tweet, we consider the top 5
       KL distance as the sorting model.
                                                                         matching availability-tweets as matched by the method. The preci-
     • HLJIT2017-IRMIDIS_1: This team participated from Hei-
                                                                         sion of a particular matching methodology is the fraction of pairs
       longjiang Institute of Technology, China. It submitted three
                                                                         that are matched correctly by the methodology (out of the 5 × n
       automatic runs. The task was viewed as an IR task. Need-
                                                                         pairs).
       tweets used as a query set and Availability-tweets used as
                                                                         (ii) Recall: The recall of matching is the fraction of all the need-
       a collection of documents. All the runs used Indri open-
                                                                         tweets (present in the gold standard) which a methodology is able
       source retrieval tool and the Dirichlet smoothing language
                                                                         to match correctly.
       model to solve the matching problem. However, the three
                                                                         (iii) F-Score: F-score of a matching methodology is the harmonic
       runs submitted by this team differ in preprocessing step.
                                                                         mean of the precision and recall.
                                                                     5
   Table 4 shows the evaluation performance of each submitted               relevant microblogs from the live streaming of microblogs dynam-
run, along with a brief summary. For each type, the runs are ar-            ically. We plan to explore this direction in the coming years.
ranged in the decreasing order of the F-Score. It is evident that the
methods which considered noun overlapping or cosine similarity              ACKNOWLEDGEMENTS
between need-tweets and availability-Tweets to obtain matching-             The track organizers thank all the participants for their interest in
score (post POS tagging) outperformed the other methodologies.              this track. We also thank the FIRE 2017 organizers for their support
                                                                            in organizing the track.
5 CONCLUSION AND FUTURE DIRECTIONS
The FIRE 2017 IRMiDis track successfully created a benchmark col-           REFERENCES
lection of code-mixed microblogs posted during disaster events.             [1] M. Basu, K. Ghosh, S. Das, R. Dey, S. Bandyopadhyay, and S. Ghosh. 2017. Identi-
                                                                                fying Post-Disaster Resource Needs and Availabilities from Microblogs. In Proc.
The track also compared the performance of various methodolo-                   ASONAM.
gies in retrieving and matching two pertinent and actionable types          [2] M. Basu, A. Roy, K. Ghosh, S. Bandyopadhyay, and S. Ghosh. 2017. Microblog
of information, namely need-tweets and availability-tweets. We                  Retrieval in a Disaster Situation: A New Test Collection for Evaluation. In Proc.
                                                                                Workshop on Exploitation of Social Media for Emergency Relief and Preparedness
hope that the test collection developed in this track will help the             (SMERP) co-located with European Conference on Information Retrieval. 22–31.
research community in the development of a better model for re-                 http://ceur-ws.org/Vol-1832/SMERP_2017_peer_review_paper_3.pdf
trieval and matching in future.                                             [3] M. Imran, C. Castillo, F. Diaz, and S. Vieweg. 2015. Processing Social Media
                                                                                Messages in Mass Emergency: A Survey. Comput. Surveys 47, 4 (June 2015),
   In this year’s track we considered a static collection of code-              67:1–67:38.
mixed microblogs. However, in reality, microblogs are obtained              [4] Twitter-search-api 2017. Twitter Search API. (2017). https://dev.twitter.com/
                                                                                rest/public/search
in a continuous stream. The challenge can be extended to retrieve


                                                                        6

</pre>