=Paper= {{Paper |id=Vol-2036/T2-8 |storemode=property |title=HLJIT2017-IRMIDIS@IRMiDis-FIRE2017:Information Retrieval from Microblogs during Disasters |pdfUrl=https://ceur-ws.org/Vol-2036/T2-8.pdf |volume=Vol-2036 |authors=Zhao Zicheng,Ning Hui ,Zhuang Ziyao,Zhao Jinmei,Li Jun |dblpUrl=https://dblp.org/rec/conf/fire/ZhaoNZZL17 }} ==HLJIT2017-IRMIDIS@IRMiDis-FIRE2017:Information Retrieval from Microblogs during Disasters== https://ceur-ws.org/Vol-2036/T2-8.pdf
HLJIT2017-IRMIDIS@IRMiDis-FIRE2017:Information
     Retrieval from Microblogs during Disasters
             Zhao Zicheng                                   Ning Hui                                   Zhuang Ziyao
     School of Computer Science and              School of Computer Science and              Faculty of Science, Agriculture and
     Technology, Harbin Engineering              Technology, Harbin Engineering             Engineering, University of Newcastle
        University, Harbin, China                   University, Harbin, China                          upon Tyne, UK
      zichengzhao888@gmail.com                       ninghui@hrbeu.edu.cn                       zhuangziyao1@outlook.com

              Zhao Jinmei                                    Li Jun
     School of Continuing Education,             School of Computer Science and
     Harbin University of Commerce,            Technology, Heilongjiang Institute of
              Harbin, China                        Technology, Harbin, China
       zhaojinmei1@outlook.com                      lijun34667@outlook.com


ABSTRACT                                                           availability of some specific resources. Task 2 is called
                                                                   Matching need-tweets and availability-tweets. Participants'
This paper describes the work of HLJIT-IRMIDIS for the
                                                                   goals are to match need-tweet and availability-tweet. The
Information Retrieval from Microblogs during Disasters.
                                                                   goal of the participants is to push multiple availability-tweets
This track is divided into two sub-tasks. Task 1 is to solve
                                                                   for a need-tweet.
the identification problem of need-tweets and availability-
                                                                       For Task 1 is considered as a classification problem in
tweets during the disaster. Task 2 is to solve the matching
                                                                   this paper. We selected three classifiers, AdaBoost [3], SVM
problem between need-tweets and availability-tweets. For
                                                                   [4] of linear kernel and SVM of nonlinear kernel to resolve
Task 1, the identification of need-tweets and availability-
                                                                   this problem, denoted as AdaBoost (task1_2), SVM-
tweets is formalized into a classification problem. This paper
                                                                   L(task1_1) and SVM-NL (task1_3). For the feature of the
presents a classification method for distinguishing the need-
                                                                   classifier, this paper presents a feature selection method
tweets and availability-tweets. For Task 2, the match of
                                                                   based on the logistic regression. For Task 2, this paper deems
need-tweets and availability-tweets is formalized into a
                                                                   it as a retrieval problem. The need-tweets is used as a query
retrieve problem. This paper proposes a matching method
                                                                   and the retrieval model is used to retrieve the most matching
based on language model. The evaluation shows the
                                                                   documents with need-tweets in the document collection
performance of our approach, which achieved 0.0687 on
                                                                   composed of availability-tweets. The evaluation scores of
MAP in Task 1 and 0.1671 on F-Score in Task 2.
                                                                   our best submitted in terms of Overall Map and F-score have
KEYWORDS                                                           been reported as 0.0687 and 0.1671 respectively on IRMiDis
                                                                   Fire2017 dataset.
Information Retrieval, Microblogs during Disasters, tweets,
classification                                                     2   Method of Task 1
                                                                   Intuitively, Task 1 can be viewed as a two-category
1   Introduction
                                                                   classification. If we formalize Task 1 of recognition tweet as
Microblogging sites such as Twitter have become important          a classification problem, our objectives focus on answering
sources of situational information during disaster events [2,      the following two questions: (1) Which classification-based
6]. However, dealing with identifying specific tweets and          methods can effectively be applied to the recognition tweet,
matching relevant tweets are challenging due to micro-blog         and (2) which features should be used in the classifier.
content is short, contains different language and interference
information and so on. The FIRE 2017 Microblog task [1] is         2.1 Method Selection
motivated by this scenario and aims to promote development         For classification tasks D = {(x1 , y1 ), (x2 , y2 ), β‹― , ( xm , ym )
of information retrieval (IR) methods to Identifying specific      }, yi Ο΅{0,1}, where xi is a feature vector and yi is a feature
tweets from microblogs posted during disasters. This track         label. IRMiDis Fire2017 submitted three groups of run. We
is divided into two sub-tasks. Task 1 is called recognition        use AdaBoost, SVM-L and SVM-NL classifiers to predict
need-tweets and availability-tweets. Need-tweets which             need-tweets and availability-tweets, respectively.
inform about the need or requirement of some specific                  For task1_1, we use the SVM-L classification model. The
resource. Availability-tweets which inform about the               principle of the model is to classify the data using the
hyperplane. The distance from the positive sample point to          people's living security items. The extracted words can
the hyperplane as the sorting result.                               represent information about the microblogging in the
    For task1_2, we use Adaboost, which is a family of              disaster.
algorithms that can enhance weak learners to strong learners.
The working mechanism of the classifier is to start from the        3     Method of Task 2
initial training set training at a base learner, according to the   According to the description of Matching need-tweets and
performance of the base learner to the training sample              availability-tweets, we formalize the problem as follows.
distribution of new adjustments. In the previous course, the        Denote a retrieval problem as IR = (𝑄, 𝐷, 𝐹, 𝑅(π‘žπ‘– , 𝑑𝑖 )) ,
training samples of the wrong learners received more
                                                                    where Q is need-tweet and D is availability-tweet, F is
attention in the follow-up, and then the next-based learner
                                                                    the rule that satisfies the relevance sorting model,
was trained based on the adjusted sample distribution. A
                                                                    𝑅(π‘žπ‘– , 𝑑𝑖 ) for query π‘žπ‘– and document 𝑑𝑖 relevance.
probability value with a positive probability greater than 0.5
                                                                    Whereπ‘žπ‘– and 𝑑𝑖 are predicted need-tweet and availability-
is used as the sorting result.
                                                                    tweet in Task 1. The open source retrieval tool indri1 is used
    For task1_3, we use SVM-NL. The classification
                                                                    in Task 2. We use the language model based on the Dirichlet
principle is to use the inner product kernel function instead
                                                                    [5] smoothing and select the KL distance as the sorting
of the high-dimensional space to the non-linear mapping of
                                                                    model. The language model based on Dirichlet smoothing
positive and negative examples of separation. During the test,
                                                                    and the KL distance sorting model are defined as follows:
the classifier generates a prediction probability for the
positive case. We use the probability value as the sorting                                                 𝑃(𝑀|𝑄)
                                                                               𝐾𝐿(𝑄|𝐷) = βˆ‘ 𝑃(𝑀|𝑄)π‘™π‘œπ‘”
result.                                                                                                    𝑃(𝑀|𝐷)            (1)
                                                                                              𝑀
2.2      Feature Selection
                                                                    where Q is query model, D is document model, we would
Content-based microblogging filtering method, affecting a
microblogging is need-tweets or availability-tweets factors         compute an estimate of the corresponding Q and D, and
are the features of the microblogging. For content-based            w is the set of all the words in vocabulary.
filtering methods, words are natural features. For the                                        𝑐(𝑀, 𝐷) + πœ‡π‘ƒπ‘šπ‘™ (𝑀)
                                                                                   𝑃(𝑀|𝐷) =                                  (2)
Fire2017 task, we applied the logistic regression model to                                          |𝐷| + πœ‡
select 1116 disaster-related words as microblogging features.
Feature words can filter out the noise word, but also improve       where π‘ƒπ‘šπ‘™ (𝑀) is language model and ΞΌ is a smoothing
the classification efficiency of the classifier. In this paper,     parameter.
the weight of the feature in the feature library is updated by
the method of gradient descending. Using the gradient               4     Result
descent method, select the appropriate feature learning rate        We begin this section by summarising details of the dataset,
to ensure the appropriate learning rate. Table 1 shows the top      performance measures, experimental settings, and then
20 features.                                                        describe our experiments result.
                                                                    4.1    Data
                      Table 1: Feature of top20                     This section describes the dataset provided to the shared task
                                                                    participants. 20000 training data with answer and 50000
  No                 Term       No                  Term            testing data was provided by the organizers during the Nepal
  1                   ΰ€°ΰ€Ύΰ€Ήΰ€€      11                  relief          earth-quake in April 2015.
  2                 Anyone      12                 planes           4.2    Performance Measures
  3               Ambulance 13                     meals
                                                                    For Task 1, evaluation is Mean Average Precision (MAP)
  4                   NEA       14                Doctors           considering the retrieved ranked list. For Task 2, evaluation
  5                supplying    15                Hospital          is F-Score. F-Score = 2 * Precision@5 * Recall /
  6                 medical     16               electricity        (Precision@5 + Recall). Precision@5, i.e., for each need-
  7                  send       17                packets           tweet that is correctly identified. Recall, i.e., what fraction of
  8                  Food       18                 blood            overall need-tweets could be correctly matched by at least
  9                 pitched     19                   ΰ€šΰ₯€ΰ€¨            one availability-tweet.
  10              emergency 20                       ΰ€­ΰ₯‡ΰ€œΰ₯‡           4.3    Experimental Settings
  By analyzing the selected keywords, we found that                 Pre-processing: remove punctuation, URL and mention.
medical, doctors, blood, hospital, ambulance and so on for          Parameter selection of feature selection: learning rate =
medical information. Relief, electricity, food and meals are

1
    http://www.lemurproject.org/indri/
0.004. Parameter settings for the classifier: the parameters of          SVM-L                kernel=linear, loss=squared_hinge,
each classifier are shown in Tables 2.                                                      multi_class=ovr, penalty=l2, tol=0.0001
                                                                         Adaboost                     n_estimators=100,
                                                                                           algorithm=SAMME.R, LearningRate=1.0
                 Tables 2: Parameter Settings                            SVM-NL                    kernel=rbf, gamma=auto,
                                                                                               probability=true, classweight=12
Method                            Parameter



4.4      Result of Task 1
Table 3 shows the experimental results of Task 1.

                                                        Tables 3: Results of Task 1

             Submission Detail           Availability-Tweets Evaluation           Need-Tweets Evaluation                     Average map
    No                                   Precision        Recall              Precision       Recall
                     Run ID                                           Map                                    Map                  MAP
                                          @100            @1000                @100           @1000
    1        HLJIT-IRMIDIS_task1_3           0.5400       0.1878     0.0905      0.3500       0.1405       0.0468               0.0687
    2        HLJIT-IRMIDIS_task1_2           0.7100       0.1276     0.0798      0.3900       0.0913       0.0468               0.0633
    3        HLJIT-IRMIDIS_task1_1           0.2300       0.1633     0.0493      0.0200       0.1194       0.0079               0.0286

   From the experimental results, we can see that the Run2
                                                                        Acknowledgments
achieves higher Precision@100 than others. For Run2, we
submitted 73 Need-Tweets and 216 Need-Tweets, so                        This work is supported by the Social Science Fund of
Recall@1000 is lowest. However, too many negative                       Heilongjiang Province of China (No. 16XWB02)
examples may lead to Recall@1000 of three groups result is
too low in the training model.                                          Reference
                                                                         [1] M. Basu, S. Ghosh, K. Ghosh and M. Choudhury. Overview of the
4.5      Result of Task 2                                                    FIRE 2017 track: Information Retrieval from Microblogs during
Table 4 shows the experimental results of Task 2.                            Disasters (IRMiDis). In Working notes of FIRE 2017 - Forum for
                                                                             Information Retrieval Evaluation, Bangalore, India, December 8-10,
                                                                             2017, CEUR Workshop Proceedings. CEUR-WS.org, 2017.
                                                                         [2] Imran M, Castillo C, Diaz F, et al. Processing social media messages in
                  Table 4: Results of Task 2                                 mass emergency: A survey [J]. ACM Computing Surveys (CSUR),
                                                                             2015, 47(4): 67.
                                 Precision                               [3] RΓ€tsch G, Onoda T, MΓΌller K R. Soft margins for AdaBoost [J].
    Run ID                                     Recall      F-Score           Machine learning, 2001, 42(3): 287-320.
                                    @5
                                                                         [4] Cortes C, Vapnik V. Support vector machine [J]. Machine learning,
   HLJIT-IRMIDIS_task2_1     0.1819      0.1546 0.1671                       1995, 20(3): 273-297.
   HLJIT-IRMIDIS_task2_3     0.2033      0.1405 0.1662                   [5] MacKay D J C, Peto L C B. A hierarchical Dirichlet language model
                                                                             [J]. Natural language engineering, 1995, 1(3): 289-308.
   HLJIT-IRMIDIS_task2_2     0.2051      0.0913 0.1264                   [6] Vieweg S, Hughes A L, Starbird K, et al. Microblogging during two
   From the experimental results, we can see that the results                natural hazards events: what twitter may contribute to situational
of Run1 and Run3 are similar on the F-score.                                 awareness [C]//Proceedings of the SIGCHI conference on human
                                                                             factors in computing systems. ACM, 2010: 1079-1088.
5     Conclusion and Further Work
We have described our approach to all of the tasks in the
context of IRMiDis fire2017 competition. The evaluation
shows the performance of our approach, which achieved
Map (0.0687) in Task 1 and F-Score (0.1671) in Task 2. As
a future work, we work like to explore deep learning to text
matching and information retrieval of the tweets. Meanwhile,
also includes finding new filtering techniques and
parameters to tackle such informally written documents like
tweets.