=Paper= {{Paper |id=Vol-1141/paper_20 |storemode=property |title=Part-of-Speech is (almost) enough: SAP Research & Innovation at the #Microposts2014 NEEL Challenge |pdfUrl=https://ceur-ws.org/Vol-1141/paper_20.pdf |volume=Vol-1141 |dblpUrl=https://dblp.org/rec/conf/msm/DahlmeierNT14 }} ==Part-of-Speech is (almost) enough: SAP Research & Innovation at the #Microposts2014 NEEL Challenge== https://ceur-ws.org/Vol-1141/paper_20.pdf
         Part-of-Speech is (almost) enough: SAP Research &
         Innovation at the #Microposts2014 NEEL Challenge

                 Daniel Dahlmeier                       Naveen Nandan                         Wang Ting
           SAP Research and Innovation             SAP Research and Innovation        SAP Research and Innovation
            #14 CREATE, 1 Create Way                #14 CREATE, 1 Create Way           #14 CREATE, 1 Create Way
                   Singapore                               Singapore                          Singapore
             d.dahlmeier@sap.com                   naveen.nandan@sap.com                 dean.wang@sap.com

 ABSTRACT                                                          2. METHOD
 This paper describes the submission of the SAP Research &         2.1 Extraction
 Innovation team at the #Microposts2014 NEEL Challenge.            We use a sequence tagging approach for entity extraction. In
 We use a two-stage approach for named entity extraction           particular, we use a conditional random field (CRF) which
 and linking, based on conditional random fields and an en-        is a discriminative, probabilistic model for sequence data
 semble of search APIs and rules, respectively. A surprising       with state-of-the-art performance [3]. A linear-chain CRF
 result of our work is that part-of-speech tags alone are al-      tries to estimate the conditional probability of a label se-
 most sufficient for entity extraction. Our results for the        quence y given the observed features x, where each label yt
 combined extraction and linking task on a development and         is conditioned on the previous label yt−1 . In our case, we
 test split of the training set are 34.6% and 37.2% F1 score,      use BIO CoNLL-style tags [5]. We do not differentiate be-
 respectively, and for the test set is 37%.                        tween different entity classes for BIO tags (e.g, ‘B’ instead
                                                                   of ‘B-PERSON’).
 Keywords
 Conditional Random Field, Entity Extraction, DBpedia Link-        The choice of appropriate features can have a significant im-
 ing                                                               pact on the model’s performance. We have investigated a
                                                                   set of features that are commonly used for named entity
                                                                   extraction. Table 1 lists the features. The casing features
 1. INTRODUCTION
 The rise of social media platforms and microblogging ser-              Feature                       Example
 vices has led to an explosion in the amount of informal,               words                          Obamah
 user-generated content on the web. The task of the #Mi-                words lower                    obamah
 croposts2014 workshop NEEL challenge is named entity ex-               POS                                ˆ
                                                                        title case                       True
 traction and linking (NEEL) for microblogging texts [1].               upper case                       False
 Named-entity extraction and linking is a challenging prob-             stripped words                 obamah
 lem because tweets can contain almost any content, from                is number                        False
 serious news, to personal opinions, to sheer gibberish and             word cluster                   -NONE-
 both extraction and linking have to deal with the inherent             dbpedia          dbpedia.org/resource/Barack Obama
 ambiguity of natural language.
                                                                   Table 1: Examples of features for entity extraction.
 In this paper, we describe the submission of the SAP Re-
 search & Innovation team. Our system breaks the task into         upper case and lower case and the is number feature are
 two separate steps for extraction and linking. We use a           implemented using simple regular expressions. The stripped
 conditional random field (CRF) model for entity extraction        words feature is the lowercased word with initial hashtags
 and an ensemble of search APIs and rules for entity linking.      and @ characters removed. The DBpedia feature is anno-
 We describe our method and present experimental results           tated automatically using the DBpedia Spotlight web API
                                                                   1
 based on the released training data. One surprising finding         and acts as a type of gazetteer feature. For a label yt at
 of our experiments is that part-of-speech tags alone perform      position t, we consider features x extracted at the current
 almost as well as the best feature combinations for entity        position t and previous position t−1. We experimented with
 extraction.                                                       larger feature contexts but they did not improve the result
                                                                   on the development set.

                                                                   2.2      Linking
                                                                   For the linking step, we explore different search APIs, such
 Copyright c 2014 held by author(s)/owner(s); copying permitted
 only for private and academic purposes.                           as Wikipedia search2 , DBpedia Spotlight, and Google search
 Published as part of the #Microposts2014 Workshop proceedings,    to retrieve the DBpedia resource for a mention. We begin
 available online as CEUR Vol-1141 (http://ceur-ws.org/Vol-1141)   with using the extracted entities individually as query terms
 #Microposts2014, April 7th, 2014, Seoul, Korea.                   1
                                                                       github.com/dbpedia-spotlight/dbpedia-spotlight
                                                                   2
                                                                       github.com/goldsmith/Wikipedia




· #Microposts2014 · 4th Workshop on Making Sense of Microposts · @WWW2014
                  Feature          F1 score                               Data set     Precision    Recall    F1 score
                  POS                  0.622                              Dev              0.673     0.591        0.629
                  + is number         0.629                               Dev-test         0.671     0.579        0.622
                  + upper case         0.623
                                                                          Table 3: Results for entity extraction.
     Table 2: Results for extraction feature selection.

                                                                   3.3   Linking
 to these search APIs. As ambiguity is a major concern for         To test our linking system, we follow two approaches. First,
 the linking task, for tweets where there are multiple enti-       we measure the accuracy of the linking system using the
 ties extracted, we use the entities combined as an additional     gold standard where we observe an accuracy of 67.6%. As
 query term. For example, a tweet with annotated entities          a second step, we combine the linking step with our entity
 as Sean Hoare and phone hacking, Sean Hoare would map             extraction step and measure the F1 score. Table 4 shows
 to a specific resource in DBpedia but phone hacking could         the results on the dev and dev-test split for the combined
 refer to more than one resource. By using the query term          system.
 “phone hacking + Sean Hoare”, we can help boost the rank
 for the resource “News International phone hacking scandal”              Data set     Precision    Recall    F1 score
 to map to the entity phone hacking instead of a general ar-              Dev              0.436     0.287        0.346
 ticle on “Phone Hacking”. In our system, we make use of                  Dev-test         0.477     0.304        0.372
 the Web APIs for Wikipedia search and DBpedia Spotlight
 together with some hand-written rules to rank the resources       Table 4: Results for entity extraction and linking.
 returned. The result of the ranking step is then used to
 construct the DBpedia resource URL to which the entity is
 mapped.                                                           4.    CONCLUSION
                                                                   We have described the submission of the SAP Research &
                                                                   Innovation team to the #Microposts2014 NEEL shared task.
 3. EXPERIMENTS AND RESULTS                                        Our system is based on a CRF sequence tagging model for
 In this section, we present experimental results of our method,   entity extraction and an ensemble of search APIs and rules
 based on the on the data released by the organizers.              for entity linking. Our experiments show that POS tags
                                                                   are a surprisingly effective feature for entity extraction in
 3.1     Data sets                                                 tweets.
 We split the provided data set into a training (first 60%),
 development (dev, next 20%), and test (dev-test, last 20%)        5.    ACKNOWLEDGEMENT
 set. We perform standard pre-processing steps. We per-            The research is partially funded by the Economic Develop-
 form tokenization and POS tagging using the Tweet NLP             ment Board and the National Research Foundation of Sin-
 toolkit [4], lookup word cluster indicators for each token        gapore.
 from the Brown clusters released by Turian et al. [6], and
 annotate the tweets with the DBpedia Spotlight web API.           6.    REFERENCES
                                                                   [1] A. E. Cano Basave, G. Rizzo, A. Varga, M. Rowe,
                                                                       M. Stankovic, and A.-S. Dadzie. Making Sense of
 3.2     Extraction                                                    Microposts (#Microposts2014) Named Entity
 We train the CRF model on the training set of the data,
                                                                       Extraction & Linking Challenge. In Proc.,
 perform feature selection based on the dev set, and test the
                                                                       #Microposts2014, pages 54–60, 2014.
 resulting model on the dev-test set. We evaluate the re-
 sulting models using precision, recall, and F1 score. In all      [2] A.L. Berger, V.J. Della Pietra, and S.A. Della Pietra. A
 experiments, we use the CRF++ implementation of condi-                maximum entropy approach to natural language
 tional random fields3 with default parameters. We found in            processing. Computational linguistics, 22(1), 1996.
 initial experiments that the CRF parameters did not have a        [3] J. Lafferty, A. McCallum, and F. Pereira. Conditional
 great effect on the final score. We employ a greedy feature           random fields: Probabilistic models for segmenting and
 selection method [2] to find the subset of the best features.         labeling sequence data. 2001.
 Table 2 shows the results of the feature selection experi-        [4] O. Owoputi, B. O’Connor, C. Dyer, K. Gimpel,
 ments on the development set. We can see that POS tags                N. Schneider, and N. Smith. Improved part-of-speech
 alone give a F1 score of 62.2%. Adding the binary is num-             tagging for online conversational text with word
 ber feature increases the score to 62.9%. Additional features,        clusters. In Proceedings of NAACL-HLT, 2013.
 such as lexical features, word clusters, or the DBpedia Spot-     [5] E.T.K. Sang and F. De Meulder. Introduction to the
 light annotations, do not help and even decrease the score.           conll-2003 shared task: Language-independent named
 Surprisingly the word token itself is not selected as one of          entity recognition. In Proceedings of HLT-NAACL,
 the features. Thus, the CRF performs its task without even            2003.
 looking at the word itself! After feature selection, we re-       [6] J. Turian, L. Ratinov, and Y. Bengio. Word
 train the CRF with the best performing feature set {POS,              representations: a simple and general method for
 is number } and evaluate the model on the dev and dev-test            semi-supervised learning. In Proceedings of ACL, 2010.
 set. The results are shown in Table 3.
 3
     code.google.com/p/crfpp/




                                                                                                                            74
· #Microposts2014 · 4th Workshop on Making Sense of Microposts · @WWW2014