=Paper=
{{Paper
|id=Vol-1141/paper_20
|storemode=property
|title=Part-of-Speech is (almost) enough: SAP Research & Innovation at the #Microposts2014 NEEL Challenge
|pdfUrl=https://ceur-ws.org/Vol-1141/paper_20.pdf
|volume=Vol-1141
|dblpUrl=https://dblp.org/rec/conf/msm/DahlmeierNT14
}}
==Part-of-Speech is (almost) enough: SAP Research & Innovation at the #Microposts2014 NEEL Challenge==
Part-of-Speech is (almost) enough: SAP Research & Innovation at the #Microposts2014 NEEL Challenge Daniel Dahlmeier Naveen Nandan Wang Ting SAP Research and Innovation SAP Research and Innovation SAP Research and Innovation #14 CREATE, 1 Create Way #14 CREATE, 1 Create Way #14 CREATE, 1 Create Way Singapore Singapore Singapore d.dahlmeier@sap.com naveen.nandan@sap.com dean.wang@sap.com ABSTRACT 2. METHOD This paper describes the submission of the SAP Research & 2.1 Extraction Innovation team at the #Microposts2014 NEEL Challenge. We use a sequence tagging approach for entity extraction. In We use a two-stage approach for named entity extraction particular, we use a conditional random field (CRF) which and linking, based on conditional random fields and an en- is a discriminative, probabilistic model for sequence data semble of search APIs and rules, respectively. A surprising with state-of-the-art performance [3]. A linear-chain CRF result of our work is that part-of-speech tags alone are al- tries to estimate the conditional probability of a label se- most sufficient for entity extraction. Our results for the quence y given the observed features x, where each label yt combined extraction and linking task on a development and is conditioned on the previous label yt−1 . In our case, we test split of the training set are 34.6% and 37.2% F1 score, use BIO CoNLL-style tags [5]. We do not differentiate be- respectively, and for the test set is 37%. tween different entity classes for BIO tags (e.g, ‘B’ instead of ‘B-PERSON’). Keywords Conditional Random Field, Entity Extraction, DBpedia Link- The choice of appropriate features can have a significant im- ing pact on the model’s performance. We have investigated a set of features that are commonly used for named entity extraction. Table 1 lists the features. The casing features 1. INTRODUCTION The rise of social media platforms and microblogging ser- Feature Example vices has led to an explosion in the amount of informal, words Obamah user-generated content on the web. The task of the #Mi- words lower obamah croposts2014 workshop NEEL challenge is named entity ex- POS ˆ title case True traction and linking (NEEL) for microblogging texts [1]. upper case False Named-entity extraction and linking is a challenging prob- stripped words obamah lem because tweets can contain almost any content, from is number False serious news, to personal opinions, to sheer gibberish and word cluster -NONE- both extraction and linking have to deal with the inherent dbpedia dbpedia.org/resource/Barack Obama ambiguity of natural language. Table 1: Examples of features for entity extraction. In this paper, we describe the submission of the SAP Re- search & Innovation team. Our system breaks the task into upper case and lower case and the is number feature are two separate steps for extraction and linking. We use a implemented using simple regular expressions. The stripped conditional random field (CRF) model for entity extraction words feature is the lowercased word with initial hashtags and an ensemble of search APIs and rules for entity linking. and @ characters removed. The DBpedia feature is anno- We describe our method and present experimental results tated automatically using the DBpedia Spotlight web API 1 based on the released training data. One surprising finding and acts as a type of gazetteer feature. For a label yt at of our experiments is that part-of-speech tags alone perform position t, we consider features x extracted at the current almost as well as the best feature combinations for entity position t and previous position t−1. We experimented with extraction. larger feature contexts but they did not improve the result on the development set. 2.2 Linking For the linking step, we explore different search APIs, such Copyright c 2014 held by author(s)/owner(s); copying permitted only for private and academic purposes. as Wikipedia search2 , DBpedia Spotlight, and Google search Published as part of the #Microposts2014 Workshop proceedings, to retrieve the DBpedia resource for a mention. We begin available online as CEUR Vol-1141 (http://ceur-ws.org/Vol-1141) with using the extracted entities individually as query terms #Microposts2014, April 7th, 2014, Seoul, Korea. 1 github.com/dbpedia-spotlight/dbpedia-spotlight 2 github.com/goldsmith/Wikipedia · #Microposts2014 · 4th Workshop on Making Sense of Microposts · @WWW2014 Feature F1 score Data set Precision Recall F1 score POS 0.622 Dev 0.673 0.591 0.629 + is number 0.629 Dev-test 0.671 0.579 0.622 + upper case 0.623 Table 3: Results for entity extraction. Table 2: Results for extraction feature selection. 3.3 Linking to these search APIs. As ambiguity is a major concern for To test our linking system, we follow two approaches. First, the linking task, for tweets where there are multiple enti- we measure the accuracy of the linking system using the ties extracted, we use the entities combined as an additional gold standard where we observe an accuracy of 67.6%. As query term. For example, a tweet with annotated entities a second step, we combine the linking step with our entity as Sean Hoare and phone hacking, Sean Hoare would map extraction step and measure the F1 score. Table 4 shows to a specific resource in DBpedia but phone hacking could the results on the dev and dev-test split for the combined refer to more than one resource. By using the query term system. “phone hacking + Sean Hoare”, we can help boost the rank for the resource “News International phone hacking scandal” Data set Precision Recall F1 score to map to the entity phone hacking instead of a general ar- Dev 0.436 0.287 0.346 ticle on “Phone Hacking”. In our system, we make use of Dev-test 0.477 0.304 0.372 the Web APIs for Wikipedia search and DBpedia Spotlight together with some hand-written rules to rank the resources Table 4: Results for entity extraction and linking. returned. The result of the ranking step is then used to construct the DBpedia resource URL to which the entity is mapped. 4. CONCLUSION We have described the submission of the SAP Research & Innovation team to the #Microposts2014 NEEL shared task. 3. EXPERIMENTS AND RESULTS Our system is based on a CRF sequence tagging model for In this section, we present experimental results of our method, entity extraction and an ensemble of search APIs and rules based on the on the data released by the organizers. for entity linking. Our experiments show that POS tags are a surprisingly effective feature for entity extraction in 3.1 Data sets tweets. We split the provided data set into a training (first 60%), development (dev, next 20%), and test (dev-test, last 20%) 5. ACKNOWLEDGEMENT set. We perform standard pre-processing steps. We per- The research is partially funded by the Economic Develop- form tokenization and POS tagging using the Tweet NLP ment Board and the National Research Foundation of Sin- toolkit [4], lookup word cluster indicators for each token gapore. from the Brown clusters released by Turian et al. [6], and annotate the tweets with the DBpedia Spotlight web API. 6. REFERENCES [1] A. E. Cano Basave, G. Rizzo, A. Varga, M. Rowe, M. Stankovic, and A.-S. Dadzie. Making Sense of 3.2 Extraction Microposts (#Microposts2014) Named Entity We train the CRF model on the training set of the data, Extraction & Linking Challenge. In Proc., perform feature selection based on the dev set, and test the #Microposts2014, pages 54–60, 2014. resulting model on the dev-test set. We evaluate the re- sulting models using precision, recall, and F1 score. In all [2] A.L. Berger, V.J. Della Pietra, and S.A. Della Pietra. A experiments, we use the CRF++ implementation of condi- maximum entropy approach to natural language tional random fields3 with default parameters. We found in processing. Computational linguistics, 22(1), 1996. initial experiments that the CRF parameters did not have a [3] J. Lafferty, A. McCallum, and F. Pereira. Conditional great effect on the final score. We employ a greedy feature random fields: Probabilistic models for segmenting and selection method [2] to find the subset of the best features. labeling sequence data. 2001. Table 2 shows the results of the feature selection experi- [4] O. Owoputi, B. O’Connor, C. Dyer, K. Gimpel, ments on the development set. We can see that POS tags N. Schneider, and N. Smith. Improved part-of-speech alone give a F1 score of 62.2%. Adding the binary is num- tagging for online conversational text with word ber feature increases the score to 62.9%. Additional features, clusters. In Proceedings of NAACL-HLT, 2013. such as lexical features, word clusters, or the DBpedia Spot- [5] E.T.K. Sang and F. De Meulder. Introduction to the light annotations, do not help and even decrease the score. conll-2003 shared task: Language-independent named Surprisingly the word token itself is not selected as one of entity recognition. In Proceedings of HLT-NAACL, the features. Thus, the CRF performs its task without even 2003. looking at the word itself! After feature selection, we re- [6] J. Turian, L. Ratinov, and Y. Bengio. Word train the CRF with the best performing feature set {POS, representations: a simple and general method for is number } and evaluate the model on the dev and dev-test semi-supervised learning. In Proceedings of ACL, 2010. set. The results are shown in Table 3. 3 code.google.com/p/crfpp/ 74 · #Microposts2014 · 4th Workshop on Making Sense of Microposts · @WWW2014