<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Sunita Sarawagi. Information extraction.
Foundations and Trends® in Databases</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Named Entity Recognition for Telugu News Articles using Naïve Bayes Classifier</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>SaiKiranmai Gorla</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>N L Bhanu Murthy</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aruna Malapati</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Birla Institute of Technology and Science</institution>
          ,
          <addr-line>Pilani, Hyderabad</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <volume>1</volume>
      <issue>3</issue>
      <fpage>3</fpage>
      <lpage>8</lpage>
      <abstract>
        <p>The Named Entity Recognition (NER) is identifying name of Person, Location, Organization etc. in a given sentence or a document. In this paper, we have attempted to classify textual content from on-line Telugu newspapers using well known generative model. We have used generic features like contextual words and their part-of-speech (POS) to build the learning model. By understanding the syntax and grammar of Telugu language, we propose morphological pre-processing of the data and this step yields us better accuracy. We propose some interesting language dependent features like post-position feature, clue word feature and gazetteer feature to improve the performance of the model. The model achieved an overall average F1-Score of 88.87% for Person, 87.32% for Location and 72.69% for Organization.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        News providers and publishing companies generate
humongous amount of unstructured textual content on
daily basis. This content is not of much use if there
are no tools and techniques for searching and indexing
the text. Named Entity Recognition (NER) is an
important task in Natural Language Processing (NLP)
to figure out the named entities in text documents.
Named Entities (NEs) are usually proper nouns like
name of Person, Organization, Location etc in text
documents. Named Entity Recognition can naturally
be applied to news articles to identify named entities
in those articles. Knowing these named entities in each
article help in categorizing the news articles in defined
position and empower smooth information detection.
NER task was first presented at MUC-6
        <xref ref-type="bibr" rid="ref2">in 1995</xref>
        [GS96]
and since then, the task has undergone several
transitions beginning from the rule based approaches to the
currently used Machine learning techniques. The
performance of NER task for different languages depends
on the properties of the language.
      </p>
      <p>In English, capitalization feature play an important
role as NEs are generally capitalized in this language.
The capitalization feature is not available for Indian
Languages (IL) which makes the task more
challenging. In this paper, we attempt to get some insights and
results of NER for Telugu language. The challenges in
NER specific to Telugu language are: a) no
capitalization b) two words in English can be mapped to one
word in Telugu. Example: in Delhi (English):
DhillIlO where ( ) lO is an post-position marker c)
absence of part-of-speech tagger d) free word ordering.</p>
      <p>In this paper, a generative model is proposed for
NER task using Naïve Bayes classifier. The following
features have been considered for training the model
contextual word, part-of-speech tag, gazetteer as a
binary feature, post-position feature and clue word
feature. We are mainly interested in classifying a given
word to one of the named entities namely Person,
Location and Organization. The results obtained from
the proposed approach are comparable to other
competitive techniques for Telugu langauge.</p>
      <p>The rest of the article is organized as follows. We
discuss related work in Section 2 and illustrate dataset
in Section 3. The methodology and evaluation metrics
are presented in Section 4. In Section 5 and Section
6 we propose features to build Naïve Bayes Classifier
and discuss the results. The conclusions of our study
are summarized in Section 7.</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>The NER task, can be approached in two ways: by
hand-crafted rules and statistical machine learning
techniques [Sar08]. A rule-based approach for NER
tasks require patterns which can describe the
internal structure and contextual rules which give clues for
identification and classification. An example of such
rules can be a street name if phrase ends with the word
‘X ’proceeded by preposition word “Y ”, where ‘X ’can
be “street ”and ‘Y ’could be ‘in ’from sentence such as
‘The Apple store in jail street in hyderabad’.</p>
      <p>Some of the ruled-based systems include FASTUS
[AHB+95] which uses regular expressions to extract
Named Entities (NEs). LaSIE and LaSIE II [HGA+98]
uses look up lists of NE to identify NEs. The
ruledbased systems are efficient for domain specific like
biological domain where certain formulation in
terminology. Some biological NER task include [ARG08]. The
limitation of ruled-based approach is that they require
expert about the knowledge of the language and
domain. These knowledge resources take time to build
and not transferable to other domains. Hence, NER
has been solved using machine learning approaches.</p>
      <p>Machine learning approaches can be classified into
three different approaches: Supervised learning,
Semisupervised learning and Unsupervised learning. In
Supervised learning (SL), labeled training data with
features is given as an input to the model, which can
classify new data. Some of the SL algorithms are
Support Vector Machine [TC02], Condidtional
Random Field [ANC08], Hidden Markov Model [SZS+04],
Neural Network [KT07], Decision tree [FM09], Naïve
Bayes [MH05] and Maximum Entropy Model [CN02].
In semi-supervised learning the model makes use
of both labeled and unlabeled data. The popular
Semi-supervised learning in NER are boot-strapping
[Kno11] and Co-training [CS99]. Most of the
Unsupervised learning approaches in NER are clustering and
distributional statistics using similarity functions.</p>
      <p>In Indian Languages considerable amount of work
has been done in Bengali, Hindi. Ekbal et.al [EB08]
developed an NER system for Bengali and Hindi
using SVM. These systems use different contextual
information of words in predicting four NE classes, such
as Person, Location, Organization and miscellaneous.
The annotated corpora consists of 122,467 tokens for
Bengali and 502,974 tokens for Hindi. The system
has been tested with 35K and 60K tokens for
Bengali and Hindi with an F1-score 84.15% and 77.17%
respectively. Ekbal et.al [EB09] developed the NER
system using CRF for Bengali and Hindi using
contextual features with an F1-Score of 83.89% for Bengali
and 80.93% for Hindi.</p>
      <p>A very small amount of work is done in Telugu
NER. Srikanth and Murthy [SM08], have used part
of LERC-UoH Telugu corpus where CRF based Noun
Tagger is built using 13,425 words. This has been
considered as one of the feature for rule-based NER
system for Telugu mainly focusing on identifying Person,
Location and Organization without considering POS
tag or syntactic information. This work is limited to
only single word NEs. Praneeth et.al [SGPV08] build
CRF based NER system with language independent
and dependent features. They have conducted
experiments on data released as a part of NER for South and
South-East Asian Languages (NERSSEAL) 1
competition with 12 classes and obtained F1–score of 44.89%.
3
3.1</p>
      <sec id="sec-2-1">
        <title>Corpus</title>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Dataset and Pre-processing</title>
      <p>Telugu Newspaper corpus is generated by crawling
through newspaper websites2 3. The corpus is
annotated with three NE classes namely Person, Location,
Organization and one not named entity class. The
annotation was verified by Telugu linguists. The
annotated data consists of 54,457 words out of which 16,829
are unique word forms. The number of named entities
in the corpus are 2658 Persons, 2291 Locations and
1617 Organizations.
3.2</p>
      <sec id="sec-3-1">
        <title>Morphological Pre-processing</title>
        <p>Morphology is the study of word formation: how words
are formed from smaller morphemes. A morpheme
is the smallest part of a word that has grammatical
information or meaning.</p>
        <p>For Example: The word trainings has 3 morphemes
in it: train_ing_s</p>
        <p>As discussed in Section 6.1, Telugu is a highly
inflectional and agglutinating language and hence it makes
all sense to perform morphological pre-processing. In
this work, we perform morphological pre-processing to
only Nouns in the dataset because most of the NEs are
Nouns.</p>
        <p>For Example:
(haidarAbAdlo) = ౖ ద
_
1. ౖ ద
2.
3. కత
(bijepiki) =</p>
        <p>_
(kavitaku) = కత _</p>
        <p>We would like to explore the significance of this
morphological pre-processing step and hence put up
results with and without this pre-processing step. The
results unarguably signifies the importance of this
step.</p>
        <p>1http://ltrc.iiit.ac.in/ner-ssea-08/
2http://www.eenadu.net/
3http://www.andhrajyothy.com/
4.1</p>
      </sec>
      <sec id="sec-3-2">
        <title>Methodology</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Methodology &amp; Evaluation Metrics 5</title>
      <p>We have considered 11 features for every word in a
sentence and classify each word to one of the three
named entities namely Person, Organization, Location
and one NNE class(Not a Named Entity). Thus there
are D(11) features for every word and each is to be
classified into 4 classes say c1, c2, c3 and c4. Naïve
Bayes classifier is a generative model where in
posterior probability of a word belonging to a particular
class, ci where i=1 to 4, given the feature vector of
the word, (x1; x2; ::; xD), is computed by making use
of Bayes theorem. Assuming the conditional
independence of features given particular class, the posterior
probability will be calculated as follows:
p(cij(x1; x2; :::; xD)) =
p((x1; x2; :::; xD)jci)p(ci)
∑4</p>
      <p>i=1 p((x1; x2; :::; xD)jci)
p(x1jci)p(x2jci)::::p(xDjci)p(ci)
= ∑4</p>
      <p>i=1 p(x1jci)p(x2jci):::::p(xDjci)
The prior probability, p(ci), and conditional
probabilities p(x1jc1); p(x2jc2); ::::; p(xDjci) are estimated from
the training data. The posterior probabilities for each
of the class is computed and the word is classified into
the class of maximal posterior probability.</p>
      <p>The algorithm is implemented in C++. It is
applied 50 times on the data. In each round, 70% of the
sentences are randomly chosen for training and the
remaining 30% are considered for testing. The results
provided in the tables in Section 5 and Section 6 are
the average of 50 rounds.
4.2</p>
      <sec id="sec-4-1">
        <title>Evaluation Metrics</title>
        <p>The standard evaluation measures like Precision,
Recall, F1–score are considered to find out the prediction
accuracies of proposed model.</p>
        <p>P recision(P ) =</p>
        <p>Recall(R) =
c
t
c
r
F 1</p>
        <p>Score =
2 P</p>
        <p>P + R</p>
        <p>R
where r is the number of NEs predicted by the model,
t is the total number of NEs present in the test set
and c is the number of NEs correctly predicted by the
model.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Contextual features and Naïve Bayes</title>
    </sec>
    <sec id="sec-6">
      <title>Classifier</title>
      <p>Orthographic features (capitalization or digits), suffix,
preffix, NE specific words, gazetter features, POS etc.
are generally used for NER. In English, capitalization
feature play an important role as NEs are generally
capitalized in this language. Unfortunately this
feature is not applicable for the Indian languages.</p>
      <p>The contextual word and POS features are used to
build the prediction model. For window size of 3,
contextual features are the current word (w0), previous
word (w 1) and the next word (w+1). The
corresponding POS features are part-of-speech of (w0) and (w 1)
and (w+1) represented by pos0, pos 1 and pos+1
respectively. The experiments was also repeated for
window size 5. The Precision, Recall and F1–score remain
more or less the same.</p>
      <p>Let us consider the following example: త (NNP)
తన(PRP) (JJ) ఇషప (VB) (Sita likes her
dress). For window size of 3, the contextual word and
POS tags of the current word are తన (w 1),
ఇషప (w+1), PRP (pos 1), VB (pos+1).</p>
      <p>We show that the six features are conditionally
independent given the class label where TAG can be
Person, Location, Organization and NNE (not a named
entity).</p>
      <p>1. The features wi and posi are independent.</p>
      <p>p(wijposi; T AG) = p(wijT AG)
Most of the times a word can be tagged with only
one POS tag. It cannot have different POS tags.
The chances of a word being tagged under
different POS tags based on context is very rare. For
example, consider the word “John”. Its POS tag
is proper noun and it is the only possible POS tag
for this word. So conditioning a word on its POS
tag will not change its probability.
2. The features wi and wj (i ̸= j) are independent.</p>
      <p>p(wijwj ; T AG) = p(wijT AG)
Telugu being a free order language, any word can
occur before/after a particular word. The
probability of occurrence of any word before/after a
particular word is same. It can also be seen as
words occur with uniform probability.
3. The features wi and posj (i ̸= j) are independent
p(wijposj ; T AG) = p(wijT AG)
Since the condition is true for the words, it will
definitely be true for their POS tags.
4. The features posi and posj (i = j) are
independent.</p>
      <p>p(posijposj; T AG) = p(wijT AG)
The posterior probability can be represented as:
p(wi = P ersonjw 1; w0; w+1; pos 1; pos0; pos+1) =
p(w 1; w0; w1; pos 1; pos0; pos1jP erson)p(P erson)</p>
      <p>Applying chain rule to the likelihood probability
term:
p(w 1; w0; w+1; pos 1; pos0; pos+1jP erson) =
p(w 1jw0; w1; pos 1; pos0; pos1; P erson)
p(w0jw1; pos 1; pos0; pos1; P erson)
p(w1jpos 1; pos0; pos1; P erson)
p(pos 1jpos0; pos1; P erson) p(pos0jpos1; T AG)
p(pos1jP erson)
Posterior probability after applying conditional
independence on the features will be:
p(P ersonjw 1; w0; w+1; pos 1; pos0; pos+1) =
p(w 1jP erson) p(w0jP erson) p(w1jP erson)
p(pos 1jP erson) p(pos0jP erson) p(pos1jP erson)
p(P erson)
As the conditional independence holds good for all
features, we train the model with 70% of the data
and test on the remaining 30% of the data. The
average prediction accuracies of several runs have
been reported in Table 1.
As discussed in Section 3.2, morphological
preprocessing of each word in the dataset is considered
and the results are presented in Table 2. It is
interesting to observe that there is improvement after
the morphological pre-processing.
Though the overall results put up decent
performance, but F1-score of Organizations is not
impressive. We will introduce language dependent features
to improvise the overall performance and prediction
accuracies of organization.</p>
    </sec>
    <sec id="sec-7">
      <title>Language dependent features and</title>
      <p>building comprehensive Naïve Bayes</p>
    </sec>
    <sec id="sec-8">
      <title>Classifier</title>
      <p>Language dependent features are used to enhance the
performance of the classifier. We propose couple of
langauge dependent features and they are illustrated
in the below sub-sections.
6.1</p>
      <sec id="sec-8-1">
        <title>Post-position (PSP) feature</title>
        <p>Telugu is highly inflectional and agglutinating
language. The way lexical forms get generated in
Telugu are different. Words are formed by productive
derivation and inflectional suffixes to roots or stems as
explained in Section 3.2. Some of the PSP markers in
Telugu are (lO), (ku), (ki) etc. We propose
a boolean feature whose value is 1 if a Proper noun
(NNP) is followed by a postpostion otherwise 0. The
statistics of PSP following the NEs are shown in Table
3. We build a Naïve Bayes Classifier with contextual
word and POS features along with PSP feature and
average accuracies of several runs are shown in Table
4.
Clue words plays an important role for identifying
NEs. In this work we considered clue words for
recognizing organization. Since organizations is a
multiword and they tend to end with few suffixes like మం
డ (Council), సంఘం(Company), సంఘం (Community),
సమ (Federation), క (Club) etc. We build a Naïve
Bayes Classifier with contextual word and POS
features, PSP feature along with Clue word feature and
average accuracies of several runs are shown in Table
5.</p>
        <p>Since the list of suffixes are as exhaustive as
possible for Telugu names, we would expect predominant
increase in accuracy for organization. But that is not
the case here because the words like ‘సంఘం
(Community) ’are tagged in the corpus as organization and not
a named entity equal number of times. Hence, there
is not much of improvement in accuracies.
In this section, we explain the process of building
Gazetteer for NEs from Wikipedia. Wikipedia keeps
up the list of categories for each of its title. For
example, the Wikipedia categories are ‘Educational
institutions established in 1926’, Companies listed on the
‘Bombay Stock Exchange’ refer to the names of
Organization whereas ‘Living people’, ‘Player’ refer to
Person whereas ‘States and territories’, ‘City-states’
refer to Location.</p>
        <p>The following are steps for constructing gazetteers:
Initially we manually constructed a list of seeds
for Person, Location and Organization. We then
search each seed in Wikipedia and extract the
categories in order to construct the list of categories
(category_ list) for each NE class.</p>
        <p>In order to resolve ambiguity we remove the
categories that are present in more than one NE
class in the category list and call it as Unique_
category_ lists. For example the category list
may contain ‘actor’, ‘engineer’and ‘famous’ for
NE Person and ‘city’, ‘street’ and ‘famous’ for NE
Location. The category label ‘famous’ is removed
because it is present in both NE Person and
Location
We extract list of Wikipedia titles using Telugu
Wikipedia dump 4.</p>
        <p>Then we start searching the category labels in
Unique_ category_ lists of each NE class in
Wikipedia dump. The Unique_ category_ lists
having maximum matches is assigned as NE class
for that NE.</p>
        <p>We have generated a list of 7,593 Person names, 4,791
Location names, and 254 Organizations after the
following the above mentioned procedure.</p>
        <p>Example for Person name ‘Mahendra Singh Dhoni
(మందం )’ belongs to categories (వ in
Telugu) as shown in Figure 1 5.</p>
        <p>4https://dumps.wikimedia.org/tewiki/
5https://te.wikipedia.org/wiki/మందం
_</p>
        <p>The gazetteer feature enhanced the performance
accuracies and the results are shown in Table 6. The
F1-Score of Organization has been increased by 19%
and there are impressive improvements for other NEs
as well.
7</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>Conclusion</title>
      <p>In this paper, we have attempted to classify Named
Entities in Telugu News articles using Naïve Bayes
classifier. The prediction accuracies of learning
models have been enhanced significantly after the data
being morphologically preprocessed as proposed in this
work. The language dependent features, proposed in
this paper, improve prediction accuracies where in a
notable increase of 26% in the F1-score of Organization
is observed. The comprehensive learning model built
with contextual words and their parts of speech along
with proposed language dependent features achieved
an overall average F1-Score of 88.87% for Person,
87.32% for Location and 72.69% for Organization.
[ANC08]</p>
      <p>Andrew Arnold, Ramesh Nallapati, and
William W. Cohen. Exploiting feature
hierarchy for transfer learning in named
entity recognition. In Proceedings of ACL-08:
[ARG08]
[CN02]
[CS99]
[EB08]
[EB09]
[FM09]
[GS96]</p>
      <p>Johannes Knopp. Extending a multilingual
lexical resource by bootstrapping named
entity classification using wikipedia’s
category system. In Proceedings of the Fifth
International Workshop On Cross Lingual
Information Access, pages 35–43. Asian
Federation of Natural Language
Processing, 2011.</p>
      <p>Jun’ichi Kazama and Kentaro Torisawa. A
new perceptron algorithm for sequence
labeling with non-local features. In
Proceedings of the 2007 Joint Conference on
Empirical Methods in Natural Language
Processing and Computational Natural
Language Learning (EMNLP-CoNLL), 2007.</p>
      <p>Behrang Mohit and Rebecca Hwa.
Syntaxbased semi-supervised named entity
tagging. In Proceedings of the ACL Interactive
Poster and Demonstration Sessions, pages
57–60. Association for Computational
Linguistics, 2005.
[SGPV08] Praneeth M Shishtla, Karthik Gali, Prasad
Pingali, and Vasudeva Varma.
Experiments in telugu ner: A conditional
random field approach. In Proceedings of
the IJCNLP-08 Workshop on Named
Entity Recognition for South and South East</p>
      <p>Asian Languages, 2008.
[SM08]
[TC02]</p>
      <p>P. Srikanth and Kavi Narayana Murthy.</p>
      <p>Named entity recognition for telugu. In
Proceedings of the IJCNLP-08 Workshop
on Named Entity Recognition for South and</p>
      <p>South East Asian Languages, 2008.
[SZS+04] Dan Shen, Jie Zhang, Jian Su, Guodong
Zhou, and Chew-Lim Tan.
Multi-criteriabased active learning for named entity
recognition. In Proceedings of the 42nd
Annual Meeting of the Association for
Computational Linguistics (ACL-04), 2004.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [AHB+95]
          <string-name>
            <surname>Douglas</surname>
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Appelt</surname>
          </string-name>
          ,
          <string-name>
            <surname>Jerry R. Hobbs</surname>
            , John Bear, David Israel,
            <given-names>Megumi</given-names>
          </string-name>
          <string-name>
            <surname>Kameyama</surname>
            , Andy Kehler, David Martin,
            <given-names>Karen</given-names>
          </string-name>
          <string-name>
            <surname>Myers</surname>
            , and
            <given-names>Mabry</given-names>
          </string-name>
          <string-name>
            <surname>Tyson</surname>
          </string-name>
          .
          <article-title>Sri international fastus systemmuc-6 test results and analysis</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <source>In Sixth Message Understanding Conference (MUC-6): Proceedings of a Conference Held in Columbia, Maryland, November 6-8</source>
          ,
          <year>1995</year>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>HLT</surname>
          </string-name>
          , pages
          <fpage>245</fpage>
          -
          <lpage>253</lpage>
          . Association for Computational Linguistics,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <article-title>Combining terminology resources and statistical methods for entity recognition: an evaluation</article-title>
          .
          <source>In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)</source>
          , Marrakech, Morocco, may
          <year>2008</year>
          .
          <article-title>European Language Resources Association (ELRA).</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>In COLING 2002: The 19th International Conference on Computational Linguistics</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Michael</given-names>
            <surname>Collins</surname>
          </string-name>
          and
          <string-name>
            <given-names>Yoram</given-names>
            <surname>Singer</surname>
          </string-name>
          .
          <article-title>Unsupervised models for named entity classification</article-title>
          .
          <source>In 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora</source>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <article-title>Bengali named entity recognition using support vector machine</article-title>
          .
          <source>In Proceedings of the IJCNLP-08 Workshop on Named Entity Recognition for South and South East Asian Languages</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <article-title>A conditional random field approach for named entity recognition in bengali and hindi</article-title>
          .
          <source>Linguistic Issues in Language Technology</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>44</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Manning</surname>
          </string-name>
          .
          <article-title>Nested named entity recognition</article-title>
          .
          <source>In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing</source>
          , pages
          <fpage>141</fpage>
          -
          <lpage>150</lpage>
          . Association for Computational Linguistics,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Ralph</given-names>
            <surname>Grishman</surname>
          </string-name>
          and
          <string-name>
            <given-names>Beth</given-names>
            <surname>Sundheim</surname>
          </string-name>
          .
          <article-title>Message understanding conference- 6: A brief history</article-title>
          .
          <source>In COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics</source>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <source>[Kno11] [KT07] [MH05] [Sar08] Conference Held in Fairfax, Virginia, April 29 - May 1</source>
          ,
          <year>1998</year>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [HGA+98]
          <string-name>
            <given-names>K.</given-names>
            <surname>Humphreys</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Gaizauskas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Azzam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Huyck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mitchell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Cunningham</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wilks</surname>
          </string-name>
          . University of sheffield:
          <article-title>Description of the lasie-ii system as used for muc-7</article-title>
          . In Seventh Message Understanding Conference (
          <article-title>MUC-7): Proceedings of a Koichi Takeuchi and Nigel Collier</article-title>
          .
          <article-title>Use of support vector machines in extended named entity recognition</article-title>
          .
          <source>In COLING-02: The 6th Conference on Natural Language Learning</source>
          <year>2002</year>
          (CoNLL-2002),
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>