=Paper= {{Paper |id=Vol-2006/paper048 |storemode=property |title=INFORMed PA: A NER for the Italian Public Administration Domain |pdfUrl=https://ceur-ws.org/Vol-2006/paper048.pdf |volume=Vol-2006 |authors=Lucia Passaro,Alessandro Lenci,Anna Gabbolini |dblpUrl=https://dblp.org/rec/conf/clic-it/PassaroLG17 }} ==INFORMed PA: A NER for the Italian Public Administration Domain== https://ceur-ws.org/Vol-2006/paper048.pdf
    INFORMed PA: A NER for the Italian Public Administration Domain
                    Lucia C. Passaro? , Alessandro Lenci? , Anna Gabbolini??
         ?
             Dipartimento di Filologia, Letteratura e Linguistica, University of Pisa (Italy)
                           ??
                              ETI3 | Evolution, Technology & Innovation
                               lucia.passaro@for.unipi.it
                                alessandro.lenci@unipi.it
                                  anna.gabbolini@eti3.it


                     Abstract                              In this paper, we focus on Named Entity Recog-
                                                        nition (NER) for PA. Several approaches have
    English. In this paper, we illustrate the           been proposed in literature including Rule-based,
    creation of a NER for the Public Ad-                Machine Learning-based and Hybrid methods.
    ministration (PA) domain. We discuss
                                                           Hand-made Rule-based NERs focus on extract-
    the creation of an annotated corpus with
                                                        ing names using lots of human-made rules. In
    documents from the Italian Albo Pretorio
                                                        general, these systems consist of a set of patterns
    Nazionale and provide results of the sys-
                                                        based on grammatical (e.g., part of speech), syn-
    tem evaluation.
                                                        tactic (e.g., word precedence) and orthographic
    Italiano. In questo lavoro mostriamo la             features (e.g., capitalization) in combination with
    creazione di un NER per il dominio della            dictionaries (Budi and Bressan, 2003; Appelt et
    Pubblica Amministrazione (PA). Presenti-            al., 1993; Grishman, 1995). These approaches
    amo la creazione del corpus formato da              usually give good results, but require long devel-
    documenti dell’Albo Pretorio Nazionale e            opment time by expert linguists. On the one hand,
    mostriamo i risultati della valutazione del         these systems have better results for restricted do-
    sistema.                                            mains, being capable of detecting very complex
                                                        entities, but, on the other one, they lack portability
                                                        and robustness and do not necessarily adapt well
1   Introduction                                        to new domains and languages.
In the Public Administration (PA) domain, the              Machine learning techniques, on the contrary,
rapid adoption of the new legislation about the         use a collection of annotated documents for train-
governance transparency has been forcing Italian        ing the classifiers. Therefore the development time
municipalities to produce their acts in a digital       moves from the definition of rules to the prepa-
form and to make them available for both citizens       ration of annotated corpora (Bikel et al., 1997;
and authorities. However, the acts delivered by         Borthwick et al., 1998; McCallum and Li, 2003).
PAs are typically in a free-text electronic format,     The systems identify and classify nouns using ma-
which is not convenient for searching, decision-        chine learning algorithms such as Maximum En-
support, and data analysis. Therefore, the de-          tropy (Berger et al., 1996), Support Vector Ma-
velopment of NLP tools to extract high-quality          chines (Cortes and Vapnik, 1995) and Conditional
structured information, including Named Entities        Random Field (Lafferty et al., 2001). More re-
(NEs) such as Persons and Organizations, repre-         cently, also deep learning architectures have been
sents a key factor to enable the access to the wealth   proposed for Named Entity Recognition (Chiu and
of information produced by PAs, and a crucial step      Nichols, 2015; Strubell et al., 2017).
in turning the keyword of “transparency” into re-          Finally, Hybrid NER systems, combine rule-
ality. The potentialities of NLP tools can be ex-       based and machine learning-based methods, and
ploited to mine the large document repositories         make new methods using strongest points from
produced by PA daily, with the aim of identifying       each method (Srihari et al., 2000).
trends in their activity, suggesting possible syner-       Existing general purpose Italian corpora anno-
gies to increase their efficiency, and raising “red     tated with NEs such as I-CAB (Magnini et al.,
flags” about suspicious behaviors, especially for       2006) are not optimal for training a NER for the
their relationships with private companies.             domain of PA because of the gap between bu-
reaucratic language and standard Italian, and also             section 3 we describe the adaptation of the system
because of the lack of important classes such as               to PA texts and its performances (section 4.1). In
act and normative references, that are very use-               section 5, we report on the annotation of relations
ful in PA-oriented applications. To tackle these               that we performed on a sample of the corpus and
problems, we decided to create a new corpus                    finally discuss the results and ongoing work.
from scratch starting from: (i) administrative doc-
uments belonging to the Italian Albo Pretorio;                 2     The CoLingLab NER
(ii) the CoLingLab NER, a general NER trained
                                                               The standard Italian CoLingLab NER was trained
on I-CAB, from which we took the initial config-
                                                               on the Italian Content Annotation Treebank (I-
uration of features. The corpus of PA documents
                                                               CAB (Magnini et al., 2006)), a corpus of Italian
written in Italian “bureaucratese”, has the charac-
                                                               news, annotated with semantic information at dif-
teristics described in Brunato (2015):
                                                               ferent levels: Temporal Expressions, Named En-
  1. Pseudo-technicisms or collateral technicisms              tities, relations between entities. I-CAB is com-
     (e.g., balneazione, fattispecie);                         posed of 525 news documents taken from the lo-
  2. Abstract nouns with -zione/-mento suffixes                cal newspaper ‘L’Adige’ (time span: September-
     (e.g., stipulazione, espletamento), deverbal              October of 2004). The NEs annotated in the cor-
     nouns, usually with zero suffix (e.g., suben-             pus are: Locations (L OC), Geo-Political Entities
     tro, scorporo, utilizzo) and denominal verbs              (G PE), Organizations (O RG) and Persons (P ER).
     (e.g., relazionare, disdettare);                              As we said before, this model is unsatisfac-
  3. Archaic terms (e.g., allorché, suddetto) and             tory for the domain of Public Administration in
     latinisms (e.g. una tantum, pro capite);                  two main respects. First, its classes are insuffi-
  4. Forestierisms (e.g., governance, front office);           cient to deal with the type of information in the
  5. Uncommon and formal terms (e.g., diniego                  PA documents, that are full of references to other
     for rifiuto);                                             “linked” acts and legislative reference; second,
  6. Stereotyped phrases (e.g., entro e non oltre,             the language used in these documents is a pecu-
     in riferimento all’oggetto);                              liar and highly complex variant of standard Italian
  7. Abbreviations and acronyms.                               (cf. above). In addition, the performance of the
                                                               model, attested at ∼0.66 of F1-score on a portion
   For the creation of a NER for PA, we decided to             of I-CAB decreases dramatically on the PA doc-
exploit the existing architecture employed for the             uments, reaching a F1-score of ∼0.35. To mea-
project SEMPLICE1 and in particular we adopted                 sure such performances, in the test set we mapped
a statistical method based on the Stanford NER                 ORG PA (cf. below) with ORG, and in the train-
(Finkel et al., 2005), a system implemented in                 ing set we mapped G PE with L OC.
Java and available for download under the GNU
General Public License. This choice allowed us to              3     A NER for PA Documents
easily compare the gain obtained by enriching the              The adaptation of the CoLingLab NER to the PA
training corpus with PA documents and to speed                 domain included the extension of the standard
up the development process. Moreover, using                    NE classes (Rau, 1991; Grishman and Sundheim,
a Conditional Random Field (CRF) (Lafferty et                  1996; Tjong Kim Sang, 2002; Tjong Kim Sang
al., 2001) as learning algorithm made it possible              and De Meulder, 2003) to other entity types par-
for us to compare the PA model with other                      ticularly important in the context of municipali-
domain-adapted NERs (Passaro and Lenci, 2014).                 ties. In particular, we added the class ACT, to
                                                               mark other administrative documents (normally,
   This paper is structured as follows: In section             PA texts refer to other documents related to the
2, we present the CoLingLab NER and we show                    same procedure), the class L AW for the relevant
its performance on a sample of PA documents; in                legislation, and an additional class of organiza-
   1
     The SEMantic instruments for PubLIc administrators
                                                               tions, O RG PA, for municipal departments.
and CitizEns (SEMPLICE; www.semplicepa.it) is a 2-
year project funded by Regione Toscana in collaboration with   3.1    The PA Corpus
IT companies to develop NLP-based tools for knowledge
management, information extraction and opinion mining for      For the creation of the corpus, we used documents
local public administrations.                                  taken from the Albo Pretorio Nazionale with the
aim of capturing the variability of the texts pro-         NEs have been annotated on the CONLL (Nivre
duced by PA. Overall, the corpus includes 460           et al., 2007) texts using the standard IOB method.
documents, for a total of 724,623 tokens, anno-         In order to deal with acts, we decided to tag them
tated with the following NEs: (i) ACT: documents        with different “labels” to distinguish their sub-
belonging to the Albo Pretorio Nazionale, with          components: the type (marked with ACT T), the
their type (optional), number and date: Determina       number (marked with ACT N), the date (marked
n. 4 del 12/02/2011; (ii) L AW: legislative refer-      with ACT D), functional tokens ( ACT X) and un-
ences: art. 183 comma 7 del D.Lgs. n. 267/2000;         parsable tokens (marked wirth ACT U). For exam-
(iii) L OC: locations and geo-political entities: Co-   ple, the act Delibera di giunta comunale numero
mune di Pisa; (iv) O RG PA: organizations related       53 del 23/10/2016 is annotated as follows: Delib-
to the Public Administration such as municipal          era di giunta comunale (ACT T) numero (ACT X)
Departments: Sezione Anagrafe; (v) O RG: organi-        53 (ACT N) del (ACT X) 23/10/2016 (ACT D),
zations: Consip Spa; (vi) P ER: physical persons.       while the act DD/67/2012 is annotated as ACT U.
The corpus has been linguistically annotated by         This method allows for a simpler normalization of
means of a pipeline of general purpose NLP tools        normative references, which is crucial for docu-
and in particular, it has been POS-tagged with the      ment retrieval because of the high variability of
Part-Of-Speech tagger described in Dell’Orletta         law mentions in the PA texts.
(2009), dependency parsed with the DeSR parser             The inter-annotator agreement between two an-
(Attardi et al., 2009). Finally, complex terms like     notators (attested at ∼0.8) has been calculated us-
forze dell’ordine (security force) have been identi-    ing the Cohen’s K index on a sample of 25 docu-
fied using the EXTra term extraction tool (Passaro      ments of 25 different municipalities, for a total of
and Lenci, 2016).                                       26,190 tokens.

3.2   Annotation                                        4     System Overview

NE annotation has been performed by means of            To train the NER, no information from gazetteers
an incremental process: first 100 documents have        was used. The model includes the following
been annotated by 2 annotators (one of them was         groups of features:
a domain expert). In a second phase we trained
                                                        S EQUENCES : Next and previous words and a
a CRF model on these documents and we used it
                                                            window of 6 words (3 preceding and 3 fol-
to automatically annotate new documents. Finally,
                                                            lowing the target word) and their classes;
we identified the most common errors of the clas-
                                                        N-G RAMS : Character-level features, i.e., sub-
sifier and two new annotators manually revised the
                                                            strings of the word with a maximum length
output. This process has been repeated for each
                                                            of 6 letters;
group of 100 documents up to covering the whole
                                                        O RTHOGRAPHY: “word shape” features such
corpus that includes 460 distinct documents. The
                                                            as spelling, capital letters, presence of
average length of the documents is 1,575.26 to-
                                                            non–alphabetical characters etc.;
kens and the total number of the tokens is 724,623.
                                                        L INGUISTIC FEATURES : The word position in
Figure 1 shows the distribution of the different NE
                                                            the sentence (numeric attribute), the lemma,
classes in the corpus.
                                                            and the PoStag (nominal attribute);
                                                        T ERMS : We employed complex terms as features
                                                            to train the model. Terms have been extracted
                                                            with EXTra (Passaro and Lenci, 2016).

                                                        4.1    System’s Performances
                                                        We trained the CRF model based on the CoL-
                                                        ingLab NER on the annotated PA corpus, and we
                                                        tested its performances first with cross-validation
                                                        and then on a sample of new 25 documents of
                                                        25 different municipalities. This choice stems
                                                        from the fact that very often different municipal-
 Figure 1: Distribution of the NEs in the corpus.
                                                        ities tend to use different templates and different
ways to refer to particular entities. This is partic-                    Precision    Recall     F1-Score
ularly common in some NE classes such as ACTS            ACT              0.9747      0.8477      0.9068
and O RG PA, that vary a lot across municipalities.      LAW              0.9494      0.9615      0.9554
For example, some of the analyzed texts contain          LOC               0.799      0.6913      0.7413
strings of the form YYYY/G/NNNNN to refer to the         ORG              0.8017      0.7686      0.7848
acts, where the number is actually a string encod-       ORG PA           0.8706      0.7957      0.8315
ing both the date (year: YYYY), a code for the           PER              0.9142      0.8694      0.8912
type (G) and the number of the act (NNNNN).              MicroAVG          0.914      0.8355       0.873
Other municipalities instead adopt a less strictly       MacroAVG         0.8849      0.8224      0.8518
codified pattern to indicate the act such as Type of
act, number N* of DD/MM/YYYY. Likewise, de-             Table 2: System results (on a sample of 25 texts)
pending on the writing style (and conventions) of
the municipalities, the various departments (i.e.,
O RG PA) can include both strings like Corpo dei
Vigili Urbani and codes like Tec-01/ICT. To eval-
uate the system performance with respect to the
variation of the naming conventions adopted by
different municipalities, we randomly selected 25
municipalities and one document for each of them
balanced for length.
   Table 1 reports on the results obtained in cross
validation and Table 2 shows the performance on
the sample of 25 documents. Figure 2 shows also
the confusion matrix for that sample.
   In order to investigate the contribution of non-
linguistic features, we performed ablation experi-            Figure 2: Confusion Matrix (25 texts)
ments and we tested the results on the sample of
25 documents. The ∆F1-Score for such groups is          PART O F: the relation of hyponymy, which can
as follows: S EQUENCES: 3%; N-G RAMS: 1%;                     occur between: (i) two locations (e.g. a
O RTHOGRAPHY: 4%. In addition, we performed                   Municipality in Province); (ii) two organiza-
an additional experiment by training the NER on a             tions (e.g. a participated into a holding com-
combination of I-CAB and the PA documents. In                 pany); (iii) a person and an organization (e.g.
this case, we noticed a ∆F1-Score of 2% by re-                a member of an organization). Implicit at-
spect to the original model.                                  tribute for this reation is “work in”.
                                                        L OCATION: an entity placed into a particular lo-
                 Precision    Recall     F1-Score             cation, occurring between: (i) an organiza-
    ACT           0.7876      0.8914      0.8356              tion and a location (e.g. an organization lo-
    LAW            0.827      0.8423      0.8343              cated in a certain region). Possible attributes
    LOC            0.702      0.7398      0.7196              for this relation are “work in” and “placed
    ORG           0.7085       0.689      0.6977              in”; (ii) a person and and a location (e.g. a
    ORG PA        0.6158      0.7774      0.6855              person living in a particular area). Possi-
    PER           0.8373      0.8776      0.8567              ble attributes are “work in”, “born in” and
    MacroAVG      0.7464      0.8029      0.7716              “placed in”.
                                                        I S R ELATED T O: an underspecified relation be-
Table 1: System results (10-fold cross validation)            tween any entity pair.

5    Towards a Relational Classifier for PA                Preliminary experiments have been performed
                                                        to examine the characteristics of an automatic
For a subset of the corpus, we also annotated           classifier for extracting relations from administra-
the semantic relations occurring between two en-        tive acts, and the performance seem to be very
tities in the domain of the PA, using the following     promising, despite the size of the training set,
scheme:                                                 which includes in total 100 documents so far. The
extension of the annotated corpus and the training      word embeddings. Moreover, we will focus on the
of the relational classifier are currently ongoing.     development of classifiers for Relation Extraction
                                                        and Entity Linking.
6   Discussion
                                                        Acknowledgments
The results show that the NER reaches satisfac-
tory results for most of the classes, although leg-     This research has been supported from the Project
ging behind in the recognition of PA Organiza-          SEMantic instruments for PubLIc administrators
tions, which, among others, tend to have a higher       and CitizEns (SEMPLICE), funded by Regione
formal variability, including for example both en-      Toscana, and the Company ETI3 | Evolution, Tech-
tities like Corpo dei Vigili Urbani and Tec-01/ICT.     nology & Innovation. Special acknowledgements
Moreover, in the recognition of Location names in       go to Roberto Battistelli and Francesco Sandrelli
the domain of the PA, the system is expected to de-     (ETI3 ) for support, and to the students Roswita
tect entities with a non-standard detail level going    Candusso, Carmela Cinquesanti, Federica Sem-
from the name of the municipalities (e.g. Comune        plici and Ludovica Vasile for manual annotation.
di Pisa) to very detailed addresses (e.g., via S.
Maria n. 36, 56126 Pisa (PI) interno 15). A simi-
lar problem occurs in the recognition of very small     References
organizations, whose name contains the name of          Douglas E. Appelt, Jerry R Hobbs, John Bear, David
its founder (i.e., Mario Rossi snc). In these cases,      Israel, and Mabry Tyson. 1993. Fastus: A finite-
especially when snc is omitted, the system predicts       state processor for information extraction from real-
                                                          world text. In IJCAI, volume 93, pages 1172–1178.
the class P ER instead of the correct class O RG. We
are confident that adding lexicons and gazetteers       Giuseppe Attardi, Felice Dell’Orletta, Maria Simi, and
will improve the identification of entities of this       Joseph Turian. 2009. Accurate dependency parsing
kind, but it could be interesting to investigate au-      with a stacked multilayer perceptron. In EVALITA
                                                          2009 - Evaluation of NLP and Speech Tools for Ital-
tomatic normalization, disambiguation and entity          ian 2009, LNCS, Reggio Emilia (Italy). Springer.
linking approaches (Hoffart et al., 2011; Han et al.,
2011).                                                  Adam L. Berger, Vincent J. Della Pietra, and Stephen
                                                          A. Della Pietra. 1996. A maximum entropy ap-
                                                          proach to natural language processing. Computa-
7   Conclusions and Ongoing Work                          tional linguistics, 22(1):39–71.
Named entities play an important role in admin-         Daniel M. Bikel, Scott Miller, Richard Schwartz, and
istrative acts, especially in those - like the docu-      Ralph Weischedel. 1997. Nymble: A high-
ments in the Albo Pretorio - describing the main          performance learning name-finder. In Proceedings
actions taken by Municipalities. This kind of in-         of the Fifth Conference on Applied Natural Lan-
                                                          guage Processing, pages 194–201, Washington, DC.
formation is very useful to fullfil the obligations       Association for Computational Linguistics.
related to supervisory monitoring, disclosure, pe-
riodic self-assessment, and review of the govern-       Andrew Borthwick, John Sterling, Eugene Agichtein,
ment decisions.                                           and Ralph Grishman. 1998. Nyu: Description of the
                                                          mene named entity system as used in muc-7. In In
   In this paper, we presented a NER for PA that          Proceedings of the Seventh Message Understanding
shows a significant ability to identify the relevant      Conference (MUC-7.
entities, and in particular legislative reference and
connected acts. It is important to stress the lexical   Dominique Brunato. 2015. A study on linguistic com-
                                                          plexity from a computational linguistics perspective.
and syntactic complexity of bureaucratic language         a corpus-based investigation of italian bureaucratic
represents a big challenge for NLP tools and meth-        texts. Ph.D. Thesis, University of Siena.
ods. Such a complexity derives from the techni-
                                                        Indra Budi and Stéphane Bressan. 2003. Association
cal lexis of other domain-specific languages with
                                                           rules mining for name entity recognition. In Web In-
which PA deals daily, such as education, environ-          formation Systems Engineering, 2003. WISE 2003.
ment, ICT technologies, public health and so on.           Proceedings of the Fourth International Conference
In near feature we plan to explore the possibility         on, pages 325–328. IEEE.
of re-engineering our system to take advantage of       Jason P.C. Chiu and Eric Nichols. 2015. Named en-
new algorithms for entity extraction such as neu-          tity recognition with bidirectional lstm-cnns. arXiv
ral networks and in particular from character level        preprint arXiv:1511.08308.
Corinna Cortes and Vladimir Vapnik. 1995. Support-        Joakim Nivre, Johan Hall, Sandra Kübler, Ryan Mc-
  vector networks. Mach. Learn., 20(3):273–297.             Donald, Jens Nilsson, Sebastian Riedel, and Deniz
                                                            Yuret. 2007. The conll 2007 shared task on de-
Felice Dell’Orletta. 2009. Ensemble system for part-        pendency parsing. In Proceedings of the CoNLL
  of-speech tagging. In EVALITA 2009 - Evaluation           Shared Task Session of EMNLP-CoNLL 2007, pages
  of NLP and Speech Tools for Italian 2009, LNCS,           915–932, Prague (Czech Republic). Association for
  Reggio Emilia (Italy). Springer.                          Computational Linguistics.

Jenny Rose Finkel, Trond Grenager, and Christopher        Lucia C. Passaro and Alessandro Lenci. 2014. ”il pi-
   Manning. 2005. Incorporating non-local informa-          ave mormorava...”: Recognizing locations and other
   tion into information extraction systems by gibbs        named entities in italian texts on the great war.
   sampling. In Proceedings of the 43rd Annual Meet-        In Proceedings of the First Italian Conference on
   ing on Association for Computational Linguistics,        Computational Linguistics CLiC-it 2014 & and of
   ACL ’05, pages 363–370, Ann Arbor, Michigan              the Fourth International Workshop EVALITA 2014,
   (USA). Association for Computational Linguistics.        pages 286–290, Pisa (Italy).

                                                          Lucia C. Passaro and Alessandro Lenci. 2016. Ex-
Ralph Grishman and Beth Sundheim. 1996. Mes-                tracting terms with extra. In Proceedings of the EU-
  sage understanding conference-6: A brief history.         ROPHRAS 2015 – Computerised and Corpus-based
  In Proceedings of the 16th conference on Compu-           Approaches to Phraseology: Monolingual and
  tational linguistics-Volume 1, pages 466–471. Asso-       Multilingual Perspectives, pages 188–196, Malaga
  ciation for Computational Linguistics.                    (Spain).
Ralph Grishman. 1995. The nyu system for muc-6            Lisa F. Rau. 1991. Extracting company names from
  or where’s the syntax? In Proceedings of the 6th           text. In Artificial Intelligence Applications, 1991.
  Conference on Message Understanding, pages 167–            Proceedings., Seventh IEEE Conference on, vol-
  175, Columbia, Maryland. Association for Compu-            ume 1, pages 29–32. IEEE.
  tational Linguistics.
                                                          Rohini Srihari, Cheng Niu, and Wei Li. 2000. A hy-
Xianpei Han, Le Sun, and Jun Zhao. 2011. Collective         brid approach for named entity and sub-type tag-
  entity linking in web text: A graph-based method.         ging. In Proceedings of the Sixth Conference on
  In Proceedings of the 34th International ACM SIGIR        Applied Natural Language Processing, ANLC ’00,
  Conference on Research and Development in Infor-          pages 247–254, Seattle, Washington (USA). Asso-
  mation Retrieval, pages 765–774, Beijing (China).         ciation for Computational Linguistics.
                                                          Emma Strubell, Patrick Verga, David Belanger, and
Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bor-
                                                            Andrew McCallum. 2017. Fast and accurate en-
  dino, Hagen Fürstenau, Manfred Pinkal, Marc Span-
                                                            tity recognition with iterated dilated convolutions.
  iol, Bilyana Taneva, Stefan Thater, and Gerhard
                                                            In Proceedings of the 2017 Conference on Empiri-
  Weikum. 2011. Robust disambiguation of named
                                                            cal Methods in Natural Language Processing, pages
  entities in text. In Proceedings of the Conference on
                                                            2660–2670.
  Empirical Methods in Natural Language Process-
  ing, pages 782–792, Edinburgh (United Kingdom).         Erik F. Tjong Kim Sang and Fien De Meulder.
  Association for Computational Linguistics.                 2003. Introduction to the conll-2003 shared task:
                                                             Language-independent named entity recognition. In
John Lafferty, Andrew McCallum, and Fernando C. N.           Proceedings of the seventh conference on Natural
  Pereira. 2001. Conditional random fields: Prob-            language learning at HLT-NAACL 2003-Volume 4,
  abilistic models for segmenting and labeling se-           pages 142–147. Association for Computational Lin-
  quence data. In Proceedings of the Eighteenth In-          guistics.
  ternational Conference on Machine Learning, pages
  282–289, San Francisco, CA (USA). Morgan Kauf-          Erik F. Tjong Kim Sang. 2002. Introduction to
  mann Publishers Inc.                                       the conll-2002 shared task: language-independent
                                                             named entity recognition. In Proceedings of the
Bernardo Magnini, Emanuele Pianta, Manuela Sper-             6th conference on Natural language learning, vol-
  anza, Valentina Bartalesi Lenzi, and Rachele Sprug-        ume 31, pages 1–4.
  noli. 2006. Italian content annotation bank (i-cab):
  Named entities.

Andrew McCallum and Wei Li. 2003. Early results for
  named entity recognition with conditional random
  fields, feature induction and web-enhanced lexicons.
  In Proceedings of the seventh conference on Natu-
  ral language learning at HLT-NAACL 2003-Volume
  4, pages 188–191, Edmonton (Canada). Association
  for Computational Linguistics.