<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Contextual Representations and Semi-Supervised Named Entity Recognition for Portuguese Language</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Pedro</forename><surname>Vitor Quinta De Castro</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Universidade Federal de Goiás</orgName>
								<address>
									<postCode>GO, 74690-900</postCode>
									<settlement>Goiânia</settlement>
									<country key="BR">Brazil</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Nádia</forename><surname>Félix</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Universidade Federal de Goiás</orgName>
								<address>
									<postCode>GO, 74690-900</postCode>
									<settlement>Goiânia</settlement>
									<country key="BR">Brazil</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Felipe</forename><surname>Da Silva</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Universidade Federal de Goiás</orgName>
								<address>
									<postCode>GO, 74690-900</postCode>
									<settlement>Goiânia</settlement>
									<country key="BR">Brazil</country>
								</address>
							</affiliation>
						</author>
						<author role="corresp">
							<persName><forename type="first">Anderson</forename><surname>Da</surname></persName>
							<email>iii.anderson@inf.ufg.br</email>
							<affiliation key="aff0">
								<orgName type="institution">Universidade Federal de Goiás</orgName>
								<address>
									<postCode>GO, 74690-900</postCode>
									<settlement>Goiânia</settlement>
									<country key="BR">Brazil</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Silva</forename><surname>Soares</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Universidade Federal de Goiás</orgName>
								<address>
									<postCode>GO, 74690-900</postCode>
									<settlement>Goiânia</settlement>
									<country key="BR">Brazil</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Contextual Representations and Semi-Supervised Named Entity Recognition for Portuguese Language</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">B711FD890A284C3670442838F44A8A43</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T16:58+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Natural Language Processing</term>
					<term>Named Entity Recognition</term>
					<term>Deep Learning</term>
					<term>Neural Networks</term>
					<term>Portuguese Language</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Named Entity Recognition is a Natural Language Processing task which is difficult to adapt across different domains. In this work, we propose a Semi-Supervised approach using Deep Learning models in order to support three different domains for the Portuguese language: general, police and medical. We perform the self-training of a model with an architecture based on a Bidirectional Long Short-Term Memory network with a Conditional Random Fields sequential classifier, using five Portuguese corpora. The word representations of the proposed model are contextual and provided by ELMo's language model. The results achieve a competitive performance in the IberLEF evaluation forum.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Information Extraction (IE) is the process of obtaining structured data from sources which can not be interpreted directly by machines, like texts <ref type="bibr" target="#b22">[23]</ref>. This is particularly important considering the amount of textual information which is exchanged every minute on the internet <ref type="bibr" target="#b33">[34]</ref>. Named Entity Recognition (NER) is the Natural Language Processing (NLP) task which focus on identifying and classifying named entities from this unstructured textual information, making them interpretable and accessible to different communication channels.</p><p>When dealing with multiple domains, a NER prediction model needs to be able to handle not only the difference of lexicon between them, but also the difference of morphological features. This adds an additional layer of complexity to this task, requiring a more scalable model to perform well in this challenge.</p><p>This paper describes our participation in IberLEF (Iberian Languages Evaluation Forum), Task 1: Named Entity Recognition <ref type="bibr" target="#b30">[31]</ref>. We present a system</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related Work</head><p>The first Deep Learning architectures to be applied in NER models were based on CNNs <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b31">32]</ref>, and later on Recurring Neural Networks (RNN) <ref type="bibr" target="#b8">[9,</ref><ref type="bibr" target="#b10">11,</ref><ref type="bibr" target="#b3">4,</ref><ref type="bibr" target="#b16">17,</ref><ref type="bibr" target="#b21">22]</ref>. The reason why Deep Learning models perform well on NLP tasks is because they learn latent features from words, as well as the interactions between them, during the training of specific tasks, such as NER.</p><p>Collobert et al. <ref type="bibr" target="#b4">[5]</ref> proposed a model based on a Multilayer Perceptron with a convolutional layer, and the following works for NER were mostly based on bidirectional LSTMs, with a few differences between them. Huang et al. <ref type="bibr" target="#b10">[11]</ref> used a biLSTM-CRF network with manually selected features, combined with features from SENNA <ref type="bibr" target="#b4">[5]</ref> word embeddings. Chiu and Nicols <ref type="bibr" target="#b3">[4]</ref> used a biLSTM model without the CRF layer for classification, and had their best results with character level features extracted from a CNN layer, concatenated with SENNA embeddings. Lample et al. <ref type="bibr" target="#b16">[17]</ref> and Ma and Hovy <ref type="bibr" target="#b21">[22]</ref> used similar approaches based on biLSTM-CRF models, with the difference that <ref type="bibr" target="#b16">[17]</ref> used a biLSTM to extract character level features, combined with Word2Vec <ref type="bibr" target="#b23">[24]</ref> representations, while <ref type="bibr" target="#b21">[22]</ref> used a CNN to extract the character level features, that were combined with GloVe <ref type="bibr" target="#b28">[29]</ref> embeddings. These works show that biLSTM-CRF networks became a standard architecture for NER models (as well as for other NLP sequential classification tasks). Following works focused on representation of the words, instead of the actual NER model. Language models have been the primary architecture for contextualized word representations.</p><p>Peters et al. <ref type="bibr" target="#b29">[30]</ref>, Devlin et al. <ref type="bibr" target="#b6">[7]</ref> and Akbik et al. <ref type="bibr" target="#b0">[1]</ref> developed different architectures for contextual word representations based on bidirectional language models and evaluated their performance on the NER task (as well as on other NLP tasks). Both papers, <ref type="bibr" target="#b29">[30]</ref> and <ref type="bibr" target="#b0">[1]</ref> used a biLSTM-CRF baseline NER model for evaluating their representation models, while <ref type="bibr" target="#b6">[7]</ref> evaluated his model by adding a neural layer to the language model, performing the NER classification with it. The ELMo (Embeddings from Language Model) representations from <ref type="bibr" target="#b29">[30]</ref> are provided by the biLM language model, which is based on 2 biLSTM networks, with 2 layers each, and the model's input is a character level representation provided by a CNN network. In another way, <ref type="bibr" target="#b6">[7]</ref> created BERT, a language model based on the Transformer <ref type="bibr" target="#b35">[36]</ref> architecture, which is based only on the neural mechanism of attention. The author from <ref type="bibr" target="#b0">[1]</ref> created a language model on character level, in a way that his objective was not to predict words, but characters. The architecture of his CharLM model is also based on a biLSTM network. Table <ref type="table">1</ref> lists the models presented on this section with their respective F-Score performance on the English benchmark from CoNLL-2003 <ref type="bibr" target="#b34">[35]</ref>.</p><p>For Portuguese language, the first work that used a Deep Learning approach was from Dos Santos and Guimarães <ref type="bibr" target="#b31">[32]</ref>, who adapted the architecture from <ref type="bibr" target="#b4">[5]</ref> Work Benchmark F-Score Year Akbik et al. <ref type="bibr" target="#b0">[1]</ref> CoNLL  <ref type="bibr" target="#b5">[6]</ref> HAREM-Tot 69.14% 2018 Chiu and Nichols <ref type="bibr" target="#b3">[4]</ref> CoNLL-2003 91.62% 2016 Ma and Hovy <ref type="bibr" target="#b21">[22]</ref> CoNLL-2003 91.21% 2016 Lample et al. <ref type="bibr" target="#b16">[17]</ref> CoNLL-2003 90.94% 2016 Huang et al. <ref type="bibr" target="#b10">[11]</ref> CoNLL-2003 90.10% 2015 Dos Santos e Guimarães <ref type="bibr" target="#b31">[32]</ref> HAREM-Sel 71.23% 2018 HAREM-Tot 65.41% Collobert et al. <ref type="bibr" target="#b4">[5]</ref> CoNLL-2003 89.59% 2011 Table <ref type="table">1</ref>. NER models using Deep Learning architectures for English and Portuguese languages, both evaluated using the CoNLL script <ref type="bibr" target="#b34">[35]</ref>. The English language results are reported on the CoNLL-2003 <ref type="bibr" target="#b34">[35]</ref> benchmark, and the Portuguese ones are reported on the HAREM <ref type="bibr" target="#b32">[33]</ref> benchmark. and proposed CharWNN. For this work, besides using character level features from CNN, the authors also used word embeddings that were pre-trained using the Word2Vec tool <ref type="bibr" target="#b37">[38]</ref>. Da Costa and Paetzold <ref type="bibr" target="#b5">[6]</ref> and Quinta de Castro et al. <ref type="bibr" target="#b2">[3]</ref> used a BiLSTM-CRF architecture with minor differences between them. <ref type="bibr" target="#b5">[6]</ref> concatenated character level features from a BiLSTM network with FastText <ref type="bibr" target="#b12">[13]</ref> word embeddings, prior to passing this concatenation through another BiLSTM network. <ref type="bibr" target="#b2">[3]</ref> used a similar approach from <ref type="bibr" target="#b16">[17]</ref> and concatenated the character level features from a BiLSTM network with the representations of a second BiLSTM, which processed pre-trained Wang2Vec <ref type="bibr" target="#b19">[20]</ref> embeddings.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Proposed Model</head><p>In this work, we propose a system based on different deep learning architectures, similar to that was used by <ref type="bibr" target="#b29">[30]</ref>: a Bidirectional Long Short-Term Memory (BiL-STM) <ref type="bibr" target="#b9">[10]</ref> NER model with a Conditional Random Fields (CRF) <ref type="bibr" target="#b15">[16]</ref> sequential classifer; fed by the contextual word representations from an ELMo <ref type="bibr" target="#b29">[30]</ref> language model, combined with character level representations from a Convolutional Neural Network (CNN) <ref type="bibr" target="#b7">[8,</ref><ref type="bibr" target="#b17">18]</ref>. Our system differs from <ref type="bibr" target="#b29">[30]</ref> in the way that we do not use pre-trained word embeddings, and we use two different ELMo models, one for the general domain of Portuguese language, and one for the police domain.</p><p>The ELMo embeddings are obtained using the biLM (bidirectional Language Model) <ref type="bibr" target="#b29">[30]</ref> architecture. This architecture is based on 2 BiLSTM networks, each of them responsible for one direction in the bidirectional language model: one for keeping a representation while making predictions in the forward direction of the text and one for the reverse direction. The first layer from the biLM model produces character level features from the training words using two CNNs, one for each direction of the text, each of them with 2048 convolutional filters. They produce a representation with a total dimension of 4096, which is fed to the first BiLSTM layer of the biLM model. Each layer of the model (the CNN and the two BiLSTMs) projects the input it receives to a vector of dimension 1024. These 3 projections represent the ELMo embeddings which are produced by the biLM model. The size of the biLM training vocabulary determines the amount of words that will be predicted in the Softmax layer of the model, as shown in figure <ref type="figure" target="#fig_0">1</ref>. The BiLSTM-CRF architecture used in this work is the same from the Al-lenNLP framework <ref type="bibr" target="#b1">[2]</ref>, following a parameterization similar to the one described in <ref type="bibr" target="#b29">[30]</ref> for the NER task. The CNN network used for producing character level features from words used embeddings with dimension 16 and 128 convolutional filters of size 3, with the ReLU <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b25">26]</ref> activation function. The BiLSTM network used for encoding the words has 2 layers, with 200 hidden units each. Figure <ref type="figure" target="#fig_1">2</ref> shows the dimensionality of the word representations obtained from the CNN and the 2 ELMo embeddings used. The 2 ELMo we use were trained in two separate domains: for the general Portuguese domain we used a Portuguese Wikipedia <ref type="bibr" target="#b36">[37]</ref> dump, and for the police domain we used a 1.6 billion word corpus created from public documents from Brazil's Labor Courts <ref type="bibr" target="#b14">[15]</ref>. The Portuguese ELMo model we trained is publicly available at https://allennlp.org/elmo. For the Iber-LEF evaluation, we performed the fine tuning of this ELMo in this combined dataset, following <ref type="bibr" target="#b29">[30]</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Experimental Setup and Results</head><p>For the Portuguese NER task, IberLEF specified the evaluation of models in three different domains: general, police and clinical. For the specific domains only person names (PER category) are annotated, while the general domain dataset is annotated with 5 different categories: person, place (PLC), organization (ORG), value (VAL) and time (TME). The following public corpora were used for the model proposed in this work: WikiNER <ref type="bibr" target="#b26">[27]</ref>, LeNER-Br <ref type="bibr" target="#b20">[21]</ref>, HAREM I <ref type="bibr" target="#b32">[33]</ref> and MiniHAREM <ref type="bibr" target="#b27">[28]</ref> golden collections, and Paramopama <ref type="bibr" target="#b13">[14]</ref>. We also used a private legal corpus provided by the Datalawyer company, consisting of 76 annotated documents from the Brazilian Labor Court. The only dataset annotated with all five categories is HAREM. These corpora have the following categories annotated in them:</p><p>- Since only the HAREM datasets contains all the categories needed for the IberLEF evaluation, we adopted a semi-supervised approach training for an initial NER model to perform the self-training of the final model. This training had the following procedure:</p><p>1. For each one of the datasets, we ignored all the entities that were not annotated as one of the 5 relevant categories for this evaluation. Their annotation was removed; 2. We merged the datasets from HAREM, <ref type="bibr">LeNER</ref> None of the existing annotations was removed or overriden during the bootstrapping of the datasets. Only words that prior to this process had no category associated to them were classified as either Time or Value, according to the bootstrap model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Models Evaluation</head><p>Prior to submitting the NER model with word representations from 2 ELMo and a CNN (henceforth referred to as 2xELMo+CNN), we performed the training of two other models, with different types of word representation: (i) ELMo+CNN and (ii) ELMo+CNN+Wang2Vec <ref type="bibr" target="#b18">[19]</ref>. These two models use only the general domain ELMo. We performed the training of these three models using the same configuration, and performed an additional evaluation of them in the following datasets: MiniHAREM, test datasets from Datalawyer Company and LeNER-Br, and the full datasets from Paramopama and WikiNER. For all of them, except MiniHAREM, we evaluated both variants: with and without bootstrapped Time and Value entities. The best model with the best F-Score was ELMo+CNN+Wang2Vec, followed by 2xELMo+CNN.</p><p>We also evaluated the three models in all nine datasets (MiniHAREM, Datalawyer, LeNER-Br, Paramopama and WikiNER, with these last four being evaluated in the original dataset, and the bootstrapped dataset). The 2xELMo+CNN had the best results for the MiniHAREM dataset, as well as for the datasets in the police domain (Datalawyer and LeNER-Br datasets). ELMo+CNN had the best results for Paramopama and WikiNER. After grouping these evaluation results by model, the best mean F-Score was from the 2xELMo+CNN variant. Since 2xELMo+CNN performed better in the police domain (which is relevant for the IberLEF evaluation), we chose this model for the task evaluation.</p><p>Table <ref type="table" target="#tab_2">2</ref> presents the results obtained from the IberLEF evaluation. We point out that the only corpus we did not use from HAREM to train our models was the one from HAREM II <ref type="bibr" target="#b24">[25]</ref>, which is the one used in the general domain evaluation. We also did not have any access to any type of clinical documents or embeddings, so our model contained no type of adaptation for this specific domain. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Corpus</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Concluding Remarks</head><p>For the Portuguese NER task of the Iberian Languages Evaluation Forum, we experimented with different systems based on deep learning architectures, for both NER model and word representations. For the NER model we used the BiLSTM-CRF architecture, which became a reference for sequential classification NLP tasks. For word representations we experimented with character level features from Convolutional Neural Networks, Wang2Vec pre-trained word embeddings, and the ELMo embeddings from a biLM language model. We evaluated different models with different types of word representations in 5 different corpora, and submitted a system based on 2 different ELMo, combined with character level features. Our model was trained in a semi-supervised scenario, in order to account for the lack of certain types of categories in the used corpora.</p><p>Our main contribution is the use of ELMo embeddings for the Portuguese NER task, which have not been reported so far in the related literature. Our pre-trained ELMo model is publicly available at https://allennlp.org/elmo.</p><p>For future work, instead of training a single NER model with different ELMo representations for different domains, we will experiment with an ensemble of different models, each one trained separately in a different domain.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig.1. Layer representations of the biLM architecture and their connections between layers and projections. Note that the arrows → e ← in the LSTM layers indicate the direction of the objective function from the bidirectional language model, not the direction of the LSTM networks, which are also bidirectional. Each 2-layer BiLSTM network used in this scheme works as a unidirectional language model, and their composition provides bidirectionality to the whole language model.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. Representation of words in the proposed architecture</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head></head><label></label><figDesc>HAREM: Place, Organization, Person, Time, Value, Abstraction, Work, Event, Thing and Other; -LeNER-Br: Legal Case, Law, Place, Organization, Person and Time; -Paramopama: Place, Organization, Person and Time; -WikiNER: Place, Miscellaneous, Organization and Person; -Datalawyer: Function, Legal Basis, Place, Organization, Person, Court, Settlement Value, Pleed Value, Conviction Value, Court Costs and District.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>-2003 93.09% 2018 Devlin et al. (BERT Large) [7] CoNLL-2003 92.80% 2018 Devlin et al. (BERT Base) [7] CoNLL-2003 92.40% 2018</figDesc><table><row><cell>Peters et al. [30]</cell><cell>CoNLL-2003 92.22% 2018</cell></row><row><cell>Quinta de Castro et al. [3]</cell><cell>HAREM-Sel 76.27% 2018 HAREM-Tot 70.33%</cell></row><row><cell>Da Costa e Paetzold</cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head></head><label></label><figDesc>-Br and Paramopama, and randomly split them into training, validation and test sets; 3. The resulting datasets from the previous step were used to train a NER model for bootstrapping Time and Value annotations for the datasets that didn't contain these categories; 4. The bootstrap model was used to annotate: 4.1. Time and Value entities in the WikiNER dataset; 4.2. Value entities in the LeNER-Br dataset; 4.3. Value entities in the Paramopama dataset; 4.4. Time and Value entities in the Datalawyer dataset. 5. The resulting boostrapped corpora were merged and split into training, validation and test sets; 6. The resulting datasets from the previous step were used to train the final NER model that was submitted to the IberLEF evaluation.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2 .</head><label>2</label><figDesc>Results from the IberLEF evaluation, for the 3 different domains.</figDesc><table><row><cell></cell><cell cols="2">Category Precision Recall F-Score</cell></row><row><cell>Police Dataset</cell><cell>Person</cell><cell>86.14% 92.82% 89.35%</cell></row><row><cell>Clinical Dataset</cell><cell>Person</cell><cell>32.47% 51.02% 39.68%</cell></row><row><cell>General Dataset (SIGARRA + HAREM II)</cell><cell>Overall</cell><cell>63.11% 51.69% 56.83%</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0">Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019)</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Acknowledgements</head><p>Thanks to Datalawyer (https://www.datalawyer.com.br/) for the financial support and for providing the legal dataset used for training the submitted model. This work was developed in Deep Learning Brazil research group. Our researches are sponsored by Copel Energy Distribution, Data-H Artificial Intelligence, Cy-berLabs Artificial Intelligence, Americas Health and iFood Food Delivery.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Contextual string embeddings for sequence labeling</title>
		<author>
			<persName><forename type="first">A</forename><surname>Akbik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Blythe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Vollgraf</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">COLING 2018, 27th International Conference on Computational Linguistics</title>
				<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="1638" to="1649" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<ptr target="https://allennlp.org/" />
		<title level="m">AllenNLP: An open-source nlp research library, built on pytorch</title>
				<imprint>
			<date type="published" when="2018-06">2018. 06-July-2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Portuguese named entity recognition using lstm-crf</title>
		<author>
			<persName><forename type="first">Quinta</forename><surname>De Castro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">V</forename><surname>Félix Felipe Da Silva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Da Silva Soares</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Computational Processing of the Portuguese Language</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Villavicencio</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><surname>Moreira</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Abad</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Caseli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Gamallo</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Ramisch</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Gonçalo Oliveira</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><forename type="middle">H</forename><surname>Paetzold</surname></persName>
		</editor>
		<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer International Publishing</publisher>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="83" to="92" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Named entity recognition with bidirectional LSTM-CNNs</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">P</forename><surname>Chiu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Nichols</surname></persName>
		</author>
		<ptr target="https://www.aclweb.org/anthology/Q16-1026" />
	</analytic>
	<monogr>
		<title level="j">Transactions of the Association for Computational Linguistics</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page" from="357" to="370" />
			<date type="published" when="2016-12">Dec 2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Natural language processing (almost) from scratch</title>
		<author>
			<persName><forename type="first">R</forename><surname>Collobert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Weston</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bottou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Karlen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kavukcuoglu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kuksa</surname></persName>
		</author>
		<ptr target="http://dl.acm.org/citation.cfm?id=1953048.2078186" />
	</analytic>
	<monogr>
		<title level="j">J. Mach. Learn. Res</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page" from="2493" to="2537" />
			<date type="published" when="2011-11">Nov 2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Effective sequence labeling with hybrid neural-crf models</title>
		<author>
			<persName><forename type="first">P</forename><surname>Da Costa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">H</forename><surname>Paetzold</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Computational Processing of the Portuguese Language</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Villavicencio</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><surname>Moreira</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Abad</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Caseli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Gamallo</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Ramisch</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Gonçalo Oliveira</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><forename type="middle">H</forename><surname>Paetzold</surname></persName>
		</editor>
		<imprint>
			<publisher>Springer International Publishing</publisher>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="490" to="498" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1810.04805</idno>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position</title>
		<author>
			<persName><forename type="first">K</forename><surname>Fukushima</surname></persName>
		</author>
		<idno type="DOI">10.1007/BF00344251</idno>
		<ptr target="https://doi.org/10.1007/BF00344251" />
	</analytic>
	<monogr>
		<title level="j">Biological Cybernetics</title>
		<imprint>
			<biblScope unit="volume">36</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="193" to="202" />
			<date type="published" when="1980-04">Apr 1980</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Speech recognition with deep recurrent neural networks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Graves</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rahman Mohamed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">E</forename><surname>Hinton</surname></persName>
		</author>
		<idno>CoRR abs/1303.5778</idno>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Long short-term memory</title>
		<author>
			<persName><forename type="first">S</forename><surname>Hochreiter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Schmidhuber</surname></persName>
		</author>
		<idno type="DOI">10.1162/neco.1997.9.8.1735</idno>
		<ptr target="http://dx.doi.org/10.1162/neco.1997.9.8.1735" />
	</analytic>
	<monogr>
		<title level="j">Neural Comput</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="issue">8</biblScope>
			<biblScope unit="page" from="1735" to="1780" />
			<date type="published" when="1997-11">Nov 1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Bidirectional lstm-crf models for sequence tagging</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Yu</surname></persName>
		</author>
		<idno>CoRR abs/1508.01991</idno>
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">What is the best multi-stage architecture for object recognition?</title>
		<author>
			<persName><forename type="first">K</forename><surname>Jarrett</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kavukcuoglu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ranzato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Lecun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE 12th International Conference on Computer Vision</title>
				<imprint>
			<date type="published" when="2009">2009. 2009</date>
			<biblScope unit="page" from="2146" to="2153" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Bag of tricks for efficient text classification</title>
		<author>
			<persName><forename type="first">A</forename><surname>Joulin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Grave</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Bojanowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mikolov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics</title>
		<title level="s">Short Papers</title>
		<meeting>the 15th Conference of the European Chapter of the Association for Computational Linguistics<address><addrLine>Valencia, Spain</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2017-04">Apr 2017</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="427" to="431" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">Paramopama: a Brazilian-Portuguese Corpus for Named Entity Recognition</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">M</forename><surname>Júnior</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Macedo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Bispo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Santos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Silva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Barbosa</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
		<respStmt>
			<orgName>Universidade Federal de Sergipe</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Tech. rep.</note>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">N</forename><surname>De Justiça</surname></persName>
		</author>
		<ptr target="http://www.cnj.jus.br/tecnologia-da-informacao/processo-judicial-eletronico-pje" />
		<title level="m">Processo judicial eletrônico (pje)</title>
				<meeting>esso judicial eletrônico (pje)</meeting>
		<imprint>
			<date type="published" when="2019-06">2019. 06-July-2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Conditional random fields: Probabilistic models for segmenting and labeling sequence data</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Lafferty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mccallum</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">C N</forename><surname>Pereira</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Eighteenth International Conference on Machine Learning</title>
				<meeting>the Eighteenth International Conference on Machine Learning<address><addrLine>IberLEF</addrLine></address></meeting>
		<imprint>
			<publisher>Morgan Kaufmann Publishers Inc</publisher>
			<date type="published" when="2001">2001. 2019</date>
			<biblScope unit="page" from="282" to="289" />
		</imprint>
	</monogr>
	<note>Proceedings of the Iberian Languages Evaluation Forum</note>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Neural architectures for named entity recognition</title>
		<author>
			<persName><forename type="first">G</forename><surname>Lample</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ballesteros</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Subramanian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kawakami</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Dyer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
				<meeting>the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="260" to="270" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Handwritten digit recognition with a back-propagation network</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Le Cun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Boser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">S</forename><surname>Denker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Henderson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">E</forename><surname>Howard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Hubbard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">D</forename><surname>Jackel</surname></persName>
		</author>
		<ptr target="http://dl.acm.org/citation.cfm?id=2969830.2969879" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2Nd International Conference on Neural Information Processing Systems</title>
				<meeting>the 2Nd International Conference on Neural Information Processing Systems<address><addrLine>Cambridge, MA, USA</addrLine></address></meeting>
		<imprint>
			<publisher>MIT Press</publisher>
			<date type="published" when="1989">1989</date>
			<biblScope unit="page" from="396" to="404" />
		</imprint>
	</monogr>
	<note>NIPS&apos;89</note>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<author>
			<persName><forename type="first">W</forename><surname>Ling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Dyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Black</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Trancoso</surname></persName>
		</author>
		<ptr target="https://github.com/wlin12/wang2vec" />
		<title level="m">Extension of the original word2vec using different architectures</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Two/too simple adaptations of Word2Vec for syntax problems</title>
		<author>
			<persName><forename type="first">W</forename><surname>Ling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Dyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">W</forename><surname>Black</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Trancoso</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
				<meeting>the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</meeting>
		<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="1299" to="1304" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Lener-br: a dataset for named entity recognition in brazilian legal text</title>
		<author>
			<persName><forename type="first">Luz</forename><surname>De Araujo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">H</forename><surname>De Campos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">E</forename><surname>De Oliveira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">R R</forename><surname>Stauffer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Couto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bermejo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on the Computational Processing of Portuguese (PROPOR)</title>
		<title level="s">Lecture Notes on Computer Science (LNCS</title>
		<meeting><address><addrLine>Canela, RS, Brazil</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2018">September 24-26 2018</date>
			<biblScope unit="page" from="313" to="323" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF</title>
		<author>
			<persName><forename type="first">X</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Hovy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics</title>
		<title level="s">Long Papers</title>
		<meeting>the 54th Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="1064" to="1074" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Natural language processing for the semantic web</title>
		<author>
			<persName><forename type="first">D</forename><surname>Maynard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Bontcheva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Augenstein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Synthesis Lectures on the Semantic Web: Theory and Technology</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="1" to="194" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Efficient estimation of word representations in vector space</title>
		<author>
			<persName><forename type="first">T</forename><surname>Mikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Corrado</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Dean</surname></persName>
		</author>
		<ptr target="http://arxiv.org/abs/1301.3781" />
	</analytic>
	<monogr>
		<title level="j">CoRR</title>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<monogr>
		<ptr target="iSBN:978-989-20-1656-6" />
		<title level="m">Desafios na avaliação conjunta do reconhecimento de entidades mencionadas: O Segundo HAREM</title>
				<editor>
			<persName><forename type="first">C</forename><surname>Mota</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Santos</surname></persName>
		</editor>
		<imprint>
			<publisher>Linguateca</publisher>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Rectified linear units improve restricted boltzmann machines</title>
		<author>
			<persName><forename type="first">V</forename><surname>Nair</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">E</forename><surname>Hinton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 27th International Conference on International Conference on Machine Learning</title>
				<meeting>the 27th International Conference on International Conference on Machine Learning</meeting>
		<imprint>
			<publisher>Omnipress</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="807" to="814" />
		</imprint>
	</monogr>
	<note>ICML&apos;10</note>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Learning multilingual named entity recognition from wikipedia</title>
		<author>
			<persName><forename type="first">J</forename><surname>Nothman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Ringland</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Radford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Murphy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Curran</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Artificial Intelligence</title>
		<imprint>
			<biblScope unit="volume">194</biblScope>
			<biblScope unit="page" from="151" to="175" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<monogr>
		<author>
			<persName><forename type="first">Nuno</forename><surname>Cardoso</surname></persName>
		</author>
		<title level="m">Harem e miniharem: Uma análise comparativa</title>
				<imprint>
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Glove: Global vectors for word representation</title>
		<author>
			<persName><forename type="first">J</forename><surname>Pennington</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Socher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Manning</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</title>
				<meeting>the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="1532" to="1543" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Deep contextualized word representations</title>
		<author>
			<persName><forename type="first">M</forename><surname>Peters</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Neumann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Iyyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gardner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Clark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
				<meeting>the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</meeting>
		<imprint>
			<date type="published" when="2018-06">Jun 2018</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="2227" to="2237" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b30">
	<monogr>
		<title level="m" type="main">Portuguese named entity recognition and relation extraction tasks at iberlef</title>
		<author>
			<persName><forename type="first">Sandra</forename><surname>Collovini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Joaquim</forename><surname>Santos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">C J T R V P Q M S D B C R G</forename><surname>Xavier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">C</forename></persName>
		</author>
		<imprint>
			<date type="published" when="2019">2019. 2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Boosting named entity recognition with neural character embeddings</title>
		<author>
			<persName><forename type="first">C</forename><surname>Dos Santos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Guimarães</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fifth Named Entity Workshop</title>
				<meeting>the Fifth Named Entity Workshop<address><addrLine>Beijing, China</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2015-07">Jul 2015</date>
			<biblScope unit="page" from="25" to="33" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Santos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Cardoso</surname></persName>
		</author>
		<ptr target="iSBN:978-989-20-0731-1" />
		<title level="m">Reconhecimento de entidades mencionadas em português: Documentação e actas do HAREM, a primeira avaliação conjunta na área. Linguateca</title>
				<imprint>
			<date type="published" when="2007-11">November 2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Schultz</surname></persName>
		</author>
		<ptr target="https://blog.microfocus.com/how-much-data-is-created-on-the-internet-each-day/" />
		<title level="m">How much data is created on the internet each day</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<analytic>
		<title level="a" type="main">Introduction to the conll-2003 shared task: Language-independent named entity recognition</title>
		<author>
			<persName><forename type="first">Tjong</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sang</forename></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">F</forename><surname>De Meulder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003</title>
				<meeting>the Seventh Conference on Natural Language Learning at HLT-NAACL 2003</meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2003">2003</date>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page" from="142" to="147" />
		</imprint>
	</monogr>
	<note>CONLL &apos;03</note>
</biblStruct>

<biblStruct xml:id="b35">
	<analytic>
		<title level="a" type="main">Attention is all you need</title>
		<author>
			<persName><forename type="first">A</forename><surname>Vaswani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Shazeer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Parmar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Uszkoreit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">N</forename><surname>Gomez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">U</forename><surname>Kaiser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Polosukhin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems 30</title>
				<editor>
			<persName><forename type="first">I</forename><surname>Guyon</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">U</forename><forename type="middle">V</forename><surname>Luxburg</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Bengio</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Wallach</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Fergus</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Vishwanathan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Garnett</surname></persName>
		</editor>
		<imprint>
			<publisher>Curran Associates, Inc</publisher>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="5998" to="6008" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b36">
	<monogr>
		<ptr target="https://www.wikipedia.org/" />
		<title level="m">Wikipédia: Wikipédia -a free encyclopedia</title>
				<imprint>
			<date type="published" when="2019-06">2019. 06-July-2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b37">
	<monogr>
		<ptr target="https://code.google.com/archive/p/word2vec/" />
		<title level="m">Word2vec: Tool for computing continuous distributed representations of words</title>
				<imprint>
			<date type="published" when="2013-06">2013. 06-July-2019</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
