<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Author Profiling with Word+Character Neural Attention Network Notebook for PAN at CLEF 2017</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Yasuhide</forename><surname>Miura</surname></persName>
							<email>yasuhide.miura@fujixerox.co.jp</email>
							<affiliation key="aff0">
								<orgName type="institution">Fuji Xerox Co., Ltd</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Tomoki</forename><surname>Taniguchi</surname></persName>
							<email>taniguchi.tomoki@fujixerox.co.jp</email>
							<affiliation key="aff0">
								<orgName type="institution">Fuji Xerox Co., Ltd</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Motoki</forename><surname>Taniguchi</surname></persName>
							<email>motoki.taniguchi@fujixerox.co.jp</email>
							<affiliation key="aff0">
								<orgName type="institution">Fuji Xerox Co., Ltd</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Tomoko</forename><surname>Ohkuma</surname></persName>
							<email>ohkuma.tomoko@fujixerox.co.jp</email>
							<affiliation key="aff0">
								<orgName type="institution">Fuji Xerox Co., Ltd</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Author Profiling with Word+Character Neural Attention Network Notebook for PAN at CLEF 2017</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">A1EC889937AEBAD5BC5B5494BEABF61C</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T20:30+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper describes neural network models that we prepared for the author profiling task of PAN@CLEF 2017. In previous PAN series, statistical models using a machine learning method with a variety of features have shown superior performances in author profiling tasks. We decided to tackle the author profiling task using neural networks. Neural networks have recently shown promising results in NLP tasks. Our models integrate word information and character information with multiple neural network layers. The proposed models have marked joint accuracies of 64-86% in the gender identification and the language variety identification of four languages.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Researches to automatically extract author profile traits from social media have been done to empower activities such as advertisement, forensic, marketing, personalization, and security. PAN tasks have focused on traits like gender, age, and personality type in the past series. This year's author profiling task was to identify a gender and a language variety of a Twitter user <ref type="bibr" target="#b15">[15]</ref>. In the gender identification, a task participant is required to determine whether a user is male or female from tweets. Similar gender identifications have been done in past PAN series with different native languages and domains. In the language variation identification, a task participant has to decide a language variety within a given native language from tweets. The study of language varieties has been done in VarDial shared tasks <ref type="bibr" target="#b17">[17]</ref> targeting journalistic texts, but is new in PAN series targeting Twitter texts.</p><p>Statistical models using a machine learning method like support vector machine have shown effectiveness to identify profile traits in past PAN series. Various features were introduced to these models including word n-grams <ref type="bibr" target="#b6">[6,</ref><ref type="bibr" target="#b12">12,</ref><ref type="bibr" target="#b2">3]</ref>, character ngrams <ref type="bibr" target="#b6">[6,</ref><ref type="bibr" target="#b12">12,</ref><ref type="bibr" target="#b2">3]</ref>, part-of-speech tags <ref type="bibr" target="#b6">[6,</ref><ref type="bibr" target="#b2">3]</ref>, styles <ref type="bibr" target="#b6">[6,</ref><ref type="bibr" target="#b12">12,</ref><ref type="bibr" target="#b2">3]</ref>, and second order attributes <ref type="bibr" target="#b6">[6]</ref>. We decided to tackle the identifications of gender and language variety using neural networks. Neural networks have shown effectiveness to capture complex representations combing simpler representations <ref type="bibr" target="#b9">[9]</ref>. We aim to obtain complex representations that were expressed as independent features in the past studies using neural networks. Neural networks such as multilayer perceptron and restricted Boltzmann machine have been used in PAN 2016 <ref type="bibr" target="#b16">[16]</ref> to obtain word embeddings <ref type="bibr" target="#b1">[2]</ref> and as a classifier. Our models combine word information and character information with complex neural networks consisting of a recurrent neural network layer, a convolutional neural network layer, and an attention mechanism <ref type="bibr" target="#b0">[1]</ref> layer to classify a profile trait.</p><p>In the following section of this paper, we first describe our neural network models in Section 2. Data used in the models are explained in Section 3 following Section 4 with the details of an experiment to confirm the performances of the models. Finally, Section 5 concludes the paper with some future directions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Models</head><p>We propose two models that consist of multiple layers to classify a profile trait with neural networks. The architectures of the two models share most of their layers but differ in the fusion strategies of word information and character information. The first model NeuralNet-FusionTweet (NN-FT) combines the two kinds of information with a tweet-level fusion. The second model NeuralNet-FusionUser (NN-FU) performs a fusion process in user-level.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Model NN-FT</head><p>Figure <ref type="figure" target="#fig_0">1</ref> shows the architecture of NN-FT. For each user, the model accepts the words and the characters of user tweets. Note that the words and the characters are just dif- ferent representations of same tweet texts. The words and the characters are embedded with embedding layers and are processed with a recurrent neural network (RNN) layer, convolutional neural network (CNN) layers, attention mechanism <ref type="bibr" target="#b0">[1]</ref> layers, a max-pooling layer, and fully-connected (FC) layers. As an implementation of RNN, we used Gated Recurrent Unit (GRU) <ref type="bibr" target="#b7">[7]</ref> with a bi-directional setting.</p><formula xml:id="formula_0">x 1 x T … input h 1 bi-directional recurrent states … g 1 g 2 g T RNN features … x 2 u 1 context vectors + … α 1 g 1 α 2 g 2 α T g T Attention features m u 2 u T</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Attention Layer</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RNN Layer</head><formula xml:id="formula_1">h 1 h 2 h 2 h T h T</formula><p>word processes Figure <ref type="figure" target="#fig_1">2</ref> illustrates the overview of word processes by RNN W and Attention W . The input words are embedded to k w dimension word embeddings with embedding matrix E w to obtain x with x t ∈ R kw . x are then processed in RNN W with the following transition functions:</p><formula xml:id="formula_2">z t = σ (W z x t + U z h t−1 + b z ) (1) r t = σ (W r x t + U r h t−1 + b r ) (2) ht = tanh (W h x t + U h (r t ⊙ h t−1 ) + b h )<label>(3)</label></formula><formula xml:id="formula_3">h t = (1 − z t ) ⊙ h t−1 + z t ⊙ ht<label>(4)</label></formula><p>where   Attention W computes a tweet representation m as a weighted sum of g t with weight α t :</p><formula xml:id="formula_4">z t is an update gate, r t is a reset gate, ht is a candidate state, h t is a state, W z , W r , W h , U z , U r , U h are weight matrices, b z , b r , b h are</formula><formula xml:id="formula_5">m = ∑ t α t g t (<label>5</label></formula><formula xml:id="formula_6">)</formula><formula xml:id="formula_7">α t = exp ( v T α u t ) ∑ t exp (v T α u t )<label>(6)</label></formula><formula xml:id="formula_8">u t = tanh (W α g t + b α )<label>(7)</label></formula><p>where v α is a weight vector, W α is a weight matrix, and b α a bias vector. u t is an attention context vector calculated from g t with a single FC layer (Eq. 7). u t is normalized with softmax to obtain α t as a probability (Eq. 6).</p><p>character processes Figure <ref type="figure" target="#fig_3">3</ref> illustrates the overview of character processes by CNN C and MaxPooling C . The input characters are embedded to k c dimension character embeddings with character embedding matrix E c to obtain s with s i ∈ R kc . s is then passed to CNN C to obtain c with:</p><formula xml:id="formula_9">c i = f (W c s i:i+h−1 + b c )<label>(8)</label></formula><p>where f (•) is a non-linear function, W c is a weight matrix, h a convolution window size, and b c a bias vector. We used rectified linear unit for f (•). c is further processed with max-over time process <ref type="bibr" target="#b8">[8]</ref>   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Model NN-FU</head><p>Figure <ref type="figure" target="#fig_4">4</ref> shows the architecture of NN-FU. Many layers in NN-FU exist in NN-FT. Layers that are not apparent in NN-FT are Attention FUW , Attention FUC , FC FU1 , and FC FU2 . Attention FUW merges tweet representations obtained from word information. Similarly, Attention FUC merges tweet representations obtained from character information. The outputs of these attention processes are concatenated and is further processed with FC FU1 and FC FU2 . The attention processes in NN-FU are different from the attention processes in NN-FT, where word information and character information are concatenated prior to Attention FT . In NN-FU, word information and character information are concatenated after the attention processes with user-level representations. The other non-apparent layers FC FU1 and FC FU2 perform similarly as FC FT1 and FC FT2 in NN-FT to process a word+character user representation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Data</head><p>The weights in the proposed models require some data to be trained. We used two datasets to train the proposed models with two different objectives. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">PAN@CLEF 2017 Author Profiling Training Corpus</head><p>The first dataset we used to train the proposed models is the official PAN@CLEF 2017 Author Profiling Training Corpus. The dataset consists of 11, 400 Twitter users in four languages with the gold labels of gender and language variety. The languages, gender labels, and language variety included in this dataset is summarized in Table <ref type="table" target="#tab_2">1</ref> This dataset is used to train the models to minimize an empirical loss between predictions and gold labels.</p><p>We divided this dataset into train 8 , dev 1 , and test 1 with a stratified sampling by ratio of 8:1:1. These subsets were made so that we can empirically decided some parameters of the models. We will describe the detail of parameter selection in Section 4.2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Streaming Tweets</head><p>The second dataset we used to train the proposed models is tweets collected by Twitter Streaming APIs<ref type="foot" target="#foot_0">1</ref> . We collected these tweets to pre-train the word embedding matrix E w of the models. Neural network models are known to perform better when word embeddings are pre-trained by a large-scale dataset <ref type="bibr" target="#b8">[8]</ref>. The following steps describe the detail of the collection process:</p><p>1. Tweets with lang metadata of en, es, pt, and ar were collected via Twitter Streaming APIs during the period of March-May 2017. 2. Retweets are removed from the collected tweets.</p><p>3. Tweets posted by bots<ref type="foot" target="#foot_1">2</ref> are deleted from the collected tweets.</p><p>Table <ref type="table" target="#tab_3">2</ref> shows the number of resulting tweets. We will describe the detail of word embedding pre-training in Section 4.1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Experiment</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Model Configurations</head><p>Text Processor We applied a unicode normalization, a Twitter user name normalization, and a URL normalization for text pre-processing. Pre-processed texts were tokenized with the two kinds of tokenizers. Twokenizer <ref type="bibr" target="#b13">[13]</ref> is used for English and NLTK <ref type="bibr" target="#b4">[4]</ref> WordPunctTokenizer is used for other languages. Words are converted to lower case forms to ignore capitalization. Note that the lower case conversion is not performed for character inputs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Initialization of Embeddings</head><p>We pre-trained word embeddings using streaming tweets of Section 3.2 by fastText <ref type="bibr" target="#b5">[5]</ref> with the skip-gram algorithm. The pre-training parameters are dimension=100, learning rate=0.025, window size=5, negative sample size=5, and epoch=5. For character embeddings, we randomly initialized them with a uniform distribution.</p><p>Convolution Filter Sizes, Layer Unit Sizes, and Word Embedding Sizes Table <ref type="table" target="#tab_4">3</ref> summarizes the sizes of various parameters included in the proposed models. In CNN C , two values are listed since we used the multiple filters approach <ref type="bibr" target="#b10">[10]</ref>. Optimization Strategy We used cross-entropy loss as an objective function of the models. l 2 regularization was applied to the RNN layers, the attention context vectors, the CNN layers, and the FC layers of the models to avoid overfitting. The objective function was minimized through stochastic gradient descent over shuffled mini-batches with Adam <ref type="bibr" target="#b11">[11]</ref>. For the initial learning rate of Adam, we set it to 1e −3 .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Language</head><p>Parameter Selection The models have regularization parameter α which is sensitive to a dataset. We selected optimal values for α:</p><formula xml:id="formula_10">α ∈ { 1e −3 , 5e −4 , 1e −4 , 5e −5 , 1e −5 , 5e −6 , 1e −6 , 5e −7 , 1e −7 }</formula><p>in terms of accuracy with a grid search using dev 1 described in Section 3.1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">In-house Experiment</head><p>We evaluated the proposed models using train 8 , dev 1 , and test 1 . All models are trained using a single NVIDIA Titan X gpu. Table <ref type="table" target="#tab_5">4</ref> presents the results of gender identifications. In the gender identifications, NN-FU performed better than NN-FT with one exception in Spanish. Table <ref type="table" target="#tab_8">5</ref> shows the results of language variety identifications. The language variety identifications showed different characteristics where NN-FT performing better in all languages compared to NN-FU.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Submission Run</head><p>We chose the best performing models and αs in the in-house experiment as models and parameters for a submission run. In the cases of multiple best performing αs, we chose αs that showed the best performances in test 1 . The submission run was done in a TIRA virtual machine <ref type="bibr" target="#b14">[14]</ref> with cpus. Table <ref type="table">6</ref> summarizes the performances of the models in the submission run. The models showed a similar trend as in the in-house experiment. They ranked 3rd in gender ranking, 6th in language variety ranking, and 4th in the global ranking. Portuguese 99.89 + 1e -3 , 5e -4 , 1e -4, 5e -5 , 1e -5 , 5e -6 , 1e -6, 5e -7 , 1e -7 99.47 + 1e -3 , 5e -4 , 1e -4, 5e -5 , 1e -6, 5e -7 , 1e   <ref type="table">6</ref>. The performances of the proposed models in the submission run.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Language</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusion</head><p>As described in this paper, we proposed two models, NN-FT and NN-FU, for author profiling. The two models differ in the fusion strategies of word information and character information. The models marked joint accuracies of 64-86% in the gender identification and the language variety identification of four languages. They performed better in gender identification compared to language variety identification. The average accuracies from the top systems were -1.26% for gender and -2.05% for language variety. This result is not so surprising since neural network models had shown difficulties adapting to language variety identification in past VarDial shared tasks <ref type="bibr" target="#b17">[17]</ref>.</p><p>As future works of this study, we plan to analyze differences of internal states in NN-FT and NN-FU. The best performing models were different among profile traits and languages in the in-house experiment. We will like to unveil the causes of this differences to further improve our models.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 .</head><label>1</label><figDesc>Figure 1. The architecture of model NN-FT. The shaded layers are tweet-level processes.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 .</head><label>2</label><figDesc>Figure 2. Overview of word processes with RNNW and AttentionW.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 3 .</head><label>3</label><figDesc>Figure 3. Overview of character processes with CNNC and MaxPoolingC.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 4 .</head><label>4</label><figDesc>Figure 4. The architecture of model NN-FU. The shaded layers are tweet-level processes.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head></head><label></label><figDesc>The concatenated tweet representation is processed by CNN WC like in CNN C with window size h = 1 to get a word and character combined representation. The combined tweet representation is then passed to Attention FT to obtain a user representation from tweet representations. Finally, the user representation is passed to FC FT1 and FC FT2 , respectively.</figDesc><table /><note>in MaxPooling C to obtain a tweet representation o. word+character processes Two tweet representations m and o are concatenated to further apply word+character processes.</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 1 .</head><label>1</label><figDesc>The languages, the gender labels, and the language variety labels of PAN@CLEF 2017 Author Profiling Training Corpus.</figDesc><table><row><cell cols="2">Languages</cell><cell>English, Spanish, Portuguese, Arabic</cell></row><row><cell cols="2">Gender Labels</cell><cell>male, female</cell></row><row><cell></cell><cell>English</cell><cell>Australia, Canada, Great Britain, Ireland, New Zealand, United States</cell></row><row><cell>Language Variety Labels</cell><cell cols="2">Argentina, Chile, Colombia, Mexico, Peru, Spain, Venezuela Portuguese Brazil, Portugal Spanish</cell></row><row><cell></cell><cell>Arabic</cell><cell>Egypt, Gulf, Levantine, Maghrebi</cell></row><row><cell></cell><cell></cell><cell>Language #tweet</cell></row><row><cell></cell><cell></cell><cell>English 10.72M</cell></row><row><cell></cell><cell></cell><cell>Spanish 3.17M</cell></row><row><cell></cell><cell></cell><cell>Portugese 2.75M</cell></row><row><cell></cell><cell></cell><cell>Arabic 2.46M</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 2 .</head><label>2</label><figDesc>The number of tweets collected for each language with Twitter Streaming APIs. M in the table represents the million unit.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 3 .</head><label>3</label><figDesc>The sizes of parameters in the proposed models.</figDesc><table><row><cell>Parameter</cell><cell>Size</cell></row><row><cell cols="2">word embedding dimension 100</cell></row><row><cell cols="2">character embedding dimension 25</cell></row><row><cell>RNNW units</cell><cell>100</cell></row><row><cell>CNNC units</cell><cell>50</cell></row><row><cell>CNNWC units</cell><cell>300</cell></row><row><cell>CNNC filter sizes</cell><cell>3, 6</cell></row><row><cell>CNNWC filter size</cell><cell>1</cell></row><row><cell>AttentionW units</cell><cell>200</cell></row><row><cell>AttentionFT units</cell><cell>300</cell></row><row><cell>AttentionFUW units</cell><cell>200</cell></row><row><cell>AttentionFUC units</cell><cell>100</cell></row><row><cell>FC FT1 units</cell><cell>150</cell></row><row><cell>FCFU1 units</cell><cell>150</cell></row><row><cell>FCFT2 units</cell><cell>#label</cell></row><row><cell>FCFU2 units</cell><cell>#label</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 4 .</head><label>4</label><figDesc>Gender identification results of the proposed models on test1. + values are averaged values.</figDesc><table><row><cell></cell><cell cols="2">NN-FT</cell><cell></cell><cell>NN-FU</cell></row><row><cell></cell><cell cols="2">Accuracy α</cell><cell cols="2">Accuracy α</cell></row><row><cell cols="2">English 80.00</cell><cell>1e -4</cell><cell>81.94</cell><cell>5e -4</cell></row><row><cell cols="2">Spanish 79.52</cell><cell>5e -5</cell><cell>77.62</cell><cell>5e -6</cell></row><row><cell cols="2">Portuguese 84.17</cell><cell>5e -5</cell><cell>90.83 +</cell><cell>5e -7, 1e -7</cell></row><row><cell>Arabic</cell><cell>76.25</cell><cell>1e -3</cell><cell>79.17</cell><cell>5e -4</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_8"><head>Table 5 .</head><label>5</label><figDesc>Language variety identification results of the proposed models on test1. + values are averaged values.</figDesc><table><row><cell>Language</cell><cell>Trait</cell><cell cols="2">Model Accuracy Joint Accuracy</cell></row><row><cell>English</cell><cell cols="2">gender language variety NN-FT 87.17 NN-FU 80.46</cell><cell>69.92</cell></row><row><cell>Spanish</cell><cell cols="2">gender language variety NN-FT 92.71 NN-FT 81.18</cell><cell>75.18</cell></row><row><cell>Portuguese</cell><cell cols="2">gender language variety NN-FT 98.13 NN-FU 87.00</cell><cell>85.75</cell></row><row><cell>Arabic</cell><cell cols="2">gender language variety NN-FT 81.25 NN-FU 76.44</cell><cell>64.19</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_9"><head>Table</head><label></label><figDesc></figDesc><table /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://dev.twitter.com/streaming/overview</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">We assembled a Twitter client list consisting of 80 clients that are used for manual postings.</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Neural machine translation by jointly learning to align and translate</title>
		<author>
			<persName><forename type="first">D</forename><surname>Bahdanau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Cho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</author>
		<idno>Repository abs/1409.0473</idno>
		<ptr target="http://arxiv.org/abs/1409.0473" />
		<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
	<note type="report_type">Computing Research</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Author Profiling using SVMs and Word Embedding Averages-Notebook for PAN at CLEF</title>
		<author>
			<persName><forename type="first">R</forename><surname>Bayot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Gonçalves</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLEF 2016 Evaluation Labs and Workshop -Working Notes Papers</title>
				<editor>
			<persName><forename type="first">K</forename><surname>Balog</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Cappellato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Macdonald</surname></persName>
		</editor>
		<meeting><address><addrLine>Évora, Portugal</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2016-09">2016. September. 2016</date>
			<biblScope unit="page" from="5" to="8" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">CAPS: A Cross-genre Author Profiling System-Notebook for PAN at CLEF</title>
		<author>
			<persName><forename type="first">I</forename><surname>Bilan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zhekova</surname></persName>
		</author>
		<editor>Balog, K., Cappellato, L., Ferro, N., Macdonald, C.</editor>
		<imprint>
			<date type="published" when="2016">2016. 2016</date>
			<pubPlace>CLEF</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m">Evaluation Labs and Workshop -Working Notes Papers</title>
				<meeting><address><addrLine>Évora, Portugal</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2016-09">September. 2016</date>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page">8</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Natural Language Processing with Python</title>
		<author>
			<persName><forename type="first">S</forename><surname>Bird</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Loper</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Klein</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
			<publisher>O&apos;Reilly Media Inc</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Enriching word vectors with subword information</title>
		<author>
			<persName><forename type="first">P</forename><surname>Bojanowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Grave</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Joulin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mikolov</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1607.04606</idno>
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">GronUP: Groningen User Profiling-Notebook for PAN at CLEF</title>
		<author>
			<persName><forename type="first">M</forename><surname>Busger Op Vollenbroek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Carlotto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Kreutz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Medvedeva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Pool</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bjerva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Haagsma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Nissim</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLEF 2016 Evaluation Labs and Workshop -Working Notes Papers</title>
				<editor>
			<persName><forename type="first">K</forename><surname>Balog</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Cappellato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Macdonald</surname></persName>
		</editor>
		<meeting><address><addrLine>Évora, Portugal</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2016-09">2016. September. 2016</date>
			<biblScope unit="page" from="5" to="8" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Learning phrase representations using RNN encoder-decoder for statistical machine translation</title>
		<author>
			<persName><forename type="first">K</forename><surname>Cho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Van Merrienboer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Gulcehre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Bahdanau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Bougares</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Schwenk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</title>
				<meeting>the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="1724" to="1734" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Natural language processing (almost) from scratch</title>
		<author>
			<persName><forename type="first">R</forename><surname>Collobert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Weston</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bottou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Karlen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kavukcuoglu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kuksa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page" from="2493" to="2537" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Deep Learning</title>
		<author>
			<persName><forename type="first">I</forename><surname>Goodfellow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Courville</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2016">2016</date>
			<publisher>MIT Press</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Convolutional neural networks for sentence classification</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Kim</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</title>
				<meeting>the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="1746" to="1751" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">Adam: A method for stochastic optimization</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">P</forename><surname>Kingma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ba</surname></persName>
		</author>
		<idno>Repository abs/1412.6980</idno>
		<ptr target="http://arxiv.org/abs/1412.6980" />
		<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
	<note type="report_type">Computing Research</note>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Exploring the Effects of Cross-Genre Machine Learning for Author Profiling in PAN 2016-Notebook for PAN at CLEF</title>
		<author>
			<persName><forename type="first">P</forename><surname>Modaresi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Liebeck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Conrad</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLEF 2016 Evaluation Labs and Workshop -Working Notes Papers</title>
				<editor>
			<persName><forename type="first">K</forename><surname>Balog</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Cappellato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Macdonald</surname></persName>
		</editor>
		<meeting><address><addrLine>Évora, Portugal</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2016-09">2016. September. 2016</date>
			<biblScope unit="page" from="5" to="8" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Improved part-of-speech tagging for online conversational text with word clusters</title>
		<author>
			<persName><forename type="first">O</forename><surname>Owoputi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>O'connor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Dyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Gimpel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Schneider</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">A</forename><surname>Smith</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT)</title>
				<meeting>the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT)</meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="380" to="390" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Improving the Reproducibility of PAN&apos;s Shared Tasks: Plagiarism Detection, Author Identification, and Author Profiling</title>
		<author>
			<persName><forename type="first">M</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Gollub</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Stamatatos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Information Access Evaluation meets Multilinguality, Multimodality, and Visualization. 5th International Conference of the CLEF Initiative (CLEF 14)</title>
				<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="268" to="299" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Overview of the 5th Author Profiling Task at PAN 2017: Gender and Language Variety Identification in Twitter</title>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Working Notes Papers of the CLEF 2017 Evaluation Labs</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Overview of the 4th Author Profiling Task at PAN 2016: Cross-Genre Evaluations</title>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel Pardo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Verhoeven</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Daelemans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Working Notes Papers of the CLEF 2016 Evaluation Labs</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Findings of the vardial evaluation campaign</title>
		<author>
			<persName><forename type="first">M</forename><surname>Zampieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Malmasi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Ljubešić</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ali</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tiedemann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Scherrer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Aepli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial)</title>
				<meeting>the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial)</meeting>
		<imprint>
			<date type="published" when="2017">2017. 2017</date>
			<biblScope unit="page" from="1" to="15" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
