<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Author Profiling, instance-based Similarity Classification Notebook for PAN at CLEF 2017</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Yaritza</forename><surname>Adame-Arcia</surname></persName>
							<email>yaritza.adame@datys.cu</email>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Desarrollo de Aplicaciones</orgName>
								<orgName type="department" key="dep2">Tecnología y Sistemas DATYS</orgName>
								<address>
									<country key="CU">Cuba</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Daniel</forename><surname>Castro-Castro</surname></persName>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Desarrollo de Aplicaciones</orgName>
								<orgName type="department" key="dep2">Tecnología y Sistemas DATYS</orgName>
								<address>
									<country key="CU">Cuba</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Reynier</forename><forename type="middle">Ortega</forename><surname>Bueno</surname></persName>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Desarrollo de Aplicaciones</orgName>
								<orgName type="department" key="dep2">Tecnología y Sistemas DATYS</orgName>
								<address>
									<country key="CU">Cuba</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Departamento de Lenguajes y Sistemas Informáticos</orgName>
								<orgName type="institution">Universidad de Alicante</orgName>
								<address>
									<country key="ES">España</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Rafael</forename><surname>Mu- Ñoz</surname></persName>
							<email>rafael@dlsi.ua.es</email>
						</author>
						<title level="a" type="main">Author Profiling, instance-based Similarity Classification Notebook for PAN at CLEF 2017</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">6A40BC15716263D729259FA055BB376E</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T20:30+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Author profiling</term>
					<term>instance-based classification</term>
					<term>tweets gender classify</term>
					<term>tweets language variety classify</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In digital documents analysis for forensic applications, when anonymous documents are presented and it is not possible with the available tools to determine the true author of the document, there are of vital importance methods that identify the characteristics of the Author Profile (Gender, Age, Personality, etc.). We propose to use a simple method of classification based on the similarity between objects, considering different features for documents representation: (a document corresponds to a set of tweets of a user), the terms used in the tweets, as well as characteristics of opinion and subjectivity presented in them. Our goal will be to classify, based on the content of the tweets, the Gender and language variety of an author from an unknown set of tweets corresponding to him. In the experiments we observed good results in Gender classification, but low values in language variety classification. We processed only the English dataset.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The PAN Profiling task for this edition is as follows: "Gender and language variety identification in Twitter. Demographics traits such as gender and language have so far investigated separately. In this task we will have participants with a corpus annotated with authors' gender and their specific variation of their native language:</p><p> English (Australia, Canada, Great Britain, Ireland, New Zealand, United States)  Spanish (Argentina, Chile, Colombia, Mexico, Peru, Spain, Venezuela)  Portuguese (Brazil, Portugal)  Arabic (Egypt, Gulf, Levantine, Maghrebi)</p><p>Although we suggest to participate in both subtasks (gender and language identification) and in all languages, it is possible participating only in one of them and in some of the languages."</p><p>The proposal to identify these demographic traits in tweets implies that the natural language processing tools widely used for long documents analysis must be adapted to the features of the textual genre and the writing characteristics presented in tweets. We must emphasize that the complexity lies in the fact that for this genre there are no linguistic rules or writing standards. Language is informal, usually direct and full of emotions.</p><p>In past tasks of demographic traits identification on PAN evaluation framework <ref type="bibr">[5] [6]</ref>, tweet genre was used and many works presented used lexical content (words, informal text, jargon) and characteristic features of the genre (URL, hashtags, mentions, retweet, emoticons, etc.). The generality of the proposals uses the classic Bag of Words representation of documents, employing in addition to the mentioned features, n-grams of some of them, for example, words n-grams, lemmas n-grams, POS-Tagging (Part of Speech Grammatical Categories) n-grams, etc. The fundamental difference of the proposal of this year to previous proposals, lies in evaluating and classifying by variety of the language.</p><p>For the classification process, decision tree-based approximations have been used, as well as SVM by a large number of competitors and a few others have used distancebased approximations to predict the closest class <ref type="bibr" target="#b13">[14]</ref> <ref type="bibr" target="#b14">[15]</ref>.</p><p>We are interested in implementing a distance-based classification strategy and with this, use previous results presented in the Author Identification edition of 2015 <ref type="bibr" target="#b2">[3]</ref>. We will combine features of the lexical content of the tweets, their characteristic features, and polarity and emotion features of previous works of our group used in tasks of sentence polarity classification. We will experimentally evaluate the differences between an instance-based proposal and a prototype-based proposal, in the same distance-based strategy.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Implemented methods</head><p>We used two classification strategies, considering two documents representation variants. An instance based representation of the documents, where the set of tweets of an author (for each author it is available her/his gender and language variety) represents a document and with this idea, for each class (female class, male class) we have a set of documents. The second variant is a prototype-based representation, where a single document is formed for each class, and this document is constructed with all the tweets of each of the sample authors per class.</p><p>Figure <ref type="figure" target="#fig_0">1</ref> shows graphically the architecture of our proposal with the instance-based strategy. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Features and tweets pre-processing stage:</head><p>The first step correspond to build the documents that will be used as objects for the similarity calculation in the classification method. For each author, we receive the set of tweets that she/he wrote, and with the concatenation of these tweets is formed a document for this author. Remember that, of each author, what we have is the gender and the language variety. We perform a pre-processing of the document in two stages. In a first stage, we segment the tweets with a tokenizer offered in FreeLing <ref type="bibr" target="#b12">[13]</ref> [http://nlp.lsi.upc.edu/freeling/], specialized for the processing of tweets. Subsequently we proceed to the expansion of short terms used and contractions, and characteristics traits that are used in tweets such as the Hashtags, URLs, mentions, are replaced by certain fixed patterns, those traits we consider the content does not contribute to differentiate between tweets of different profiles. After these transformations, we have nor-malized the tweets a bit and next proceed to perform a syntactic analysis with the traditional POS-Tagging tools for English and Spanish according to the language of the tweets.</p><p>For the representation we use the classic Bag of Words and in this we integrate:</p><p> The lexical terms, the lemmas of these and the grammatical category.  Characteristics features of the tweets.  Features of subjectivity and opinion mining analysis <ref type="bibr" target="#b6">[7]</ref>.</p><p>With the lexical terms and lemmas, we hope to differentiate the documents of each class, because some of this features are proper of their class. For example, for language variety, some terms are used by Colombians unlike the rest, and thus similarly for each variant of Spanish. Considering the frequency of use of grammatical categories, would allow us to differentiate between tweets written by the male gender and those written by the female gender. For example, in <ref type="bibr" target="#b16">[17]</ref> it is exposed several differences in the use of words and different Part of Speech analyzing women and men writing style.</p><p>The characteristic features of the tweets we extract correspond to hashtags, the mention of author, the mark of retweet, the use of URL, the use of intensifications (capital letters, deformation of words by repetition of characters, use of admiration signs), use of laughter expressions, use of emoticons and the use of informal language. For each of these traits we consider the position in which they are used, that is, the number of times used at the beginning of the tweet, at the end or elsewhere.</p><p>Additionally, we include the analysis of the frequency of features with subjective information, for example, the number of positive or negative emoticons; the words used were categorized as Positive (P), High Positive (HP), Negative (N) and High Negative (HN), using the frequencies of this categories. We used a word polarity resource in Spanish and English taken from <ref type="bibr" target="#b11">[12]</ref>, resources of emotion in Spanish <ref type="bibr" target="#b7">[8]</ref> [11] and for English <ref type="bibr" target="#b1">[2]</ref>, and finally the resources of appraisal for Spanish <ref type="bibr" target="#b8">[9]</ref> [10] and English <ref type="bibr" target="#b0">[1]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Classification stage:</head><p>For the classification of the set of tweets of an author in the Demographic traits of gender and language variety, we tried with two strategies. A strategy in which each document (set of author tweets) is used as an instance of the class to which it belongs and for the second strategy we construct a prototype of each class using the extracted features of the set of documents belonging to the class. Each of these strategies were evaluated with the tweets collections of the training set and was selected for the final evaluation, the one that showed more stable results in different executions.</p><p>In the instance-based strategy, it is calculated the similarity of the new document with each sample document of the class, and then is computed the average similarity obtained with the class. This analysis is done with each class of a Demographic Trait and the object is going to belong to the class with which it obtains greater average similarity. In the prototype-based strategy, the similarity of the new document is calculated with the class prototype. This analysis is done for each class and the object is going to belong to the class in which the similarity obtained was the highest (1-NN <ref type="bibr" target="#b3">[4]</ref> <ref type="bibr" target="#b15">[16]</ref>).</p><p>The classification is done independently for each Author Demographic Trait, Gender classes (2 classes) and language variety (for English 6 classes and for Spanish 7 classes). Finally, the result is the combination of these two classifications.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Experiments and results</head><p>The initial experiments were performed with the training collection released for this year's 2017 task. We evaluated the accuracy obtained by performing a 2-cross fold validation. In addition, we considered the training collection of the 2015 edition for the Gender and Age classes. The description of these collections can be reviewed in <ref type="bibr">[18] [6]</ref>. In Table <ref type="table" target="#tab_0">1</ref> we include the values obtained in the tests with the two representation strategies, instance-based and prototype centroid-based one, using the collection of 2017. In table <ref type="table" target="#tab_1">2</ref>, we present the results with the collection of 2015. Evaluating the results shown in the two tables, we consider that the classification is more stable with the instance-based strategy, so we decided to include this configuration in the evaluation of the task of this year. The results obtained can be observed in the summary published by the organizers and in the following table.</p><p>The results with the test dataset are shown in <ref type="bibr" target="#b17">[18]</ref> and presented on the PAN web site. We got the lowest values of all the participants, and only run successfully for the English dataset. Comparing the results obtained using the BOW-baseline that uses the 1000 most frequent terms, we conclude that one of our problems is that we need to analyze and reduce the features used. We processed the dataset using the instance-based strategy and perhaps the results could be better if we used de prototype-based strategy whit feature selection methods.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Architecture proposed for author profiling. Instance based classification strategy.</figDesc><graphic coords="3,124.68,147.36,345.72,347.28" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>Accuracy in 2-cross fold validation train 2017</figDesc><table><row><cell>Spanish</cell><cell>English</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 .</head><label>2</label><figDesc>Accuracy in 2-cross fold validation train 2015</figDesc><table><row><cell></cell><cell></cell><cell>Spanish</cell><cell>English</cell></row><row><cell></cell><cell>gender</cell><cell>0,68</cell><cell>0,56</cell></row><row><cell>Instance based</cell><cell>Age</cell><cell>0,46</cell><cell>0,45</cell></row><row><cell></cell><cell>join</cell><cell>0,29</cell><cell>0,21</cell></row><row><cell></cell><cell>gender</cell><cell>0,68</cell><cell>0,58</cell></row><row><cell>Prototype based</cell><cell>Age</cell><cell>0,21</cell><cell>0,1</cell></row><row><cell></cell><cell>join</cell><cell>0,17</cell><cell>0,1</cell></row></table></figure>
		</body>
		<back>
			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Conclusions and future work</head><p>A representation that considers the terms used in tweets, is able to differentiate to a large extent the sets of tweets written by authors of different genres. The proposed subjectivity and opinion features allow improvements in classification, but they are not substantial improvements. In the evaluation we made with the collections of 2015, we verified that each of the sets of features separately allows good identifications of the genre and that their combination increases the values obtained. The classification in language variety maintains low results and to a great extent this is due to the little difference that is observed between some of these classes and that many terms used by the authors are of universal character and are standardized in the community.</p><p>We achieved the lowest values of all the team and considering that a baseline method using the 1000 most frequent terms in a Bag of Word representation got better results, then we need to do an exhaustive evaluation of our method.</p><p>We must work on features selection strategies and the analysis of representative objects to each of the classes. We propose to evaluate a classification with rejection or abstention for those users whose tweets do not contain characteristic features with their class, for example for the idea of language and not penalize so much the possible bad classifications.</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Extracting Appraisal Expressions</title>
		<author>
			<persName><forename type="first">K</forename><surname>Bloom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Garg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Argamon</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Association for Computational Linguistics</title>
				<meeting><address><addrLine>Rochester, NY</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="308" to="315" />
		</imprint>
	</monogr>
	<note>Proceedings of NAACL HLT 2007</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">WordNet-Affect: an Affective Extension of WordNet</title>
		<author>
			<persName><forename type="first">Carlos</forename><surname>Strapparava</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Valitutti</forename><surname>Ro</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 4th International Conference on Language Resources and Evaluation</title>
				<meeting>the 4th International Conference on Language Resources and Evaluation</meeting>
		<imprint>
			<date type="published" when="2004">2004</date>
			<biblScope unit="page" from="1083" to="1086" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Authorship Verification, combining Linguistic Features and Different Similarity Functions</title>
		<author>
			<persName><forename type="first">Daniel</forename><surname>Castro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yaritza</forename><surname>Adame</surname></persName>
		</author>
		<author>
			<persName><forename type="first">María</forename><surname>Peláez Brioso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rafael</forename><surname>Muñoz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLEF (Working Notes</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">A Survey of Modern Authorship Attribution Methods</title>
		<author>
			<persName><forename type="first">Efstathios</forename><surname>Stamatatos</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the American Society for Information Science and Technology</title>
		<imprint>
			<biblScope unit="volume">60</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="538" to="556" />
			<date type="published" when="2009-03">March 2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Overview of the 4th Author Profiling Task at PAN 2016: Cross-Genre Evaluations</title>
		<author>
			<persName><forename type="first">Francisco</forename><surname>Manuel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rangel</forename><surname>Pardo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Paolo</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ben</forename><surname>Verhoeven</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Walter</forename><surname>Daelemans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Martin</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Benno</forename><surname>Stein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">CLEF (Working Notes</title>
		<imprint>
			<biblScope unit="volume">2016</biblScope>
			<biblScope unit="page" from="750" to="784" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Overview of the 3rd Author Profiling Task at PAN</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">Rangel</forename><surname>Francisco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Fabio</forename><surname>Pardo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Paolo</forename><surname>Celli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Martin</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Benno</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Walter</forename><surname>Stein</surname></persName>
		</author>
		<author>
			<persName><surname>Daelemans</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2015">2015. 2015</date>
		</imprint>
		<respStmt>
			<orgName>CLEF</orgName>
		</respStmt>
	</monogr>
	<note>Working Notes</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">On the Impact of Emotions on Author Profiling</title>
		<author>
			<persName><forename type="first">Francisco</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Paolo</forename><surname>Rosso</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Processing &amp; Management</title>
		<imprint>
			<biblScope unit="volume">52</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="73" to="92" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Empirical Study of Opinion Mining in Spanish Tweets</title>
		<author>
			<persName><forename type="first">Grigori</forename><surname>Sidorov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sabino</forename><surname>Miranda-Jiménez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Francisco</forename><surname>Viveros-Jiménez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alexander</forename><surname>Gelbukh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Noé</forename><surname>Castro-Sánchez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Francisco</forename><surname>Velásquez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ismael</forename><surname>Díaz-Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sergio</forename><surname>Suárez-Guerra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alejandro</forename><surname>Treviño</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Juan</forename><surname>Gordon</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">LNAI</title>
		<imprint>
			<biblScope unit="volume">7629</biblScope>
			<biblScope unit="page" from="1" to="14" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Recognizing Polarity and Attitude of Words in Text</title>
		<author>
			<persName><forename type="first">L</forename><surname>Hernández</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>López-Lopez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">E</forename><surname>Medina-Pagola</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. F 14th Portuguese Conference on Artificial Intelligence</title>
				<meeting>F 14th Portuguese Conference on Artificial Intelligence<address><addrLine>Aveiro, Portugal</addrLine></address></meeting>
		<imprint>
			<publisher>EPIA</publisher>
			<date type="published" when="2009">2009. 2009</date>
			<biblScope unit="page" from="525" to="536" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Classification of Attitude Words for Opinions Mining</title>
		<author>
			<persName><forename type="first">L</forename><surname>Hernández</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>López-Lopez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">E M</forename><surname>Pagola</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Computational Linguistics and Applications</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="issue">1-2</biblScope>
			<biblScope unit="page" from="267" to="283" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Creación y evaluación de un diccionario marcado con emociones y ponderado para el español</title>
		<author>
			<persName><forename type="first">Ismael Díaz</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Grigori</forename><surname>Sidorov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sergio</forename><surname>Suárez-Guerra</surname></persName>
		</author>
		<idno type="DOI">10.7764/onomazein.29.5</idno>
	</analytic>
	<monogr>
		<title level="j">Onomazein</title>
		<imprint>
			<biblScope unit="volume">29</biblScope>
			<biblScope unit="page">23</biblScope>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Método no supervisado para la clasificación de polaridad en Twitter</title>
		<author>
			<persName><forename type="first">Jose</forename><surname>Manuel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yero</forename><surname>Moreno</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Reynier</forename><forename type="middle">Ortega</forename><surname>Bueno</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="s">VII Conferencia Internacional de Ingeniería Eléctrica</title>
		<imprint>
			<biblScope unit="page" from="1" to="4" />
			<date type="published" when="2014-06">Jun, 2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">FreeLing 3.0: Towards Wider Multilinguality</title>
		<author>
			<persName><forename type="first">Lluís</forename><surname>Padró</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Evgeny</forename><surname>Stanilovsky</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Language Resources and Evaluation Conference (LREC 2012) ELRA</title>
				<meeting>the Language Resources and Evaluation Conference (LREC 2012) ELRA<address><addrLine>Istanbul, Turkey</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2012-05">May, 2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">UniNE at CLEF 2016: Author Profiling</title>
		<author>
			<persName><forename type="first">Mirco</forename><surname>Kocher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jacques</forename><surname>Savoy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLEF (Working Notes</title>
				<imprint>
			<biblScope unit="volume">2016</biblScope>
			<biblScope unit="page" from="903" to="911" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Gabriela Ramírez-de-la-Rosa, Esaú Villatoro-Tello: Profile-based Approach for Age and Gender Identification</title>
		<author>
			<persName><forename type="first">Maria</forename><surname>José</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Garciarena</forename><surname>Ucelay</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Maria</forename><forename type="middle">Paula</forename><surname>Villegas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dario</forename><forename type="middle">G</forename><surname>Funez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Leticia</forename><forename type="middle">C</forename><surname>Cagnina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marcelo</forename><surname>Luis Errecalde</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">CLEF (Working Notes</title>
		<imprint>
			<biblScope unit="volume">2016</biblScope>
			<biblScope unit="page" from="864" to="873" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Authorship Attribution</title>
		<author>
			<persName><forename type="first">Patrick</forename><surname>Juola</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Foundations and Trends in Information Retrieval</title>
				<imprint>
			<date type="published" when="2008-03">March 2008</date>
			<biblScope unit="volume">1</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">Linguistic inquiry and word count: LIWC</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">W</forename><surname>Pennebaker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">E</forename><surname>Francis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">J</forename><surname>Booth</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2001">2001. 2001</date>
			<publisher>Lawrence Erlbaum Associates</publisher>
			<biblScope unit="volume">71</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Overview of the 5th Author Profiling Task at PAN 2017: Gender and Language Variety Identification in Twitter</title>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
		<ptr target=".org" />
	</analytic>
	<monogr>
		<title level="m">Working Notes Papers of the CLEF 2017 Evaluation Labs</title>
		<title level="s">CEUR Workshop Proceedings, CLEF and CEUR-WS</title>
		<imprint>
			<date type="published" when="2017-09">Sep 2017</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
