<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Language Variety and Gender Classification for Author Profiling in PAN 2017 Notebook for PAN at CLEF 2017</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Alexander</forename><surname>Ogaltsov</surname></persName>
							<email>ogaltsov@ap-team.ru</email>
							<affiliation key="aff0">
								<orgName type="department">Higher School of Economics</orgName>
								<orgName type="institution">Moscow Institute of Physics and Technology</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Alexey</forename><surname>Romanov</surname></persName>
							<email>alexey.romanov@phystech.edu</email>
							<affiliation key="aff0">
								<orgName type="department">Higher School of Economics</orgName>
								<orgName type="institution">Moscow Institute of Physics and Technology</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Antiplagiat</forename><surname>Cjsc</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Higher School of Economics</orgName>
								<orgName type="institution">Moscow Institute of Physics and Technology</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Language Variety and Gender Classification for Author Profiling in PAN 2017 Notebook for PAN at CLEF 2017</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">8D02F0A392C42A33F141490277E3F81E</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T20:28+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>We describe the method of Author Profiling task. The task deals with study of profile aspects like gender and language variety. We explore an approach of using high-order char n-grams as features and logistic regression as a classifier for all subtasks. This approach appears to be simple and effective for the task. We also investigated feature importances and low-dimensional embeddings of the data.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Author profiling task considers different profile dimensions of the author of the text. This year shared task <ref type="bibr" target="#b11">[12]</ref> <ref type="bibr" target="#b10">[11]</ref> is focusing on gender and language variety. Previous competitions explored properties like gender, age group <ref type="bibr" target="#b12">[13]</ref> and personal traits <ref type="bibr" target="#b6">[8]</ref>. This task is interesting from both industrial and scientific points of view. Applications like accurate advertising targeting, security and forensic fields make this task highly relevant for practice. Also, the task can be considered as a tool for filling missing information about a person in some political or demographic research. Research community also pays attention to the task special track of PAN <ref type="bibr" target="#b5">[7]</ref> shared task is held since 2013. Each year contributed a new language or new profile dimension to classify. The common part of all years was gender identification. The first task was on blog data in Spanish and English <ref type="bibr" target="#b8">[10]</ref>. Competition in 2014 concentrated on different sources like reviews, tweets etc. <ref type="bibr" target="#b7">[9]</ref>. The task of 2015 extended by additional languages and realvalued personal traits <ref type="bibr" target="#b6">[8]</ref>. The main characteristic of the most recent shared task was cross-genre. The target was to develop a model such that it will be robust to the domain of data <ref type="bibr" target="#b12">[13]</ref>. Since gender identification was presented in all previous competitions, there were many tested approaches. The main features were n-grams and various text statistics <ref type="bibr" target="#b2">[4]</ref>. Language variety task was first to appear at PAN 2017, but there were language variety detection competitions like Discriminating between similar languages and national language varieties (DSL) 2016 <ref type="bibr" target="#b0">[1]</ref>. Winning approach of this contest used char n-grams in wide range (1-7) with a linear classifier <ref type="bibr" target="#b1">[3]</ref>. We used this method not only for language variety task but also for gender classification. A new feature of the current shared task is language variety. Each language has several variants. For instance, we have two several Portuguese: Brazil variant and European one. The task is to distinguish one from another. Languages and their varieties can be found in Table <ref type="table" target="#tab_0">1</ref>. Our approach tries to automatically extract features for each of variant Portuguese, English, Spanish and Arabic without any linguistic knowledge. We use char n-grams as features and logistic regression as a classifier. Evaluation metric is accuracy for both subtasks. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Methodology</head><p>This section is about our approach to current PAN Author Profiling task. First, we briefly discuss preprocessing steps. Then, we describe how we construct the feature space. Finally, we explain our choice of logistic regression as our classifier.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Preprocessing</head><p>We did not perform any preprocessing like removing hashtags, HTML tags and urls, because we considered it as potentially informative features.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Classification</head><p>Our main assumption was to consider all short texts written by a single author as an object in machine learning task formulation. We formulated the problem as classification task with two or more classes depending on language (Table <ref type="table" target="#tab_0">1</ref>). If language has more than two varieties we used "one versus other" scheme. Let dataset</p><formula xml:id="formula_0">D = {(x i , y i )}, i = 1, . . . , m,</formula><p>to be consisted of pairs "object-class", x i ∈ R n . Each object x i has one of Z class labels y i ∈ Y = {1, . . . , Z}. We have to find mapping f ∈ F : R d → Y, which minimizes empirical risk on dataset D:</p><formula xml:id="formula_1">f = arg min f ∈F xi,yi∈D [f (x i ) = y i ],</formula><p>where F -family of models. Feature space was constructed such that for each language corpus we performed counting of character level n-gram in some range. This counts were used as features. The number of authors and features for different tasks can be founded in Table <ref type="table">2</ref>. One can see that the data is quite sparse. Density distribution of non-zero n-grams for Portuguese is shown in Figure <ref type="figure" target="#fig_0">1</ref>. We did not used higher-order n-grams because of RAM restric-  <ref type="table">2</ref>. Languages and Varieties tions, although <ref type="bibr" target="#b1">[3]</ref> reported quality to increase up to 7 char n-gram level. We performed classification by means of logistic regression model with regularization parameter C = 1. Our choice was justified by the fact that logistic regression has high bias and low variance.</p><p>In this section we describe our results during cross-validation and on the test set. Next we present embedding of the data in low-dimensional space. Finally, we discuss about feature importances of our classifier.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Results and Data Visualization</head><p>Evaluation metric this task is accuracy:</p><formula xml:id="formula_2">Accuracy = T P + T N T P + F P + T N + T F</formula><p>We evaluated quality of gender and language variety subtasks separately by using crossvalidation scheme with five folds. Results can be found in Table <ref type="table" target="#tab_1">3</ref>.   It was interesting to see how data is located in a feature space. To do so we exploited modern dimensionality reduction and data visualization techniques. Our choice of algorithm was t-SNE [2] since it reported to be fast when the number of objects is small and tends to efficiently preserve local structure of the data. Also, Python scikit-learn <ref type="bibr" target="#b3">[5]</ref> implementation of the algorithm supports sparse matrices as an input. Example for Portuguese authors is at Figure <ref type="figure" target="#fig_3">3</ref>. Unfortunately, axes of this algorithm have no clear interpretation. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Feature importances</head><p>We investigated absolute values of coefficients of our model for Portuguese language variety. This values can be considered as feature importances (Figure <ref type="figure" target="#fig_4">4</ref>). Axis x means position in array of linear regression coefficients sorted in descending order. Axis y is absolute value of the coefficient. One can see that on the one hand feature coefficients have pretty low magnitude, but on the other hand there is group of features with relatively high importance. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Conclusion and Future Work</head><p>We explored a simple and robust method for gender and language variety classification for PAN17 Author Profiling task. It turned out that high-order char n-grams are good features that are easy to generate with no need of handcrafting or expert linguistics knowledge. The main disadvantage of such features is that this is almost impossible to perform error analysis. We trained logistic regression classifier for both subtasks and evaluated accuracy measure. We will explore effects on quality measure due to adding even more n-grams.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 .</head><label>1</label><figDesc>Figure 1. Density distribution for Portuguese.</figDesc><graphic coords="3,169.35,177.92,276.66,262.06" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head></head><label></label><figDesc>Example ROC-curve for language variety classification of Portuguese is shown at Figure2. FPR and TPR are false positive rate and true positive rate respectively with various classification threshold. We evaluated test scores via TIRA.<ref type="bibr" target="#b4">[6]</ref> </figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 .</head><label>2</label><figDesc>Figure 2. ROC-curve for Portuguese language variety.</figDesc><graphic coords="4,169.35,352.86,276.67,264.94" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 3 .</head><label>3</label><figDesc>Figure 3. t-SNE data visualization for Portuguese.</figDesc><graphic coords="5,152.06,347.73,311.25,292.33" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 4 .</head><label>4</label><figDesc>Figure 4. Feature coefficients for Portuguese language variety.</figDesc><graphic coords="6,169.35,231.09,276.67,259.64" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>Languages and Varieties</figDesc><table><row><cell cols="2">Language Variety</cell></row><row><cell cols="2">Portuguese Portugal, Brazil</cell></row><row><cell>English</cell><cell>Australia, Canada, Great Britain, Ire-</cell></row><row><cell></cell><cell>land, New Zealand, United States</cell></row><row><cell>Spanish</cell><cell>Argentina, Chile, Colombia, Mexico,</cell></row><row><cell></cell><cell>Peru, Spain, Venezuela</cell></row><row><cell>Arabic</cell><cell>Gulf, Levantine, Maghrebi, Egypt</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 3 .</head><label>3</label><figDesc>Language CV gender acc. CV variety acc. Test gender acc. Test variety acc. Evaluation</figDesc><table><row><cell cols="2">Portuguese 0.8025</cell><cell>0.9850</cell><cell>0.7988</cell><cell>0.9725</cell></row><row><cell>English</cell><cell>0.7918</cell><cell>0.7913</cell><cell>0.7875</cell><cell>0.8092</cell></row><row><cell>Spanish</cell><cell>0.7456</cell><cell>0.8892</cell><cell>0.7600</cell><cell>0.8989</cell></row><row><cell>Arabic</cell><cell>0.7263</cell><cell>0.7739</cell><cell>0.7213</cell><cell>0.7556</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Visualizing high-dimensional data using t-sne</title>
		<author>
			<persName><forename type="first">L</forename><surname>Maaten</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Hinton</surname></persName>
		</author>
		<ptr target="http://ttg.uni-saarland.de/vardial2016/dsl2016" />
	</analytic>
	<monogr>
		<title level="m">Dsl shared task</title>
				<imprint>
			<date type="published" when="2008">2016. 2016. Nov 2008</date>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page" from="2579" to="2605" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Discriminating between similar languages and arabic dialect identification: A report on the third dsl shared task</title>
		<author>
			<persName><forename type="first">S</forename><surname>Malmasi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zampieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Ljubešić</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ali</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tiedemann</surname></persName>
		</author>
		<ptr target="http://aclweb.org/anthology/W16-4801" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3)</title>
				<meeting>the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3)<address><addrLine>Osaka, Japan</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2016-12">December 2016</date>
			<biblScope unit="page" from="1" to="14" />
		</imprint>
	</monogr>
	<note>The COLING 2016 Organizing Committee</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Exploring the Effects of Cross-Genre Machine Learning for Author Profiling in PAN 2016-Notebook for PAN at CLEF</title>
		<author>
			<persName><forename type="first">P</forename><surname>Modaresi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Liebeck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Conrad</surname></persName>
		</author>
		<ptr target="http://ceur-ws.org/Vol-1609/" />
	</analytic>
	<monogr>
		<title level="m">CLEF 2016 Evaluation Labs and Workshop -Working Notes Papers</title>
				<editor>
			<persName><forename type="first">K</forename><surname>Balog</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Cappellato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Macdonald</surname></persName>
		</editor>
		<meeting><address><addrLine>Évora, Portugal</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2016-09">2016. September. Sep 2016</date>
			<biblScope unit="page" from="5" to="8" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Scikit-learn: Machine learning in Python</title>
		<author>
			<persName><forename type="first">F</forename><surname>Pedregosa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Varoquaux</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gramfort</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Michel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Thirion</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Grisel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Blondel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Prettenhofer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Weiss</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Dubourg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Vanderplas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Passos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Cournapeau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Brucher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Perrot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Duchesnay</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page" from="2825" to="2830" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Improving the Reproducibility of PAN&apos;s Shared Tasks: Plagiarism Detection, Author Identification, and Author Profiling</title>
		<author>
			<persName><forename type="first">M</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Gollub</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Stamatatos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Information Access Evaluation meets Multilinguality, Multimodality, and Visualization. 5th International Conference of the CLEF Initiative (CLEF 14</title>
				<editor>
			<persName><forename type="first">E</forename><surname>Kanoulas</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Lupu</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Clough</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Sanderson</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Hall</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Hanbury</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Toms</surname></persName>
		</editor>
		<meeting><address><addrLine>Berlin Heidelberg New York</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2014-09">Sep 2014</date>
			<biblScope unit="page" from="268" to="299" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Overview of PAN&apos;17: Author Identification, Author Profiling, and Author Obfuscation</title>
		<author>
			<persName><forename type="first">M</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Tschuggnall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Stamatatos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Experimental IR Meets Multilinguality, Multimodality, and Interaction. 8th International Conference of the CLEF Initiative (CLEF 17)</title>
				<editor>
			<persName><forename type="first">G</forename><surname>Jones</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Lawless</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Gonzalo</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Kelly</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Goeuriot</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Mandl</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Cappellato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<meeting><address><addrLine>Berlin Heidelberg New York</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2017-09">Sep 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Overview of the 3rd Author Profiling Task at PAN 2015</title>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Celli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Daelemans</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLEF 2015 Evaluation Labs and Workshop -Working Notes Papers</title>
				<editor>
			<persName><forename type="first">L</forename><surname>Cappellato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Jones</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>San Juan</surname></persName>
		</editor>
		<imprint>
			<biblScope unit="page" from="8" to="11" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Overview of the 2nd Author Profiling Task at PAN 2014</title>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Chugur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Trenkmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Verhoeven</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Daelemans</surname></persName>
		</author>
		<ptr target="CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">CLEF 2014 Evaluation Labs and Workshop -Working Notes Papers</title>
				<editor>
			<persName><forename type="first">L</forename><surname>Cappellato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Halvey</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">W</forename><surname>Kraaij</surname></persName>
		</editor>
		<meeting><address><addrLine>Sheffield, UK</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2014-09">September. Sep 2014</date>
			<biblScope unit="page" from="15" to="18" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Overview of the Author Profiling Task at PAN 2013</title>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Koppel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Stamatatos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Inches</surname></persName>
		</author>
		<editor>Forner, P., Navigli, R., Tufis, D.</editor>
		<imprint>
			<date type="published" when="2013">2013</date>
			<pubPlace>CLEF</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m">Evaluation Labs and Workshop -Working Notes Papers</title>
				<meeting><address><addrLine>Valencia, Spain</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013-09">September. Sep 2013</date>
			<biblScope unit="page" from="23" to="26" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
		<title level="m">Working Notes Papers of the CLEF 2017 Evaluation Labs</title>
				<editor>
			<persName><forename type="first">L</forename><surname>Cappellato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Goeuriot</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Mandl</surname></persName>
		</editor>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Overview of the 5th Author Profiling Task at PAN 2017: Gender and Language Variety Identification in Twitter</title>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">M S B</forename></persName>
		</author>
		<ptr target=".org" />
	</analytic>
	<monogr>
		<title level="m">CLEF 2017 Labs and Workshops</title>
		<title level="s">Notebook Papers. CEUR Workshop Proceedings, CLEF and CEUR-WS</title>
		<imprint>
			<date type="published" when="2017-09">Sep 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Overview of the 4th Author Profiling Task at PAN 2016: Cross-Genre Evaluations</title>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel Pardo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Verhoeven</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Daelemans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
		<ptr target="http://ceur-ws.org/Vol-1609/" />
	</analytic>
	<monogr>
		<title level="m">Working Notes Papers of the CLEF 2016 Evaluation Labs. CEUR Workshop Proceedings, CLEF and CEUR-WS</title>
				<imprint>
			<date type="published" when="2016-09">Sep 2016</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
