<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">B4DS @ PRELEARN: Ensemble Method for Prerequisite Learning</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Giovanni</forename><surname>Puccetti</surname></persName>
							<email>giovanni.puccetti@sns.it</email>
						</author>
						<author>
							<persName><forename type="first">Luis</forename><surname>Bolanos</surname></persName>
							<email>luis.bolanos@texty.biz</email>
						</author>
						<author>
							<persName><forename type="first">Filippo</forename><surname>Chiarello</surname></persName>
							<email>filippo.chiarello@unipi.it</email>
						</author>
						<author>
							<persName><forename type="first">Gualtiero</forename><surname>Fantoni</surname></persName>
							<email>g.fantoni@ing.unipi.it</email>
						</author>
						<author>
							<affiliation key="aff0">
								<orgName type="institution">Scuola Normale Superiore</orgName>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="institution">Universitá di Pisa</orgName>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<orgName type="institution">Universitá di Pisa</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">B4DS @ PRELEARN: Ensemble Method for Prerequisite Learning</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">4AD936E44110817E8F9E727C2D98AD77</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T01:04+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>English. In this paper we describe the methodologies we proposed to tackle the EVALITA 2020 shared task PRELEARN. We propose both a methodology based on gated recurrent units as well as one using more classical word embeddings together with ensemble methods. Our goal in choosing these approaches, is twofold, on one side we wish to see how much of the prerequisite information is present within the pages themselves. On the other we would like to compare how much using the information from the rest of Wikipedia can help in identifying this type of relation. This second approach is particularly useful in terms of extension to new entities close to the one in the corpus provided for the task but not actually present in it. With this methodologies we reached second position in the challenge 1 .</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The PRELEARN task consists in classifying pairs of concepts according to whether one is a prerequisite for the other or not. The concepts are presented as Wikipedia pages and they are divided into four different domains, physics, precalculus, data mining and geometry.</p><p>The task was organized in 4 subtasks: i) two of them concerned with the type of information that can be exploited by the submitted models, either solely textual or including metadata, e.g. Wikipedia hyperlinks; ii) the other two based on different classification scenarios, training and testing could happen either on the same domain or three domain could be used as training set and the fourth as testing. A more extensive description of the task together with all the results and more information is found in the report <ref type="bibr" target="#b1">(Alzetta et al., 2020)</ref> which is part of the EVALITA 2020 <ref type="bibr">(Basile et al., 2020)</ref>. The concept of being a prerequisite is highly complex and can be misunderstood from humans as well. Indeed, this relation can be subtle and depending on the domain it may take a deep level of expertise to recognize. One of the reasons this challenge is very interesting, is the fact that several application can arise from this same setting. Regarding this, we point out how it could be interesting to apply the systems we develop for this task to evaluate teaching modules. Indeed, one could design a quality assessment for courses based on the level of agreement between subsequent chapters and sections and their prerequisite relations. A different application, could be the definition of a new way to move around Wikipedia itself, identifying which links move in the same direction as the prerequisite relation and which on the contrary move against it.</p><p>Let us now outline three main aspects common to different works tackling similar tasks. We will take into into account these specifics while developing our own models. The first is that hand crafted features are commonly used, in <ref type="bibr" target="#b10">(Miaschi et al., 2019)</ref> they develop these features mostly analysing textual statistics, for example the occurrence of one concept in the page of another one. In <ref type="bibr" target="#b9">(Liang et al., 2015)</ref> they also develop top down features, however the information they structure does not come from the body of the pages, instead they use the structure of Wikipedia as a graph with hyperlinks. Following this line, the second aspect is the use of graph structures. In most of the works predicting prerequisites, we see how they interpret pages as nodes and hyperlinks as edges. Both in <ref type="bibr" target="#b12">(Talukdar and Cohen, 2012)</ref> and in <ref type="bibr" target="#b9">(Liang et al., 2015)</ref> they use this feature, in some cases joining it with textual information, whereas in others as a stand alone one. On the contrary, in <ref type="bibr" target="#b0">(Adorni et al., 2019)</ref> they use a bottom up graph structures created to help in the prediction. The third and last is the use of neural networks, as done in <ref type="bibr" target="#b10">(Miaschi et al., 2019)</ref>, where they are employed to create representations of text that can afterward be fed as features to simpler classifiers. We remark how structuring information into a graph is a practice used also in other tasks involving several documents. One example is topic modeling <ref type="bibr" target="#b7">(Gerlach et al., 2018)</ref>, it is interesting to notice how this task shares some of the steps needed for prerequisite learning. Indeed, in both cases one needs to crate a hierarchy of concepts which is then exploited in different ways. Since we wish to exploit textual knowledge, we can also employ word embeddings. For the Italian language they are developed in <ref type="bibr" target="#b3">(Berardi et al., 2015)</ref>. On top of them we will use ensemble methodologies since they can proficiently exploit information in these representations. Notice how in principle more modern techniques, such as transformer models <ref type="bibr" target="#b6">(Devlin et al., 2019)</ref> could be used to help performance in this task, however as we will see we preferred not to do so. The main reason supporting this choice is the fact that the dataset provided for this task is not too big and thus we avoided too large models. The systems we developed try to enclose all these pieces of information we reported. Indeed, we try to exploit both knowledge strictly present within the Wikipedia pages provided for this task as well as information coming from the rest of the online encyclopedia.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Description of the System</head><p>In this report we describe the methodology we developed to tackle the PRELEARN task. We report the choices made and the steps that led us to them. In particular, We focused on the raw-text setting, for which we adopted two systems with the goal of prerequisite learning. Although both use the Wikipedia pages' texts, each one does it in different ways.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Model 1</head><p>This model exploits a combination of pretrained word embeddings, of GloVe type <ref type="bibr" target="#b11">(Pennington et al., 2014)</ref>, as trained for Italian in <ref type="bibr" target="#b3">(Berardi et al., 2015)</ref> and handcrafted features, the latter inspired from <ref type="bibr" target="#b10">(Miaschi et al., 2019)</ref>. In particular, for each page title in a concept pair (A, B), we computed a 300-dimension vector by averaging the word embeddings of each word in the A/B title. These two resulting vectors were concatenated together with the following 14 handcrafted features.</p><p>• Is B(A) in A(B)'s text?</p><p>• Number of occurrences of B(A) in A(B)'s text</p><p>• Is B(A) in the first sentence of A(B)?</p><p>• Is B in A's title?</p><formula xml:id="formula_0">• Length of A(B)</formula><p>• Jaccard similarity between the texts</p><p>• Jaccard similarity between nouns in the texts</p><p>• Difference in length between first paragraphs</p><p>• Difference in number of nouns in first paragraphs</p><p>• Jaccard similarity between nouns in first paragraphs</p><p>Then, for each pair (A,B) the final feature vector of 614 dimensions, was fed to a XGBoost classifier <ref type="bibr" target="#b4">(Chen and Guestrin, 2016)</ref>, whose model selection was performed via a nested cross validation with grid search.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Model 2</head><p>This model takes as information the first 400 words of each Wikipedia page, and for each pair (A,B) predicts if word B is a prerequisite for word A. It is composed of a Gated Recurrent Unit <ref type="bibr" target="#b5">(Cho et al., 2014)</ref> with hidden size of 8 and encoding size 32, and a linear layer taking as input the concatenation of the two vectors representing the two Wikipedia pages to check and predict the prerequisite relation. This model, similar to model M1 in <ref type="bibr" target="#b10">(Miaschi et al., 2019)</ref>, though simpler, performs well enough and is fast to train. The parameters are chosen based on a grid search selecting the best results achieved on a validation set. The aforementioned values are the best performing choices for all settings and we keep them for the cross domain task as well. We tried different learning rates, though ultimately a constant one of 0.01 for the whole training was the best choice.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Discarded Models</head><p>We attempted to perform the structured data task as well, in particular adding the structure to see if it would be useful. In order to exploit this knowledge we tried to use a Graph Convolutional Network (GCN) <ref type="bibr" target="#b8">(Kipf and Welling, 2017)</ref>. To do so we added the GCN between the Gated recurrent unit and the linear layer in Model 2 so as to perform the prediction based on the concatenation of the embedding of each node (Wikipedia page) in each pair. However this methodology resulted into lower scores in all dataset so we ended up not submitting it. We believe this is due to the fact that this is not the appropriate way to leverage the information present in the Wikipedia structure. Since we know from <ref type="bibr" target="#b10">(Miaschi et al., 2019)</ref> that the information itself is relevant.</p><p>For Model 1 instead, a variation was tested with a multi-layer perceptron as well, but results were below those reported for the XGBoost ensemble.</p><p>An overall different approach we rejected is using transformer models. Indeed to obtain a representation of the text composing each page we could employ a representation extrapolated from BERT. However, after seeing how, much smaller models were overfitting the training set, we concluded that the amount of available textual data is not enough to exploit this model and avoided it.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Results</head><p>In Table <ref type="table" target="#tab_0">1</ref> we report the achieved accuracy on the test set. As we can see, Model 1 outperformed Model 2. This is remarkable in the sense that the former is simpler than the one based on recurrent networks. The same can be said about the hand-crafted features, which are mostly statistics of each pair of pages based on occurrences. Indeed, as proven also in <ref type="bibr" target="#b10">(Miaschi et al., 2019)</ref>, 1 Values from our own validation set split this information does help the model. We believe Model 1 attained a higher score thanks to its pretrained word embeddings and the larger corpora they are trained upon. Indeed, the dataset used to create those vectors is composed of the whole Italian Wikipedia and of a large amount of novels. This encodes within these representations a wider knowledge than the one provided for this task only. Looking at the accuracy achieved with the GCN layer, we see how performances are systematically lower than the others, that is why we chose not to submit it.</p><p>After looking at the challenge results, we proceeded to explore more in general how well our models performed. In order to do so, for each one, we estimated precision, recall, accuracy and f1 score (reported in Table <ref type="table" target="#tab_1">2</ref>).</p><p>When comparing Model 1 and 2 between them, we noticed that the latter exhibited higher precision in 3 of the 4 areas, but also lower recall in 3 of them. As a result, there was a systematic difference in accuracy and f1-scores favouring Model 1 over Model 2. If we look closely at Model 1 scores in Table <ref type="table" target="#tab_1">2</ref> we see how Physics and Precalculus show a broader difference between precision and recall. This underlines how in these two domains there are some concepts that despite being involved in several prerequisite relations are less represented in the general knowledge. Moreover, the same behavior is experienced for Model 2, indicating how the models started to miss some positive samples. The fact that it happens for this second setting makes us believe this phenomenon is also due to the presence of more spread information within the Wikipedia pages of the concepts enclosed in these domains. As we mentioned the second model has higher precision in three cases, whereas the first has higher recall, in two cases the difference in recall is much in favor of the latter and indeed it is the better performing one.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Discussion</head><p>Regarding the first model, we see how the vectorization obtained from the Wikipedia corpus performs well, particularly considering that it represents exclusively the pages' titles. We also notice that the comparison between the two models is not straighforward since the ensemble model we used was not tested on the vectors obtained from the recurrent neural networks. We did not experiment in this mixed setting, since we believe it would not make sense to deploy a methodology with the power of XGBoost on embeddings solely based on the information present in the pages provided for this task. Indeed, there are high chances that the results for such complex model would still be worse than the one with the pretrained embeddings, since, as we mentioned in Section 4, the knowledge available exclusively in the pages proposed for this task is limited.</p><p>The other remarkable aspect is that to surpass the performance of the GRU, handcrafted features were helpful, despite them being mostly word occurrences counts. This same information is available to the GRU models, which performs worse. This underlines how the recurrent architecture, though powerful and able to capture long distance relations, can not retain this type of substantial details. Regarding the second model introduced, we remark how the hidden units size and the encoding size are very small. This is coherent with the fact that the dataset is not large enough to exploit the scaling potential of a recurrent neural network with a larger size. However, with this small model the results are better than with a baseline and as we mentioned the training times are all quite small. Thus, the idea of performing more ablation studies where bag of words methodologies are used together with recurrent ones, could lead to further improvements still supporting a more bottom-up solution than hand crafted features.</p><p>Following the analysis of the models we used, we can conclude that the property of being a prerequisite is a complex characteristic and thus the use of large amounts of data can be useful. On the other hand, the fact that the model solely based on the data at hand performs only marginally worse than the other underlines how this information is present in the pages themselves. Possibly a mixed dataset contained between the one at hand and the whole Italian Wikipedia could be a solution to move further in prerequisites learning.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Accuracies obtained on the task test set. For the GCN see footnote.</figDesc><table><row><cell>Wikipedia link</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 :</head><label>2</label><figDesc>All scores obtained by Models 1 and 2.</figDesc><table><row><cell></cell><cell cols="3">Precision Recall Accuracy</cell><cell>F1</cell></row><row><cell></cell><cell cols="2">Model 1</cell><cell></cell></row><row><cell>data mining</cell><cell>0.80</cell><cell>0.80</cell><cell cols="2">0.80 0.80</cell></row><row><cell>geometry</cell><cell>0.92</cell><cell>0.92</cell><cell cols="2">0.92 0.92</cell></row><row><cell>physics</cell><cell>0.84</cell><cell>0.82</cell><cell cols="2">0.82 0.81</cell></row><row><cell>precalculus</cell><cell>0.93</cell><cell>0.93</cell><cell cols="2">0.93 0.93</cell></row><row><cell></cell><cell cols="2">Model 2</cell><cell></cell></row><row><cell>data mining</cell><cell>0.82</cell><cell>0.80</cell><cell cols="2">0.81 0.81</cell></row><row><cell>geometry</cell><cell>0.90</cell><cell>0.91</cell><cell cols="2">0.91 0.91</cell></row><row><cell>physics</cell><cell>0.87</cell><cell>0.73</cell><cell cols="2">0.81 0.79</cell></row><row><cell>precalculus</cell><cell>0.95</cell><cell>0.82</cell><cell cols="2">0.89 0.88</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Copyright c</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2020" xml:id="foot_1">for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Towards the identification of propaedeutic relations in textbooks</title>
		<author>
			<persName><forename type="first">Giovanni</forename><surname>Adorni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Chiara</forename><surname>Alzetta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Frosina</forename><surname>Koceva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Samuele</forename><surname>Passalacqua</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ilaria</forename><surname>Torre</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Artificial Intelligence in Education -20th International Conference, AIED 2019</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">Seiji</forename><surname>Isotani</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Eva</forename><surname>Millán</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Amy</forename><surname>Ogan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Peter</forename><forename type="middle">M</forename><surname>Hastings</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Bruce</forename><forename type="middle">M</forename><surname>Mclaren</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Rose</forename><surname>Luckin</surname></persName>
		</editor>
		<meeting><address><addrLine>Chicago, IL, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2019-06-25">2019. June 25-29, 2019</date>
			<biblScope unit="volume">11625</biblScope>
			<biblScope unit="page" from="1" to="13" />
		</imprint>
	</monogr>
	<note>Proceedings, Part I</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Prelearn @ evalita 2020: Overview of the prerequisite relation learning task for italian</title>
		<author>
			<persName><forename type="first">Chiara</forename><surname>Alzetta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alessio</forename><surname>Miaschi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Felice</forename><surname>Dell'orletta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Frosina</forename><surname>Koceva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ilaria</forename><surname>Torre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename><surname>Valerio Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Danilo</forename><surname>Croce</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Maria</forename><surname>Di Maro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lucia</forename><forename type="middle">C</forename><surname>Passaro</surname></persName>
		</author>
		<ptr target="CEUR.org" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop</title>
				<meeting>Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop<address><addrLine>EVALITA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2020">2020. 2020</date>
		</imprint>
	</monogr>
	<note>editors</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Evalita 2020: Overview of the 7th evaluation campaign of natural language processing and speech tools for italian</title>
		<author>
			<persName><forename type="first">Danilo</forename><surname>Valerio Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Maria</forename><surname>Croce</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lucia</forename><forename type="middle">C</forename><surname>Di Maro</surname></persName>
		</author>
		<author>
			<persName><surname>Passaro</surname></persName>
		</author>
		<ptr target="CEUR.org" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop</title>
				<editor>
			<persName><forename type="first">Danilo</forename><surname>Valerio Basile</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Maria</forename><surname>Croce</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Lucia</forename><forename type="middle">C</forename><surname>Di Maro</surname></persName>
		</editor>
		<editor>
			<persName><surname>Passaro</surname></persName>
		</editor>
		<meeting>Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop<address><addrLine>EVALITA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2020">2020. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Word embeddings go to italy: A comparison of models and training datasets</title>
		<author>
			<persName><forename type="first">Giacomo</forename><surname>Berardi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andrea</forename><surname>Esuli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Diego</forename><surname>Marcheggiani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IIR</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Xgboost: A scalable tree boosting system</title>
		<author>
			<persName><forename type="first">Tianqi</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Carlos</forename><surname>Guestrin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD &apos;16</title>
				<meeting>the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD &apos;16<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="785" to="794" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Learning phrase representations using RNN encoder-decoder for statistical machine translation</title>
		<author>
			<persName><forename type="first">Kyunghyun</forename><surname>Cho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bart</forename><surname>Van Merriënboer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Caglar</forename><surname>Gulcehre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dzmitry</forename><surname>Bahdanau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Fethi</forename><surname>Bougares</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Holger</forename><surname>Schwenk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yoshua</forename><surname>Bengio</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</title>
				<meeting>the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)<address><addrLine>Doha, Qatar</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2014-10">2014. October</date>
			<biblScope unit="page" from="1724" to="1734" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">BERT: Pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">Jacob</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ming-Wei</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kenton</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kristina</forename><surname>Toutanova</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
		<title level="s">Long and Short Papers</title>
		<meeting>the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies<address><addrLine>Minneapolis, Minnesota</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2019-06">2019. June</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="4171" to="4186" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">A network approach to topic models</title>
		<author>
			<persName><forename type="first">Martin</forename><surname>Gerlach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Tiago</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Eduardo</forename><forename type="middle">G</forename><surname>Peixoto</surname></persName>
		</author>
		<author>
			<persName><surname>Altmann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Science Advances</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="issue">7</biblScope>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Semisupervised classification with graph convolutional networks</title>
		<author>
			<persName><forename type="first">Thomas</forename><forename type="middle">N</forename><surname>Kipf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Max</forename><surname>Welling</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">5th International Conference on Learning Representations, ICLR 2017</title>
		<title level="s">Conference Track Proceedings</title>
		<meeting><address><addrLine>Toulon, France</addrLine></address></meeting>
		<imprint>
			<publisher>OpenReview</publisher>
			<date type="published" when="2017-04-24">2017. April 24-26, 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Measuring prerequisite relations among concepts</title>
		<author>
			<persName><forename type="first">Chen</forename><surname>Liang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Zhaohui</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wenyi</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">Lee</forename><surname>Giles</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing</title>
				<meeting>the 2015 Conference on Empirical Methods in Natural Language Processing<address><addrLine>Lisbon, Portugal</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2015-09">2015. September</date>
			<biblScope unit="page" from="1668" to="1674" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Linguistically-driven strategy for concept prerequisites learning on italian</title>
		<author>
			<persName><forename type="first">Alessio</forename><surname>Miaschi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Chiara</forename><surname>Alzetta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alberto</forename><surname>Franco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Felice</forename><surname>Cardillo</surname></persName>
		</author>
		<author>
			<persName><surname>Dell'orletta</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications</title>
				<meeting>the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="285" to="295" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Glove: Global vectors for word representation</title>
		<author>
			<persName><forename type="first">Jeffrey</forename><surname>Pennington</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Richard</forename><surname>Socher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christopher</forename><forename type="middle">D</forename><surname>Manning</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">EMNLP</title>
				<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page" from="1532" to="1543" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Crowdsourced comprehension: Predicting prerequisite structure in Wikipedia</title>
		<author>
			<persName><forename type="first">Partha</forename><surname>Talukdar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">William</forename><surname>Cohen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Seventh Workshop on Building Educational Applications Using NLP</title>
				<meeting>the Seventh Workshop on Building Educational Applications Using NLP<address><addrLine>Montréal, Canada</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2012-06">2012. June</date>
			<biblScope unit="page" from="307" to="315" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
