<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Creativity Embedding: a vector to characterise and classify plausible triples in deep learning NLP models</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Isabeau</forename><surname>Oliveri</surname></persName>
							<email>isabeau.oliveri@polito.it</email>
						</author>
						<author>
							<persName><forename type="first">Luca</forename><surname>Ardito</surname></persName>
							<email>luca.ardito@polito.it</email>
						</author>
						<author>
							<persName><forename type="first">Giuseppe</forename><surname>Rizzo</surname></persName>
							<email>giuseppe.rizzo@linksfoundation.com</email>
						</author>
						<author>
							<persName><forename type="first">Maurizio</forename><surname>Morisio</surname></persName>
							<email>maurizio.morisio@polito.it</email>
						</author>
						<author>
							<affiliation key="aff0">
								<orgName type="institution">Politecnico di Torino</orgName>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="institution">Politecnico di Torino</orgName>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<orgName type="institution">Politecnico di Torino</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Creativity Embedding: a vector to characterise and classify plausible triples in deep learning NLP models</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">85A51600DFC965A8DBFDD134887271BF</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-19T15:38+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Creativity Embedding</term>
					<term>Creativity Metric</term>
					<term>NLP</term>
					<term>Creativity Evaluation</term>
					<term>Triple</term>
					<term>Knowledge Graph</term>
					<term>BERT</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>English. In this paper we define the creativity embedding of a text based on four self-assessment creativity metrics, namely diversity, novelty, serendipity and magnitude, knowledge graphs, and neural networks. We use as basic unit the notion of triple (head, relation, tail). We investigate if additional information about creativity improves natural language processing tasks. In this work, we focus on triple plausibility task, exploiting BERT model and a WordNet11 dataset sample. Contrary to our hypothesis, we do not detect increase in the performance.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Current conversational agents have emerged as powerful instruments for assisting humans. Oftentimes, their cores are represented by natural language processing (NLP) models and algorithms. However, these models are far from being exhaustive representation of reality and language dynamics, trained on biased data through deep learning algorithms, where the flow among various layers without could result in information loss <ref type="bibr" target="#b12">(Wang et al., 2015)</ref>. As a consequence, NLP techniques still find it challenging to manage conversation that they have never encountered before, reacting not efficiently to novel scenarios.</p><p>One way to mitigate these issues is the integration of structured information, which knowledge graphs are one of the best-known sys-Copyright ©2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).  <ref type="bibr" target="#b11">(Vrandečić and Krötzsch, 2014)</ref>, is an example of statement.</p><p>tems for representing them. The most prominent example is the Semantic Web <ref type="bibr" target="#b0">(Berners-Lee et al., 2001)</ref>, where the information is represented through linked statements, each one composed of head,relation,tail, forming a triple (Figure <ref type="figure" target="#fig_0">1</ref>). This semantic embedding allows significant advantages such as reasoning over data and operating with heterogeneous data sources.</p><p>Integration of structured information is not the only method that literature provides us to improve NLP techniques. Previous researches pointed out that analysis of creativity features could improve self-assessment evaluation, with benefits for solutions generated and inputs understanding <ref type="bibr" target="#b4">(Lamb et al., 2018;</ref><ref type="bibr" target="#b2">Karampiperis et al., 2014;</ref><ref type="bibr" target="#b10">Surdeanu et al., 2008)</ref>. We specify that in this work creativity is intended as capability to create, understand and evaluate novel contents. The concepts of Creativity AI have been discussed in their interconnections with the Semantic Web <ref type="bibr">(Ławrynowicz, 2020)</ref>, generalizable to knowledge graphs. <ref type="bibr" target="#b3">Kuznetsova et al. (Kuznetsova et al., 2013)</ref> define quantitative measures of creativity in lexical compositions, exploring different theories, such as divergent thinking, compositional structure and creative semantic subspace. The crucial point is that no every novel combinations are perceived creative and useful, distinguishing creativity perceived in unconventional, uncommon or "expressive in an interesting, imaginative, or inspirational way".</p><p>Despite it is made clear the interest of the scientific community in exploring this direction, little research is conducted over creativity in the NLP field. The results and the considerations made by Kuznetsova and Ławrynowicz, led us to investigate the possible correlations between improvements in NLP tasks and creativity, with a particular focus on self-assessment. In this paper we introduce a novel approach for supporting deep learning algorithms with a mathematical representation of creativity feature of a text. We named it creativity embedding and based it on metrics of self-evaluation creativity over graph knowledge base.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Approach</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Self-assessment creativity metrics</head><p>When humans face a problem they never encountered before, they usually perform a selfassessment procedure respect their previous knowledge and context, generally voting for the best solution. Following the example reported in Figure <ref type="figure" target="#fig_2">2</ref>, we can imagine that a person has to describe the colour of a grey desk. He does not remind the name of the colour at that time, and performs a creative process. He use a metaphor to describe the grey colour of the desk, referring to the stereotype colour of a "mouse". This metaphor is widely accepted, and the colour would be ideally understand by the interlocutor. If in place of "mouse" the random term "mask" is used, the meaning will not probably received if not particular context or knowledge is shared between the person and the interlocutor, resulting in a not effective creative process. To emulate this self-assessment procedure, we propose metrics inspired by the related-concept literature, such as recommender systems <ref type="bibr" target="#b7">(Monti et al., 2019)</ref> and machine learning <ref type="bibr" target="#b9">(Pimentel et al., 2014;</ref><ref type="bibr" target="#b9">Ruan et al., 2020)</ref>. The knowledge is represented by a graph of items interconnected by their relation (triples).</p><p>We define four metrics, namely diversity (1), novelty (2), serendipity (3), and magnitude (4). In these metrics we make use of a similarity function. In fact, to define the similarity (or the diversity, from another angle) between two or more items, we need a method and a representation that allows us to define a distance  between them. In the literature, there is no fixed notion of similarity. However, a common strategy for texts is transforming words and sentences in vectors, taking in account and keeping their distributional properties and connections. Subsequently, mathematical distance functions are applied. The similarity function could defines a semantic similarly function between two items (words or sentences) under these conditions. For prompt understanding, we anticipate that in our experiment we use cosine similarity function and BERT vectors (embeddings) as words representation, as will be discussed in following sections. Nevertheless, thus defined metrics could be computed with different item vector representation and similarity function, as long as it is adopted a similarity function with output domain [0,1], with high value for high similarity.</p><p>Diversity (1) represents the semantic diversity between the head h T and tail t T of the triple T . This information tells how these two elements are not semantically close. It could be considered as T internal semantic diversity.</p><formula xml:id="formula_0">div(T ) = 1 − similarity(h T , t T ) (1)</formula><p>Novelty (2) of a triple T is its average semantic diversity respect others triples in the context.</p><p>Context C is the sub-graph of triple obtained by traversing the paths of length p in the knowledge graph, starting from the triple h T under examination, collecting n nearest triples. It could be considered as external semantic diversity of T respect to the context C retrieved.</p><formula xml:id="formula_1">nov(T ) = 1 n n i=1 1 − similarity(T, C i ) (2)</formula><p>Serendipity (3) is here intended as the semantic novelty of the triple T , taking into account the s most novel triples considering the knowledge graph (refined context S). It could be considered as T novelty relevance.</p><formula xml:id="formula_2">ser(T ) = 1 s s i=1 1 − similarity(T, S i )<label>(3)</label></formula><p>Magnitude (4) outlines the rarity of the triple, ranking rk each component of the triple by the number of its occurrences over the total number of items in the knowledge graph. The ranking function thus defined has an output domain <ref type="bibr">[0,</ref><ref type="bibr">1]</ref>.</p><formula xml:id="formula_3">mag(T ) = rk(h T ) + rk(rel T ) + rk(t T ) 3 (4)</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Creativity Embedding</head><p>There were no annotated datasets on the creativity characteristics of interest. For this reason, a direct comparison with the ground truth was hampered.</p><p>To overcome this obstacle, we indirectly measured the effectiveness of this approach by applying it to an external model and judging the results on the triple plausibility task <ref type="bibr" target="#b14">(Yao et al., 2019;</ref><ref type="bibr" target="#b13">Wang et al., 2018;</ref><ref type="bibr" target="#b12">Wang et al., 2015;</ref><ref type="bibr" target="#b8">Padó et al., 2009)</ref>. The triple plausibility task consists of classifying a dataset's triples in plausible or not plausible classes, comparing the result respect to the ground truth. We choose this task to perform an indirect evaluation of our proposal, rely on the correlation between plausibility and creativity <ref type="bibr" target="#b4">(Lamb et al., 2018)</ref>, as plausibility could represent a positive outcome of an effective creative process. The current trend in machine learning and natural language processing models pushes the use of mathematical representation of meaningful information utilising vectors, commonly known in this field as embeddings. For these reasons, we outline and train a neural network using the computed ground truth to predict creativity values, and define as creativity embedding the weight of last hidden layer. This creativity embedding can be added and adapted in its dimension. Stated the above concepts, we define the subsequent research questions.</p><p>Research Question: A creativity embedding extracted from the creativity neural network could improve triple plausibility classification in deep learning models?</p><p>3 Model Architecture</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">BERT</head><p>We select Bidirectional Encoder Representations from Transformers (BERT) <ref type="bibr" target="#b1">(Devlin et al., 2019)</ref> as a model for investigating the effects of creativity embedding, due to its flexibility and modularity, as well as being state of the art for various NLP tasks. The BERT model could be divided into three main parts: preprocessing of the input, stack of transformer layers, and other layers on top to perform a particular task -typically a classifier. A stack of Transformers forms the BERT core. A transformer exploits the attention mechanism to learn the contextual relationship between sentences and words input. The input is not considered in one direction, but figuratively in all ones at one time, defining the context of a word considering the entire surrounding words. The model is trained with a sort of play, where some words or entire sentences are masked, and the model has to predict them. We do not modify the core of the model; we are more interested in the preprocessing part, where we will inject the creativity embedding, as explained in the next section.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Creativity Neural Network and Creativity CLS Embedding</head><p>The outline of the architecture proposed for the task is shown in Figure <ref type="figure" target="#fig_3">3</ref>. In the lower part, the triple flows through the BERT model. We used a modified tokenization technique of Knowledge Graph BERT (KG-BERT) <ref type="bibr" target="#b14">(Yao et al., 2019)</ref>  The corresponding token identifiers and embeddings are retrieved through two lookup tables, provided by the BERT model. At the top of Figure <ref type="figure" target="#fig_3">3</ref>, we show our creativity neural network. A compact and fixed-size version of the embeddings is obtained from BERT, summing the embeddings of each component of the triple. This compact version feeds the proposed neural network in charge of predicting creativity's four values and producing creativity embedding. The neural network consists of an input layer (768 * 3 neurons), an output layer (4 neurons), 4 fully connected hidden layers with a dropout probability = 0.5. The activation function used is ReLU . This neural network structure is basic since its main task is to have a flexible last hidden layer adaptable to the technology that would leverage the creativity embedding. The CLS token is one of the most representative tokens to perform classification and other types of predictions. Came to us exploiting CLS token to adding creative embedding of the triple, providing the model with a non-empty CLS, Creativity CLS Embedding. In this case, the penultimate layer has been described with several neurons equal to 768, the same size as the BERT embeddings. On the top of the architecture, a linear classifier is in charge of predictions of the plausibility task relying on Creativity CLS Embedding.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Experiment</head><p>In this experiment we random sample triples from WordNet11 <ref type="bibr" target="#b6">(Miller, 1995)</ref> dataset (50000 train, 5000 validation, 3000 test, with positive and negative labels balanced).</p><p>Creativity Neural Network. As stated in the previous sections, we compute the four metrics on each triple dataset to create the ground truth. As a similarity function we use cosine similarity, that returns a value between 0 and 1, with high value for high similarity. We applied the cosine similarity function after transforming words and sentences in embeddings, provided by BERT model. We encountered slowdowns only with novelty metric. The number of nodes is not predictable a priori in our setting, and the mathematical nature of the formula is sensitive to a high number of nodes. Peaks of memory allocation could occur, as well as long computation time. We limit the failure due to out of memory or timeout of the scheduled jobs applying the "divide et impera" paradigm and other adjustments. The length of the path p, seen as recursion deep, is fixed to 5. For each node interested by recursion, the number of maximum neighbor nodes n considered is fixed to 20. Once we obtain all the metrics values, we can train the Creativity Neural Network, as a regression problem. We use: as loss criterion mean squared error loss; as optimizer AdamW with learning rate = 0.001, betas = (0.9, 0.999), epsilon = 1e −08 , weight decay = 0.01; as scheduler StepLR with parameters step size = 10 and gamma = 0.1; we train the model for 10 epochs, size batch of 512. To evaluate performance on test set we compute variance score = −0.4493, mean absolute error = 0.1733 , mean squared error = 0.0388 and R2 score = −6.7694. Although small values of mean squared and absolute error, R2 tells us that the model do not approximate the distribution better than the "best-fit" line. This is probably due to low entropy of the inputted metrics values, that inspected, result in stationing around 0.5 value.</p><p>Triple Plausibility Task. The tokenized triple is inputted to the Creativity Neural Network, obtaining the creativity embeddings. This is added to the CLS embedding token, and the triple flows through the Transformers stack. Therefore, the BERT model is used to make predictions and address the triple plausibility task, putting a linear classifier on top of the Transformer stack. We use as loss function the binary cross-entropy loss function. The literature suggests few epochs and samples for the finetuning process. We finetune BERT for 2 epochs; after we freeze the weights of the model, training only the classifier layer for 3 epochs. We select BERT base uncased as baseline model; as optimizer AdamW with learning rate = 5e −05 , as scheduler a linear scheduler with warm up proportion = 10%; for the classifier dropout probability = 0.5. We fix the maximum sequence length at 100 tokens, as all the triples after tokenization do not exceed this number of tokens.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Result and Conclusion</head><p>In this paper we investigate if defined creativity embedding improves triple plausibility task, exploiting BERT model. We do not detect an increase in the performance (Table <ref type="table" target="#tab_1">1</ref>), comparing ourselves to KG-BERT results. In this comparison we should point out that the sample used is one fifth of the complete WN11 dataset. This result is somewhat contrary to our expectations, as the creativity embeddings represent in some way a priori information. A possible explanation might be the learning methodology of the creativity embedding: we suppose that a significant loss of information in the process has occurred. Further research might explore other types of embeddings (Grohe, 2020), as graph2vec, and different integration of the proposed metrics. Future experimental investigations may try different parameter configurations. For example, the number of nodes considered intuitively could change the values of metrics as a novelty. Nevertheless, more in-depth data analysis on the used dataset, corresponding knowledge graph, and data correlations could provide additional insights. In future work, we will consider different combinations of metrics defined to train the creativity neural network. It is possible that there are metrics more or not relevant for the task. Selecting metrics strictly relevant will result in a lightening of the computational effort and will give us information about correlations between metrics and results. To conclude, we aim to bring the NLP community's attention to new research topics on creativity. </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure1: The triple (Douglas Adams, educated at, St John's College), from Wikidata knowledge base<ref type="bibr" target="#b11">(Vrandečić and Krötzsch, 2014)</ref>, is an example of statement.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure2: A person produces different solutions to answer a question. Therefore he performs a selfassessment procedure, taking into account several parameters p based on its knowledge and the context. Finally, he chooses the possible best solution. Parameters are expressed as numbers, for simplicity.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: For each triple, Creativity Embedding computed by Creativity Neural Network is added to BERT CLS embedding, defining the Creativity CLS Embedding. A linear classifier on top perform the triple plausibility classification.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1 :</head><label>1</label><figDesc>Triple plausibility experiment results. deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171-4186. Martin Grohe. 2020. Word2vec, node2vec, graph2vec, x2vec: Towards a theory of vector embeddings of structured data. In Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS'20, page 1-16, New York, NY, USA. Association for Computing Machinery.</figDesc><table><row><cell cols="3">Number of triples</cell><cell></cell><cell cols="2">Model Metrics</cell></row><row><cell>Train</cell><cell>Val</cell><cell>Test</cell><cell cols="3">Accuracy Recall Precision</cell><cell>F1</cell></row><row><cell cols="3">CE+BERT 50000 3000 5000</cell><cell>0.5093</cell><cell>0.8510</cell><cell>0.5102</cell><cell>0.6379</cell></row><row><cell cols="3">KG-BERT 225162 5218 21088</cell><cell>0.9334</cell><cell>0.9345</cell><cell>0.9324</cell><cell>0.9334</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>Computational resources provided by HPC@POLITO, which is a project of Academic Computing within the Department of Control and Computer Engineering at the Politecnico di Torino 2 . We thank the reviewers from CLiC-it 2020 conference for the comments and advices.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">The semantic web</title>
		<author>
			<persName><forename type="first">Tim</forename><surname>Berners-Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">James</forename><surname>Hendler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ora</forename><surname>Lassila</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Scientific american</title>
		<imprint>
			<biblScope unit="volume">284</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="34" to="43" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<author>
			<persName><forename type="first">Jacob</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ming-Wei</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kenton</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kristina</forename><surname>Toutanova</surname></persName>
		</author>
		<ptr target="http://www.hpc.polito.it" />
		<title level="m">Bert: Pre-training of 2</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Towards machines for measuring creativity: The use of computational tools in storytelling activities</title>
		<author>
			<persName><forename type="first">P</forename><surname>Karampiperis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Koukourikos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Koliopoulou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE 14th International Conference on Advanced Learning Technologies</title>
				<imprint>
			<date type="published" when="2014">2014. 2014</date>
			<biblScope unit="page" from="508" to="512" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Understanding and quantifying creativity in lexical composition</title>
		<author>
			<persName><forename type="first">Polina</forename><surname>Kuznetsova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jianfu</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yejin</forename><surname>Choi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing</title>
				<meeting>the 2013 Conference on Empirical Methods in Natural Language Processing<address><addrLine>Seattle, Washington, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2013-10">2013. October</date>
			<biblScope unit="page" from="1246" to="1258" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Evaluating computational creativity: An interdisciplinary tutorial</title>
		<author>
			<persName><forename type="first">Carolyn</forename><surname>Lamb</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Daniel</forename><forename type="middle">G</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Charles</forename><forename type="middle">L A</forename><surname>Clarke</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Comput. Surv</title>
		<imprint>
			<biblScope unit="volume">51</biblScope>
			<biblScope unit="issue">2</biblScope>
			<date type="published" when="2018-02">2018. February</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Creative ai: A new avenue for the semantic web? Semantic Web</title>
		<author>
			<persName><forename type="first">Agnieszka</forename><surname>Ławrynowicz</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="69" to="78" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Wordnet: a lexical database for english</title>
		<author>
			<persName><forename type="first">George</forename><forename type="middle">A</forename><surname>Miller</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Communications of the ACM</title>
		<imprint>
			<biblScope unit="volume">38</biblScope>
			<biblScope unit="issue">11</biblScope>
			<biblScope unit="page" from="39" to="41" />
			<date type="published" when="1995">1995</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Sequeval: An offline evaluation framework for sequence-based recommender systems</title>
		<author>
			<persName><forename type="first">Diego</forename><surname>Monti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Enrico</forename><surname>Palumbo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Giuseppe</forename><surname>Rizzo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Maurizio</forename><surname>Morisio</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page">174</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">A probabilistic model of semantic plausibility in sentence processing</title>
		<author>
			<persName><forename type="first">Ulrike</forename><surname>Padó</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Matthew</forename><forename type="middle">W</forename><surname>Crocker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Frank</forename><surname>Keller</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Cognitive Science</title>
		<imprint>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="794" to="838" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Generating diverse conversation responses by creating and ranking multiple candidates</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">F</forename><surname>Marco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><forename type="middle">A</forename><surname>Pimentel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lei</forename><surname>Clifton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lionel</forename><surname>Clifton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename><surname>Tarassenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Zhen-Hua</forename><surname>Yu-Ping Ruan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xiaodan</forename><surname>Ling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Quan</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jia-Chen</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><surname>Gu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computer Speech Language</title>
		<imprint>
			<biblScope unit="volume">99</biblScope>
			<biblScope unit="page">101071</biblScope>
			<date type="published" when="2014">2014. 2020</date>
		</imprint>
	</monogr>
	<note>Signal Processing</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Learning to rank answers on large online qa collections</title>
		<author>
			<persName><forename type="first">Mihai</forename><surname>Surdeanu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Massimiliano</forename><surname>Ciaramita</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hugo</forename><surname>Zaragoza</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of ACL-08: HLT</title>
				<meeting>ACL-08: HLT</meeting>
		<imprint>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="719" to="727" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Wikidata: A free collaborative knowledgebase</title>
		<author>
			<persName><forename type="first">Denny</forename><surname>Vrandečić</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Markus</forename><surname>Krötzsch</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Commun. ACM</title>
		<imprint>
			<biblScope unit="volume">57</biblScope>
			<biblScope unit="issue">10</biblScope>
			<biblScope unit="page" from="78" to="85" />
			<date type="published" when="2014-09">2014. September</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Knowledge base completion using embeddings and rules</title>
		<author>
			<persName><forename type="first">Quan</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bin</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Li</forename><surname>Guo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IJCAI&apos;15</title>
				<imprint>
			<publisher>AAAI Press</publisher>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="1859" to="1865" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Modeling semantic plausibility by injecting world knowledge</title>
		<author>
			<persName><forename type="first">Su</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Greg</forename><surname>Durrett</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Katrin</forename><surname>Erk</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
				<meeting>the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies<address><addrLine>New Orleans, Louisiana</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2018-06">2018. June</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="303" to="308" />
		</imprint>
	</monogr>
	<note>Short Papers</note>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<author>
			<persName><forename type="first">Liang</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Chengsheng</forename><surname>Mao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yuan</forename><surname>Luo</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1909.03193</idno>
		<title level="m">Kg-bert: Bert for knowledge graph completion</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
