<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Using Word Embeddings for Recommending Datasets based on Scientific Publications</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Narges</forename><surname>Tavakolpoursaleh</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">GESIS -Leibniz Institute for the Social Sciences</orgName>
								<address>
									<settlement>Cologne</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Johann</forename><surname>Schaible</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">GESIS -Leibniz Institute for the Social Sciences</orgName>
								<address>
									<settlement>Cologne</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Stefan</forename><surname>Dietze</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">GESIS -Leibniz Institute for the Social Sciences</orgName>
								<address>
									<settlement>Cologne</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Institute for Computer Science</orgName>
								<orgName type="institution">Heinrich-Heine University Duesseldorf</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Using Word Embeddings for Recommending Datasets based on Scientific Publications</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">3C9FC8F0E91FC27E31A7BC0BBB2C09BB</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T18:29+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Dataset Retrieval and Recommendations</term>
					<term>Cross-Domain Recommendations</term>
					<term>Word Embeddings</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In scholarly search systems, computing recommendations of the same type, for example, additional publications when reading a particular publication, is a well-approached problem. However, suggesting items from another type, e.g., research data when reading a publication, is rarely covered in scholarly recommendations. In this position paper, we employ word embeddings to approach the problem of such cross-domain recommendations in scientific search systems, more specifically, recommending research data based on publications. Besides various metadata, publication and research dataset entries comprise textual metadata (e.g. title, abstract), which allows to detect similar entries using word embeddings. We illustrate first results, major problems and possible solutions when using word embeddings for recommending datasets based on publications.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>In digital libraries, such as arXiv 3 , typically, a scientific search system aids users in finding literature covering a topic of interest <ref type="bibr" target="#b16">[17]</ref>. To additionally alleviate the users' situation in finding appropriate literature, scientific search systems may also comprise recommender systems, which provide suggestions for items -sometimes previously unknown items -that are most likely of interest to a user <ref type="bibr" target="#b14">[15]</ref>. One prominent use case in scholarly recommendations is suggesting literature which is similar to a publication the user is currently viewing. This resembles recommending items of the same type (i.e. the domain of the item) and is a well-approached problem in scientific search systems. However, there is another important use case that exploits recommendations from different types, i.e., cross-domain recommendations. We focus on the following prominent and more and more emerging example, which is rarely covered in scientific search systems: recommending research data when viewing a scientific publication.</p><p>Some information systems provide a search over various types of information in a given field of interest. For example, besides publications, the GESIS-wide Search 4  (GWS), comprises research datasets, questions as well as variables, and further information in the field of Social Sciences. Such entries of different types enable scholarly recommendation systems to provide the desired cross-domain suggestions.</p><p>Why is retrieving research data important? Research data is an important facilitator of scientific progress. Making it publicly available is crucial towards enabling open science, i.e., towards replicating and/or reproducing research outcomes as well as validating newly developed methods and insights <ref type="bibr" target="#b2">[3]</ref>. When research data is archived in a digital form, the problem how to retrieve it, is mainly covered by dataset search and retrieval. Typically, retrieval systems return relevant datasets for explicitly formulated user queries <ref type="bibr" target="#b8">[9]</ref>. Recommendations can further alleviate finding suitable research data. However, most recommendation approaches target rather dataset interlinking by using semantic technologies to match datasets with other datasets that overlap in their content. Recommending datasets based on publications can pose problems using this approach, as publications might use general datasets, e.g., statistics on a country's demographics, for rather specific topics, e.g., mobility of youth towards large cities. Utilizing the content description, such as the abstract of publications and datasets, is likely to be more promising, as both might contain needed information to detect similarities.</p><p>In this paper, we present our on-going work on using word embeddings for research dataset recommendations in the GESIS-wide Search based on scientific publication that a user is currently viewing. Word embeddings seem promising in detecting appropriate recommendations based on the textual metadata of both a publication and a dataset. We focus on the specific use case in which we define a recommended dataset as relevant, if that dataset has been subject to the publication, i.e., the publication cites that dataset. The main task of the recommender is thereby defined as: the recommended dataset should/could be used and/or cited if the user intends to build her research upon the currently viewed publication. We illustrate that our word embedding model, unfortunately, does not achieve promising results, provide possible reasons, as well as give a first outlook on possible solutions how to improve the recommendations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related Work</head><p>Whereas finding information in scientific search systems that satisfies a user's information need is a well-elaborated topic in classical information retrieval <ref type="bibr" target="#b16">[17,</ref><ref type="bibr" target="#b15">16]</ref>, specifically targeting the goal to retrieve research data is still a growing field <ref type="bibr" target="#b5">[6]</ref>. To this day, still most research data repositories use the same approaches to retrieve research data as for publications, since there are only a few studies (including user behavior studies) which seem to be more promising than the established document retrieval methods <ref type="bibr" target="#b5">[6]</ref>. For recommender systems in scientific search systems, according to <ref type="bibr" target="#b0">[1]</ref>, content-based filtering is the most common recommendation approach (55%), followed by collaborative filtering (18%) and graph-based recommendation approaches (16%), while the remaining recommender systems use rather hybrid approaches. A major reason for this, is that collaborative filtering requires a large collection and investigation of user profiles and graph-based approaches require a well-designed knowledge graph describing and linking the data in a repository <ref type="bibr" target="#b7">[8]</ref>. Content-based approaches merely use the entries' metadata, which especially in digital libraries is rather rich.</p><p>Prior works on the general problem of dataset recommendation focus on particular scenarios, for instance, recommendation of datasets for interlinking (dataset-datasetrecommendation). Ellefi et al. <ref type="bibr" target="#b4">[5]</ref> use clustering and established schema-matching metrics to recommend datasets with overlapping schemata, i.e., overlapping content. Lopes et al. <ref type="bibr" target="#b11">[12]</ref> considers the link graph among datasets to recommend datasets which link to the same or similar resources. Given the lack of reliable and exhaustive metadata for research datasets, prior work in the field of dataset retrieval and dataset recommendation relies on techniques for dataset profiling <ref type="bibr" target="#b1">[2]</ref>, for instance, in order extract and represent dataset metadata capturing various dimensions of relevance. Thus, we restrict ourselves to first utilize only the textual metadata of publications and research datasets.</p><p>Word embedding techniques like Latent Semantic Indexing or word2vec can be utilized to capture the contents' metadata and provide semantics to the content <ref type="bibr" target="#b12">[13]</ref>. Recent works on unsupervised representation learning have the intent to embed context to predict the words in a sentence <ref type="bibr" target="#b9">[10]</ref> or the nodes in a graph <ref type="bibr" target="#b13">[14]</ref>. Learning the vector space representations of words have facilitated obtaining distributional semantics of words <ref type="bibr" target="#b9">[10]</ref> and have been shown to perform well in many natural language processing tasks of understanding the word-context <ref type="bibr" target="#b10">[11]</ref>. Determining the semantic similarity between items is also a related problem in the application of recommending datasets based on publications. Therefore as the first experiment, we applied Mikolov's Doc2Vec <ref type="bibr" target="#b9">[10]</ref> which is as an extension to Word2Vec for learning document-level embeddings.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Data and Approach</head><p>GESIS-wide Search: In this paper, we exploit the contents of the integrated search system GESIS-wide Search <ref type="bibr" target="#b6">[7]</ref> for recommending datasets based on publications that a user is currently viewing. The GESIS-wide search comprises publications (ca. 95k), research data (ca. 84k), questions and variables (ca. 12.7k), as well as instruments and tools (370) in the field of Social Sciences, and thus allows for such cross-domain recommendations between these four types of data. The publications are mostly in English and German language and are annotated with further textual metadata like title, abstract, topic, persons, and other. Metadata on research data comprises (among others) a title, topics, datatype, abstract, collection method, universe, primary investigator, as well as contributor in English and/or German.</p><p>Recommendation Task: When recommending items to a user, the following question arises: what is the general task of the recommendation? This means, is the recommended item supposed to complement, be as similar as possible to, or even contradict the item viewed or downloaded by the user? Additionally, other parameters of a recommendation, such as novelty and the impact on the domain, can be quite important to satisfy the users' information needs. In scientific search systems, all these dimensions might play a role when defining a relevant recommendation. However, this also makes it quite difficult to design and evaluate (cross-domain) recommendations, as with all these parameters, there are different definitions of the relevance and/or the usefulness of a recommended item. In the task of recommending datasets based on a publication, it might be desirable to recommend datasets which support the publication, complement the publication's findings, are cited in the paper, or are related in some other way, Fig. <ref type="figure">1</ref>: Number of English (en) and German (de) words in title, topic and abstract in metadata of publications and datasets e.g., topic, domain, temporal and geographical coverage. For our prototype, we focus on a first simple use case. Datasets that are cited by a publication the user is currently viewing are considered relevant, i.e., the ground truth. This resembles the following scenario: which datasets should/could be used and/or cited if the user intends to build her research upon the currently viewed publication.</p><p>Word Embeddings: Le et al. <ref type="bibr" target="#b9">[10]</ref> introduced an unsupervised algorithm that learns the vector representations from texts of different lengths to predict the surrounding words in a sample of a paragraph. This Paragraph Vector framework (Doc2Vec) maps every paragraph to a unique vector and concatenates it together with vectors of words, in order to predict the next word in a context. We used this framework for representing the context of research datasets and publications in a vector space. Subsequently, we computed the distances between the dataset representations and the representations of publications. Finally, we measured the semantic similarities and provided a list of recommendations ranked from most to least similar.</p><p>In more detail, datasets and publications in the GESIS-wide search are described with several general textual metadata like title, abstract, author or investigator, topic, and other type-specific metadata. We decided to utilize only titles and abstracts, as first both types have these labels, and second they are focused on the main topic of their contents. We concatenated the title and abstract of all items, i.e., we did not separate between dataset title/abstract and publication title/abstract but rather put them together, and trained the Doc2Vec model. Fig. <ref type="figure">1</ref> shows the number of words in titles, topics, and abstracts in publications and datasets. We set up a 300-dimensional vector space with a window size of five models for German and English language words and built the vocabulary of the entire corpus (177k items). For computing the similarity between datasets and publications, we have compared the paragraph vectors in the vector space of items. We trained two models for English and German words. Subsequently, we measured the similarities between 98k items (62k datasets and 36k publications with English metadata) using the English language model, and 78k items (20k datasets and 57, 884 publications with German metadata) using the German language model. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Preliminary Results and Discussion</head><p>As mentioned, our recommendation task considers suggesting a cited dataset in a publication as relevant for the user who is currently viewing this publication. Thus, we computed "related dataset"-links of publications in GWS and considered them as relevant for dataset recommendation. As an example of those links, dataset with title "Role of Government -ISSP 1985"<ref type="foot" target="#foot_0">5</ref> is cited by the publication entitled: "Police powers" <ref type="foot" target="#foot_1">6</ref> . Table <ref type="table" target="#tab_0">1</ref> illustrates our corpus as well as the number of correctly retrieved datasets. When retrieving the recommendable datasets for each publication, we observed the rank of used/cited datasets in the retrieval results. The outcome was not as we expected, since we could retrieve only 5.82% (i.e., only 1, 294 out of 22, 201) of all used/cited datasets in the first 1, 000 results (and only 327 in the top-10).</p><p>Using only the abstract and the title of publications and datasets, we found that it is difficult to retrieve datasets which are utilized in publications. This can have various reasons, such as the insufficient amount of words in the title and abstract and the lack of consideration of other, potentially useful information, such as publication dates or the dataset citation context as representation of a dataset. In general, the amount of metadata per record in the GWS corpus is quite different. Some records have well and prosperous metadata whereas others are poorly described, e.g., restricted to a short title and authors name. Additionally, quite an amount of datasets did not even have an abstract describing the dataset, but rather some keywords and bullet points placed as abstract. Also, training over the mix of publications and datasets might cause a problem. One possible solution would be to train embeddings for datasets and documentations separately. Another possibility to improve the results is to include a publication's abstract in the datasets' descriptions which are cited by this publication. Among other reasons, as mentioned before, the actual relevance of a recommendation is difficult to assess, which indicates that offline evaluations might be inappropriate in recommendation scenarios, as they are limited in representing the users' interests. This means, a retrieved dataset in higher rank could still be semantically relevant to the currently viewed publication although it is not applied/cited in the publication.</p><p>In the next steps, we intend to improve our model by using a pre-trained vector space where the representation of the known words are determined, or refine the model by assigning a weight to each word (e.g., a simple TF-IDF or attention layer). Additionally, one can represent the GWS datasets, publications, and their relationships within a graph. This can serve a lot of applications such as node recommendation and link prediction <ref type="bibr" target="#b17">[18]</ref>. Considering more metadata for each item, such as authors or publication years, can also improve the result. Finally, we intend to perform an online evaluation of our approaches using a Living Lab <ref type="bibr" target="#b3">[4]</ref> and compare them to the default "more-likethis"-baseline SOLR offers out of the box by analyzing click-through-rates and similar.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Publication-Dataset connections statistics in GWS corpus</figDesc><table><row><cell>Corpus</cell><cell>Statistic</cell></row><row><cell>9, 373</cell><cell># publications citing research data</cell></row><row><cell>2, 823</cell><cell># unique datasets</cell></row><row><cell>22, 201</cell><cell>total # of dataset citations</cell></row><row><cell>2.368</cell><cell>avg. citations per publication</cell></row><row><cell cols="2">Retrieved top-1000 similar items (publication + dataset)</cell></row><row><cell>1, 294 (5.82%)</cell><cell>total # of relevant datasets retrieved</cell></row><row><cell>327</cell><cell># of relevant datasets retrieved @10</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_0">https://search.gesis.org/research_data/ZA1490</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_1">https://search.gesis.org/publication/gesis-bib-24288</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Paper recommender systems: a literature survey</title>
		<author>
			<persName><forename type="first">J</forename><surname>Beel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Gipp</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Langer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Breitinger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal on Digital Libraries</title>
		<imprint>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="305" to="338" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">RDF Dataset Profiling -a Survey of Features, Methods, Vocabularies and Applications</title>
		<author>
			<persName><forename type="first">M</forename><surname>Ben Ellefi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Bellahsene</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>John</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Demidova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dietze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Szymanski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Todorov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Semantic Web Journal Accepted in</title>
		<imprint>
			<date type="published" when="2017-08">August 2017</date>
		</imprint>
	</monogr>
	<note>to appear</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The conundrum of sharing research data</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">L</forename><surname>Borgman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the American Society for Information Science and Technology</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="page" from="1059" to="1078" />
			<date>jun</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">STELLA: Towards a Framework for the Reproducibility ofOnline Search Experiments</title>
		<author>
			<persName><forename type="first">T</forename><surname>Breuer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Schaer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Tavakolpoursaleh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Schaible</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Wolff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Mueller</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Open-Source IR Replicability Challenge (OSIRRC) (accepted)</title>
				<meeting>the Open-Source IR Replicability Challenge (OSIRRC) (accepted)</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Dataset recommendation for data linking: An intensional approach</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">B</forename><surname>Ellefi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Bellahsene</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dietze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Todorov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ESWC. Springer</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Searching data: A review of observational data retrieval practices in selected disciplines</title>
		<author>
			<persName><forename type="first">K</forename><surname>Gregory</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Groth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Cousijn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Scharnhorst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Wyatt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the Association for Information Science and Technology</title>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">A digital library for research data and related information in the social sciences</title>
		<author>
			<persName><forename type="first">D</forename><surname>Hienert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Kern</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Boland</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Zapilko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Mutschke</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL) (forthcoming</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Koren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Bell</surname></persName>
		</author>
		<title level="m">Advances in Collaborative Filtering</title>
				<imprint>
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Dataset Retrieval</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">R</forename><surname>Kunze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Auer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Seventh International Conference on Semantic Computing</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2013-09">2013. sep</date>
			<biblScope unit="page" from="1" to="8" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Distributed representations of sentences and documents</title>
		<author>
			<persName><forename type="first">Q</forename><surname>Le</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mikolov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ICML&apos;</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page">1188</biblScope>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
	<note>II-1196</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Neural word embedding as implicit matrix factorization</title>
		<author>
			<persName><forename type="first">O</forename><surname>Levy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Goldberg</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in neural information processing systems</title>
				<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="2177" to="2185" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">Two approaches to the dataset interlinking recommendation problem</title>
		<author>
			<persName><forename type="first">G</forename><surname>Lopes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">A</forename><surname>Paes Leme</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Nunes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Casanova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dietze</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Word embedding techniques for contentbased recommender systems: An empirical evaluation</title>
		<author>
			<persName><forename type="first">C</forename><surname>Musto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Semeraro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>De Gemmis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Lops</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">RecSys Posters</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">Graph2Vec: Learning Distributed Representations of Graphs</title>
		<author>
			<persName><forename type="first">A</forename><surname>Narayanan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Chandramohan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Venkatesan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Jaiswal</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2017">2017</date>
			<publisher>CoRR</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<title level="m" type="main">Introduction to Recommender Systems Handbook</title>
		<author>
			<persName><forename type="first">F</forename><surname>Ricci</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Rokach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Shapira</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2011">2011</date>
			<publisher>Springer US</publisher>
			<biblScope unit="page" from="1" to="35" />
			<pubPlace>Boston, MA</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>White</surname></persName>
		</author>
		<title level="m">Interactions with search systems</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">How to build a digital library</title>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">H I H</forename><surname>Witten</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Bainbridge</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">M</forename><surname>Nichols</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2010">2010</date>
			<publisher>Morgan Kaufmann Publishers</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Scalable graph embedding for asymmetric proximity</title>
		<author>
			<persName><forename type="first">C</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gao</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Thirty-First AAAI Conference on Artificial Intelligence</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
