<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">New Datasets and a Benchmark of Document Network Embedding Methods for Scientific Expert Finding</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Robin</forename><surname>Brochier</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Antoine</forename><surname>Gourru</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Adrien</forename><surname>Guille</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Julien</forename><surname>Velcin</surname></persName>
						</author>
						<author>
							<affiliation key="aff0">
								<orgName type="institution">Université de Lyon</orgName>
								<address>
									<settlement>Lyon</settlement>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="laboratory">ERIC EA3083</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">New Datasets and a Benchmark of Document Network Embedding Methods for Scientific Expert Finding</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">DB6E6E774873B82256DA0CA2301DCBD3</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T21:41+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The scientific literature is growing faster than ever. Finding an expert in a particular scientific domain has never been as hard as today because of the increasing amount of publications and because of the ever growing diversity of expertise fields. To tackle this challenge, automatic expert finding algorithms rely on the vast scientific heterogeneous network to match textual queries with potential expert candidates. In this direction, document network embedding methods seem to be an ideal choice for building representations of the scientific literature. Citation and authorship links contain major complementary information to the textual content of the publications. In this paper, we propose a benchmark for expert finding in document networks by leveraging data extracted from a scientific citation network and three scientific question &amp; answer websites. We compare the performances of several algorithms on these different sources of data and further study the applicability of embedding methods on an expert finding task.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Many tools offer to search and filter the vast data sources available on the Web. In particular, there is a multitude of platforms directed to the scientific community. From the simple search engine for publications to the social network for researchers, all consume and produce valuable data for searching scientific content of interest. Expert finding is one the the most challenging problem that finds application in both academia and the industry. To tackle this challenge, recent advances in document network embedding (DNE) has the potential to inspire new unsupervised models that can deal with the heterogeneous network of documents of the scientific literature. However, the design of such efficient algorithms heavily depends on the development of strong evaluation frameworks.</p><p>In this paper, we propose a methodology and provide 4 datasets that extend the limited scope of expertise retrieval evaluation frameworks. Furthermore, we provide experiment results computed with unsupervised methods and we extend document network embedding algorithms to this specific task.</p><p>Our contributions are the following:</p><p>we provide 4 datasets for expert finding extracted from a scientific publication network and three question &amp; answer (Q&amp;A) websites and make them publicly available <ref type="foot" target="#foot_0">1</ref> ; we describe an evaluation methodology based on the ranking of expert candidates given a set of labeled document queries; we report experiment results that give some insights on this expert finding task; we explore and analyze the use of state-of-the-art document network embedding algorithms for expert finding and we show that further research is needed to bridge the gap between DNE methods and expert finding.</p><p>The rest of the paper is organized as follows. In Section 2, we survey related works. We detail in Section 3 our evaluation methodology, the datasets we extracted, the evaluation measures and the algorithms we use. In Section 4 we show and analyze the results of our experiments. Finally, in Section 5, we discuss our findings and provide future directions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related Works</head><p>In this section, we first present a formal definition for expert finding. Then we present algorithms of the literature that address expert finding. Finally, we describe recent methods for document network embedding that have the potential to deal with this particular task.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Formal definition of expert finding</head><p>The concept of expert finding can cover a large range of tasks. The main principle behind expertise retrieval is the search for candidates given a query. To match these two, an algorithm will be provided with some data to link the output space, a ranking of candidates, with the input space, which is often a textual content. However, many different types of data can be considered to address this challenge. To fairly compare algorithms, we choose a fixed structure for the data which reflects common use cases. Furthermore, if supervised methods benefit from labeled fields of expertise associated with the candidates, they are beyond the scope of this paper which focuses on unsupervised methods only. Our goal is to compare methods that do not require sometimes costly annotations.</p><p>Early works in expert search <ref type="bibr" target="#b7">[8]</ref> usually consider a small set of topical queries. The direct namings of these topics are used to retrieve a list of candidates by leveraging a collection of documents they published (e.g., emails, scientific papers). This type of evaluation is used across several public datasets <ref type="bibr" target="#b14">[15,</ref><ref type="bibr" target="#b15">16,</ref><ref type="bibr" target="#b25">26]</ref>.</p><p>More recently, the concept of expert finding has been merged into the wider concept of entity retrieval <ref type="bibr" target="#b0">[1]</ref>. As more and more complex data are produced on the Web, expert finding becomes a particular application of entity search. At the same time, Q&amp;A websites such as Stack Overflow 2 generate and make publicly accessible a big amount of questions with expert answers, collaboratively curated by their users. Several works address the search for experts in such websites <ref type="bibr" target="#b19">[20,</ref><ref type="bibr" target="#b27">28]</ref>. Often, the task consists in either finding the exact list of users who answered a specific question or ranking the answers according to the user votes. In the first case, the task involves considering the evolution of the users across time and, in the second case, the task involves understanding the intrinsic quality of a written answer. Nevertheless, <ref type="bibr" target="#b24">[25]</ref> reviews several models for expert finding in Q&amp;A websites. Their experiments show that matrix factorization-based methods perform better than tree based and ranking based methods.</p><p>In this paper, we adopt the document-query methodology recently proposed in <ref type="bibr" target="#b2">[3]</ref>. The expert search is performed given a set of queries that are particular textual instances of some expert topics (or fields of expertise). Given a query, an algorithm should rank first the candidates that are associated to the same fields of expertise. We provide 4 datasets for which we annotated experts and document queries. Each dataset consists in candidates and documents linked by authorship relations (candidate-document e.g. authorship) and by response relations (document-document e.g citation or answer). A query is therefore one of the documents (e.g. a scientific paper or a question) for which we aim to retrieve some experts of the topics depicted in it. This configuration reflects many real case scenarios such as (1) the automatic search for scientific reviewers, (2) the recommendation of expert users in Q&amp;A websites or even (3) the retrieval of interesting profiles for job offers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Algorithms for expert finding</head><p>Numerous works have addressed automatic expertise retrieval. We describe here the main approaches and some interesting recent methods. P@noptic Expert <ref type="bibr" target="#b6">[7]</ref> creates meta-documents for a candidate by concatenating the contents of all documents she produced. In this manner, ranking the candidates given a query becomes a similarity search between the query representation and the meta-documents representations. A voting model <ref type="bibr" target="#b13">[14]</ref> computes the similarities between the query and the documents. The algorithm then aggregates these scores at the candidate level by using a fusion technique such as the reciprocal rank <ref type="bibr" target="#b26">[27]</ref>. A propagation model <ref type="bibr" target="#b20">[21]</ref> takes advantage of the links between candidates and documents to propagate the similarities between the query and the documents. Using random walks with restart <ref type="bibr" target="#b16">[17]</ref>, the iterative propagation of the scores converges in a few steps to a stationary distribution over the candidates. WISER <ref type="bibr" target="#b5">[6]</ref> models each candidate as a small, weighted, sub-graph of the Wikipedia Knowledge Graph. Information derived from these graphs and traditional document retrieval techniques are combined to identify experts w.r.t a query. Note that methods leveraging external data are out of the scope of our benchmark. LT Expertfinder is an evaluation framework for expert finding <ref type="bibr" target="#b10">[11]</ref> based on an interactive tool. It integrates various existing algorithms (such as <ref type="bibr" target="#b0">[1]</ref>) in a user-friendly way. The underlying corpus used by this tool is the ACL Anthology Network. However, it does not include a wellestablished ground truth to assess who are the experts. Indeed, the evaluation is purely done in an online manner since the user has to evaluate the degree of expertise based on several features, such as author's citations, h-index, keywords, etc. Recent works <ref type="bibr" target="#b8">[9,</ref><ref type="bibr" target="#b22">23]</ref> propose ad hoc embedding techniques, whereas, in this work, we're interested in measuring the performance of conventional network embedding techniques.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Document network embedding</head><p>Network embedding <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b18">19]</ref> provides an efficient approach to represent nodes in a low dimensional vector space, suitable for solving various machine learning tasks. Recent techniques extend NE for document networks. Text-Associated DeepWalk (TADW) <ref type="bibr" target="#b23">[24]</ref> extends DeepWalk to deal with textual attributes. Yang et al. prove, following the work in <ref type="bibr" target="#b12">[13]</ref>, that Skip-Gram with hierarchical softmax can be equivalently formulated as a matrix factorization problem. TADW then consists in constraining the factorization problem with a pre-computed representation of the documents by using Latent Semantic Analysis (LSA) <ref type="bibr" target="#b9">[10]</ref>. Graph2Gauss (G2G) <ref type="bibr" target="#b1">[2]</ref> is an approach that embeds each node as a Gaussian distribution instead of a vector. The algorithm is trained by passing node attributes through a non-linear transformation via a deep neural network (encoder). GVNR-t <ref type="bibr" target="#b3">[4]</ref> is a matrix factorization approach for document network embedding, inspired by GloVe <ref type="bibr" target="#b17">[18]</ref>, that simultaneously learns word, node and document representations by optimizing a least-square objective over a co-occurrence matrix of the nodes constructed by truncated random walks. IDNE <ref type="bibr" target="#b4">[5]</ref> introduces a Topic-Word Attention mechanism, trained from the connections of a document network, to represent documents as mixtures of topics. DNE algorithms do not directly apply to expert finding data since they are not designed to handle multiple types of nodes, in particular candidate nodes. In this paper, we show (1) two methods to extend their applicability to the task of expert finding and (2) the impact of their representations when they are used as document representations for traditional expert finding algorithms.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Evaluation Methodology</head><p>We present in this section the evaluation methodology that we follow to access the performances of several algorithms for expert finding. We first describe the task we seek to solve, then we describe the datasets that we extracted and explain how we annotated them in order to access the quality of the algorithms' outputs. Finally, we detail the models used in our experiments.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Ranking expert candidates from document queries</head><p>Expert finding is a complex task that can be formalized in multiple ways. Early works define this task as a ranking problem given several topic-queries where the naming of these topics are directly used as queries to retrieve the expert candidates. However, in many real world applications, a user is asked to provide a specific and detailed query. In a Q&amp;A website for instance, a user usually exposes the problem she faces in full detail and does not necessarily know the exact naming of the fields of expertise needed to solve her problem. Furthermore, querying an algorithm with a small set of topicqueries can lead to poor evaluation measures due to the usually small number of fields of expertise associated with the dataset. For this reasons, we follow the document-query evaluation methodology proposed in <ref type="bibr" target="#b2">[3]</ref> by processing 4 datasets for which a set of document-queries is manually annotated.</p><p>The expert finding task in this paper is a ranking problem. Given a document labeled with a ground truth set of fields of expertise, an algorithm is queried to rank a set of candidates, among which a subset of experts are associated with the same set of labels. The data provided to the algorithms consists in a corpus of n d documents D, n c candidates C, a network of authorship with adjacency matrix A dc ∈ N n d ×nc and a network of documents with adjacency matrix A dd ∈ N n d ×n d . Figure <ref type="figure">1</ref> shows an hypothetical dataset used in this paper. The ranking is performed in an unsupervised setting, that is, no ground truth labels of expertise are given to the algorithms. The set of labeled documents (the queries) can be smaller than n d and the set of labeled candidates (experts) can be smaller than n c (i.e not all documents and candidates are labeled).</p><formula xml:id="formula_0">D1 D2 D3 D4 D5 D6 C1 C2 C3 C4 C5 A dc A dd D C Expertise labels</formula><p>Fig. <ref type="figure">1</ref>: Hypothetical example of an expert finding dataset we use in this paper. 5 candidates are authors of 6 documents. The 6 documents are connected to each other by citation in a scientific corpus, or by answer in a same post in a Q&amp;A website. Among the candidates, 3 are known to be experts in stars and/or in circles. 4 documents are associated to these 2 fields of expertise as well. In our evaluation methodology, we query an algorithm with these 4 documents and expect a ranking of candidates that will match each document's fields of expertise. As an example, a perfect algorithm might generate the rankings</p><formula xml:id="formula_1">D 1 → C 3 C 4 C 5 C 1 C 2 and D 6 → C 4 C 5 C 3 C 2 C 1 .</formula><p>To evaluate the candidate scores provided by the algorithms, we compare the resulting rankings with the ground truth fields of expertise. If a document is associated with three different labels, we expect the algorithm to rank first all experts associated to at least one of these labels. We report the area under the ROC curve (AUC), the precision at 10 (P@10) and the average precision (AP) and we compute their standard deviation along the queries. That is, we evaluate the robustness of the algorithms against the variety of document-queries.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Datasets</head><p>We consider 4 datasets. The first one is an extract of DBLP <ref type="bibr" target="#b21">[22]</ref> in which a list 199 experts in 7 fields are annotated <ref type="bibr" target="#b25">[26]</ref> by human judgments <ref type="foot" target="#foot_2">3</ref> . Our dataset only considers the annotated experts and the other candidates that are close in the co-authorship network which explains the relatively small size of our network compared to the original one. In addition to the expert annotations, our evaluation framework requires document annotations since we adopt the document-query methodology for expertise retrieval. We asked two PhD students in computer science to associate independently 20 randomly drawn documents per field of expertise (140 in total). Then, only the labels on which the two annotators agreed were kept, leaving 114 annotated papers. The mean Cohen's kappa coefficient across the labels is 0.718. An advantage of our methodology is that we can evaluate the algorithms on more queries (114 documents) than the traditional method (7 labels). This allows us to assess the robustness of the algorithms by computing the standard deviations of the ranking metrics along all queries. However, one might suggest that these 7 labels do not reflect a representative set of expertise as there are too broad. For this reason, we seek for a wider granularity of expertise by the use of well-know question &amp; answer website.</p><p>If scientific publication networks are easy to find on the Web, scientific expertise annotations are rarely available for both authors and publications. We use data downloaded in June 2019 from Stack Exchange<ref type="foot" target="#foot_3">4</ref> to create datasets for expert finding collected from three communities closely related to research. Academia<ref type="foot" target="#foot_4">5</ref> is dedicated to academics and higher education. Mathoverflow<ref type="foot" target="#foot_5">6</ref> gathers professional mathematicians and is widely used by researchers. Stats<ref type="foot" target="#foot_6">7</ref> (also known as Cross Validated) addresses statistics, machine learning and data mining issues. For each dataset, we first keep questions with at least 10 user votes that have at least one answer with 10 user votes or more. We build the networks by linking questions with their answers and by linking answers with the users who published them. The field of expertise are the tags associated with the questions. Only the tags that occur at least 50 times are kept. We annotate an expert with the tags of a question if her answer to that question received at least 10 votes. Note that the tags are first provided by the users who ask the questions but they are thereafter verified by experimented users.</p><p>The general properties of our 4 datasets are presented in Table <ref type="table" target="#tab_0">1</ref>. The annotations and the preprocessed datasets are made publicly available. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Algorithms</head><p>We run the experiments with 4 baseline algorithms and 4 document network embedding algorithms. The laters are adapted with two aggregation schemes in order to deal with the candidates since they are primarily designed for document network only. These aggregations are arbitrary and are voluntarily the most straightforward way to run DNE algorithms on bipartite networks of authors-documents. We further discuss these choices in section 4.</p><p>Baselines We run the experiments with the same models as in <ref type="bibr" target="#b2">[3]</ref>, using the tf-idf representations and the cosine similarity measure. Also, we add a random model to have reference metrics:</p><p>-Random model: we randomly draw scores between 0 and 1 for each candidate; -P@noptic model <ref type="bibr" target="#b6">[7]</ref>: we concatenate the textual content of each document associated to the candidates, use their tf-idf representations and compute the cosine similarity to produce the scores; -Voting model <ref type="bibr" target="#b13">[14]</ref>: we use the reciprocal rank to aggregate the scores at the candidate level; -Propagation model <ref type="bibr" target="#b20">[21]</ref>: we concatenate the two adjacency matrices A dc and A dd to construct a transition matrix between candidates and documents such that</p><formula xml:id="formula_2">A = A dd A dc A dc 0</formula><p>. The initial scores are the cosine similarities between the tf-idf representations of the query and the documents. The scores are propagated iteratively until convergence with a restart probability of 0.5.</p><p>We also run the voting and propagation models using document representations produced by IDNE in place of the tf-idf vectors. The document network provided to IDNE has adjacency matrix</p><formula xml:id="formula_3">A d = A dc A dc + A dd .</formula><p>Extending DNE algorithms for expert finding DNE methods usually operate in networks of documents, with no candidate nodes. To apply them in the context of expert finding, we propose two straightforward approaches:</p><p>-pre-aggregation: as in the P@noptic model, meta-documents are generated by aggregating the documents produced by each candidates. Furthermore, an adjacency matrix of a meta-network between candidates and documents is constructed. We compute a candidate network as A c = A dc A dc and a document network as</p><formula xml:id="formula_4">A d = A dc A dc + A dd . The meta-network is then A = A d A dc A dc A c</formula><p>. The candidate and document representations are then generated by treating this meta-network and the concatenation of the documents and meta-documents as an ordinary instance of document network. From this meta-network, we generate representations with the DNE algorithms. The scores of the candidates are generated by cosine similarity between the representation of the document-query and the representations of the candidates; -post-aggregation: in this setting, we first train the DNE algorithm on the network of documents defined by</p><formula xml:id="formula_5">A d = A dc A dc + A dd .</formula><p>Once the representations are generated for all documents, a representation for a candidate is computed by averaging the vectors of all documents associated to her. The scores are then computed by cosine similarity.</p><p>We run the experiments with 4 document network embedding algorithms, using the authors' implementations. For all methods, the dimension of the representations is set to 256:</p><p>-TADW <ref type="bibr" target="#b23">[24]</ref>: we follow the original paper by using 20 iterations and a penalty term λ = 0.2; -GVNR-t <ref type="bibr" target="#b3">[4]</ref>: we use γ = 10 random walks of length t = 40, a sliding window of size l = 5 and a threshold x min = 2 with 4 iterations; -Graph2gauss (G2G) <ref type="bibr" target="#b1">[2]</ref>: we make sure the loss function converges before the maximum number of iterations; -IDNE <ref type="bibr" target="#b4">[5]</ref>: we run all experiments with n t = 32 topic vectors with 5000 balanced mini-batches of 16 positive samples and 16 negative samples.</p><p>Tables <ref type="table" target="#tab_4">2 to 5</ref> show the experiments results. In the following, we analyze the performances of the aggregation scheme against the baseline algorithms, we highlight the interesting results obtained when using the baselines with pre-computed document representations with a DNE algorithm and finally we make some observations on the differences between the datasets. Note that the implementation of TADW, provided by the authors, could not scale to Mathoverflow.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Baselines versus DNE aggregation schemes</head><p>For all datasets, the propagation model performs generally better than the other algorithms, particularly in terms of precision. Both aggregation schemes yield to poor results and none of these two methods appear to be better than the other. GVNR-t is the best algorithm among the document network embedding models. We believe that, if DNE algorithms are well suited for document network representation learning, the gap between simple tasks such as node classification and link prediction and the task of expert finding is too big for our naive aggregation schemes to perform well. Especially, the network structure changes significantly between an homogeneous network and an  heterogeneous network. Moreover, expert finding algorithms often benefit from information about the centrality of the candidates and documents. DNE algorithms do not particularly preserve this information neither do our aggregation schemes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Using DNE as document representations for the baselines</head><p>Since the baseline algorithms perform well, we study the possibility to apply them using a DNE algorithm for the representations of the documents. We only report the results with the representations computed with IDNE but we observe the same behaviors with other DNE models. First, these representations constantly improve the voting model, which achieves best results in terms of AUC on Stats and Mathoverflow. Then, the most surprising effect is the significant decrease of performance of the propagation model. If the precision for the first ranked candidates is not affected, the AUC score significantly drops for the three Q&amp;A datasets. We believe that document network embeddings captures too long-range dependencies between the documents in the network, which are then subsequently exaggerated by the propagation procedure. Figure <ref type="figure" target="#fig_1">2</ref> shows the effect of the representations used with the propagation model on the ROC curve.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Differences between the datasets</head><p>The results achieved by the algorithms on all three Stack Exchange datasets are consistent. However, they do not behave the same with DBLP. First, DNE methods get closer scores to the baselines on DBLP. In the Q&amp;A datasets, the interactions are more isolated i.e. there are more users having fewer interactions. This difference of network properties might disadvantage DNE methods who are usually trained on scale-free networks whose degree distribution follows a power law. Moreover, the propagation method does not suffer with DBLP from the decrease of performance induced by the IDNE representations. We hypothesize that the low number of expertise fields associated with this dataset largely reduces the effect described in the previous section.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Discussion and Future Work</head><p>In this paper, we provide experiment materials for expert finding with the help of four annotated datasets and further report results based on several baseline algorithms. Moreover, we study the ability of document network embedding methods to tackle the expert finding challenge. We show that DNE algorithms can not be trivially adapted to achieve state-of-the-art scores. However, we reveal that document network embeddings can improve the voting model but diminish the propagation model.</p><p>In future work, we would like to find an efficient way to bridge the gap between DNE algorithms and expert finding. To do so, taking the heterogeneity into account should help better capturing the real similarity between a document and a candidate. Furthermore, a deeper analysis of the interplay between the candidates and the text content of the documents appears to be a necessary way to better understand the task of expert finding.    </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>(a) Propagation model with tf-idf representations: the curve has a nice shape which means the ranking of candidates are good even for the last ranked experts.(b) Propagation model with IDNE representations: the first ranked candidates are good but the algorithm tends to wrongly rank last many true experts.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 2 :</head><label>2</label><figDesc>Fig. 2: Effect of IDNE representations on the propagation model. Using document network embeddings significantly damages the rankings.</figDesc><graphic coords="9,134.77,115.83,166.46,161.57" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>General properties of the datasets.</figDesc><table><row><cell></cell><cell cols="5"># candidates # documents # labels # experts # queries</cell><cell>label example</cell></row><row><cell>DBLP</cell><cell>707</cell><cell>1641</cell><cell>7</cell><cell>199</cell><cell cols="2">114 'information extraction'</cell></row><row><cell>Stats</cell><cell>5765</cell><cell>14834</cell><cell>59</cell><cell>5765</cell><cell cols="2">3966 'maximum-likelihood'</cell></row><row><cell>Academia</cell><cell>6030</cell><cell>20799</cell><cell>55</cell><cell>6030</cell><cell cols="2">4214 'recommendation-letter'</cell></row><row><cell>Mathoverflow</cell><cell>7382</cell><cell>38532</cell><cell>98</cell><cell>7382</cell><cell cols="2">10614 'galois-representations'</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 :</head><label>2</label><figDesc>Mean scores with their standard deviations on DBLP</figDesc><table><row><cell></cell><cell>AUC</cell><cell>P@10</cell><cell>AP</cell></row><row><cell>random</cell><cell cols="3">49.47 (09.80) 05.00 (06.66) 07.09 (03.81)</cell></row><row><cell>panoptic (tf-idf)</cell><cell cols="3">74.06(12.94) 22.37 (16.35) 23.24 (12.55)</cell></row><row><cell>voting (tf-idf)</cell><cell cols="3">78.60 (11.97) 26.05 (15.76) 28.24 (13.92)</cell></row><row><cell cols="4">propagation (tf-idf) 79.26 (13.09) 33.07 (19.61) 34.66 (18.21)</cell></row><row><cell>pre-agg TADW</cell><cell cols="3">65.84 (12.94) 15.61 (11.63) 17.26 (08.78)</cell></row><row><cell>pre-agg GVNR-t</cell><cell cols="3">76.90 (11.46) 19.04 (11.70) 21.39 (09.61)</cell></row><row><cell>pre-agg G2G</cell><cell cols="3">72.87 (12.75) 15.70 (11.62) 18.53 (09.37)</cell></row><row><cell>pre-agg IDNE</cell><cell cols="3">78.08 (11.27) 20.18 (11.85) 22.00 (09.87)</cell></row><row><cell>post-agg TADW</cell><cell cols="3">68.01 (13.37) 16.32 (11.57) 18.01 (08.97</cell></row><row><cell>post-agg GVNR-t</cell><cell cols="3">73.91 (13.93) 18.86 (12.19) 20.57 (10.33)</cell></row><row><cell>post-agg G2G</cell><cell cols="3">68.94 (15.23) 16.23 (12.02) 18.21 (09.76)</cell></row><row><cell>post-agg IDNE</cell><cell cols="3">76.87 (13.36) 19.04 (14.57) 21.57 (10.96)</cell></row><row><cell>voting (IDNE)</cell><cell cols="3">82.23 (11.08) 34.82 (18.46) 37.27 (16.16)</cell></row><row><cell cols="4">propagation (IDNE) 82.44 (16.14) 44.47 (22.91) 47.01 (22.06)</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3 :</head><label>3</label><figDesc>Mean scores with standard deviations on Stats (04.91) 53.91 (18.06) 32.18 (08.33) propagation (IDNE) 67.62 (10.11) 90.43 (15.20) 33.07 (08.93)</figDesc><table><row><cell></cell><cell>AUC</cell><cell>P@10</cell><cell>AP</cell></row><row><cell>random</cell><cell cols="3">50.01 (02.24) 04.52 (07.02) 04.96 (02.81)</cell></row><row><cell>panoptic (tf-idf)</cell><cell cols="3">79.47 (06.22) 13.45 (13.39) 15.22 (05.62)</cell></row><row><cell>voting (tf-idf)</cell><cell cols="3">84.96 (05.22) 52.53 (16.13) 31.01 (06.58)</cell></row><row><cell cols="4">propagation (tf-idf) 86.33 (05.64) 91.53 (13.44) 44.09 (07.70</cell></row><row><cell>pre-agg TADW</cell><cell cols="3">63.07 (07.70) 11.42 (12.34) 08.45 (03.87)</cell></row><row><cell>pre-agg GVNR-t</cell><cell cols="3">70.67 (09.49) 21.12 (20.99) 12.43 (07.30</cell></row><row><cell>pre-agg G2G</cell><cell cols="3">63.63 (07.62) 12.93 (12.06) 07.81 (04.15</cell></row><row><cell>pre-agg IDNE</cell><cell cols="3">65.07 (09.05) 13.37 (13.48) 09.40 (05.19)</cell></row><row><cell>post-agg TADW</cell><cell cols="3">68.74 (07.02) 13.67 (12.59) 09.99 (04.37)</cell></row><row><cell>post-agg GVNR-t</cell><cell cols="3">66.56 (08.61) 22.47 (15.92) 10.75 (05.42</cell></row><row><cell>post-agg G2G</cell><cell cols="3">62.53 (07.44) 11.95 (11.86) 07.48 (04.13)</cell></row><row><cell>post-agg IDNE</cell><cell cols="3">65.63 (08.57) 13.34 (13.13) 09.38 (04.94</cell></row><row><cell>voting (IDNE)</cell><cell>86.94</cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4 :</head><label>4</label><figDesc>Mean scores with standard deviations on Academia -idf) 88.02 (03.32) 99.01 (03.57) 54.04 (05.44) pre-agg TADW 61.47 (06.16) 11.09 (12.04) 09.29 (03.53) pre-agg GVNR-t 64.22 (09.69) 25.67 (23.27) 13.07 (07.54) pre-agg G2G 61.54 (05.38) 14.30 (12.91) 08.74 (03.69) pre-agg IDNE 58.74 (07.49) 10.21 (11.58) 08.41 (03.99 post-agg TADW 71.94 (04.63) 14.44 (12.87) 12.68 (04.37 post-agg GVNR-t 61.22 (06.24) 20.70 (14.59) 10.19 (04.21) post-agg G2G 58.87 (05.79) 12.80 (12.06) 08.12 (03.67) post-agg IDNE 59.97 (07.40) 10.61 (11.19) 08.76 (04.17)) voting (IDNE) 86.79 (03.90) 55.81 (17.35) 37.13 (07.58) propagation (IDNE) 61.35 (08.56) 95.02 (10.15) 31.27 (08.21)</figDesc><table><row><cell></cell><cell>AUC</cell><cell>P@10</cell><cell>AP</cell></row><row><cell>random</cell><cell cols="3">50.02 (01.78) 05.93 (08.07) 06.09 (02.72)</cell></row><row><cell>panoptic (tf-idf)</cell><cell cols="3">81.54 (04.36) 18.35 (18.76) 22.93 (07.14)</cell></row><row><cell>voting (tf-idf)</cell><cell cols="3">85.88 (03.47) 57.99 (15.87) 37.66 (05.83</cell></row><row><cell>propagation (tf</cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 5 :</head><label>5</label><figDesc>Mean scores with standard deviations on Mathoverflow IDNE) 69.38 (09.65) 92.35 (13.88) 39.62 (09.89)</figDesc><table><row><cell></cell><cell>AUC</cell><cell>P@10</cell><cell>AP</cell></row><row><cell>random</cell><cell cols="3">49.98 (01.62) 06.44 (08.28) 06.53 (03.06)</cell></row><row><cell>panoptic (tf-idf)</cell><cell cols="3">81.87 (04.46) 21.95 (19.15) 22.95 (07.54)</cell></row><row><cell>voting (tf-idf)</cell><cell cols="3">86.80 (03.23) 61.11 (18.68) 40.10 (08.27)</cell></row><row><cell cols="4">propagation (tf-idf) 88.08 (03.38) 93.68 (12.16) 49.58 (08.90)</cell></row><row><cell>pre-agg TADW</cell><cell>NA</cell><cell>NA</cell><cell>NA</cell></row><row><cell>pre-agg GVNR-t</cell><cell cols="3">65.34 (09.22) 44.02 (28.31) 16.88 (08.55)</cell></row><row><cell>pre-agg G2G</cell><cell cols="3">66.84 (08.99) 22.95 (17.81) 12.49 (05.70)</cell></row><row><cell>pre-agg IDNE</cell><cell cols="3">67.01 (09.26) 22.96 (17.84) 13.40 (06.02)</cell></row><row><cell>post-agg TADW</cell><cell>NA</cell><cell>NA</cell><cell>NA</cell></row><row><cell>post-agg GVNR-t</cell><cell cols="3">63.84 (07.59) 41.81 (22.68) 14.96 (06.25)</cell></row><row><cell>post-agg G2G</cell><cell cols="3">65.06 (09.09) 22.43 (16.94) 11.78 (05.51)</cell></row><row><cell>post-agg IDNE</cell><cell cols="3">66.74 (09.10) 21.92 (17.21) 13.11 (05.87)</cell></row><row><cell>voting (IDNE)</cell><cell cols="3">88.71 (03.76) 68.46 (18.53) 43.53 (09.90)</cell></row><row><cell>propagation (</cell><cell></cell><cell></cell><cell></cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://github.com/brochier/expert_finding</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://stackoverflow.com/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://lfs.aminer.cn/lab-datasets/expertfinding/#expert-list</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">https://archive.org/details/stackexchange</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">https://academia.stackexchange.com/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_5">https://mathoverflow.net/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_6">https://stats.stackexchange.com/</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Overview of the trec 2010 entity track</title>
		<author>
			<persName><forename type="first">K</forename><surname>Balog</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Serdyukov</surname></persName>
		</author>
		<author>
			<persName><surname>De</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">P</forename><surname>Vries</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
		<respStmt>
			<orgName>NORWEGIAN UNIV OF SCIENCE AND TECHNOLOGY TRONDHEIM</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Tech. rep.</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Deep gaussian embedding of graphs: Unsupervised inductive learning via ranking</title>
		<author>
			<persName><forename type="first">A</forename><surname>Bojchevski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Günnemann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations</title>
				<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="1" to="13" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Impact of the query set on the evaluation of expert finding systems</title>
		<author>
			<persName><forename type="first">R</forename><surname>Brochier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Guille</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Rothan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Velcin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 3rd Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2018) co-located with the 41st International ACM SIGIR Conference</title>
				<meeting>the 3rd Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2018) co-located with the 41st International ACM SIGIR Conference</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Global vectors for node representations</title>
		<author>
			<persName><forename type="first">R</forename><surname>Brochier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Guille</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Velcin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The World Wide Web Conference</title>
				<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="2587" to="2593" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Inductive document network embedding with topic-word attention</title>
		<author>
			<persName><forename type="first">R</forename><surname>Brochier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Guille</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Velcin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 42nd European Conference on Information Retrieval Research</title>
				<meeting>the 42nd European Conference on Information Retrieval Research</meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Wiser: A semantic approach for expert finding in academia based on entity linking</title>
		<author>
			<persName><forename type="first">P</forename><surname>Cifariello</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Ferragina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ponza</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Inf. Syst</title>
		<imprint>
			<biblScope unit="volume">82</biblScope>
			<biblScope unit="page" from="1" to="16" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">P@noptic expert: Searching for experts not just for documents</title>
		<author>
			<persName><forename type="first">N</forename><surname>Craswell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Hawking</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Vercoustre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Wilkins</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Ausweb Poster Proceedings</title>
				<meeting><address><addrLine>Queensland, Australia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2001">2001</date>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="page">17</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Overview of the trec 2005 enterprise track</title>
		<author>
			<persName><forename type="first">N</forename><surname>Craswell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">P</forename><surname>De Vries</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Soboroff</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Trec</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page" from="1" to="7" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Skill translation models in expert finding</title>
		<author>
			<persName><forename type="first">A</forename><surname>Dargahi Nobari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Sotudeh Gharebagh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Neshati</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval</title>
				<meeting>the 40th international ACM SIGIR conference on research and development in information retrieval</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="1057" to="1060" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Indexing by latent semantic analysis</title>
		<author>
			<persName><forename type="first">S</forename><surname>Deerwester</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">T</forename><surname>Dumais</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">W</forename><surname>Furnas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">K</forename><surname>Landauer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Harshman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the American Society for Information Science</title>
		<imprint>
			<biblScope unit="volume">41</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="391" to="407" />
			<date type="published" when="1990">1990</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Lt expertfinder: An evaluation framework for expert finding methods</title>
		<author>
			<persName><forename type="first">T</forename><surname>Fischer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Remus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Biemann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)</title>
				<meeting>the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="98" to="104" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">node2vec: Scalable feature learning for networks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Grover</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Leskovec</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining</title>
				<meeting>the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="855" to="864" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Neural word embedding as implicit matrix factorization</title>
		<author>
			<persName><forename type="first">O</forename><surname>Levy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Goldberg</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in neural information processing systems</title>
				<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="2177" to="2185" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Voting for candidates: adapting data fusion techniques for an expert search task</title>
		<author>
			<persName><forename type="first">C</forename><surname>Macdonald</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Ounis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 15th ACM international conference on Information and knowledge management</title>
				<meeting>the 15th ACM international conference on Information and knowledge management</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2006">2006</date>
			<biblScope unit="page" from="387" to="396" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Overview of the trec 2007 blog track</title>
		<author>
			<persName><forename type="first">C</forename><surname>Macdonald</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Ounis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Soboroff</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">TREC</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="31" to="43" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Evidence-centered assessment design</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">J</forename><surname>Mislevy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">M</forename><surname>Riconscente</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Handbook of test development</title>
				<imprint>
			<publisher>Routledge</publisher>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="75" to="104" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">The pagerank citation ranking: Bringing order to the web</title>
		<author>
			<persName><forename type="first">L</forename><surname>Page</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Brin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Motwani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Winograd</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1999">1999</date>
		</imprint>
		<respStmt>
			<orgName>Stanford InfoLab</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Tech. rep</note>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Glove: Global vectors for word representation</title>
		<author>
			<persName><forename type="first">J</forename><surname>Pennington</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Socher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Manning</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)</title>
				<meeting>the 2014 conference on empirical methods in natural language processing (EMNLP)</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="1532" to="1543" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Deepwalk: Online learning of social representations</title>
		<author>
			<persName><forename type="first">B</forename><surname>Perozzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Al-Rfou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Skiena</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining</title>
				<meeting>the 20th ACM SIGKDD international conference on Knowledge discovery and data mining</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="701" to="710" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Finding expert users in community question answering</title>
		<author>
			<persName><forename type="first">F</forename><surname>Riahi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zolaktaf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Shafiei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Milios</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 21st International Conference on World Wide Web</title>
				<meeting>the 21st International Conference on World Wide Web</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="791" to="798" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Modeling multi-step relevance propagation for expert finding</title>
		<author>
			<persName><forename type="first">P</forename><surname>Serdyukov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Rode</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Hiemstra</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 17th ACM conference on Information and knowledge management</title>
				<meeting>the 17th ACM conference on Information and knowledge management</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="1133" to="1142" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Arnetminer: extraction and mining of academic social networks</title>
		<author>
			<persName><forename type="first">J</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Su</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining</title>
				<meeting>the 14th ACM SIGKDD international conference on Knowledge discovery and data mining</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="990" to="998" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Unsupervised, efficient and semantic expertise retrieval</title>
		<author>
			<persName><forename type="first">C</forename><surname>Van Gysel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>De Rijke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Worring</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The International World Wide Web Conferences Steering Committee</title>
				<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="volume">2016</biblScope>
			<biblScope unit="page" from="1069" to="1079" />
		</imprint>
	</monogr>
	<note>WWW.</note>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Network representation learning with rich text information</title>
		<author>
			<persName><forename type="first">C</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Chang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Twenty-Fourth International Joint Conference on Artificial Intelligence</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Expert finding in community question answering: a review</title>
		<author>
			<persName><forename type="first">S</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Hall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">B</forename><surname>Cabotà</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Artificial Intelligence Review</title>
		<imprint>
			<biblScope unit="volume">53</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="843" to="874" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Expert finding in a social network</title>
		<author>
			<persName><forename type="first">J</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Database Systems for Advanced Applications</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="1066" to="1069" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Expansionbased technologies in finding relevant and new information: Thu trec 2002: Novelty track experiments</title>
		<author>
			<persName><forename type="first">M</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ma</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NIST SPECIAL PUBLICATION SP</title>
		<imprint>
			<biblScope unit="volume">251</biblScope>
			<biblScope unit="page" from="586" to="590" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Expert finding for question answering via graph regularized matrix completion</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Ng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Knowledge and Data Engineering</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="993" to="1004" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
