-

Graph Analytical Re-ranking for Entity Search

0 Takahiro Komamizu Nagoya University Nagoya , Japan

Entity search is a fundamental task in Linked Data (LD). The task is, given a keyword search query, to retrieve a set of entities in LD which are relevant to the query. The state-ofthe-art approaches for entity search are based on information retrieval technologies such as TF-IDF vectorization and ranking models. This paper examines the approaches by applying a traditional evaluation metrics, recall@k, and shows ranking qualities still room left for improvements. In order to improve the ranking qualities, this paper explores possibilities of graph analytical methods. LD is regarded as a large graph, graph analytical approaches are therefore appropriate for this purpose. Since query-based graph analytical approaches fit to entity search tasks, this paper proposes a personalized PageRankbased re-ranking method, PPRSD (Personalized PageRank based Score Distribution), for retrieved results by the state-of-the-art. The experimental evaluation recognizes improvements but its results are not satisfactory, yet. For further improvements, this paper reports investigations about relationship between queries and entities in terms of path lengths on the graph, and discusses future directions for graph analytical approaches.

1http://tiny.cc/dbpedia-entity vant answers are absent in top-k results. The analysis results shown the Total column in Table 1, recall@10, recall@100, and recall@1000 are maximally 0.2872, 0.6912, and 0.8708, respectively.

The investigation results indicate that there are still room left for improving rankings. The low recall for top-10 results and the high recall for top-1000 results imply that large amount of relevant results are within 1000 results but most of them are below top-10. Therefore, this paper attempts to improve the ranking by reranking, which arrange the ranking by applying different ranking criteria. It is reasonable to take graph topological features into account due to the nature of LD. Therefore, this paper applies graph analytical methods for re-ranking. The result for the re-ranking method is expected to be an answer to the question.

In consequence, this paper arranges the aforementioned question to the following question. Do graph analysis-based re-ranking methods improve the ranking quality? This paper attempts to take graph analytical methods into account and proposes a re-ranking method PPRSD (Personalized PageRank based Score Distribution) which distributes calculated relevance scores by the state-of-the-art in a personalized PageRank manner. Test of PPRSD gives the following answer, the graph analysis-based re-ranking method can improve the ranking quality but the improvement is not very significant.

In order to find future directions based on graph analytical methods for improving entity search, this paper performs investigations and provides insights. This paper poses a question for results of the preliminary evaluation by recall@k, that is, why recall@1000 is not perfect yet? To answer the question, this paper investigates relationship between query terms and relevant entities for the query, and the investigation reveals that some terms only exist on distant literals from relevant entities. Additionally, this paper obtains a clue for selection of predicates connecting to literals w.r.t. different distances from the entities. Based on these investigations, this paper puts discussion on future directions based on graph analytical approaches for entity search.

The following sections discuss the detail for getting the answers to the questions. Section 2 introduces briefly the state-of-the-art shown in [HNX+17] and showcases the preliminary evaluation in terms of recall@k metrics, and Section 3 explains the idea and detail of PPRSD, and Section 4 evaluates the state-ofthe-art and PPRSD using the test collection and shows the answers to the aforementioned questions. Section 5 displays additional investigations and insights for the future directions, and Section 6 concludes this paper.

State of Current Entity Search

This work explores the future directions of entity search, to this end, this paper investigates the current state of entity search, especially this paper sticks to a leading benchmark, DBpedia-Entity v2 [HNX+17].

As shown in the benchmark, there are various approaches which are mainly based on information retrieval and natural language processing techniques. The list of approaches include fundamental approaches: BM25 [RZ09], BM25-CA [RZ09], LM (Language Modeling) [PC98], SDM (Sequential Dependency Model) [MC05], PRMS (Probabilistic Model for Semistructured Data) [KXC09], and MLM-all (Mixture of Language Models) [OC03]; fielded extension approaches: MLM-CA [OC03], FSDM (Fielded Sequential Dependence Model) [ZKN15], and BM25FCA [RZ09]; extended approaches by entity linking technique [HBB16] for query: LM-ELR [HBB16], SDM-ELR [HBB16], and FSDM-ELR [HBB16].

These works are based on a fielded document construction method in [Has18]. As an overall structure, each entity has 1000 fields together with three additional fields. The 1000 fields are corresponding with top 1000 frequent predicates in DBpedia, and the additional fields are heuristically constructed such that one is “name” field which is constitution of predicates rdfs:label and foaf:name; another is “types” field which contains rdf:type predicate and predicates ending in “subject”, and the other is “contents” field which holds the contents of all fields of connected entities except those connected by owl:sameAs to remove same entities in different languages. Aforementioned approaches use parts of the fielded documents as follows: BM25, LM and SDM use the contents field; and MLMall, PRMS and FSDM use top-10 fields. The field extension approaches are differentiated by settings of field weights (e.g., MLM-all uses equal weights for all fields, while PRMS learns weights for fields).

To investigate the qualities of these approaches, this paper tests more intuitive metrics recall@k in addition to NDCG which is shown in [HNX+17]. The NDCG results are copied to Table 4 (rows of not *ed method names correspond to the original results shown in [HNX+17]). The NDCG result shows comparative ranking qualities among these approaches. While, NDCG is not a clear indicator for distances from goals. Therefore, this paper investigates more clear indicator, recall@k (Eqn. 1) which reveals ratio of relevant results in top-k. (1)

Table 1 displays recall@k (k 2 f10; 100; 1000g) and it indicates that more than 80% of relevant results are the number of relevant items in top-k the total number of relevant items @10 .1344 .1397

Total @100 .4198 .4355 included in top-1000 but only 20% to 45% of them are included in top-10, which indicates there are room left for improving rankings. The recalls are calculated on the top-1000 results presented in the benchmark data2. The boldface and underlined cells in the table show maximum recall scores for tasks and k. All methods have low recall@10 as well as recall@100, but still high recall@1000, meaning that ranking performance should be improved. The gap row in the table emphasizes that top-10 results have large room left for improvements. 3

PageRank-based Re-ranking

This work attempts to improve the ranking qualities by graph analytical re-ranking methods. LD is modeled as a labeled graph, it is therefore reasonable to apply graph analytical approaches to evaluate values of entities. In particular, this paper explores feasibility of PageRank [PBMW99], which is popular graph analytical methods to originally evaluate Web pages and has been applied for many other domains.

This paper models LD data as data graph (Def. 1) Definition 1 (Data Graph) Given LD data, data graph G = (V; E) is a graph, where set V = R [ L [ B of vertices are union of set R of entities, set L of literals and set B of blank nodes, and set E V P V of edges between vertices with predicates P as labels.

2https://github.com/iai-group/DBpedia-Entity/tree/ master/runs/v2

The subsequent sections introduce naïve baseline approaches and the proposed re-ranking method, PPRSD. Section 3.1 introduces re-ranking methods via PageRank [PBMW99] and personalized PageRank [Hav02], and introduces a preliminary evaluation of these methods. Then, Section 3.2 explains PPRSD which utilizes both results of the state-of-the-art and advantages of personalized PageRank. 3.1

Naïve Graph Analytical Re-ranking

As discussed above, graph analytical approaches are reasonable for re-ranking criteria, however, with a little consideration, global evaluation methods like PageRank do not make sense for ranking entities with respect to input keyword queries. Roughly speaking, PageRank evaluates vertices having lots of incoming links as important. Therefore, when PageRank is applied to the data graph G, PageRank gives an order of vertices which is independent from input queries. Examinations for the global rankings show bad results (this paper does not include this because it is obvious).

In order to test PageRank and personalized PageRank in a re-ranking manner, this work utilizes an insight from the recall@k results in Table 1. The insight is that the top-1000 results by existing methods include more than 80% of relevant results. Thus, the idea of re-ranking with PageRank and personalized PageRank is to filter top-1000 result entities by the existing methods and to apply the graph analytical approaches. To do so, an induced subgraph (Definition 2) for the top-1000 result entities are extracted.

On the induced subgraph G0 extracted from top1000 results, PageRank and personalized PageRank values are calculated as Eqn. 2 and Eqn. 3. In Eqn. 2, pr is a PageRank vector with 1000 length, A is a 1000 1000 adjacency matrix of G0, e is 1000-length vector which elements are all 1, and d is a damping factor which is the probability of random jumps. Similarly, in Eqn 3, pprq is a 1000-length PageRank vector for query q, A is an adjacency matrix as PageRank, s is 1000-length personalized vector for q, which elements corresponding with matching entities for q are 1 and other elements are 0, and d is a damping factor. pr = (1

d) prA + d e pprq = (1 d) pprqA + d s (2) ( 3 )

A preliminary experiment over these naïve reranking methods shows worse results than the stateof-the-art. The preliminary experiment tests the feasibility of aforementioned methods (PageRank and personalized PageRank-based re-ranking methods) on the DBpedia-Entity v2 benchmark [HNX+17]. The reranking approaches are applied for all the state-ofthe-art methods listed in Table 1. Table 2 displays maximum recall@k values among the applied methods of PageRank and personalized PageRank, separately. Amongst PageRank and personalized PageRank, personalized PageRank has achieved better performance than PageRank, therefore, taking relevance to queries into account results better ranking qualities. Comparing recall@k of the state-of-the-art shown in Table 1, the re-ranking methods are mostly worse then them. Consequently, re-ranking methods should more rely on the state-of-the-art. 3.2

Re-ranking by Score Distribution

The preliminary evaluation on the naïve re-ranking methods reveal two facts: one is personalized PageRank-based re-ranking is superior to PageRankbased re-ranking, and the other is the state-of-the-art are still more powerful than simple graph analytical approaches. Therefore, the facts suggest that personalized PageRank-based method with utilizing results of the state-of-the-art can be a better choice. The rest of this section introduces how to realize it.

The main idea of the proposed approach is that utilizing relevance scores for re-ranking algorithm via personalized PageRank. The state-of-the-art rank entities by their own relevance scores, the scores indicate relative relevance degrees among the resulting entities. That is, there are more or less gaps on relevance scores than those on ranks. Additionally, the relevance scores are more sophisticated than just counting matching entities as naïve personalized PageRankbased re-ranking approach (s in Eqn. 3).

To realize this idea, this work arranges the personalized PageRank formulation shown in Eqn. 3 to include the relevance scores calculated by the state-of-the-art as Eqn. 4 called PPRSD (stands for Personalized PageRank based Score Distribution).

pprsdq = (1 d) pprsdqA + d t (4) where pprsdq is a 1000-length relevance score vector of PPRSD. The personalized vector s is redefined as t, where each element ti of entity vi 2 V 0 stores a relevance score of q to vi calculated by one of the state-ofthe-art method. Log likelihood-based relevance scores (i.e., LM, MLM, SDM, FSDM, PRMS, and their variations) are negative values in nature, therefore, these scores are converted to positive numbers by applying exponential function. In addition, the converted scores are quite small (e.g., 10 34) because the values in the log function are products of probabilities, therefore, the converted scores are multiplied by positive number so as to make the scores comparable with those of the other methods. As PageRank-based methods compute the relevance score vectors (i.e., pr in Eqn. 2 and ppr in Eqn. 3), PPRSD also computes the relevance score vector, pprsd, by the power method. Result entities ranked by PPRSD are of ordering in the relevance scores. 4

Experimental Evaluation

The experiment in this paper attempts to confirm the re-ranking method, PPRSD, improves the ranking qualities in terms of both recall@k and NDCG@k. Since PPRSD relies on the results of the state-ofthe-art, this experiment uses the standard benchmark dataset [HNX+17]3 of entity search on DBpedia. Relevance scores for entities in the state-of-the-arts are obtained from the website of the benchmark4. This experimental evaluation attempts to answer the following question: Does the re-ranking method improve the state-of-the-are? And, how large or small the improvements are? In order to answer the question, PPRSD and the state-of-the-art are compared by recall@k (Eqn. 1) and NDCG@k (Eqn. 5) which is a ratio of DCG@k (Eqn. 6) over the ideal value of DCG@k referred as to IDCG@k. 3http://tiny.cc/dbpedia-entity 4https://github.com/iai-group/DBpedia-Entity/tree/ master/runs/v2 0.30 0.25 010.20 lla@0.15 c re0.10 0.05

The subsequent sections discuss the comparison of the ranking qualities between the original methods and the re-ranked methods by PPRSD. More specifically, Section 4.1 introduces an empirical study for determining damping factor d in Eqn. 4, and, based on the choice of the damping factor, Section 4.2 discusses the comparison between the PPRSD-based methods over the original methods. 4.1

Effect of Damping Factor

Figure 1 showcases effects of damping factor in various state-of-the-art which PPRSD is applied, and it reveals that smaller damping factor (i.e., around 0.1) achieves the best performances. In the figure, horizontal axis expresses damping factor which ranges 0 to 1.0 in Figure 1(a) and 0 to 0.2 in Figure 1(b), vertical axis represents recall@10, and lines in the figure represent the state-of-the-art. FSDM-ELR and BM25F-CA perform the best among the state-of-the-art in the figure around damping factor 0.1. In consequence, keeping 10% of relevance scores is the best choice for PPRSD, so d = 0:1 is used for the later experiments. 4.2

Overall Evaluation

Table 3 and Table 4 display the comparisons of values of recall@k for the former and NDCG@k for the latter among PPRSD and the state-of-the-art. The Model row of the table represents tasks of entity search and values of k, and the left-most column shows lists of the state-of-the-art and re-ranked versions (which are represented by *) of them by PPRSD. In addition, each group of rows corresponding with a method includes imp. row which emphasizes the improvement ratio by PPRSD. Cells contain recall@k values, and the best value in a row is emphasized by boldface and underline.

Table 3 shows PPRSD successfully improves ranking qualities of the state-of-the-art. The Total column represents the ranking qualities among all tasks. The column indicates that 7 over 12 methods have been improved by PPRSD in recall@10 and 8 methods have also been improved by PPRSD but 2 methods have been degraded the ranking qualities in recall@100. This indicates that PPRSD successfully improves the ranking qualities. Note that PPRSD improves not only elementary approaches (e.g., BM25) but also more sophisticated approaches (e.g., BM25FCA and FSDM-ELR).

Table 4 also shows PPRSD successfully improves ranking qualities of the state-of-the-art. Recall@k and NDCG@k obviously have correlation, therefore, the improvements should be confirmed as the success shown in recall@k. As expected, the best results are all of PPRSD-based methods. It is worthy to note that the improvement ratios shown in imp. are larger than those of recall@k, indicating that PPRSD improves the rankings not only by just more relevant entities in the rankings but also better positions of relevant entities in the rankings. Since NDCG@k is good at relative comparison between rankings, the results confirm the improvements of rankings. For example, BM25CA* is the best ranking method in terms of recall@10 for QALD-2 task, however, BM25F-CA* is superior to BM25-CA* and the best in terms of NDCG@10 for the same task. This means that BM25F-CA* have less relevant entities in top-10 rankings but more relevant entities are in the earlier positions in the top-10 rankings. Consequently, evaluation based on NDCG@k confirms the ranking improvements by PPRSD. 5

Investigation for Improvements

This section explores possibilities of further improvements for the state-of-the-art and PPRSD-based reranking methods in the following aspect: Why recall@1000 is not 100% yet? Since PPRSD is based on the state-of-the-art, the upper bound of ranking qualities is limited by them and improving the stateof-the-art is also important for further improvement of PPRSD. Therefore, this work investigates the reason why top-1000 results have not been perfect yet. To answer the question, this paper investigates path lengths from relevant entities to entities which literals contain an input keyword query term (detail in Section 5.1). The investigation reveals that there are still space left for including literals within larger distances (i.e., 3, 4, and 5 hops). Obviously, taking longer paths (or sequences of predicates) into account entails explosion of the number literals included into documents of entities. As a result, each entity gets noisy documents, and it is easy to imagine that the noisy documents degrade ranking qualities. To obtain hints for preferable paths for literals, this paper investigates the commonalities of tail predicates in the paths (detail in Section 5.2). The investigation reveals that tail predicates should be different for different lengths of the paths. 5.1

Distance from Query Term

The state-of-the-art rely on terms occurring within two hops at most, modeled as fielded documents. Section 2 introduces the fields of entities taken into account for the state-of-the-art, and the contents field includes contents of one-hop away entities. This implies that no method considers terms occurring within longer hops away.

The fielded document construction limits the possibilities to reach to the relevant results due to the absence of query terms in the documents. This fact is estimated from the preliminary evaluation on recall@k in Table 1, that is, recall@1000 values are less than 86% (except SemSearch ES task which is designed for direct matching with terms). In other words, 14% are below top-1000 results.

In order to answer question why recall@1000 is not perfect?, this paper attempts to realize the relation between relevant answers and the numbers of hops from query terms. To this end, this work investigates the minimum distances from relevant entities to query terms by performing SPARQL queries in terms of the distances. SPARQL queries are generated with a graph pattern of a sequential path from given entity r 2 R to literal ` 2 L which contains query term t, and predicates and resources between r and ` are fulfilled by free variables. Figure 3 illustrates a n-length graph pattern for entity r and query term t. Based on the pattern, ASK query (which is an indicator function query in SPARQL) is generated to examine such pattern exists. Following SPARQL query displays examples of generated ASK queries for distance 2.

ASK{ hri ?p0 ?v0. ?v0 ?p1 ?v1.

?v1 ?p2 ?l. ?l bif:contains ’t’.

FILTER isLiteral(?l).}

This investigation measures the minimum distance which satisfies the ASK query corresponding with the distance. The procedure of this investigation is that: (1) given query q, relevant entity list Aq for q is obtained from the benchmark dataset; (2) parse q into set Tq of terms; ( 3 ) examine ASK queries from length 0 to maximum length (5 for this investigation) for each pair of relevant entity r 2 Aq and term t 2 Tq; (4) as soon as the ASK query is satisfied, the distance is recorded; and (5) the obtained distances for each relevant entities are analyzed. Obtained distances for a relevant entity of a query may be different term by term. Therefore, this investigation analyses minimum distance, average distance and maximum distance for each relevant entity of a query. Consequently, these distances are individually gathered and calculate their averages to observe how long distances required to touch query terms from relevant entities.

Figure 2 showcases the analyzed distances with respect to tasks as well as with regardless of tasks (i.e., Total ). In the figure, bars represent ratios of relevant entities having the number of hops (distances) to reach from query terms, and dashed lines express cumulative ratios of relevance entities. Three kinds of bars (lightgray and oblique stripe bars, gray and horizontal stripe bars, and black and crossing stripe bars) correspond 1.0 lts0.8 u s re0.6 f ioo0.4 t aR0.2 0.0 0 The1numb2er of3hops 4 (a) Total with minimum, average (rounded), and maximum distances. Similarly, three kinds of lines (lines with triangles, circles, and squares) correspond with minimum, average (rounded), and maximum.

Figure 2 indicates that at least one term is included in literals directly connected with relevant entities, and Figure 2(a) indicates that most of the relevant entities are reachable from query terms within two hops on average, however, in terms of maximum distances, still more than 10% relevant entities are not reachable within three hops. This fact answers the question why recall@1000 is not perfect? as some relevant entities are still not found by the query terms due to the smaller distances to construct entity documents. This phenomenon is also marked on individual tasks except SemSeach ES task, which is a simple tasks so that queries in the task are more directly explaining requiring entities than others. 5.2

Commonality of Tail Predicates of Paths

A simple solution for improving ranking qualities in terms of the previous investigation is top include literals within more hops (i.e., 3 or more), however, it is obvious that the solution incurs noisy entity documents by including unnecessary literals within larger hops. The number of reachable entities in G increases very quickly as the distance increases. Therefore, irrelevant entities contribute to entity documents.

An intuition to avoid this situation is to select “good” paths from an entity which include meaningful literals for the entity. A naïve extension is to find paths from an entity to “good” entities and to include their documents (suppose the same approach to the state-ofthe-art) into the document of the entity. This paper wants to clarify there is any difference between selfdescriptive literals and supportive literals for other entities. Self-descriptive literals explain well about target entities, while supportive literals explain supplemental facts about the target entities. Self-descriptive literals tend to be close to the targets, while supportive literals tend to relatively distant from the targets. Therefore, this investigation attempts to understand the differences of predicates with ending literals (called tail predicates) between shorter and longer paths. The investigation is done in the following procedure: gathers surveyed paths for each relevant entities using intermediate results of the previous investigation (Section 5.1), and analyzes the paths in terms of commonalities of the tail predicates. The commonalities are measured for different lengths (i.e., 1 to 4) of the tail predicate sequences by Jaccard index, Jaccard (Yir; Yjr) = jjYiir[\YYjjrrjj , Y r where Yir is a set of i-length tail predicates of entity r and j j is cardinality.

Table 5 display commonalities of tail predicates among different lengths of paths for different tail lengths. The results reveal that commonalities of tail predicates decrease as differences of path lengths increase. This fact indicates that literals reachable in different path lengths should select different tail predicates (e.g., rdfs:label is not always a good choice.). Due to the tremendous number of tail predicates, the detailed analysis on what kind tail predicates are preferable in particular path lengths is left for future work. Examples from rough analysis include rdfs:label and rdfs:comment for 1-length paths and dbo:wikiPageWikiLinkText for 5-length paths. of-the-art by formulating as a re-ranking problem. For further improvements, this paper reports two investigations about relationship between paths to literals containing query terms and relevant entities to queries. Results of the investigations support the improvement possibilities of graph analytical approaches, and developing them is still left for future works. Consequently, this paper answers to the first question as Yes, they are appropriate, but there is still an issue on matching entities in a graph.

Acknowledgments.

This work was partly supported by JSPS KAKENHI Grant Number JP18K18056.

This section investigates relationship between paths and result entities in order to answer the question Why recall@1000 is not 100% yet? The answers of this investigation are 2-fold: (1) literals in distant paths are absent from documents; and (2) setting of tail predicates is universal for all lengths of paths. Although, the first problem is obvious, still the number of reachable entities in more than two hops is extraordinary large, therefore, constructing documents from longer paths is computationally expensive. Additionally, in the naïve approach, the generated documents may include large amount of not quite relevant facts to entities.

Future Directions.

To overcome the aforementioned problems, solutions lies on graph analytical approaches as Section 3 showcases their possibilities. A basic idea is to select appropriate reachable predicates and entities within two or more hops. To this end, graph analytical approaches (e.g., PageRank, Random Walks) can be good choices. As discussed in this paper, nonpersonalized PageRank and its families are not appropriate, meaning that global centralities do not help. Therefore, customizable graph analytical approaches such as ObjectRank [BHP04] and random walk with restart (RWR) [TFP08] are preferable. There are some preliminary works based on this idea, namely FORK [KOAK17], and RWRDoc [Kom18]. FORK has applies ObjectRank for entity search and it achieves better precision@k. While, RWRDoc has applies RWR for determining importances of reachable entities in terms of RWR scores and it slight improves in terms of NDCG@k. These indicates that graph analytical approaches still leave space for improvements. 6

Conclusion

This paper deals with entity search over Linked Data, analyzes the state-of-the-art in terms of recall@k, and reveals the possibilities of improvements of the stateof-the-art. Also, this paper indicates the feasibility of graph analytical approaches for improving the state[BHB09] [BHP04] [Has18] [Hav02] [HBB16]

Christian Bizer, Tom Heath, and Tim Berners-Lee. Linked Data - The Story

So Far. Int. J. Semantic Web Inf. Syst., 5( 3 ):1–22, 2009.

Andrey Balmin, Vagelis Hristidis, and

Yannis Papakonstantinou. ObjectRank: Authority-Based Keyword Search in Databases. In VLDB 2004, pages 564–575, 2004.

Faegheh Hasibi. Semantic Search with Knowledge Bases. PhD thesis, Norwegian University of Science and Technology, Trondheim, Norway, 2018.

Taher H. Haveliwala. Topic-sensitive PageRank. In WWW 2002, pages 517– 526, 2002. Faegheh Hasibi, Krisztian Balog, and

Svein Erik Bratsberg. Exploiting Entity Linking in Queries for Entity Retrieval.

In ICTIR 2016, pages 209–218, 2016. [HNX+17] Faegheh Hasibi, Fedor Nikolaev, Chenyan Xiong, Krisztian Balog, Svein Erik Bratsberg, Alexander Kotov, and Jamie Callan.

DBpedia-Entity v2: A Test Collection for

Entity

Search . In SIGIR 2017 , pages 1265 - 1268 , 2017 .

[KOAK17]

Takahiro

Komamizu , Sayami Okumura, Toshiyuki Amagasa, and

Hiroyuki

Kitagawa . FORK: Feedback-Aware ObjectRank-Based Keyword Search over Linked Data . In AIRS 2017 , pages 58 - 70 , 2017 .

[PBMW99]

Lawrence

Page , Sergey Brin, Rajeev Motwani, and

Terry

Winograd . The PageRank Citation Ranking: Bringing Order to the Web . Technical Report 1999-66 , November 1999 .

Napolitano.

7th Open Challenge on Question Answering over Linked Data (QALD7) . In ESWC 2017 , pages 59 - 69 , 2017 .

[WKC+12] Qiuyue

Wang

, Jaap Kamps, Georgina Ramírez Camps, Maarten Marx, Anne Schuth, Martin Theobald, Sairam Gurajada, and

Arunav

Mishra . Overview of the INEX 2012 Linked Data Track . In CLEF 2012 Evaluation Labs and Workshop , 2012 .

[ZKN15]

Nikita

Zhiltsov , Alexander Kotov, and

Fedor

Nikolaev . Fielded Sequential Dependence Model for Ad-Hoc Entity Retrieval in the Web of Data . In SIGIR 2015 , pages 253 - 262 , 2015 .

[Kom18] [KXC09] [MC05] [OC03] [PC98] [PMZ10] [RZ09] [TFP08] [UNH +17] Ricardo

Usbeck

, Axel-Cyrille Ngonga

Ngomo

, Bastian Haarmann, Anastasia Krithara,

Michael

Röder , and Giulio Takahiro Komamizu. Learning Interpretable Entity Representation in Linked Data . In DEXA 2018 , 2018 . (to appear).

Jinyoung

Kim , Xiaobing Xue, and

W. Bruce

Croft . A Probabilistic Retrieval Model for Semistructured Data . In ECIR 2009 , pages 228 - 239 , 2009 .

Donald

Metzler and

W. Bruce

Croft . A Markov Random Field Model for Term Dependencies . In SIGIR 2005 , pages 472 - 479 , 2005 .

Paul

Ogilvie and

James P.

Callan . Combining Document Representations for Known-item Search . In SIGIR 2003 , pages 143 - 150 , 2003 .

Jay M. Ponte and W. Bruce

Croft . A Language Modeling Approach to Information Retrieval . In SIGIR 1998 , pages 275 - 281 , 1998 .

Jeffrey

Pound ,

Peter

Mika , and

Hugo

Zaragoza . Ad-hoc Object Retrieval in the Web of Data . In WWW 2010 , pages 771 - 780 , 2010 .

Stephen E.

Robertson and

Hugo

Zaragoza . The Probabilistic Relevance Framework: BM25 and Beyond .

FTIR , 3 ( 4 ): 333 - 389 , 2009 .

Inf. Syst. , 14 ( 3 ): 327 - 346 , 2008 .