<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>PageRank on Wikipedia: Towards General Importance Scores for Entities</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andreas Thalhammer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Achim Rettinger</string-name>
          <email>achim.rettingerg@kit.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>AIFB, Karlsruhe Institute of Technology</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Link analysis methods are used to estimate importance in graph-structured data. In that realm, the PageRank algorithm has been used to analyze directed graphs, in particular the link structure of the Web. Recent developments in information retrieval focus on entities and their relations (i. e. knowledge graph panels). Many entities are documented in the popular knowledge base Wikipedia. The cross-references within Wikipedia exhibit a directed graph structure that is suitable for computing PageRank scores as importance indicators for entities. In this work, we present di erent PageRank-based analyses on the link graph of Wikipedia and according experiments. We focus on the question whether some links - based on their position in the article text - can be deemed more important than others. In our variants, we change the probabilistic impact of links in accordance to their position on the page and measure the e ects on the output of the PageRank algorithm. We compare the resulting rankings and those of existing systems with pageview-based rankings and provide statistics on the pairwise computed Spearman and Kendall rank correlations.</p>
      </abstract>
      <kwd-group>
        <kwd>Wikipedia</kwd>
        <kwd>DBpedia</kwd>
        <kwd>PageRank</kwd>
        <kwd>link analysis</kwd>
        <kwd>page views</kwd>
        <kwd>rank correlation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Entities are omnipresent in the landscape of modern information extraction and
retrieval. Application areas range from natural language processing over
recommender systems to question answering. For many of these application areas it
is essential to build on objective importance scores of entities. One of the most
successful amongst di erent methods is the PageRank algorithm [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. It has been
proven to provide objective relevance scores for hyperlinked documents, e. g. in
Wikipedia [
        <xref ref-type="bibr" rid="ref5 ref6 ref9">5,6,9</xref>
        ]. Wikipedia serves as a rich source for entities and their
descriptions. Its content is currently used by major Web search engine providers
as a source for short textual summaries that are presented in knowledge graph
panels. In addition, the link structure of Wikipedia has been shown to exhibit
the potential to compute meaningful PageRank scores: connected with
semantic background information (such as DBpedia [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]) the PageRank scores over the
Wikipedia link graph enable rankings of entities of speci c types, for example for
Listing 1.1. Example: SPARQL query on DBpedia for retrieving top-10 scientists
ordered by PageRank (can be executed at http://dbpedia.org/sparql).
PREFIX v:&lt;http://purl.org/voc/vrank#&gt;
SELECT ?e ?r
FROM &lt;http://dbpedia.org&gt;
FROM &lt;http://people.aifb.kit.edu/ath/#DBpedia_PageRank&gt;
WHERE {
?e rdf:type dbo:Scientist;
v:hasRank/v:rankValue ?r.
} ORDER BY DESC(?r) LIMIT 10
scientists (see Listing 1.1). Although the provided PageRank scores [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] exhibit
reasonable output in many cases, they are not always easily explicable. For
example, as of DBpedia version 2015-04, \Carl Linnaeus" (512) has a much higher
PageRank score than \Charles Darwin" (206) and \Albert Einstein" (184)
together in the result of the query in Listing 1.1. The reason is easily identi ed
by examining the articles that link to the article of \Carl Linnaeus":1 Most
articles use the template Taxobox2 that de nes the eld binomial authority.
It becomes evident that the page of \Carl Linnaeus" is linked very often
because Linnaeus classi ed species and gave them a binomial name (cf. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]). In
general, entities from the geographic and biological domains have distinctively
higher PageRank scores than most entities from other domains. While, given
the high inter-linkage of these domains, this is expected to some degree, articles
such as \Bakhsh" (1913), \Provinces of Iran" (1810), \Lepidoptera", (1778), or
\Powiat" (1408) are occurring in the top-50 list of all things in Wikipedia, in
accordance to DBpedia PageRank 2015-04 [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] (see Table 5). These points lead
us to the question whether these rankings can be improved. Unfortunately, this
is not a straight forward task as a gold standard is missing and rankings are
often subjective.
      </p>
      <p>In this work we investigate on di erent link extraction3 methods that address
the root causes for the e ects stated above. We focus on the question whether
some links - based on their position in the article text - can be deemed more
important than others. In our variants, we change the probabilistic impact of
links in accordance to their position on the page and measure the e ects on the
output of the PageRank algorithm. We compare these variants and the rankings
of existing systems with page-view-based rankings and provide statistics on the
pairwise computed Spearman and Kendall rank correlations.
1 Articles that link to \Carl Linnaeus" { https://en.wikipedia.org/wiki/</p>
      <p>Special:WhatLinksHere/Carl_Linnaeus
2 Template:Taxobox { https://en.wikipedia.org/wiki/Template:</p>
      <p>Taxobox
3 With \link extraction" we refer to the process of parsing the wikitext of a Wikipedia
article and to correctly identify and lter hyperlinks to other Wikipedia articles.</p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <p>In this section we provide additional background on the used PageRank variants,
link extraction from Wikipedia, and redirects in Wikipedia.
2.1</p>
      <sec id="sec-2-1">
        <title>PageRank Variants</title>
        <p>
          The PageRank algorithm follows the idea of a user that browses Web sites by
following links in a random fashion (random surfer). For computing PageRank,
we use the original PageRank formula [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and a weighted version [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] that accounts
for the position of a link within an article.
        </p>
        <p>
          { Original PageRank [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] { On the set of Wikipedia articles W , we use
individual directed links link(w1; w2) with w1; w2 2 W , in particular the set of
pages that link to a page l(w) = fw1jlink(w1; w)g and the count of
outgoing links c(w) = jfw1jlink(w; w1)gj. The PageRank of a page w0 2 W is
computed as follows:
pr(w0) = (1
d) + d
        </p>
        <p>
          X
wn2l(w0)
pr(wn)
c(wn)
{ Weighted Links Rank (WLRank) [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] { In order to account for the relative
position of a link within an article, we adapt Formula (1) and introduce link
weights. The idea is that the random surfer is likely not to follow every link
on the page with the same probability but may prefer those that are at the
top of a page. The WLRank of a page w0 2 W is computed as follows:
pr(w0) = (1
d) + d
        </p>
        <p>X
wn2l(w0)
pr(wn) lw(link(wn; w0))
Pwm lw(link(wn; wm))
The link weight function lw is de ned as follows:
lw(link(w1; w2)) = 1
f irst occurrence(link(w1; w2); w1)
jtokens(w1)j
(1)
(2)
(3)
For tokenization we are splitting the article text in accordance to white
spaces but do not split up links (e. g., [[brown bear|bears]] is treated
as one token). The token numbering starts from 1, i. e. the rst word/link
of an article. The method f irst occurrence returns the token number of the
rst occurrence of a link within an article.</p>
        <p>Both formulas (1) and (2) are iteratively applied until the scores converge. The
variable d marks the damping factor: in the random surfer model, it accounts
for the possibility of accessing a page via the browser's address bar instead of
accessing it via a link from another page.</p>
        <p>For reasons of presentation, we use the non-normalized version of PageRank
in both cases. In contrast to the normalized version, the sum of all computed
PageRank scores is the number of articles (instead of 1) and, as such, does not
re ect a statistical probability distribution. However, normalization does not
in uence the nal ranking and the resulting relations of the scores.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Wikipedia Link Extraction</title>
        <p>
          In order to create a Wikipedia link graph we need to clarify which types of
links are considered. The input for the rankings of [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] is a link graph that is
constructed by the DBpedia Extraction Framework4 (DEF). The DBpedia
extraction is based on Wikipedia database backup dumps5 that contain the
nonrendered wikitexts of the Wikipedia articles and templates. From these sources,
DEF builds a link graph by extracting links of the form [[article|anchor
text]]. We distinguish two types of links with respect to templates:6
1. Links that are de ned in the Wikipedia text but do not occur within a
template, for example \[[brown bear|bears]]" outside ff and gg.
2. Links that and provided as (a part of) a parameter to the template, for
example \[[brown bear|bears]]" inside ff and gg.
        </p>
        <p>DEF considers only these two types of links and not any additional ones that
result from the rendering of an article. It also has to be noted that DEF does not
consider links from category pages. This mostly a ects links to parent categories
as the other links that are presented on a rendered category page (i. e. all articles
of that category) do not occur in the wikitext. As an e ect, the accumulated
PageRank of a category page would be transferred almost 1:1 to its parent
category. This would lead to a top-100 ranking of things with mostly category
pages only. In addition, DEF does not consider links in references (denoted via
&lt;ref&gt; tags).</p>
        <p>In this work, we describe how we performed more general link extraction
from Wikipedia. Unfortunately, in this respect, DEF exhibited certain in
exibilities as it processes Wikipedia articles line by line. This made it di cult to
regard links in the context of an article as a whole (e. g., in order to determine
the relative position of a link). In consequence, we reverse-engineered the link
extraction parts of DEF and created the SiteLinkExtractor7 tool. The tool
enables to execute multiple extraction methods in a single pass over all articles
and can also be extended by additional extraction approaches.
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Redirected vs. Unredirected Wikipedia Links</title>
        <p>DBpedia o ers two types of page link datasets:8 one in which the redirects are
resolved and one in which they are contained. In principle, also redirect chains
of more than one hop are possible but, in Wikipedia, the MediaWiki software is
con gured not to follow such redirect chains (that are called \double redirect"
4 DBpedia Extraction Framework {</p>
        <p>extraction-framework/wiki
5 Wikipedia dumps { http://dumps.wikimedia.org/
6 Template inclusions are marked by double curly brackets, i. e. ff and gg.
7 SiteLinkExtractor { https://github.com/TBritsch/SiteLinkExtractor
8 DBpedia PageLinks { http://wiki.dbpedia.org/Downloads2015-04
https://github.com/dbpedia/</p>
        <p>P L</p>
        <p>P LR
in Wikipedia)9 automatically and various bots are in place to remove them. As
such, we can assume that only single-hop redirects are in place. However, as
performed by DBpedia, also single-hop redirects can be resolved (see Figure 1).
Alternatively, for various applications (especially in NLP) it can make sense to
keep redirect pages as redirect pages also have a high number of inlinks in various
cases (e. g. \Countries of the world")10. However, with reference to Figure 1 and
assuming that redirect pages only link to the redirect target, B passes most of
its own PageRank score on to C (note that the damping factor is in place).
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Link Graphs</title>
      <p>We implemented ve Wikipedia link extraction methods that enable to create
di erent input graphs for the PageRank algorithm. In general we follow the
example of DEF and consider type 1 and 2 links for extraction (which form a
subset of those that occur in a rendered version of an article). The following
extraction methods were implemented:
All Links (ALL) This extractor produces all type 1 and 2 links. This is the
reverse-engineered DEF method. It serves as a reference.</p>
      <p>Article Text Links (ATL) This measure omits links that occur in text that
is provided to Wikipedia templates (i. e. includes type 1 links, omits type 2
links). The relation to ALL is as follows: AT L ALL.</p>
      <sec id="sec-3-1">
        <title>Article Text Links with Relative Position (ATL-RP) This measure ex</title>
        <p>tracts all links from the Wikipedia text (type 1 links) and produces a score
for the relative position of each link (see Formula 3). In e ect, the link graph
ATL-RP is the same as ATL but uses edge weights based on each link's
position.</p>
        <p>Abstract Links (ABL) This measure extracts only the links from Wikipedia
abstracts. We chose the de nition of DBpedia which de nes an abstract as
9 Wikipedia: Double redirects { https://en.wikipedia.org/wiki/Wikipedia:</p>
        <p>Double_redirects
10 Inlinks of \Countries of the world" { https://en.wikipedia.org/wiki/</p>
        <p>Special:WhatLinksHere/Countries_of_the_world</p>
        <p>Redirects are not resolved in any of the above methods. We execute the
introduced extraction mechanisms on dumps of the English (2015-02-05) and
German (2015-02-11) Wikipedia. The respective dates are aligned with the input
of DEF with respect to DBpedia version 2015-04.12 Table 1 provides an overview
of the number of extracted links per link graph.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experiments</title>
      <p>In our experiments, we rst computed PageRank on the introduced link graphs.
We then measured the pairwise rank correlations (Spearman's and Kendall's
)13 between these rankings and the reference datasets (of which three are also
based on PageRank and two are based on page-view data of Wikipedia). With
the resulting correlation scores, we investigated on the following hypotheses:
H1 Links in templates are created in a \please ll out" manner and rather
negatively in uence on the general salience that PageRank scores should
represent.</p>
      <p>H2 Links that are mentioned at the beginning of articles are more often clicked
and correlate with the number of page views that the target page receives.
H3 The practice of resolving redirects does not strongly impact on the nal
ranking in accordance to PageRank scores.
4.1</p>
      <sec id="sec-4-1">
        <title>PageRank Con guration</title>
        <p>
          We computed PageRank with the following parameters on the introduced link
graphs ALL, ATL, ATL-RP, ABL, and TEL: non-normalized, 40 iterations,
damping factor 0:85, start value 0:1.
11 DBpedia abstract extraction { http://git.io/vGZ4J
12 DBpedia 2015-04 dump dates {
http://wiki.dbpedia.org/servicesresources/datasets/dataset-2015-04/dump-dates-dbpedia-2015-04
13 Both measures have a value range from 1 to 1 and are speci cally designed for
measuring rank correlation.
DBpedia PageRank (DBP) The scores of DBpedia PageRank [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] are based
on the \DBpedia PageLinks" dataset (i. e. Wikipedia PageLinks as extracted
by DEF, redirected). The computation was performed with the same
conguration as described in Section 4.1. The scores are regularly published as
TSV and Turtle les. The Turtle version uses the vRank vocabulary [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Since
DBpedia version 2015-04, the DBP scores are included in the o cial
DBpedia SPARQL endpoint (cf. Listing 1.1 for an example query). In this work,
we use the following versions of DBP scores based on English Wikipedia:
2014, 2015-04.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>DBpedia PageRank Unredirected (DBP-U) This dataset is computed in</title>
        <p>the same way as DBP but uses the \DBpedia PageLinks Unredirected"
dataset.14 As the name suggests, Wikipedia redirects are not resolved in this
dataset (see Section 2.3 for more background on redirects in Wikipedia). We
use the 2015-04 version of DBP-U.</p>
        <p>SubjectiveEye3D (SUB) Paul Houle aggregated the Wikipedia page views
of the years 2008 to 2013 with di erent normalization factors (particularly
considering the dimensions articles, language, and time)15. As such,
SubjectiveEye3D re ects the aggregated chance for a page view of a speci c
article in the interval years 2008 to 2013. However, similar to unnormalized
PageRank, the scores need to be interpreted in relation to each other (i. e.
the scores do not re ect a proper probability distribution as they do not add
up to one).</p>
      </sec>
      <sec id="sec-4-3">
        <title>The Open Wikipedia Ranking - Page Views (TOWR-PV) \The Open</title>
        <p>Wikipedia Ranking"16 provides scores on page views. The data is described
as \the number of page views in the last year" on the project's Web site.</p>
        <p>The two page-views-based rankings serve as a reference in order to evaluate
the di erent PageRank rankings. We show the amount of entities covered by the
PageRank datasets and the entity overlap with the page-view-based rankings in
Table 2.
4.3</p>
      </sec>
      <sec id="sec-4-4">
        <title>Results</title>
        <p>We used MATLAB for computing the pairwise Spearman's and Kendall's
correlation scores. The Kendall's rank correlation measure has O(n2)
complexity and takes a signi cant amount of time for large matrices. In order to
speed this up, we sampled the data matrix by a random selection of 1M rows for
14 DBpedia PageLinks Unredirected {
http://downloads.dbpedia.org/201504/core-i18n/en/page-links-unredirected_en.nt.bz2
15 SubjectiveEye3D { https://github.com/paulhoule/telepath/wiki/</p>
        <p>SubjectiveEye3D
16 The Open Wikipedia Ranking { http://wikirank.di.unimi.it/
Kendall's . The pairwise correlation scores of and are reported in Tables 3
and 4 respectively. The results are generally as expected: For example, the
pageview-based rankings correlate strongest with each other. Also DBP-U 2015-04
and ALL have a very strong correlation (these rankings should be equal).</p>
        <p>
          H1 seems to be supported by the data as the TEL PageRank scores correlate
worst with any other ranking. However, ATL does not correlate better with SUB
and TOWR-PV than ALL. This indicates that the reason for the bad correlation
might not be due to the \bad semantics of links in the infobox". With random
samples on ATL - which produced similar results - we found that the computed
PageRank values of TEL are mostly a ected by the low total link count (see
Table 1). With respect to the initial example, the PageRank score of \Carl
Linnaeus" is reduced to 217 in ATL. However, a general better performance of
ATL is not noticeable with respect to the comparison to SUB and TOWR-PV.
We assume that PageRank on DBpedia's RDF data results in similar scores
as TEL as DBpedia [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] extracts its semantic relations mostly from Wikipedia's
infoboxes.
        </p>
        <p>
          Indicators for H2 are the scores of ABL and ATL-RP. However, similar to
TEL, ABL does not produce enough links for a strong ranking. ATL-RP, in
contrast, produces the strongest correlation with SUB. This is an indication
that - indeed - articles that are linked at the beginning of a page are more often
clicked. This is supported by related ndings where actual HTTP referrer data
was analyzed [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
        <p>With respect to H3, we expected DBP-U 2015-04 and DBP 2015-04 to
correlate much stronger but DEF does not implement the full work ow of Figure
1: although it introduces a link A ! C and removes the link A ! B, it does not
remove the link B ! C. As such, the article B occurs in the nal entity set with
the lowest PageRank score of 0:15 (as it has no incoming links). In contrast, these
pages often accumulate PageRank scores of 1000 and above in the unredirected
datasets. If B would not occur in the nal ranking of DBP 2015-04, it would not
be considered by the rank correlation measures. This explains the comparatively
weak correlation between the redirected and unredirected datasets.
4.4</p>
      </sec>
      <sec id="sec-4-5">
        <title>Conclusions</title>
        <p>Whether links from templates are excluded or included in the input link graph
does not impact strongly on the quality of rankings produced by PageRank.
WLRank on articles produces best results with respect to the correlation to
page-view-based rankings. In general, although there is a correlation, we assume
that link and page-view-based rankings are complementary. This is supported by
Table 5 which contains the top-50 scores of SUB, DBP 2015-04, and ATL-RP:
The PageRank-based measures are strongly in uenced by articles that relate
to locations (e. g., countries, languages, etc.) as they are highly interlinked and
referenced by a very high fraction of Wikipedia articles. In contrast, the
pageview-based ranking of SubjectiveEye3D covers topics that are frequently accessed
and mostly relate to pop culture and important historical gures or events. We
assume that a strong and more objective ranking of entities is probably achieved
by combining link-structure and page-view-based rankings on Wikipedia. In
general, and especially for applications that deal with NLP, we recommend to use
the unredirected version of DBpedia PageRank.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Related Work</title>
      <p>
        This work is in uenced and motivated by an initial experiment that was
performed by Paul Houle: In the Github project documentation of SubjectiveEye3D
he reports about Spearman and Kendall rank correlations between
SubjectiveEye3D and DBpedia PageRank [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The results are similar to our computations.
The normalization that has been carried out on the SUB scores mitigates the
e ect of single peaks and makes an important contribution towards providing
objective relevance scores. The work of Eom et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] investigates on the
difference between 24 language editions of Wikipedia with PageRank, 2DRank,
and CheiRank rankings. The analysis focuses on the rankings of the top-100
persons in each language edition. We consider this analysis as seminal work for
investigation on mining cultural di erences with Wikipedia rankings. This is an
interesting topic as di erent cultures use the same Wikipedia language edition
(e. g., United Kingdom and the United States). Similarly, the work of Lages et al.
provide rankings of universities of the world in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Again, 24 language editions
were analyzed with PageRank, 2DRank, and CheiRank. PageRank is shown to
be e cient in producing similar rankings like the \Academic Ranking of World
Universities (ARWU)" (that is provided yearly by the Shanghai Jiao Tong
University). In a recent work, Dimitrov et al. introduce a study on the link traversal
behavior of users within Wikipedia with respect to the positions of the followed
links. Similar to our nding, the authors conclude that a great fraction of clicked
links can be found in the top part of the articles.
      </p>
      <p>Comparing ranks on Wikipedia is an important topic and with our
contribution we want to emphasize the need for considering the signals \link graph"
and \page views" in combination.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Summary &amp; Future Work</title>
      <p>In this work, we compared di erent input graphs for the PageRank algorithm,
the impact on the scores, and the correlation to page-view-based rankings. The
main ndings can be summarized as follows:
1. Removing template links has no general in uence on the PageRank scores.
2. The results of WLRank with respect to the relative position of a link
indicate a better correlation to page-view-based rankings than other PageRank
methods.
3. If redirects are resolved, it should be done in a complete manner as
otherwise entities get assigned arti cially low scores. We recommend using a
unredirected dataset for applications in the NLP context.</p>
      <p>
        Currently, we use the link datasets and the PageRank scores in our work on entity
summarization [
        <xref ref-type="bibr" rid="ref10 ref11">10,11</xref>
        ]. However, there are many applications that can make
use of objective rankings of entities. As such, we plan to investigate further on
the combination of page-view-based rankings and link-based ones. In e ect, for
humans, rankings of entities are subjective and it is a hard task to approximate
\a general notion of importance".
      </p>
      <p>Acknowledgement. The authors would like to thank Thimo Britsch for his
contributions on the rst versions of the SiteLinkExtractor tool. The research
leading to these results has received funding from the European Union Seventh
Framework Programme (FP7/2007-2013) under grant agreement no. 611346 and
by the German Federal Ministry of Education and Research (BMBF) within the
Software Campus project \SumOn" (grant no. 01IS12051).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          , G. Kobilarov,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          , and
          <string-name>
            <surname>Z. Ives.</surname>
          </string-name>
          <article-title>DBpedia: A Nucleus for a Web of Open Data</article-title>
          .
          <source>In The Semantic Web: 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference</source>
          , Busan, Korea,
          <source>November 11-15</source>
          ,
          <year>2007</year>
          . Springer Berlin Heidelberg,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>R.</given-names>
            <surname>Baeza-Yates</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Davis</surname>
          </string-name>
          .
          <article-title>Web Page Ranking Using Link Attributes</article-title>
          .
          <source>In Proceedings of the 13th International World Wide Web Conference on Alternate Track Papers &amp;Amp; Posters, WWW Alt. '04</source>
          , pages
          <fpage>328</fpage>
          {
          <fpage>329</fpage>
          , New York, NY, USA,
          <year>2004</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>S.</given-names>
            <surname>Brin</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Page</surname>
          </string-name>
          .
          <article-title>The Anatomy of a Large-scale Hypertextual Web Search Engine</article-title>
          .
          <source>In Proceedings of the Seventh International Conference on World Wide Web</source>
          <volume>7</volume>
          ,
          <issue>WWW7</issue>
          , pages
          <fpage>107</fpage>
          {
          <fpage>117</fpage>
          . Elsevier Science Publishers B. V., Amsterdam, The Netherlands, The Netherlands,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>D.</given-names>
            <surname>Dimitrov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Singer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Lemmerich</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Strohmaier</surname>
          </string-name>
          .
          <article-title>Visual Positions of Links and Clicks on Wikipedia</article-title>
          .
          <source>In Proceedings of the 25th International Conference Companion on World Wide Web, WWW '16 Companion</source>
          , pages
          <volume>27</volume>
          {
          <fpage>28</fpage>
          . International World Wide Web Conferences Steering Committee,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. Y.
          <string-name>
            <surname>-H. Eom</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Aragn</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Laniado</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Kaltenbrunner</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Vigna</surname>
            , and
            <given-names>D. L.</given-names>
          </string-name>
          <string-name>
            <surname>Shepelyansky</surname>
          </string-name>
          .
          <article-title>Interactions of Cultures and Top People of Wikipedia from Ranking of 24 Language Editions</article-title>
          .
          <source>PLoS ONE</source>
          ,
          <volume>10</volume>
          (
          <issue>3</issue>
          ):1{
          <fpage>27</fpage>
          ,
          <string-name>
            <surname>Mar</surname>
          </string-name>
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Lages</surname>
          </string-name>
          , Jose, Patt, Antoine, and
          <string-name>
            <surname>Shepelyansky</surname>
          </string-name>
          , Dima L.
          <source>Wikipedia Ranking of World Universities. Eur. Phys. J. B</source>
          ,
          <volume>89</volume>
          (
          <issue>3</issue>
          ):
          <fpage>69</fpage>
          ,
          <string-name>
            <surname>Mar</surname>
          </string-name>
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Linne</surname>
          </string-name>
          ,
          <article-title>Carl von</article-title>
          and Salvius, Lars. Caroli Linnaei...
          <article-title>Systema naturae per regna tria naturae :secundum classes, ordines, genera, species, cum characteribus, di erentiis, synonymis, locis</article-title>
          ., volume v.
          <volume>1</volume>
          . Holmiae :Impensis Direct.
          <source>Laurentii Salvii</source>
          ,
          <volume>1758</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>A.</given-names>
            <surname>Roa-Valverde</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Thalhammer</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Toma</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.-A.</given-names>
            <surname>Sicilia</surname>
          </string-name>
          .
          <article-title>Towards a formal model for sharing and reusing ranking computations</article-title>
          .
          <source>In Proceedings of the 6th International WS on Ranking in Databases in conjunction with VLDB</source>
          <year>2012</year>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>A.</given-names>
            <surname>Thalhammer</surname>
          </string-name>
          .
          <article-title>DBpedia PageRank dataset</article-title>
          . Downloaded from http:// people.aifb.kit.edu/ath#DBpedia_PageRank,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>A.</given-names>
            <surname>Thalhammer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Lasierra</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A. Rettinger.</surname>
          </string-name>
          <article-title>LinkSUM: Using Link Analysis to Summarize Entity Data</article-title>
          .
          <source>In Proceedings of the 16th International Conference on Web Engineering (ICWE</source>
          <year>2016</year>
          ). To appear,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>A.</given-names>
            <surname>Thalhammer</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Rettinger</surname>
          </string-name>
          .
          <article-title>Browsing DBpedia Entities with Summaries</article-title>
          .
          <source>In The Semantic Web: ESWC 2014 Satellite Events</source>
          , pages
          <volume>511</volume>
          {
          <fpage>515</fpage>
          . Springer,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>