<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Pinpointing In uence in Pinterest</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Panagiotis Liakos</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Katia Papakonstantinopoulou</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Sioutis</string-name>
          <email>sioutis@cril.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Konstantinos Tsakalozos</string-name>
          <email>konstantinos.tsakalozos@canonical.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alex Delis</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Canonical Group Ltd.</institution>
          ,
          <addr-line>London</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universite d'Artois, CRIL UMR 8188</institution>
          ,
          <addr-line>Lens</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Athens</institution>
          ,
          <addr-line>GR15703, Athens</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <fpage>26</fpage>
      <lpage>37</lpage>
      <abstract>
        <p>The success of most applications that run on top of a social network infrastructure is due to the social ties among their users; the users can get informed about the activity of their friends and acquaintances, and, hence, new ideas, habits, and products have the opportunity to gain popularity. Therefore, understanding the in uence dynamics on social networks provides us with insights that are useful in designing efcient social network applications. In this work we focus on Pinterest, a social network that is often used to promote commercial products, and investigate the in uence mechanisms in it. We examine the user indegree and PageRank as potential estimators of the number of repins and likes the user may receive. We observe that, although both measures are weakly associated with user in uence in Pinterest, PageRank is much more powerful than indegree in revealing how much in uential a user is.</p>
      </abstract>
      <kwd-group>
        <kwd>Social In uence</kwd>
        <kwd>PageRank</kwd>
        <kwd>Pinterest</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Over the past two decades the rise of social networking sites has shaped new
forms of interaction. Social media such as Facebook, Twitter, and Pinterest are
drawing millions of individuals, who now depend on them to keep up with friends,
follow breaking news, share their interests, and discover products or events. The
popularity of online social networks and the diversity of the activities of their
users make them ideal candidates for studying in uence patterns. In particular,
we want to nd out whether certain individuals in a social network have the
power to a ect their social contacts and are in the position to convince them to
buy a product or adopt a political idea.</p>
      <p>
        The di usion of in uence has received signi cant attention from the elds of
sociology, advertising, and political science. Early studies argue for the existence
Copyright c 2016 for the individual papers by the papers' authors. Copying
permitted for private and academic purposes. This volume is published and copyrighted
by its editors.
of a small minority of opinion leaders, that are able to persuade the majority
of the society to mimic their behavior [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. More recent studies, however, do not
outright accept this hypothesis. In [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] the authors consider the probability of a
customer buying some product as a function of the in uence of other customers,
as well as her intrinsic desire for the product. In [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] it is suggested that although
an in uential minority may exist, it is seldom responsible for spreading ideas; it
is a critical mass of easily in uenced individuals who trigger chain-reactions of
in uence. Identifying in uential individuals allows for cost-e ective viral
marketing techniques to increase brand awareness or even sway the public opinion.
      </p>
      <p>
        The massive activity of online social network users enables us to collect rich
large-scale data and assess the presence and potential of in uential users. A
related recent e ort [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] investigates in uence in the Twitter social network.
Topological measures, such as the number of one's followers (indegree), are reported
to fail in exposing user in uence. This is attributed to the users' tendency to
follow others for courtesy and is often referred to as the million follower fallacy.4
However, user activity varies across di erent social networking sites. Thus, it is
important to examine whether indegree reveals user in uence in other social
media. Moreover, considering more re ned measures of importance implied by the
network topology may lead to more e ective identi cation of in uential users.
      </p>
      <p>
        We focus on Pinterest, an image-based online social network in which users
can post (or pin) content they nd interesting and browse the content of others
in their feed, where they can re-post (or repin), comment, or endorse (like) other
pins. Consequently, users focus on the curation and discovery of existing content
and many businesses opt to join Pinterest in an e ort to promote their products.
We perform an in-depth empirical analysis to measure the extent to which the
indegree of Pinterest users is associated with the number of repins and likes
they receive. In addition to this, we propose the use of PageRank [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] to better
estimate one's potential to in uence others. Indegree considers the expressed
opinions of all users as equally important. In contrast, PageRank distinguishes
important users based on the network topology. Our intuition is that the more
authoritative a user is, the more in uential she is expected to be. Therefore, the
use of PageRank to identify in uential users sounds promising. In summary, we
make the following contributions:
{ We examine the association between indegree and user in uence in a very
popular social network. We nd that, similarly to Twitter [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], the indegree
of Pinterest users reveals very little regarding the number of repins and likes
a user may receive.
{ We propose the use of PageRank for identifying in uential individuals and
investigate its e ectiveness. Our ndings suggest that, even though the
association between PageRank and user in uence appears limited, PageRank
is much more powerful than indegree in revealing the in uence of a user.
      </p>
      <p>The rest of this paper is organized as follows. In Section 2 we review the
related work. Section 3 provides an insight into Pinterest's fundamental elements</p>
      <sec id="sec-1-1">
        <title>4 http://bit.ly/1WaTAeK</title>
        <p>and outlines our approach. In Section 4 we present an empirical analysis of
Pinterest activity to study the extent of in uence in the network. Finally, Section 5
concludes this work.
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Related work</title>
      <p>
        The following fundamental algorithmic problem for social network processes
dealing with in uence was posed by Domingos and Richardson in [
        <xref ref-type="bibr" rid="ref15 ref4">15,4</xref>
        ]: if we
can try to convince a subset of individuals to adopt a new product or
innovation, and the goal is to trigger a large cascade of further adoptions, which set
of individuals should we target? This problem is studied in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] in the context of
the most popular models in social network analysis. The authors show that the
aforementioned problem is essentially the optimization problem of selecting the
most in uential nodes, which is N P-hard, and provide the rst provable
approximation guarantees for e cient algorithms. The results of this study gave rise to
a number of works that try to identify in uential users by employing heuristics
that are based on real network data.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], Cha et al. study user in uence on Twitter; based on a Twitter crawl,
they compare three measures of in uence, namely, indegree, retweets, and
mentions, and observe that users who have high indegree, although popular, are not
necessarily in uential in terms of causing retweets or mentions. Their ndings
suggest that topological measures alone, probably do not reveal much about how
in uential a user is. However, they leave room for examining more re ned
measures of in uence, which do not consider all links in a network to be transferring
the same authority. Weng et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] focus on the problem of identifying in
uential users of micro-blogging services, with Twitter being their service of choice for
studying such in uence. In a dataset that they prepared for their study, they
observe that 72:4% of the users in Twitter follow more than 80% of their followers,
and 80:5% of the users have 80% of users they are following follow them back.
Their study reveals that the presence of such reciprocity can be explained by
the phenomenon of homophily [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], i.e., the tendency of individuals to associate
and bond with similar others. Based on this nding, they propose TwitterRank,
an extension of the PageRank algorithm, to measure the in uence of users in
Twitter. Identifying patterns of in uence also serves the purpose of evaluating
sociological models. In another line of research, in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] the authors study real
user activity in the Digg social network, based on sociological studies and the
game theory framework, and identify a model that closely describes the observed
in uence. In particular, they model user activity as an opinion formation game
in which each user is in uenced by her social contacts and nd that the Nash
equilibria of the game are nice illustrations of how users real behave.
      </p>
      <p>
        Gilbert et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] focus on identifying certain properties of Pinterest. In
particular, they try to answer the questions of what drives activity on Pinterest, what
role gender plays in the site's social connections, and, nally, what distinguishes
Pinterest from other networks, such as Twitter. With respect to those questions,
they conclude that being female implies more repins but fewer followers, and
that four verbs set Pinterest apart from Twitter, namely, use, look, want, and
need. Similarly to [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], properties of Pinterest are also highlighted in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Chang
et al. study a fundamental issue for social curation sites where people collect,
organize, and share pictures of items, with a focus on Pinterest, as it is the most
prominent example of such a site. In particular, they study the issue of what
patterns of activity attract attention. They organize their study around two key
factors, namely, the extent to which users specialize in particular topics, and
homophily among users. Further, they also consider the existence of di erences
between female and male users. Their study reveals that women and men
differ in the types of content they collect and the degree to which they specialize.
These ndings suggest strategies both for users (e.g., to attract an audience)
and for maintainers (e.g., to explore content recommendation methods) of
social curation sites. Mittal et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] characterize Pinterest on the basis of large
scale crawls of 3:3 million user pro les, and 58:8 million pins. In particular, they
explore various attributes of users, pins, boards, pin sources, and user locations
in detail, and perform topical analysis of user generated textual content. This
characterization revealed the most prominent topics among users and pins, such
as design, fashion, photography, food, travel, music, and art. Moreover the top
image sources and geographical distribution of users on Pinterest were obtained.
      </p>
      <p>
        Ottoni et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] make a rst attempt towards a more complete understanding
of user behavior across multiple online social networks, such as Twitter and
Pinterest. They collect a sample of over 30; 000 users that have accounts on
both Twitter and Pinterest, by crawling their pro le information and activity
on a daily basis for a period of almost three months. Then, they develop a
novel methodology for comparing activity across these two sites, that builds on
the Labeled Latent Dirichlet Allocation model (L-LDA) [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], a supervised topic
model for credit attribution in multi-labeled corpora. The authors nd that the
global patterns of use across the two sites di er signi cantly, and that users tend
to post items to Pinterest before posting them on Twitter. These ndings can
assist in the understanding of user behavior on individual sites, as well as the
dynamics of sharing across the social web.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Tracing In uence in Pinterest</title>
      <p>In this section we discuss the Pinterest social network; its immense
popularity, the actions allowed to its users, and its most prominent categories indicate
that Pinterest is extremely interesting business-wise. Moreover, we detail the
measures of in uence that we will employ in our empirical analysis.
3.1</p>
      <sec id="sec-3-1">
        <title>Understanding Pinterest</title>
        <p>Pinterest is an image sharing social network that was founded in 2010 and was
the fastest site to surpass the 10,000,000 monthly active users milestone.5 During</p>
        <sec id="sec-3-1-1">
          <title>5 http://techcrunch.com/2012/02/07/pinterest-monthly-uniques/</title>
          <p>the year 2015, Pinterest was also reported to have broken the mark of 100,000,000
monthly active users.6 Notably, the vast majority of these users are female,7
which makes Pinterest particularly interesting to study.</p>
          <p>Pinterest users are able to create a pro le along with one or more boards
where they may pin images or other media content, for example videos. They are
also able to follow other users or speci c boards in order to receive personalized
updates in their feed. Pins can be liked or repinned (shared) by other users, as
it is also the case with the content of other social networks such as Facebook or
Twitter.</p>
          <p>Pinterest boards are organized into a broad range of categories, typical ones
being Food &amp; Drink, Weddings, and Home Decor. These categories are indicative
of the users' habit of creating digital shopping lists of products they are interested
in buying. This tendency has attracted signi cant commercial attention, as many
businesses invest in creating compelling boards to increase their revenue.8
3.2</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>De ning in uence in Pinterest</title>
        <p>A user's behavior in a social network is often a ected by others. However, the
task of singling out those individuals who actually cause social in uence in an
online social network is not trivial. In this paper, we focus on three fundamental
actions performed by Pinterest users, namely, follow, repin, and like. We consider
that the number of followers a user has, as well as the number of repins and likes
she receives, are indicative of her potential to a ect others in the network and
can be used as a measure of in uence.</p>
        <p>
          The use of indegree as a quality measure may be awed due to its local
nature. To alleviate this issue we additionally consider the use of PageRank values.
In information networks, with the World Wide Web being the most popular
example, the authority of each node is estimated using the PageRank algorithm [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ],
a quality metric introduced by the creators of the Google search engine that is
based on the network's link structure. According to the de nition of PageRank,
a node's authority is distributed uniformly to the nodes it has a link to. As a
consequence, a node is important if many important nodes link to it. The use of
PageRank, however, is not restricted to search engine optimization. PageRank
is used in recommendation systems, in ranking tweets in Twitter, and even in
suggesting friends in online social networks.
        </p>
        <p>The measures of in uence that can be drawn from the Pinterest network and
are examined in this work are the following ones:
{ Indegree in uence: the number of followers of a user directly indicates the
size of the audience of that user.
{ PageRank in uence: the PageRank of a user indicates the strength of her
in uence on her followers.</p>
        <sec id="sec-3-2-1">
          <title>6 http://mobile.nytimes.com/blogs/bits/2015/09/17/</title>
          <p>pinterest-crosses-user-milestone-of-100-million/</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>7 http://www.omnicoreagency.com/pinterest-statistics/</title>
        </sec>
        <sec id="sec-3-2-3">
          <title>8 https://business.pinterest.com/en/success-stories</title>
          <p>{ Like in uence: the number of likes containing one's name indicates the
ability of that user to generate popular content.
{ Repin in uence: the number of repins containing one's name indicates the
ability of that user to generate content with pass-along value.</p>
          <p>Notice that the rst two measures depend solely on the topology of the social
network, while the other two take into account the user activity on the network
as well.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Empirical Analysis</title>
      <p>This section details the analysis we performed on the Pinterest network and
our empirical observations concerning the in uence patterns exhibited. We rst
introduce the dataset we used for our analysis and the experimental setup needed
for the PageRank algorithm. Then, we proceed with answering the following
questions:
{ Do the indegree of Pinterest users and the repins and likes they receive follow
long-tailed distributions?
{ Is there signi cant overlap among the top users based on indegree or
PageRank and the top users based on the number of repins or likes they received?
{ How strong is the pairwise rank correlation between indegree, repins and
likes in Pinterest?
{ Is the rank correlation improved when we consider the use of PageRank
instead of indegree?
4.1</p>
      <sec id="sec-4-1">
        <title>Experimental Setting</title>
        <p>
          Our empirical analysis is based on the Pinterest dataset described in [
          <xref ref-type="bibr" rid="ref18 ref19">19,18</xref>
          ]. The
activity of the users was collected from January 3rd 2013 to January 21st 2013,
while the social graph was crawled during April 2013. We focused on the subset
of this dataset that contains only links that are common to both Pinterest and
Facebook. Our experimental setup comprises a social graph of 36,198,633 users
and 983,520,986 social ties. Regarding the activity of these users, we analyzed a
total of 18,957,340 repins, and 9,066,973 likes on pins. These pins were created
by a total of 1,253,189 users.
        </p>
        <p>Calculating the PageRank values of large-scale graphs calls for the use of
Pregel-like graph processing systems like Apache Giraph.9 Using Juju10, we
deployed an Apache Hadoop 2.7.1 cluster on a Dell PowerEdge R630 server with
an Intel R Xeon R E5-2630 v3, 2.40 GHz processor, and 256 GB of RAM. Then,
we built Apache Giraph in the client node of our cluster and submitted a
Giraph job that calculated the PageRank values of the users in our social graph by
performing 80 iterations. This allowed us to fully utilize our infrastructure and
obtain the desired results in less than 3 hours.</p>
        <sec id="sec-4-1-1">
          <title>9 http://giraph.apache.org/ 10 https://jujucharms.com/big-data</title>
          <p>s
r
e
s
U
f
o
r
e
b
m
u
N</p>
          <p>Indegree, Repin &amp; Like Distributions
Indegree</p>
          <p>Repins</p>
          <p>Likes
1
10
100
1000
10000
Indegree &amp; Number of Repins/Likes</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>Distribution of indegree and received repins/likes</title>
        <p>In Figure 1 we illustrate the indegree distribution of the social graph of Pinterest
as well as the distribution of in uence based on repins and likes. We use a
logarithmic scale to highlight the fact that the users' indegree, as well as the
users' number of received repins or likes, varies by several orders of magnitude. In
particular, we observe that there are a few users with more than 1,000 followers,
repins or likes, whereas the majority of users have no more than one follower and
have received no more than one repin or like. Therefore, most of the activity on
the network is centered around a small minority of users. Moreover, we observe
that users tend to receive slightly more repins than likes.
4.3</p>
      </sec>
      <sec id="sec-4-3">
        <title>Overlap of top-ranked users</title>
        <p>We continue our analysis of the Pinterest dataset by examining the overlap of
the top-ranked users for each measure. In particular, the Venn diagram of
Figure 2(a) depicts the overlap of the top-10,000 users according to their indegree,
the number of received repins, and the number of received likes. We observe that
the overlap of indegree with both of the other measures is marginal. In contrast,
there is signi cant overlap between the users whose pins received many repins
and those whose pins received many likes.</p>
        <p>Indegree</p>
        <p>9,955
9</p>
        <p>11
25
6,215</p>
        <p>(a)
3,751
Repins
3,749
Likes</p>
        <p>PageRank</p>
        <p>9,990
3</p>
        <p>Moreover, we present the corresponding Venn diagram for the top-10,000
users according to their PageRank in Figure 2(b). We see that the overlap of
PageRank with repins and likes is even more insigni cant in this case.</p>
        <p>These observations hint that there is very weak correlation of the indegree or
PageRank of users with the frequency they receive repins and likes. In addition
to this, we showed that users who receive many repins tend to receive many likes
as well.
4.4</p>
      </sec>
      <sec id="sec-4-4">
        <title>Comparing In uence Measures</title>
        <p>
          For each of the 36; 198; 633 users of our dataset we calculated the value of
indegree and PageRank and the number of repins and likes, and adopted the
methodology of [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] to quantify the association between them. In particular, we
characterized our comparison on the relative order of users' ranks as a measure
of di erence. We assigned the rank of 1 to the most in uential user, and
increased the rank as we proceeded to less in uential users. Identical values were
each assigned fractional ranks equal to the average of their positions in the
ascending order of the values [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Then, we used the Spearman's rank correlation
coe cient [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], a non-parametric measure that is used to assess the degree of
association between two variables, to examine whether two ranked variables covary.
This measure is especially useful as it does not make any assumptions regarding
the distribution of the data.
        </p>
        <p>The Spearman's rank correlation coe cient is de ned by the following
formula:11
= 1
6 P d2</p>
        <p>i
n(n2
1)
where di = rg(Xi) rg(Yi) is the di erence between the two ranks of user i, and
n is the total number of users.
11 In our case all ranks are distinct integers, hence, we present the simpli ed formula.</p>
        <p>1</p>
        <p>
          In [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], a preprocessing step is applied to the dataset that removes all users
with a limited number of tweets since the creation of their account. We
investigated the impact of an equivalent preprocessing step to our dataset by examining
only those users who have created at least 15 pins, i.e., a total of 20; 011; 513
users. This resulted to a weaker correlation of indegree with both repins (0:1068)
and likes (0:0997). Therefore, we attribute the deviation of our results from the
ndings of [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] to the di erent patterns of use across Pinterest and Twitter.
        </p>
        <p>
          Another important nding illustrated in Figure 3, is the extremely strong
correlation between repins and likes. This is indicative of the di erent focus of
Pinterest compared with Twitter; in Twitter the action of a retweet focuses
on the content of a tweet and the action of a mention on the user, whereas in
Pinterest both the action of a repin and the action of a like focus on the content
of a pin. As such, the correlation between the two corresponding measures in
Twitter is reported to be moderate (0.58) [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], whereas the two measures in
Pinterest are very strongly associated.
        </p>
        <p>We have already discussed that most users have very few followers, as clearly
shown in Figure 1. Consequently, most users are tied when ranked according to
their indegree. This may lead to a fabricated result regarding the correlation of
measures, as users with low indegree most likely also have limited repins and</p>
        <p>
          1
likes. To alleviate this issue, we considered users ranked in the top 10th and 1st
percentiles, respectively, according to their indegree, as suggested in [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
        <p>We observe in Figures 4 and 5 that the correlation of indegree with repins
or likes is indeed even weaker for the top 10th and 1st percentiles of users
respectively. However, what stands out is that the correlation of PageRank with
repins and likes is stronger than that of indegree for these two cases. In
particular, in Figure 4 we observe that for the top 10th percentile of users according to
their indegree, the association of PageRank with user in uence is about twice as
strong as that of indegree. The results for the 1st percentile, depicted in Figure 5,
are even more impressive, as the statistical dependence with PageRank instead
of indegree is more than twice as strong.</p>
        <p>This veri es our intuition that using PageRank instead of indegree allows for
capturing the importance of users more accurately. Given that the focus of both
repins and likes is on the content of the pin and not on the user who posted
it, we expect that PageRank will perform even better against actions that are
centered toward individuals, e.g., Twitter's mentions.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion and Future Work</title>
      <p>The study of in uence patterns on social networks is essential for the design
of a successful advertising strategy. The recent unprecedented growth in the
number of online social networking services allows us to empirically validate
relevant theories and uncover opportunities for viral marketing techniques. We
performed an analysis of a large amount of activity on the Pinterest network. The
case of Pinterest is particularly interesting as it is embraced by an ever-increasing
0</p>
      <p>Indegree/Repins
PageRank/Repins</p>
      <p>Indegree/Likes
PageRank/Likes</p>
      <p>Repins/Likes
number of businesses eager to promote their products to a wide audience through
captivating pin-boards.</p>
      <p>We examined the association of the indegree of Pinterest users with the
number of times their pins are repinned or liked. Our results reveal that there
is very little correlation between the ranking of users based on their indegree
and their ranking based on the number of either repins or likes they receive.
We attribute this result to the million follower fallacy that is evident in this
network, and the fact that the focus of repins and likes is more on the content
of pins rather than on the user who posted them. Furthermore, we proposed the
use of PageRank instead of indegree for the identi cation of in uential users. As
PageRank is a measure that captures how authoritative a user is in a network,
we expect that a ranking of users based on their PageRank value will provide us
with a more accurate view of their potential to in uence their social contacts.
Indeed, even though correlation with the ranking of users based on repins or
likes received is still limited, PageRank performed much better than indegree.</p>
      <p>We will further investigate the in uence patterns of the Pinterest network
with regard to the variance exhibited across di erent topics. In particular, we
will consider the categories of pins when ranking users based on the repins and
likes received. The overlap of in uential users across di erent Pinterest
categories will indicate how many are able to spread information over a variety of
topics. Moreover, we will examine whether PageRank proves to be even more
successful when quantifying its association with an action targeted towards the
user instead of the content (both repins and likes are related to the content of
a pin). We expect that for online social networks that focus more on original
content, like Twitter or Instagram, and with actions targeted towards the user,
such as mentions, PageRank's performance will be signi cantly superior to that
of the indegree.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>W.</given-names>
            <surname>Buck</surname>
          </string-name>
          .
          <article-title>Tests of signi cance for point-biserial rank correlation coe cients in the presence of ties</article-title>
          .
          <source>Biometrical Journal</source>
          ,
          <volume>22</volume>
          (
          <issue>2</issue>
          ):
          <volume>153</volume>
          {
          <fpage>158</fpage>
          ,
          <year>1980</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>M.</given-names>
            <surname>Cha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Haddadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Benevenuto</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Gummadi</surname>
          </string-name>
          .
          <article-title>Measuring User In uence in Twitter: The Million Follower Fallacy</article-title>
          .
          <source>In ICWSM</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>S.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Gilbert</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L. G.</given-names>
            <surname>Terveen. Specialization</surname>
          </string-name>
          , Homophily, and
          <article-title>Gender in a Social Curation Site: Findings from Pinterest</article-title>
          .
          <source>In CSCW</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>P. M.</given-names>
            <surname>Domingos</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Richardson</surname>
          </string-name>
          .
          <article-title>Mining the network value of customers</article-title>
          .
          <source>In KDD</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>E.</given-names>
            <surname>Gilbert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bakhshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L. G.</given-names>
            <surname>Terveen</surname>
          </string-name>
          <article-title>. \I Need to Try This!": A Statistical Overview of Pinterest</article-title>
          . In CHI,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>E.</given-names>
            <surname>Katz</surname>
          </string-name>
          and
          <string-name>
            <given-names>P. F.</given-names>
            <surname>Lazarsfeld</surname>
          </string-name>
          .
          <article-title>Personal In uence, The part played by people in the ow of mass communications</article-title>
          .
          <source>Transaction Publishers</source>
          ,
          <year>1955</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>D.</given-names>
            <surname>Kempe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Kleinberg</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Tardos</surname>
          </string-name>
          .
          <article-title>Maximizing the spread of in uence through a social network</article-title>
          .
          <source>In KDD</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>P.</given-names>
            <surname>Lawrence</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Sergey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Motwani</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Winograd</surname>
          </string-name>
          .
          <article-title>The PageRank Citation Ranking: Bringing Order to the Web</article-title>
          .
          <source>Technical report</source>
          , Stanford University,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>E. L.</given-names>
            <surname>Lehmann and H. J. D'Abrera.</surname>
          </string-name>
          <article-title>Nonparametrics: statistical methods based on ranks</article-title>
          . Springer-Verlag New York,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>P.</given-names>
            <surname>Liakos</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Papakonstantinopoulou</surname>
          </string-name>
          .
          <article-title>On the Impact of Social Cost in Opinion Dynamics</article-title>
          . In ICWSM,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>M. McPherson</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Smith-Lovin</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Cook</surname>
          </string-name>
          .
          <article-title>Birds of a Feather: Homophily in Social Networks</article-title>
          .
          <source>Annual Review of Sociology</source>
          ,
          <volume>27</volume>
          (
          <issue>1</issue>
          ):
          <volume>415</volume>
          {
          <fpage>444</fpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>S.</given-names>
            <surname>Mittal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dewan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Kumaraguru</surname>
          </string-name>
          .
          <article-title>Pinned it! A Large Scale Study of the Pinterest Network</article-title>
          .
          <source>In IKDD</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>R.</given-names>
            <surname>Ottoni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. B. L.</given-names>
            <surname>Casas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Pesce</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. M.</given-names>
            <surname>Jr.</surname>
          </string-name>
          , C. Wilson,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mislove</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Almeida</surname>
          </string-name>
          .
          <article-title>Of Pins and Tweets: Investigating How Users Behave Across Imageand Text-Based Social Networks</article-title>
          .
          <source>In ICWSM</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>D.</given-names>
            <surname>Ramage</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Nallapati</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          .
          <article-title>Labeled LDA: A Supervised Topic Model for Credit Attribution in Multi-labeled Corpora</article-title>
          .
          <source>In EMNLP</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>M.</given-names>
            <surname>Richardson</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Domingos</surname>
          </string-name>
          .
          <article-title>Mining Knowledge-sharing Sites for Viral Marketing</article-title>
          .
          <source>In KDD</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Watts</surname>
          </string-name>
          and
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Dodds</surname>
          </string-name>
          .
          <article-title>In uentials, networks, and public opinion formation</article-title>
          .
          <source>Journal of consumer research</source>
          ,
          <volume>34</volume>
          (
          <issue>4</issue>
          ):
          <volume>441</volume>
          {
          <fpage>458</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>J. Weng</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Lim</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Jiang</surname>
            , and
            <given-names>Q.</given-names>
          </string-name>
          <string-name>
            <surname>He</surname>
          </string-name>
          .
          <article-title>Twitterrank: Finding Topic-Sensitive In uential Twitterers</article-title>
          .
          <source>In WSDM</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>C. Zhong</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Salehi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Shah</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Cobzarenco</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Sastry</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Cha</surname>
          </string-name>
          . Social Bootstrapping:
          <article-title>How Pinterest and Last.fm Social Communities Bene t by Borrowing Links from Facebook</article-title>
          .
          <source>In WWW</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>C. Zhong</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Shah</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Sundaravadivelan</surname>
            , and
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Sastry</surname>
          </string-name>
          .
          <article-title>Sharing the Loves: Understanding the How and Why of Online Content Curation</article-title>
          .
          <source>In ICWSM</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>