<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Brief Comparison of Community Detection Algorithms over Semantic Web Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jose L. Martinez-Rodriguez</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ivan Lopez-Arevalo</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ana B. Rios-Alvarado</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xiaoou Li</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Autonomous University of Tamaulipas Victoria</institution>
          ,
          <country country="MX">Mexico</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Cinvestav-IPN Mexico City</institution>
          ,
          <country country="MX">Mexico</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Cinvestav-Tamaulipas Victoria</institution>
          ,
          <country country="MX">Mexico</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Community detection is a task responsible for categorizing nodes of a graph into groups that share similar features or properties (e.g. topological structure or node attributes). This is an important task in fields such as social network analysis or pattern recognition that span a large and varied amount of information hiding relations with knowledge. In this sense, an initiative that seeks to extract knowledge from data is the Semantic Web, whose primary goal is to represent Web data into a graph in order to discover facts and relations. In this paper, we developed a strategy to apply community detection algorithms over Semantic Web data graphs. For this purpose, five algorithms were tested to identify groups from a dataset retrieved from the DBpedia knowledge base containing more than 45 thousand nodes and almost 500 thousand edges in the domain of movies. Clustering quality was evaluated by using the modularity measure and the features of the best communities were analyzed.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Nowadays information about real world things is mostly represented with
elements or entities such as people, locations, dates, and so forth. This kind of
information commonly shows connections among elements organized into a graph.
Thus, a graph or network is composed of a set of objects called vertices (nodes)
joined by links (also known as arcs or edges) that allow them to represent binary
relations among dataset elements.</p>
      <p>Sometimes, nodes tend to be connected with many other nodes that share
common features, for example, people that like the same music genre or enjoy
a movie classification would take advantage of being organized into a group,
probably to share experiences, reviews or for taking decisions. Hence, the
community detection task aims to facilitate the labeling and allocating of nodes into
groups. However, actually the amount of information grows exponentially and
therefore, generating a graph for further assigning nodes to communities that
share preferences is a difficult task. In particular, one of the bigger data source
hiding relations able to demonstrate knowledge about the world is the Web; but
unfortunately, a large portion of the Web lacks a formal structure able to be
processed automatically by computers. As a consequence, the Semantic Web was
presented as an initiative whose purpose is to create a formal representation of
the information expressed on the Web. The Semantic Web follows a basic model
used to formally and semantically represent information, it is called RDF triple4,
which is mainly composed of three basic elements: Subject-Predicate-Object.</p>
      <p>
        The RDF triple structure and an example of how an RDF statement about
a place is located in a country are presented in Figure 1, where nodes
represent Villa Nellcôte as Subject, France as Object, joined by locatedIn as
Predicate (edge). This linking structure allows to organize data in a directed, labeled
graph5[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>One of the most important projects in the Semantic Web is the Linked Open
Data (LOD) cloud6 which provides more than 70 billion triples modelled as a
graph with information about government institutions, locations, music
descriptions, and so on. In this regard, DBpedia7 is an important LOD dataset that
offers contents extracted from Wikipedia and hence, it implies that information
can be separated into subsections adequate to be processed by algorithms.</p>
      <p>In a nutshell, in this paper we apply traditional community detection
algorithms over information from the Semantic Web, specifically a subsection of
the DBpedia knowledge base about movies and actors domain. We provide a
strategy to process the information from the Semantic Web through community
detection algorithms, databases, and a visualization tool. Because of this
proposed strategy, detected groups sharing similar features aim to provide a way to
enrich Semantic Web information.
4 RDF (Resource Description Framework) http://www.w3.org/RDF/, [last visit June
10, 2016]
5 https://www.w3.org/DesignIssues/LinkedData.html, [last visit June 10, 2016]
6 LOD cloud state http://lod-cloud.net, [last update August 30, 2014]
7 http://wiki.dbpedia.org/, [last visit June 10, 2016]</p>
      <p>The rest of the paper is organized as follows. Section 2 presents related work
regarding the community detection algorithms and grouping strategies in the
Semantic Web. Our proposed strategy is outlined in Section 3. Section 4 holds
an analysis of results. Finally, conclusions are presented in Section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related work</title>
      <p>The community detection idea is not new and some already presented
algorithms in the state of the art so far are taken into account to guide a conceptual
sustenance of this work.</p>
      <p>
        Several works have presented a review of community detection algorithms.
Schaeffer [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] presented a study with graph clustering definitions and
terminology, moreover, he presented methods and measures to identify communities and
an evaluation regarding the overseen techniques. Harenberg et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] developed
an empirical revision of algorithms for community detection separated in two
modalities: overlapping algorithms and disjoint algorithms. At the first place,
overlapping techniques (non-exclusive) detect nodes of a dataset belonging to
more than one community; for example, a musical singer who also is a movie
actor. In the second place, disjoint algorithms differ from the first category in
the sense that only an exclusive category is assigned to every node. Focused on
machine learning algorithms, Giannini [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] performed one of the first attempts
to integrate community detection strategies over Semantic Web data. She
mentioned that the purpose of making clustering over RDF graphs is to detect groups
of vertices that share common properties or that play a similar role by using only
information about the topology of the graph. In this sense, Khosravi-Farsani et
al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] proposed a similarity score model applied to RDF data clustering by
considering the number of shortest paths between resources. The authors test their
model with information about person resources obtained from DBpedia. The
similarity among resources is thoroughly tackled in the Ontology Matching area
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and some approaches [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ],[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] have taken advantage of the measures proposed
by this. However, in this paper, we only consider topology characteristics of a
subset of the DBpedia graph in order to detect communities and as distinct
to proposed approaches, we provide a strategy to apply traditional community
detection algorithms.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Methodology</title>
      <p>In order to carry out the community detection algorithms testing over Semantic
Web data, we propose the following strategy composed of four stages:
Semantic Web data acquisition: At this stage, we only consider a Semantic</p>
      <p>Web subset for testing.</p>
      <p>Preprocessing: Information needs to be adapted to apply traditional
algorithms.</p>
      <p>Community detection labeling: The five selected algorithms are tested in
this stage.</p>
      <p>Analysis of results: Features of the obtained groups are analyzed.</p>
      <p>In following subsections, the process involved in every stage is described.
3.1</p>
      <p>
        Semantic Web data acquisition
Due to the huge amount of information on the Semantic Web (more than 70
billion triples), using the whole graph is not possible for a quick test. Only parts of
this graph have been used for applying proposed algorithms [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ],[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Thus, in
order to demonstrate the applicability of community detection algorithms, we only
focused on a subset of the Semantic Web graph that has a rich organization and
diversity of contents easily understandable by people; we have selected the
DBpedia dataset regarding the domain of films: actors and movies. Despite Linked
Movie Data Base (LinkedMDB)8 provides information about movies, we are
focus on the implementation of the algorithms on LOD cloud datasets. However,
we plan to include diverse and wide datasets in a further implementation.
      </p>
      <p>The straightforward way to obtain information from DBpedia9 is through
their official SPARQL10 endpoint. We performed queries as the one depicted in
Listing 1.1.</p>
      <p>Listing 1.1. SPARQL query for movie domain resources retrieval
SELECT ? movie ? g e n r e ? a c t o r ? d u r a t i o n
WHERE { ? movie a dbpedia owl : Film ; dbpedia owl : s t a r r i n g ? a c t o r .</p>
      <p>OPTIONAL { ? movie dbpprop : d u r a t i o n ? d u r a t i o n . }</p>
      <p>OPTIONAL { ? movie dbpprop : g e n r e ? g e n r e . } }
3.2</p>
      <p>
        Preprocessing
Once the information was retrieved, some preprocessing tasks should be applied
in order to clean and prepare the data. The output of the previous stage
produced 254,251 rows in CSV format with two classes; actors and movies, obtaining
a bipartite graph with actors performing movies. However, as stated by
Fortunato [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], multipartite networks are usually projected into unipartite graphs in
order to apply standard community detection algorithms, even when there is
an information loss by doing this transformation, different configurations may
be obtained to produce communities. Thus, according to the output format, we
imported the data into a MySQL table for an easy and fast manipulation of
information. Then, a second graph was generated where nodes were expressed
as movies and edges represented common actors between movies. This was only
produced for testing purposes because data may be expressed as authors
sharing movies in common. In this way, it was obtained a graph with the properties
presented in Table 1.
8 http://linkedmdb.org, [last visit June 10, 2016]
9 DBpedia SPARQL endpoint http://dbpedia.org/sparql, [last visit June 10, 2016]
10 A SQL like language used to query RDF information
      </p>
      <p>
        In order to obtain the influence of individuals, a measure that defines the
importance of a node within a graph is the Centrality [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] and it is based on
graph paths. A path or walk is a sequence of edges connecting a sequence of
distinct vertices. That is, a route through a graph from vertex to vertex along
edges. Therefore, the shortest path between vertices is the sequence of vertices
with fewest edges required to connect two vertices. In this sense, two indicators
explored in this study were Betweenness Centrality (BC) and Closeness
Centrality (CC). The first one is given by the number of shortest paths from all vertices
to all others going through a vertex. And the second one measures the amount of
steps required to access a node from another. Both BC and CC were computed
by using the Gephi tool11. Details about these implementations and algorithms
are provided by Brandes [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>The BC distribution of the nodes in the graph is depicted in Figure 2. More
than 12 thousand nodes obtained a value close to 0, which means those nodes
are away from paths between other nodes and therefore belong only to one the
communities. On the contrary, only a few nodes hold values in the range of
millions, that is, nodes joining communities. Respect to the CC distribution, the
values obtained for the graph are depicted in Figure 3. A total of 478 nodes
obtained a high centrality, that is, central nodes with a small average shortest
path length to other nodes, which gives an idea of the number of groups in the
dataset, as provided in section 3.3.</p>
      <p>Information about the movie class was considered as nodes, but it is possible
to arrange information in order to allocate actors as nodes connected by movies
to see the relation among them, that is, how actors collaborate with each other
by means of movies.
3.3</p>
      <p>
        Community detection labeling
There are many community detection algorithms with exclusive and non-exclusive
nature. In this paper, only exclusive algorithms were considered, that is, where
11 Gephi graph visualization https://gephi.org/, [last visit June 10, 2016]
individuals only belong to one community. The tested algorithms in this
experiment are: Fastgreedy [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], Multilevel [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], Walktrap [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], Infomap [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] and Leading
Eigenvector [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. A common feature among these algorithms is that they apply
the modularity measure to evaluate the strength of division of a network into
communities. Modularity is defined by Newman and Girman [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] as follows:
Q =
1
2m
      </p>
      <p>X(Aij</p>
      <p>Pij ) (Ci; Cj )
where m is the number of edges, A is the adjacency matrix, P is the probability
of an edge existing between vertices i and j, and (Ci; Cj ) = 1 iff i and j belong
to the same community (cluster). P is calculated as follows:</p>
      <p>Pij =
kikj
2m
here k is the degree of a vertex and m is the number of edges. In order to apply
the aforementioned algorithms over the selected dataset, the iGraph12 Phyton
package was used. Finally, the result of the grouping was visually plotted with
the Gephi tool.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Analysis of results</title>
      <p>After executing the five community detection algorithms, the obtained results
together with a visualization of traces denoting the grouping organization by
every algorithm are analyzed in this section. All of the experiments were
developed under a computer with Intel Core i5 processor, 8GB RAM, and OS X 10.10.
The number of detected communities and the obtained modularity score by the
algorithms is given in Table 2, where the first column indicates the evaluated
algorithm, the second provides the number of communities, the third is for the
Modularity score, the fourth column contains the execution time, and the last
column provides the computational complexity of the algorithm.</p>
      <p>Algorithm
Walktrap
Multilevel</p>
      <p>Infomap</p>
      <p>Fastgreedy
L.Eigenvector</p>
      <p>Communities Modularity</p>
      <p>Time Order
1338 0.7119 7m17.269s O(n3)
376 0.7363 4.009s O(m)
1757 0.5923 10m30.748s O(m)
463 0.6587 3m1.411s O(nlog2n)
324 0.5613 1m47.124s O((m + n)n)</p>
      <p>Table 2. Modularity by algorithm</p>
      <p>Regarding to the identified communities, the Infomap algorithm obtained
the highest number of communities. However, it also got low modularity and the
worst execution time of the test; this is due to the principle of random walks
that tend to fall into isolated groups of nodes.</p>
      <p>With respect to the modularity, it can be seen that Multilevel algorithm
obtained the best score, this is due to the way it iteratively moves nodes among
communities by considering the fluctuation of the modularity. An interesting fact
is the one given by the Walktrap algorithm, at the second position of results; this
is because, even this algorithm is based on random walks like as the Infomap
algorithm, it takes into account information of the community structure. The
distribution of the ten most populated communities obtained by every one of
the five tested algorithms can be seen in Figure 4.</p>
      <p>The Multilevel algorithm obtained the best modularity; in this sense, some
features about movie genres from the nine most populated communities found
by this algorithm are analyzed below.
12 http://cran.r-project.org/web/packages/igraph/index.html, [last visit June
10, 2016]</p>
      <p>The organization of nodes given by the Multilevel algorithm (highest
modularity) can be visualized in Figure 5, where colors indicate the communities and
it gives the idea of the dispersion presented by this algorithm.</p>
      <p>
        The most usual movie genres in the dataset are Drama, Suspense, Action,
Comedy, and Horror. By taking this into account, the corresponding percentages
for every one of the five movie categories in the nine considered communities are
presented in Table 3, where letters indicate Category (C), Group (G), Drama
(Dra), Action (Act), Comedy (Com) and Horror (Horr). Despite Comedy and
Drama categories seem to be opposite each other (emotionally) and that
exclusive algorithms are only considered in this work, they have more elements
than others categories in the first two communities, this overlapping is present
in genre categories because of the participation of actors in different movies.
Hence, the predominance of categories in every community is considered
because it is helpful for different applications such as recommender systems [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ],
marketing [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] or social networks analysis [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], to seek for associations among
users. In this study only a few common features were observed, however, this is
encouraging to produce diverse studies from Semantic Web data in such a way
that other domains can be processed to discover information and analyze the
data distribution and features from groups.
      </p>
      <p>G/C Dra Susp Act Com Horr # elements
In this paper, some community detection algorithms over Semantic Web data
were tested. Such algorithms are based on topology features and
evaluationmeasure fluctuation to determine groups or communities with acceptable quality.
Information provided by community detection algorithms is helpful to produce
observations or predictions from data. In this sense, we aimed to develop a
strategy for analyzing and organizing information provided by a structured data
source such as the Semantic Web. The Semantic Web offers many benefits such
as a graph based-model, facts about real world objects and the integration of
heterogeneous data sources, but it is a huge repository which is impractical to
process it by a single computer. For this reason, only a relatively small subset
of the Semantic Web was selected in order to test the selected algorithms.
However, some data adequacies were required, we provided a strategy composed of
stages for data acquisition, preprocessing, community detection, and analysis.
As a result of testing algorithms and generating communities, we have analyzed
information regarding to the actors and movies domain as a proof of concept.</p>
      <p>Therefore, we consider that our strategy is able to be used in other domains such
as Biology, Medicine or Publishing, to mention a few.</p>
      <p>In addition, it is important to note that the community detection algorithms
provide new facts that can be used for inference task, which is a very important
task in the Semantic Web area. Therefore, the strategy to implement community
detection algorithms over Semantic Web data may be used in further tasks such
as:
– Testing with diverse Knowledge bases and domains
– Leverage the output of the best community detection algorithm in tasks
related to OL&amp;P (e.g. ontology axioms generation)
– Provide an enrichment to the LOD cloud with newly discovered insights
(context, group descriptions, inference)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bikakis</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Skourla</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Papastefanatos</surname>
          </string-name>
          , G.:
          <article-title>rdf: Synopsviz-a framework for hierarchical linked data visual exploration and analysis</article-title>
          .
          <source>In: European Semantic Web Conference</source>
          . pp.
          <fpage>292</fpage>
          -
          <lpage>297</lpage>
          . Springer (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Blondel</surname>
            ,
            <given-names>V.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guillaume</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lambiotte</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lefebvre</surname>
          </string-name>
          , E.:
          <article-title>Fast unfolding of communities in large networks</article-title>
          .
          <source>Journal of Statistical Mechanics: Theory and Experiment</source>
          (
          <volume>10</volume>
          ) (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Brandes</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>A faster algorithm for betweenness centrality*</article-title>
          .
          <source>Journal of mathematical sociology</source>
          <volume>25</volume>
          (
          <issue>2</issue>
          ),
          <fpage>163</fpage>
          -
          <lpage>177</lpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Clauset</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Newman</surname>
            ,
            <given-names>M.E.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moore</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Finding community structure in very large networks</article-title>
          .
          <source>Phys. Rev. E</source>
          <volume>70</volume>
          ,
          <issue>066111</issue>
          (Dec
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Colomo-Palacios</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sánchez-Cervantes</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alor-Hernández</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>RodríguezGonzález</surname>
          </string-name>
          , A.:
          <article-title>Linked data: Perspectives for it professionals</article-title>
          .
          <source>International Journal of Human Capital and Information Technology Professionals</source>
          <volume>3</volume>
          (
          <issue>3</issue>
          ),
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Euzenat</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shvaiko</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Ontology
          <string-name>
            <surname>Matching (Second Edition</surname>
          </string-name>
          ),
          <source>vol. 2</source>
          . Springer Berlin Heidelberg (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Farsani</surname>
            ,
            <given-names>H.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nematbakhsh</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lausen</surname>
          </string-name>
          , G.:
          <article-title>Srank: Shortest paths as distance between nodes of a graph with application to RDF clustering</article-title>
          .
          <source>J. Information Science</source>
          <volume>39</volume>
          (
          <issue>2</issue>
          ),
          <fpage>198</fpage>
          -
          <lpage>210</lpage>
          (
          <year>2013</year>
          ), http://dx.doi.org/10.1177/0165551512463994
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Fortunato</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Community detection in graphs</article-title>
          .
          <source>Physics Reports</source>
          <volume>486</volume>
          (
          <issue>3</issue>
          ),
          <volume>103</volume>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Giannini</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Rdf data clustering</article-title>
          .
          <source>In: BI Systems Workshops</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Harenberg</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bello</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gjeltema</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ranshous</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harlalka</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seay</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Padmanabhan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Samatova</surname>
          </string-name>
          , N.:
          <article-title>Community detection in large-scale networks: a survey and empirical evaluation</article-title>
          . Wiley Interdisciplinary Reviews (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Hotho</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maedche</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Staab</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Ontology-based text document clustering</article-title>
          .
          <source>KI</source>
          <volume>16</volume>
          (
          <issue>4</issue>
          ),
          <fpage>48</fpage>
          -
          <lpage>54</lpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Jung</surname>
            ,
            <given-names>J.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Euzenat</surname>
          </string-name>
          , J.:
          <article-title>Towards semantic social networks</article-title>
          .
          <source>In: European Semantic Web Conference</source>
          . pp.
          <fpage>267</fpage>
          -
          <lpage>280</lpage>
          . Springer (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Maedche</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zacharias</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Clustering ontology-based metadata in the semantic web</article-title>
          .
          <source>In: Principles of Data Mining and Knowledge Discovery</source>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Nadori</surname>
            ,
            <given-names>Y.L.E.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Erramdani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moussaoui</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Semantic web-marketing 3.0: Advertisement transformation by modeling</article-title>
          .
          <source>In: Multimedia Computing and Systems (ICMCS)</source>
          , 2014 International Conference on. pp.
          <fpage>569</fpage>
          -
          <lpage>574</lpage>
          . IEEE (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Newman</surname>
            ,
            <given-names>M.E.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Girvan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Finding and evaluating community structure in networks</article-title>
          .
          <source>Phys. Rev. E</source>
          <volume>69</volume>
          ,
          <issue>026113</issue>
          (Feb
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Newman</surname>
            ,
            <given-names>M.E.</given-names>
          </string-name>
          :
          <article-title>Finding community structure in networks using the eigenvectors of matrices</article-title>
          .
          <source>Physical review E</source>
          <volume>74</volume>
          (
          <issue>3</issue>
          ),
          <volume>036104</volume>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Passant</surname>
          </string-name>
          , A.:
          <article-title>dbrec - music recommendations using dbpedia</article-title>
          .
          <source>In: The Semantic Web ISWC</source>
          <year>2010</year>
          ,
          <article-title>LNCS</article-title>
          , vol.
          <volume>6497</volume>
          , pp.
          <fpage>209</fpage>
          -
          <lpage>224</lpage>
          . Springer Berlin Heidelberg (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Pons</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Latapy</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Computing communities in large networks using random walks</article-title>
          .
          <source>In: Computer and Information Sciences</source>
          , vol.
          <volume>3733</volume>
          , pp.
          <fpage>284</fpage>
          -
          <lpage>293</lpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Rosvall</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Axelsson</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bergstrom</surname>
          </string-name>
          , C.T.:
          <article-title>The map equation</article-title>
          .
          <source>The European Physical Journal Special Topics</source>
          <volume>178</volume>
          (
          <issue>1</issue>
          ),
          <fpage>13</fpage>
          -
          <lpage>23</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Schaeffer</surname>
            ,
            <given-names>S.E.</given-names>
          </string-name>
          : Survey:
          <article-title>Graph clustering</article-title>
          .
          <source>Comput. Sci. Rev</source>
          . (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Zafarani</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abbasi</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
          </string-name>
          , H.:
          <article-title>Social media mining: an introduction</article-title>
          . Cambridge University Press (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>