<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Modularization of Graph-Structured Ontology with Semantic Similarity</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Soudabeh Ghafourian</string-name>
          <email>so.ghafourian@stu-mail.um.ac.ir</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amin Rezaeian</string-name>
          <email>amin.rezaeian@stu-mail.um.ac.ir</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mahmoud Naghibzadeh</string-name>
          <email>naghibzadeh@um.ac.ir</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Engineering, Ferdowsi University of Mashhad</institution>
          ,
          <addr-line>Mashhad</addr-line>
          ,
          <country country="IR">Iran</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Modularization is a key requirement to manage the size and complexity of large ontologies by replacing each one by a set of smaller ontologies. Two reasons for this requirement are that current ontology languages such as OWL do not allow partial reuse of ontologies and ontologies are ever growing to cover more knowledge in a specific domain. Many existing modularization methods focus on either semantics or structural aspects of ontologies while both of them are important. In this paper, we consider both semantic and structure by combining these aspects using random walk algorithms to achieve a balance between them. We also define weights for different relations to take semantic into account. The proposed method is designed using two algorithms: a greedy algorithm and a heuristic one to reduce run-time and time-complexity. Our goal is to produce reusable modules of high quality and support large ontologies. The results of the experiments show that our algorithms perform well in comparison with existing golden standard.</p>
      </abstract>
      <kwd-group>
        <kwd>Ontology Modularization</kwd>
        <kwd>Partitioning</kwd>
        <kwd>Ontology Reuse</kwd>
        <kwd>Semantic Similarity</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Today, we use ontologies in the context of information processing, semantic web, and
many other contexts [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Large ontologies usually contain many terms. These large
ontologies are confronted with challenges in their life cycle such as processing and
maintenance [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Modularization is an approach to tackle challenges of large ontologies. An
ontology module is a small ontology that can have inter-module links with other small
ontologies [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and the union of the produced modules is semantically equal to the first
ontology [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Ontology modularization is used to achieve different goals such as scalability for
querying data and reasoning on ontologies, scalability for evolution and maintenance,
complexity management, understandability, context-awareness and personalization
and reusability. The goals of ontology modularization can affect the
understandability, the advantages, and disadvantages of the resulting modules [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Reusability is one of the important goals of modularization of ontologies that apply
when designing ontologies or when developing new applications based on ontologies.
Current ontology languages such as OWL do not allow importing only parts of an
ontology. This is a problem because if an ontology developer needs to reuse only
parts of an ontology, they must import the whole ontology, which takes more space
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. For example, we have home appliances ontology and an ontology developer who
is interested in small home appliances; the developer would like to import the part of
the ontology, which concerns only small home appliances. Therefore, he is not
interested in loading the whole ontology but rather an extracted module of the ontology
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>In this paper, we present modularization algorithms that assign weights to the
different relations that are used in the formalization of the ontology. The goal of our
modularization is reusability. The proposed modularization algorithms are performed
after the weighing phase of different relations is completed. Until now, some of these
relations such as inverseOf, unionOf, intersectionOf, disjointWith, have not been
considered. Since different kinds of the relations represent different semantic aspects
of the ontology, the weighing is determined according to the importance of these
relations. The proposed algorithms consider both structural and semantic criteria.</p>
      <p>Our goal is to produce modules in which concepts are both structurally and
semantically closely related; hence, the resulting subontologies can be reused in developing
new applications.</p>
      <p>Modularization is done based on an agglomerative hierarchical algorithm on which
we perform some optimization to achieve a better result and use new scoring function.
We also propose another algorithm to reduce time-complexity.</p>
      <p>The remainder of the paper is organized as follows: in the next section, we briefly
present some of the related work. In section 3, we define weights for different
relations; describe our modularization algorithms and criteria for modularization. In
Section 4, we evaluate the proposed method and report our results. Finally, Section 5
concludes this paper with future work suggestions.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related work</title>
      <p>
        In this section, we review related work to ontology modularization. Recently, many
different approaches have been proposed for this purpose. There are several
categories for modularization. We categorize modularization based on types of
representation of ontology into two sets: graph-based approaches and logic-based approaches
[
        <xref ref-type="bibr" rid="ref2 ref3">2-3</xref>
        ]. Our work is based on graph approaches. In logic-based approaches, the
modules are produced based on logical representation of ontologies. The work in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
extracts modules from OWL-DL ontology based on user's semantic query. Graph-based
approaches use graph-theoretic algorithms to traverse the hierarchy of ontologies and
some heuristics to get relevant modules [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In general, the works [
        <xref ref-type="bibr" rid="ref10 ref7 ref8 ref9">7-10</xref>
        ] use
agglomerative algorithms to modularize ontologies. An agglomerative approach is an
iterative bottom up one in which, in each iteration, two modules with the highest similarity
are merged to produce a new module. The work in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] defines structural and linguistic
similarities. The first criterion is based on hierarchy of classes and the other one is
based on similarities between the local descriptions of the classes. They present
partition algorithms that combine criteria as a scoring function. Their goal is ontology
matching. In matching ontologies, one tries to find the most similar ontology amongst
a set of ontologies to the one for which a match is requested. The approach performs
modularization of each two ontologies whose similarity is needed. In the
modularization, the hierarchical subclassOf relation and a linguistic similarity measure are used.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], A weighted graph is constructed from rdfs:subClassOf and
rdfs:subPropertyOf relationships. owl:equivalentClass or owl:equivalentProperty are
identified in a preprocessing stage. They present a structure-oriented partitioning
algorithm and add RDF sentences to construct blocks from the modules. The goal of
this work is ontology matching. This work only considers limited relationships.
Relations such as inverseOf, unionOf, intersectionOf, disjointWith, are not considered.
      </p>
      <p>
        Paper [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] describes a structure-based ontology partitioning. They construct a
directed weighted graph based on structure similarities. Five types of criteria between
concepts, i.e., subclass, domain/range, definition, substring, and distance relation, are
defined. The weighted matrix is constructed according to the number of connections
between nodes; then they use Line Island Method [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] to partition and finally perform
optimization to improve the partitioning.
      </p>
      <p>
        The work in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] proposes an ontology partitioning method, which produces
overlapped modules, i.e., final modules may have common concepts. They use semantic
similarity between concepts, so their generated graph is conceptual. This work
considers structure and semantic, but it does not distinguish between different
relationships.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], different graph representations for ontologies are developed. There are three
basic representations and two different extensions of these representations. They use
several methods to convert ontologies into graphs, and apply community detection
algorithms to partition the graphs. Their experiments show that the algorithms work
much better when subject, object and predicate are represented as different nodes.
Paper [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] further develops the findings of paper [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], but the main difference is that
they define a weight function for different relations of the ontology as shown in Table
1; this is the first steps in a semantic approach. They apply three community detection
algorithms (a type of an agglomerative algorithm) on different graph representations.
      </p>
      <p>
        The general topic of papers [
        <xref ref-type="bibr" rid="ref7 ref8">7-8</xref>
        ] is ontology matching based on structural aspects
of ontologies. However, we are interested in ontology modularization with the goal of
reusability of resulting modules. The works described in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] mostly use
structure in order to make a graph representation of an ontology and use classic graph
clustering methods to modularize ontologies, while our method uses a hierarchical
clustering method. We use different graph clustering methods than [
        <xref ref-type="bibr" rid="ref10 ref9">9-10</xref>
        ], in addition
we consider more relations. These relations are mentioned in section 3.2.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Proposed Approach</title>
      <p>
        In this section, we first describe how we represent ontologies. Then we present how
we construct the weighted matrix. Next, we normalize the weights of edges, and then
Neighborhood Random-Walk Distance is introduced [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Finally, the modularization
algorithms are described.
      </p>
      <p>
        Our goal is to bring together into one module the most related concepts that have
highest semantic similarities and presumably describe one subdomain. This agrees
with the concept of domain specific ontologies [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. If a good modularization
algorithm such as agglomerative algorithms and a suitable scoring function is used this
goal is reachable.
      </p>
      <p>
        Agglomerative algorithms [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] are algorithms where the modules with the highest
similarity are iteratively merged. They are bottom-up, this means, initially every node
is considered as an independent module and in the end there is only one module.
3.1
      </p>
      <sec id="sec-3-1">
        <title>Different Graph Representations of Ontology</title>
        <p>
          Ontology web language1 (OWL) is a semantic web language. It is based on the
Resource Description Framework2 (RDF). RDF represents information as triples of the
form (Subject, Predicate, and Object).RDF triples can be mapped to a graph where
subject and object are nodes and each predicate is a directed edge from a subject to an
object. It displays a simple mental model for RDF that is frequently used [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
        <p>
          We also use other graph models; since the predicate of one triple is a subject or an
object in some other triples, we represent every subject, object, and predicate as
separate nodes [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
        </p>
        <p>
          There are various representations of ontologies [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].We represent each one of
subjects, objects and predicates as a separate node in which we have two types of edges:
one type of edge is from subject to predicate and another is from predicate to object.
The predicate node contains object or datatype properties. We also consider every
individual as a node.
3.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Constructing Weighted Matrix</title>
        <p>We define a weight function to give weighs to different relationships of the ontology.
This function assigns an integer number to existing relationships as shown in Table 1.
The main reason for weighing relationships is that different relationships have
different semantics and show different aspects of the ontology. We would like to
distinguish between these aspects and their importance by their weights. This is not meant
that a relation with a higher weight is more valuable than a relation with a lower
weight, but sometimes it means that a relation with a higher weight has existential
precedence over a relation with a lower weight. Therefore, the weights can be
changed for different applications and/or in different contexts. For example, if
sub</p>
        <sec id="sec-3-2-1">
          <title>1 OWL - http://www.w3.org/OWL 2 RDF - http://www.w3.org/RDF</title>
          <p>ClassOf relation exists because it is part of ontology language, then domain and range
relations are meaningful. Therefore, we assign weight 10 to subClassOf and weight 5
(i.e., one half the weight) to domain/range relation. In some other situations, the
weights are assigned based on the wideness or narrowness of their meanings.
Examples are given below.</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>List of Different Relationships and Weights. The weights represented in Table 1 are</title>
        <p>
          based on previous research and also our assessments of relations which are not
studied by previous research. The base of weights is what is mentioned in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. For new
relations, the weight assignment logic is explained in the previous subsection.
        </p>
        <p>
          The equivalent relation denotes that the classes have the same meaning, so the
classes that have equivalent relation are put into one module. Considering the
meaning of the equivalent relation, they are given the highest weight. Furthermore,
the subclass relations give some important information about classes; hence, these
relationships have high semantic contribution to ontology modularization. As
mentioned before, the weight of this relationship is higher than that of domain/range
relationship but not as much as the equivalent relationship has [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
        </p>
        <p>The union and intersection relations are the same as subclass relations because if
for example class A is the union of C, B and D then each class C, B and D is a
subClassOf A.</p>
        <p>If disjoint relation exists between highest-level concepts, the weight is considered
to be zero because they are really disjoint, however if it occurs in lowest-level
concepts, the weight is considered 10. If it occurs somewhere in between, the weight is
assigned accordingly.</p>
        <p>When two concepts have a complement relationship, it means they are strongly
connected. We consider that at first, they have a subclass relationship (their super
class is the universal set) and then they have a complement relationship.</p>
        <p>We put the inverseOf relations in the modules of the property that is related to, so
its weight should be high.</p>
        <p>For object property, when the properties have an inverse functional attribute, it
means this property implies unique value and on the other hand, domain and range
edges represent subdomains of ontologies. We add the restrictions such as cardinality
after modularization because they contain literal values.</p>
        <p>Unifying the Weight of Edges. If there is more than one relationship between two
nodes, we add up the weights of all the existing relationships between the nodes.
Thus, all weights are taken into consideration in our modularization algorithms.
3.3</p>
      </sec>
      <sec id="sec-3-4">
        <title>Normalization</title>
        <p>In this phase, the weights of the edges in the graph are normalized to be between zero
and one. Thus, the weight of the edge outgoing from a node v is divided by the sum of
the weights of outgoing edges from node v. This is needed for input matrix of random
walk in which every element must be between zero and one.</p>
        <p>,
= ∑ ∈ _
,
( ) ,</p>
        <p>Where , is the weight of the edge outgoing from a node v that is normalizing
and the denominator is the sum of the weights of outgoing edges from node v.
3.4</p>
      </sec>
      <sec id="sec-3-5">
        <title>Neighborhood RandomWalk Distance</title>
        <p>
          We use the neighborhood random walk method [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] to measure vertex closeness. A
random walk is a mathematical representation of the path one may navigate through
multiple random steps.
        </p>
        <p>,
= ∑ : →
( ) (1 − )
( )</p>
        <p>Where is transition probability matrix, ℎ( ) is length of random walk
where ℎ( ) ≤ , l is the length that a random walk can go, is restart
probability where ∈ (0,1), , is the neighborhood random walk distance from to .
It measures vertex closeness and is a path from to whose length is ℎ( )
with transition probability ( ).</p>
        <p>
          We perform matrix multiplication on a transition probability matrix of the ontology
graph to use the neighborhood random walk model [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
(1)
(2)
(3)
= ∑
        </p>
        <p>(1 − )</p>
        <p>In this equation, is the neighborhood random walk distance matrix. Intuitively
every element at row i, column j in R, captures the probability of navigating from
node i to j with at most l steps in graph. l is the length that a random walk can go and
it comes from the previous formula, and is the random walk step. The neighborhood
random walk distance matrix is constructed in the following steps:</p>
        <sec id="sec-3-5-1">
          <title>1. Assigning weights to different relationships 2. Normalization of weight of step 1 3. Constructing transition probability and neighborhood random walk distance matrix</title>
          <p>3.5</p>
        </sec>
      </sec>
      <sec id="sec-3-6">
        <title>Modularization Algorithm</title>
        <p>
          The proposed modularization algorithm in this section is an agglomerative algorithm.
The main difference between our algorithm and other similar algorithms in [
          <xref ref-type="bibr" rid="ref7 ref8">7-8</xref>
          ] and
[
          <xref ref-type="bibr" rid="ref10 ref9">9-10</xref>
          ] is that we use a new scoring function that calculates both intra-and
interconnectivity. We apply scoring function for every concept node in our algorithm in
order to improve the efficiency. It means that the distance between the node and every
other node is measured. Another difference is that we consider more relations than
other approaches. We use an agglomerative algorithm, such that in each iteration, we
select two modules that have the highest positive impact on the score of
modularization. Then those modules are merged. From the scoring function perspective, the
scores of other modules may not change. This way the required computation for
computing modularization score is not high. The process is shown in Algorithm 1.
        </p>
        <p>
          The advantages of agglomerative algorithms are that we don’t have to know the
size and the number of modules and the result of these algorithms depend on the
chosen similarity criterion [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ].The input to our algorithm is the neighborhood random
walk distance matrix.
        </p>
        <p>
          Our criterion function is the silhouette coefficient ( ) [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] where −1 ≤ ( ) ≤ 1. If
( ) is close to one it means that the node will be appropriately grouped. The average
( ) of a module is a measure that shows how appropriately the nodes have been
grouped into modules.
        </p>
        <p>Where is node, ( ) is the average similarity of to all other nodes within the
same module, ( ) is the highest average similarity of to nodes of other modules,
and ( ) should be computed for each node . For the module , which is the
average of all s(i) for all nodes i in module c, is defined as follows:
( ) =
( ) ( )
{ ( ), ( )}
=
∑ ∈ ( )
(4)
(5)
(6)</p>
        <p>In this equation is the number of nodes in module c. To score the
modularization , we define the scoring function as follows:
( ) =
∈ ( )</p>
        <p>Equation (6) is used to compute efficiency of the modularization. It comes from the
average of scores of each module. Each module’s score is computed by the average of
scores of its nodes. As the score of every node is calculated using both intra- and
inter-module connections, the resulting efficiency of modularization is effected by both
intra- and inter-module connections of all nodes in graph.</p>
        <p>Algorithm 1. Modularization (adjacencyMatrix)
//C represents whole modularization
C = put every node in a separate module
for K=N down to 2 // K shows number of modules
// for all the pairs of nodes that could be merged
maxS= -2
for i=1 to K-1
for j=i+1 to K
c=Union(i,j);//merge modules i and j into module C
C=C⋃{c}\{i,j};
S=Score(C); //according to equation (6)
//if this new score is better, save it
if S&gt;maxS
maxS = S;
modularization=C;
end if
end for
end for
bestModularization=modularization;
end for
Heuristic Algorithm. We propose the heuristic algorithm to support large ontologies
and reduce time-complexity. It is shown in Algorithm 2. The input of this algorithm is
incidence matrix that is constructed from adjacency matrix R that is calculated using
equation (3). Each row of incidence matrix represents an edge, which consists of first
node, second node and weight. It is useful for large data sets. This matrix is ordered
descending by weight column.</p>
        <p>The time-complexity of this method is ( ) where n is the number of concepts in
the ontology, whereas the complexity of Algorithm 1 is O(n3.complexityscoring).
Algorithm 2. Modularization (adjacencyMatrix)
//C represents whole modularization</p>
        <p>C = put every node in a separate module
for i=1 to N // N shows size of Adjacency Matrix
for j=1 to N
incidenceMatrix =constructIncidenceMatrix
(adjacencyMatrix);
end for
end for
incidenceMatrix = Sort(incidenceMatrix);
for i=1 to K // K shows length of adjacency matrix
if Numberofmodules ==1</p>
        <p>break;
end if
//module(i,1) is the module for start node of edge(i)
//module(i,2) is the module for end node of edge(i)
c=Union(module(i,1),module(i,2));
//if size of merged modules are more than which is
//number of concepts/3, don’t combine modules.
if |c|&gt;</p>
        <p>continue;
else</p>
        <p>C=C⋃{c}\{module(i,1),module(i,2)};
end if
//Checks the score of modularization if it is fixed
//do not merged and terminate the algorithm.
sc=Score(C);
if (sc_old=sc)</p>
        <p>break;
end if
sc_old=sc;
end for
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experimental Results</title>
      <p>
        We have implemented the proposed modularization algorithms in Matlab and use
Java to process the ontologies. We use F-measure [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] to evaluate our methods.
Fmeasure is a value between zero and one and higher values show better performance.
This metric compares produced modules to reference modules that are already
available for tested ontologies. F-measure is computed for pairs of modules in which one
module is selected from reference modularization and another is one from the
generated modules. Finally, the average F-measures of all modules is computed as the
Fmeasure of whole modularization.
      </p>
      <p>−
=
∗
∗
(7)</p>
      <p>Where precision is the number of common concepts between two modules, divide
by number of concepts in generated module. On the other hand, recall is calculated by
dividing number of common concepts by number of reference concepts.
4.1</p>
      <sec id="sec-4-1">
        <title>Dataset</title>
        <p>
          We use FOAF3, AAIR4, BIO5 and SWCO6 ontologies as introduced in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] and
compared our results to concept grouping of these ontologies that are provided in their
websites, and in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Table 2 shows the details of these ontologies. The following
provides a brief description of them:
 Friend of a Friend (FOAF) Ontology: FOAF is an ontology that describe persons,
their activities and other personal information.
3 http://xmlns.com/foaf/spec/20100101.html
4 http://xmlns.notu.be/aair
5 http://vocab.org/bio/0.1/.html
6 http://data.semanticweb.org/ns/swc/ontology
 Biographical Information Ontology (BIO): BIO is an ontology to represent
biographical information about people, both living and dead.
 Semantic Web Conference Ontology (SWCO): The SWCO defines concepts about
academic conferences.
 Atom Activity Streams Vocabulary ontology (AAIR): This ontology is a
vocabulary for describing social networking sites activities.
        </p>
        <p>
          In [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], they apply three algorithms: Fast Greedy Community (FGC), Walk
Community (WTC) and Spin Glass Community (SGC). Because FGC and WTC are
agglomerative algorithms, we compare our algorithms with them. As there are several
ways introduced in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] to represent an ontology as a graph, we choose a type of their
representation of ontology in that every subject, object and predicate is represented as
a separate node.
        </p>
        <p>
          Our results are shown in Table 3. Column 1 and 2 show F-measure of our
algorithms and column 4 and 5 show [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] algorithms the F-measure. The F-measure result
is multiplied by 100 as it is done in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
Analysis of Result. The distribution of classes and properties within their concept
grouping affect F-measure. For example for the FOAF ontology, one group just
contains properties but we consider both classes and properties to modularize. In concept
grouping of AAIR and SWCO, the distribution between classes and property is
balanced and subclass relation is defined as the main concept. The groups of the BIO
ontology contain one group for classes and four groups for properties, so our score is
low. When the concept grouping contains the groups that have classes and property,
our score is good because the main objective of ontology modularization is that the
modules describe subdomains.
        </p>
        <p>
          A Simple Case Study. We also use a small ontology given in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], which consists of
six classes, and one property. In [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], they produce three modules, i.e. modules
{Reference, Inproceedings, Book, Monograph}, {Author, Person}, and {hasAuthor}.
However in our work, we produce two modules {Reference, Inproceedings, Book,
Monograph}, and {has Author, Author, Person}. The reason for these modularizations is
that while the work in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] only considers subClassOf relations, we have considered
domain/range and subClassOf relations. The modularization presented by [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] is
assumed as reference, and the calculation of F-measure is shown in Table 4. F-measures
comparing modules 1 and 2 are consequently 1.0 and 0.5, which are computed using
equation (7). As we have only two modules, the F-measure for third module becomes
zero. The average of these three gives 0.5 as the F-measure for whole modularization.
We have proposed modularization algorithms based on semantic and structure of
ontology. Semantic is considered based on assigning weight to different relationships.
Furthermore, we have considered more relationships than other approaches that
consider only hierarchical relation of classes. Considering more relationships from an
ontology leads to making more edges in graph representation of that ontology. Thus,
one can make a better decision on whether two nodes are similar.
        </p>
        <p>We used neighborhood random walk distance matrix to combine semantic and
structural aspects of an ontology. Each element of this matrix is calculated
considering weights of almost all elements of the transition probability matrix, thus weights
used in proposed method are more precise than methods, which only use weight
matrix.</p>
        <p>We have introduced a new scoring function to merge modules. The objectives of
this function are to maximize the intra-module similarity and to minimize
intermodule similarity. The scoring function shows how appropriately nodes have been
grouped in their modules according to its objectives.</p>
        <p>As a result, we have produced meaningful modules as we consider more relations
than similar methods, and process these relations such that each edge weight has an
impact on every module selection.</p>
        <p>In future work, we plan to evaluate our experiments with other evaluation methods
and other datasets to determine the efficiency of our algorithms. Furthermore, we
would like to further investigate the weight of edges to improve our approach.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Parent</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spaccapietra</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>An Overview of Modularity</article-title>
          . In: Stuckenschmidt,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Parent</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Spaccapietra</surname>
          </string-name>
          , S. (eds.)
          <article-title>Modular Ontologies</article-title>
          .
          <source>LNCS</source>
          , vol.
          <volume>5445</volume>
          , pp.
          <fpage>5</fpage>
          -
          <lpage>23</lpage>
          . Springer, Heidelberg(
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Özacar</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Öztürk</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ünalır</surname>
            ,
            <given-names>M.O.</given-names>
          </string-name>
          :
          <article-title>ANEMONE: An Environment for Modular Ontology Development</article-title>
          .
          <source>Data &amp; Knowledge Engineering</source>
          .
          <volume>70</volume>
          ,
          <fpage>504</fpage>
          -
          <lpage>526</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Doran</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tamma</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Payne</surname>
            ,
            <given-names>T.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palmisano</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>An Entropy Inspired Measure for Evaluating Ontology Modularization</article-title>
          .
          <source>In: 5thInternationalConference on Knowledge Capture (KCAP'09)</source>
          , pp.
          <fpage>73</fpage>
          -
          <lpage>80</lpage>
          . ACM, New York (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Pathak</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Johnson, T.,
          <string-name>
            <surname>Chute</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Survey of Modular Ontology Techniques and their Applications in the Biomedical Domain</article-title>
          . Integrated Computer-Aided Engineering.
          <volume>16</volume>
          ,
          <fpage>225</fpage>
          -
          <lpage>242</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Zhangl</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liul</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qinl</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tangl</surname>
          </string-name>
          , SH.:
          <article-title>Extracting Module from OWL-DL Ontology</article-title>
          .
          <source>In: 2011International Conference on System Science</source>
          , pp.
          <fpage>176</fpage>
          -
          <lpage>179</lpage>
          .IEEE Press (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Abadi</surname>
            ,
            <given-names>M.J.S</given-names>
          </string-name>
          , Zamanifar,
          <string-name>
            <surname>K.</surname>
          </string-name>
          : Producing Complete Modules in Ontology Partitioning.
          <source>2011 International Conference on Semantic Technology and Information Retrieval</source>
          , pp.
          <fpage>137</fpage>
          -
          <lpage>143</lpage>
          . IEEE Press (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Partition-Based Block Matching of Large Class Hierarchies</article-title>
          . In: Mizoguchi,
          <string-name>
            <surname>R.</surname>
          </string-name>
          , Shi,
          <string-name>
            <given-names>Zh.</given-names>
            ,
            <surname>Giunchiglia</surname>
          </string-name>
          ,
          <string-name>
            <surname>F</surname>
          </string-name>
          . (eds.) The Semantic Web - ASWC
          <year>2006</year>
          .
          <article-title>LNCS</article-title>
          , vol.
          <volume>4185</volume>
          , pp.
          <fpage>72</fpage>
          -
          <lpage>83</lpage>
          . Springer, Heidelberg (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qu</surname>
          </string-name>
          , Y., Cheng, G.:
          <article-title>Matching Large Ontologies: A divide-and-conquer approach</article-title>
          .
          <source>Data &amp; Knowledge Engineering</source>
          ,
          <volume>67</volume>
          , pp.
          <fpage>140</fpage>
          -
          <lpage>160</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Coskun</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rothe</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Teymourian</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paschke</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Applying Community Detection Algorithms on Ontologies for Identifying Concept Groups</article-title>
          .
          <source>In: Proceeding of the 5th International Workshop on Modular Ontologies (WoMO</source>
          <year>2011</year>
          ), pp.
          <fpage>12</fpage>
          -
          <lpage>24</lpage>
          . IOS Press (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Coskun</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rothe</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paschke</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Ontology Content "At a Glance"</article-title>
          .
          <source>In: 7th International Conference on Formal Ontology in Information Systems (FOIS</source>
          <year>2012</year>
          ), pp.
          <fpage>147</fpage>
          -
          <lpage>159</lpage>
          . IOS Press (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Stuckenschmidt</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schlicht</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Structure-Based Partitioning of Large Ontologies</article-title>
          . In: Stuckenschmidt,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Parent</surname>
          </string-name>
          , Ch.,
          <string-name>
            <surname>Spaccapietra</surname>
          </string-name>
          , S. (eds.)
          <article-title>Modular Ontologies</article-title>
          .
          <source>LNCS</source>
          , vol.
          <volume>5445</volume>
          , pp.
          <fpage>187</fpage>
          -
          <lpage>210</lpage>
          . Springer, Heidelberg (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Batagelj</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Analysis of large networks - islands</article-title>
          .
          <source>In: Dagstuhl seminar 03361: Algorithmic Aspects of Large and Complex Networks</source>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Etminani</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rezaeian-Delui</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Naghibzadeh</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Overlapped ontology partitioning based on semantic similarity measures</article-title>
          .
          <source>In: 2010 5th International Symposium on Telecommunications (IST)</source>
          , pp.
          <fpage>1013</fpage>
          -
          <lpage>1018</lpage>
          . IEEE (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Resource Description</surname>
          </string-name>
          <article-title>Framework (RDF)</article-title>
          , http://www.w3.org/RDF/
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15. Cheng, H.,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            , Yu,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Clustering Large Attributed Graphs: A Balance between Structural and Attribute Similarities</article-title>
          .
          <source>ACM Transactions on Knowledge Discovery from Data. 5</source>
          ,
          <fpage>1</fpage>
          -
          <lpage>33</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Fortunato</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Community detection in graphs</article-title>
          .
          <source>Physics Reports</source>
          .
          <volume>486</volume>
          ,
          <fpage>75</fpage>
          -
          <lpage>174</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Rousseeuw</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Silhouettes: a Graphical Aid to the Interpretation and Validation of Cluster Analysis</article-title>
          .
          <source>Journal of Computational and Applied Mathematics</source>
          .
          <volume>20</volume>
          ,
          <fpage>53</fpage>
          -
          <lpage>65</lpage>
          (
          <year>1987</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>