<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>The European Physical Journal B 66 (2008) 409-418. doi:10.1140/epjb/
e2008</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1016/j</article-id>
      <title-group>
        <article-title>Constraint Community Detection: modelling approaches with applications</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Oksana Pichugina</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lyudmyla Kirichenko</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yurii Skob</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olha Matsiy</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kharkiv National University of Radio Electronics</institution>
          ,
          <addr-line>14 Nauki Avenue, Kharkiv, 61166</addr-line>
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National Aerospace University "Kharkiv Aviation Institute"</institution>
          ,
          <addr-line>17 Chkalova Street, Kharkiv, 61070</addr-line>
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>V. N. Karazin Kharkiv National University</institution>
          ,
          <addr-line>4 Svobody Sq., Kharkiv, 61022</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>2353</volume>
      <fpage>20</fpage>
      <lpage>22</lpage>
      <abstract>
        <p>Community Detection (CD) is a fundamental issue in Network Analysis, focusing on identifying densely connected node groups within a network. Its broader interpretation, Constrained Community Detection (CCD), emerges when supplementary constraints are applied, expanding the scope of the problem. CD has been extensively explored in Network Analysis, boasting numerous developed exact and approximate methods. Conversely, CCD encompasses a more extensive array of real-world issues and applications within Network Analysis. There is a significant need to broaden the spectrum of CD problem variants by establishing rigorous mathematical models. These models would serve as the foundation for developing new exact and heuristic algorithms to solve these problems. This paper investigates various approaches to CCD problem (CCDP) modeling. Specifically, we introduce a novel method for problem modeling that encompasses a broader range of constraints and establish its correlation with the conventional CCDP modeling approach. Additionally, we demonstrate its distinct advantages. The integration of these approaches presents opportunities for extending the class of formalized CCDP as polynomial optimization problems. Consequently, these problems can be eficiently addressed using contemporary nonlinear solvers and can also be transformed into solvable QUBO models applicable to both quantum and digital annealing.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Community Detection</kwd>
        <kwd>Modularity</kwd>
        <kwd>Constraint Binary Optimization</kwd>
        <kwd>Integer Programming</kwd>
        <kwd>Polynomial Optimization</kwd>
        <kwd>Network</kwd>
        <kwd>Node partition</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Community Detection and Edge Clustering are two main techniques of network analysis (NA)
aiming to uncover structures and patterns within networks [3, 4, 5, 6, 7, 8, 9, 10, 11, 12].</p>
      <p>Community detection (CD, graph partitioning or network clustering) is the process of identifying
groups of nodes within a network called clusters that are more densely connected than nodes in other
clusters.</p>
      <p>CD aims to find natural subdivisions into dense clusters within the network called communities.
Community detection is a fundamental aspect of network analysis with wide-ranging implications
across various domains. Identifying communities within networks provides valuable insights into
complex systems’ structure, behavior, and function. Here are some reasons demonstrating the
importance of community detection: it provides an understanding of complex systems, allows performing
Social Network Analysis in particular, to explore collaboration networks and detect criminal
networks, performs Cultural and Societal Analysis; enables solving many research problems in
Biology and Bioinformatics, e.g., in Epidemiology and Disease Spread; helps to improve management of
Recommendation Systems, Urban Planning and Transportation Systems, made proper Market
Segmentation and study Customer Behavior; boost financial and economic systems by Fraud Detection
and enhancement Cybersecurity issues; rise performance of real networks by improving their
Robustness and Resilience; solve numerous problems of Natural Language processing by exploring Semantic
Web and Content Organization. Community detection has a wide range of applications across various
ifelds, where the goal is to uncover hidden structures, patterns, and relationships within networks.
To understand the numerous applications in the listed research domain, we outline some applications
[1, 2, 8, 11, 13, 14, 15, 16]:
1. Identifying groups of friends or communities in social networks, analyzing information flow
and influence propagation, detecting online communities in forums and social media platforms.
2. Discovering research communities and collaborations in academic citation networks,
identifying influential researchers or papers within specific fields.
3. Investigating criminal networks and identifying key actors and their associations, analyzing
patterns of criminal activity and connections.
4. Identifying protein complexes and functional modules in protein-protein interaction networks,
analyzing genetic regulatory networks and identifying co-regulated gene groups.
5. Tracking the spread of diseases through contact networks and identifying potential hotspots,
analyzing transmission patterns and identifying groups at higher risk.
6. Enhancing recommendation algorithms by considering communities of users with similar
interests, grouping web pages with similar content for better search results and content
organization.
7. Optimizing routing and resource allocation in communication networks, Identifying clusters of
devices in network trafic analysis.
8. Analyzing transportation networks to identify hubs, sub-communities, and trafic patterns,
designing eficient public transportation routes based on community structure.
9. Segmenting customers based on purchasing behavior and preferences, analyzing social
interactions to understand consumer trends.
10. Identifying groups of users engaging in coordinated fraudulent activities, detecting anomalies
and security threats by analyzing network behavior.
11. Organizing and categorizing web content based on thematic communities, enhancing search
results by considering community relevance.</p>
      <p>Real-world CD problems are often accompanied by additional constraints on nodes, edges and
communities, complicating their modeling and significantly afecting solution methods. This paper
studies the issue of modeling CD problems with additional constraints (Constraint Community
Detection). In particular, we propose a new approach to modeling the problems as Boolean constraint
optimization problems.</p>
    </sec>
    <sec id="sec-2">
      <title>1. Prerequisites</title>
      <p>The application domain of CD is far from limited to the above list. Conducting CD is valuable
whenever understanding network structure and relationships is important for making informed decisions
or gaining insights into complex systems.</p>
      <p>CD is conducted on networks, and it is necessary to distinguish networks and graphs. The terms
"network" and "graph" are closely related concepts in the NA and Graph Theory field, but they are
used in slightly diferent ways and contexts.</p>
      <p>A graph is a mathematical object that consists of a set of nodes (vertices) and a set of edges that
connect pairs of nodes. It is denoted as  = ( ,  ), where  = {1, ...,  } is a node-set (vertex set), 
is an edge set. Graphs represent relationships or connections between diferent entities of various
types. Graphs can be directed or undirected, weighted or unweighted.</p>
      <p>A network is a collection of interconnected elements. Networks can represent a wide range of
real-world systems with relationships or interactions between entities. A network can be represented
by nodes representing entities and edges representing interactions of the entities, i.e. any network is
representable by a certain graph.</p>
      <p>The community detection problem (CDP) in a network  is formulated as an optimization problem,
where the goal is to find a partition of nodes into communities that maximizes a certain objective
function. Diferent objective functions capture diferent aspects of community structure. Modularity
[11] and conductance [12] are the two most common.</p>
      <sec id="sec-2-1">
        <title>1.1. Modularity optimization CDP</title>
        <p>Modularity  quantifies the diference between the observed number of edges within communities
and the expected number of edges in a random graph. In this paper, modularity is chosen as a criterion
of optimization.</p>
        <p>The popular formalization (CDP statement) is
• Input: a network  by its weighted adjacency matrix  = [  ], ∈ ; the number  of desired
communities (or an upper bound on the number of communities).
• Output: a partition  of the nodes  into communities.</p>
        <p>• Objective function: maximization of modularity  .</p>
        <p>
          The modularity function  ( | ) assesses the extent to which a partition of network nodes
corresponds to the densely-connected node subsets in the network  and is defined as
 ( | ) =
1 ∑ (  −     ) {  =   },
2 , ∈ 2
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
where
•   = ∑ ∈   is the weighted degree of the node  , 2 = ∑, ∈   is the total weight of the
network (in particular,  is the number of edges in  if the graph is unweighted),
the modularity function (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ):
        </p>
        <p>[1,  ] is the community assignment for the node  ∈  .</p>
        <p>•  {  =   } is an indicator function equal to one if   =   , otherwise, it is annulled;   =  ( ) ∈
The problem (further referred to as Problem 1) is to find the community assignment 
∗ maximizing
(Problem 1):  ( ∗| ) = m∈ax  ( | ),
where  is a set of community node assignment over  .
satisfying the following constraints.</p>
        <p>The vector of integer variables  = [  ] ∈ ℤ&gt;0</p>
        <p>1. Labels of communities are in the range [1,  ]:</p>
        <p>is the network-wide node community assignment
} ⊆    =   and   =   ⇒   =   .</p>
        <p>} ⊆    =   and   ≠   ⇒   ≠   .
formalization: ∀ ∈ [1,  ] ∃ ∈  ∶   =  or</p>
        <p>Also, if  is an exact number of communities, the following constraints are used for mathematical
|  | ≥ 1,  ∈ [1,  ].</p>
        <p>1 ≤   ≤ , 
∈  .
2. Partition constraints. Each node must belong to exactly one community. In other words, 
induces a partition of  .
partition of the node set  if
To formalize this condition, let  1, ...,   be a set of communities induced by  . They form a
| 1| + ...|  | = ,
  ∩   ′ = ∅,  &lt;  ′,  ,  ′
∈ [1,  ].</p>
        <p>
          In terms of the introduced integer variables, now, the communities can be represented as
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
(
          <xref ref-type="bibr" rid="ref5">5</xref>
          )
(
          <xref ref-type="bibr" rid="ref6">6</xref>
          )
(
          <xref ref-type="bibr" rid="ref7">7</xref>
          )
(
          <xref ref-type="bibr" rid="ref8">8</xref>
          )
3. Symmetry constraints.
4. Transitivity constraints:
• For every triple {, ,
        </p>
        <p>community, then , 
• For every triple {, , 
communities, then , 
are also in diferent communities, i.e.
} ⊆  , if , 
are in the same community and ,</p>
        <p>are in diferent
  = { ∈  ∶   =  },  ∈ [1,  ].</p>
        <p>∀{,  } ⊆    =   ⇔   =   .
are in the same community, i.e.</p>
        <p>} ⊆  , if , 
are in the same community and , 
are in the same</p>
      </sec>
      <sec id="sec-2-2">
        <title>1.2. Approaches to CD</title>
        <p>Since CDP is NP-hard, various heuristic and approximation community detection algorithms (CDAs)
are used to find approximate solutions efectively [3, 4, 5, 6, 7, 8, 10, 11, 12, 17, 18, 19, 20, 21]. The most
common algorithms are:
• Louvain Method [17], which is a greedy optimization algorithm iteratively improving
modularity by moving nodes between communities;
• Label Propagation [12], where nodes iteratively adopt the labels of their neighbors until stable
labeling is achieved;
• Spectral Clustering [11] involving computing the eigenvectors of specific matrices derived from
the graph to find clusters;
• The Girvan and Newman algorithm [22] is a hierarchical community detection method that
divides communities by eliminating edges with higher betweenness;
• The Clauset community detection algorithm [13] identifies communities by optimizing the
modularity of network partitions;
• The Brandes et al. Community Detection Algorithm is a greedy agglomerative method that
utilizes Linear Integer Programming to optimize modularity for community detection;
• The Spin Glass Algorithm [23] is a hierarchical agglomerative approach that minimizes the</p>
        <p>Hamiltonian of the Potts-like spin model, with spin states symbolizing communities;
• The Walk Trap Algorithm developed by Pons and Latapy [24] is a hierarchical agglomerative
method rooted in random walks, initiating from individual clusters.</p>
        <p>
          Most of these methods are applied directly to Problem 1, where the constraints (
          <xref ref-type="bibr" rid="ref3">3</xref>
          )-(
          <xref ref-type="bibr" rid="ref7">7</xref>
          ) are present
in-explicitly. In addition, CDP can also be easily reformulated as a binary optimization problem (see
below) and, respectively, be solved by Integer Programming methods [25] and other Nonlinear
Programming techniques [26].
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>1.3. Modularity optimization CCDP</title>
        <p>
          Constrained Community Detection (Constrained CD, CCD) [9, 14] is a generalization of the standard
community detection problem where additional constraints, such as (
          <xref ref-type="bibr" rid="ref8">8</xref>
          ), are introduced to guide the
process of identifying communities within a network. These constraints can be in the form of prior
knowledge, user preferences, or specific requirements related to the network’s properties. The goal
is to incorporate these constraints into the CD process while optimizing an objective function that
captures the desired community structure, such as modularity and conductance.
        </p>
        <p>
          Among known constraints in CCD are:
• Community Size Constraints: for preventing the formation of very small or very large
communities, enforcing constraints on the sizes of communities are imposed;
Let   ,   be a lower and an upper bound on the size of the community   ,  ∈ [1,  ]. In these
notations, the constraint is
  ≤ |  | ≤   ,  ∈ [1,  ].
(
          <xref ref-type="bibr" rid="ref9">9</xref>
          )
• Balance Constraints: balance requirement for certain community characteristics such as size
of communities, distribution of node degrees and node/edge attributes.
        </p>
        <p>Suppose we are given an upper bound Δ on the diference of sizes of two communities in the
community assignment  . Mathematically, it can be expressed as</p>
        <p>(|  | − |  ′ |)2 ≤ Δ2,  &lt;  ′, ,  ′ ∈ [1,  ].
• Seed Nodes or Labels: the goal is to ensure that certain nodes are assigned to the specified
communities.</p>
        <p>Let  ⊆  and   ⊂ [1,  ] be a set of communities, where the node  can be assigned for  ∈  .</p>
        <p>This can be written as
• Conflicting constraints : these constraints express the condition that some nodes must be
assigned to diferent communities. For their formalization, we introduce a set   , whose elements
are collections of sets of conflicting nodes/ They are assigned to diferent communities if:
  ∈   ⊆ [1,  ],  ∈  .
  = { ⊆</p>
        <p>∶ ∀{,  } ⊆    ≠   }.
  = { ⊆</p>
        <p>∶ ∀{,  } ⊆    =   }
• Forcing constraints: these constraints require certain nodes to be assigned to the same
community. Similar to conflict ones, we formalize them as follows. First, we introduce a set
consisting of the sets of forcing nodes, i.e. the ones that need to be assigned to the same
community.
• Hierarchical Constraints: require hierarchical structuring of the detected communities, where
the communities are nested within larger communities;
• Similarity Constraints: the goal is to ensure a certain level of similarity between nodes and
edges, including their specific attributes.</p>
        <p>
          To Problem 1 complemented by the constraints (
          <xref ref-type="bibr" rid="ref9">9</xref>
          )-(
          <xref ref-type="bibr" rid="ref13">13</xref>
          ), we will refer to as Problem 1.G. It would
be very desirable to reformulate Problem 1 and Problem 1.G as integer optimization problems with
variables forming the vector  , and for this, the approaches described in [27, 28] can be applied.
However, this has not yet been achieved. That is why other modeling approaches are needed.
        </p>
        <p>Additional constraints in CDP require implementing more complex optimization methods than the
standard CD approaches because it is complicated to incorporate additional constraints in the
standard community detection heuristics. These methods utilize mathematical models of CCD problems
(CCDP) that are not uniquely determined. Accordingly, the efectiveness of using the CCD methods
highly depends on these underlying mathematical models.</p>
        <p>
          In this paper, we study approaches to CCDP modelling. In particular, we present a new approach
to problem modeling that covers a larger number of constraints and establish its connection with
the standard approach to CCDP modeling. Combining these two approaches opens up prospects for
expanding the class of formalized CCD problems in the form of polynomial optimization problems.
Respectively, these CCDPs can be solved efectively using contemporary nonlinear solvers [29] and
also reduced to QUBO models [30] solvable by quantum and digital annealers.
(
          <xref ref-type="bibr" rid="ref10">10</xref>
          )
(
          <xref ref-type="bibr" rid="ref11">11</xref>
          )
(
          <xref ref-type="bibr" rid="ref12">12</xref>
          )
(
          <xref ref-type="bibr" rid="ref13">13</xref>
          )
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2. Modelling CCD problems as binary programs</title>
      <p>
        We will formalize additional constraints using binary variables. In order to accomplish this, first, we
reformulate the integer programming problem (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ), whose dimension is  , as a binary optimization
problem of higher dimension.
      </p>
      <sec id="sec-3-1">
        <title>2.1. Standard CCD modelling approach</title>
        <p>First, we recall the standard approach [9, 14, 31, 32] that allows formalizing some of the
abovementioned constraints.</p>
        <p>Suppose variables form a square matrix of binary variables of the size  :</p>
        <p>= [  ], ∈ ∈   × ,
where  = {0, 1},</p>
        <sec id="sec-3-1-1">
          <title>1 if the nodes  and  are in the same community, 0, otherwise.</title>
          <p>
            Then the expressions (
            <xref ref-type="bibr" rid="ref1">1</xref>
            ) and (
            <xref ref-type="bibr" rid="ref2">2</xref>
            ) for the objective can be rewritten as
d) The constraint (
            <xref ref-type="bibr" rid="ref6">6</xref>
            ) takes the form of
e) The constraint (
            <xref ref-type="bibr" rid="ref7">7</xref>
            ) becomes
          </p>
          <p>∀{,  } ⊆    =   .</p>
          <p>
            f) Likewise (
            <xref ref-type="bibr" rid="ref3">3</xref>
            ), the constraint (
            <xref ref-type="bibr" rid="ref8">8</xref>
            ) cannot be written in terms of  -entries.
2. CCDP constraints are only partially representable in terms of the  -elements.
a) (
            <xref ref-type="bibr" rid="ref9">9</xref>
            ), (
            <xref ref-type="bibr" rid="ref10">10</xref>
            ) and (
            <xref ref-type="bibr" rid="ref11">11</xref>
            ) are not formalized in terms of   .
          </p>
          <p>(Problem 2):  ( ∗) =  m∈ a x×  ( ),
 ( ) =
1 ∑ (  −     )  .</p>
          <p>2 , ∈ 2</p>
          <p>When the matrix  ∗ is found, then the community assignment  ∗ is formed as follows: an arbitrary
node  ∈  is selected and is assigned to the community  1 along with all other nodes belonging to
the same community as  . The process of community assignment continues iteratively for unassigned
nodes and the communities  2, ...,   .</p>
          <p>
            Certain constraints must be added to (
            <xref ref-type="bibr" rid="ref15">15</xref>
            ).
1. CDP constraints
a) The constraint (
            <xref ref-type="bibr" rid="ref3">3</xref>
            ) cannot be written in terms of the variables (
            <xref ref-type="bibr" rid="ref14">14</xref>
            ). However, there are CD
algorithms where the number of communities can be specified in advance [12].
b) The constraint (
            <xref ref-type="bibr" rid="ref4">4</xref>
            ) holds due to the above way of the node-to-community assignment.
c) The symmetry constraint (
            <xref ref-type="bibr" rid="ref5">5</xref>
            ) is formalized as
(
            <xref ref-type="bibr" rid="ref14">14</xref>
            )
(
            <xref ref-type="bibr" rid="ref15">15</xref>
            )
(
            <xref ref-type="bibr" rid="ref16">16</xref>
            )
(
            <xref ref-type="bibr" rid="ref17">17</xref>
            )
(
            <xref ref-type="bibr" rid="ref18">18</xref>
            )
(19)
b) The condition (
            <xref ref-type="bibr" rid="ref12">12</xref>
            ) can be written as follows:
c) Similar to (20), the constraints (
            <xref ref-type="bibr" rid="ref13">13</xref>
            ) are rewritten as:
∀ ∈   ∶ ∀{,  } ⊆    = 0.
          </p>
          <p>∀ ∈   ∶ ∀{,  } ⊆    = 1.
 = [  ] ∈ , ∈[1, ] ∈</p>
          <p>× ,
  =
{
1 if   =  ,
0, otherwise.</p>
          <p>
            Thus, among the above CD and CCD constraints, five groups are not formalized in terms of   and
require other approaches for formalization.
reduced to  2− if the symmetry constraint is utilized for eliminating the variables  
2
The dimension of this CCDP given by (
            <xref ref-type="bibr" rid="ref15">15</xref>
            )-(21) (further referred to as Problem 2.G) is  2. It can be
satisfying the
relation
          </p>
          <p>≥  .</p>
          <p>The advantages of the formalized constraints in Problems 2, 2.G and objective functions are their
linearity, while the disadvantage is the unknown, at the moment, the approach to formalizing the rest
of the constraints in terms of entries of  .</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>2.2. New CCD modelling approach</title>
        <p>In this section, we present our approach to CCD modelling, also utilizing binary variables.</p>
        <p>Let us introduce another matrix of binary variables</p>
        <p>
          In terms of these variables, the indicator function in (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) is representable as
that allows rewriting the modularity function (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) in terms of the introduced binary variables:
(20)
(21)
(22)
(23)
where
where
        </p>
        <p>
          We came to the binary formulation of problem (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) (further referred to as Problem 3): find the
binary matrix  ∗, where the minimum of  ′( ) is attained, i.e.,
 {  =   } =  1( , , 
) = 1 −
1 
2  =1
], ∈ ∶   = − 2 ( 
1
−  2
        </p>
        <p>), ,  ∈  .
(Problem 3):  ′( ∗) = min  ′( ),
 ∈ℝ ×
where  ′( ) is given by (22), and the one-hot constraints hold:

 =1
∑   = 1,  ∈  .</p>
        <p>CCDP constraints.</p>
        <p>1. CDP constraints.</p>
        <p>
          Problem 3 is the constrained binary optimization problem with the quadratic objective (22) and
linear equality constraints (24). Let us formalize in terms of  -entries the rest of the above CDP and
nection between elements of the matrices  and  :
a) The condition (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) holds automatically since the matrix  has  columns.
b) Fulfillment of the condition (
          <xref ref-type="bibr" rid="ref4">4</xref>
          ) is ensured by the one-hot constraint (24).
c) In order to write out the symmetry constraint (
          <xref ref-type="bibr" rid="ref5">5</xref>
          ) in terms of  , first, we establish
con
        </p>
        <p>
          Making the substitution (25) into (
          <xref ref-type="bibr" rid="ref17">17</xref>
          ), we come to the quadratic equality constraint:
d) The constraints (
          <xref ref-type="bibr" rid="ref6">6</xref>
          ) and (
          <xref ref-type="bibr" rid="ref7">7</xref>
          ) also hold automatically.
e) The condition (
          <xref ref-type="bibr" rid="ref8">8</xref>
          ) is easily written in terms of the new variables, taking into account that
which is, clearly, redundant.
|  | = ∑ ∈   ,  ∈ [1,  ]:
∀{,  } ⊆ 
∑     =
        </p>
        <p>
          ∑     ,
 ∈[1, ]
a) In terms of   , the constraint (
          <xref ref-type="bibr" rid="ref9">9</xref>
          ) looks like:
b) In terms of  -entries, the constraint (
          <xref ref-type="bibr" rid="ref10">10</xref>
          ) is:
  ≤ ∑   ≤   ,  ∈ [1,  ].
        </p>
        <p>∈
(∑   − ∑   ′)2 ≤ Δ2,  &lt;  ′, ,  ′ ∈ [1,  ].
 ∈</p>
        <p>
          ∈
c) In terms of   , the condition (
          <xref ref-type="bibr" rid="ref11">11</xref>
          ) is represented as
d) The binary representation of (
          <xref ref-type="bibr" rid="ref12">12</xref>
          ) is
∑   = 1,  ∈  .
        </p>
        <p>∈ 
∀ ⊆   ∶</p>
        <p>∑
{, }⊆  =1
−   )2 =  |2| ⋅ 2 = | |(| | − 1).</p>
        <p>(24)
(25)
(26)
(27)
(28)
(29)
e) The constraint takes the form of
∀ ⊆   ∶</p>
        <p>∑ (2 − ∑( 
{, }⊆
 =1</p>
        <p>−   )2) = | |(| | − 1)
that can be simplified to</p>
        <p>∀ ⊆   ∶ ∑{, }⊆ ∑ =1( 
−   )2 = 0.</p>
        <p>(30)</p>
        <sec id="sec-3-2-1">
          <title>The dimension of Problem 3 is  ×  .</title>
          <p>It is seen that Problem 3 is a binary problem with a convex quadratic objective function and linear
constraints, i.e. it is quadratic binary problem. Also, we came to its CCD generalization having the
form of Problem 3 with the additional constraints (26)-(30), further referred to as Problem 3.G. These
constraints can be present in any combination.</p>
          <p>Due to the presence of the quadratic constraints (27)- (30), it belongs to the class of quadratically
constrained binary optimization problems with the quadratic objective and constraints. Moreover,
the constraints (29) and (30) are quadratic equality constraints, i.e. they are non-convex in contrast
to the convex objective (22) and the convex inequality constraints (27). Thus, attacking Problem 3.G
we deal with a non-convex binary optimization problem.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Discussion</title>
      <p>Let us compare the models Problems 2 and 3 and their generalization, Problems 2.G and 3.G.</p>
      <p>Comparing the dimensions of the models, the advantage remains with Problems 3 and 3.G since the
upper bound  on the number of communities is normally much smaller than the number of nodes  ,
respectively,</p>
      <p>2 ≫  ⋅  .</p>
      <p>In contrast to Problem 3 and Problem 3.G, Problem 2 and its CCD generalization, Problem 2.G, are
linear binary problems but do not completely formalized.</p>
      <p>Accordingly, only the models Problem 3 and Problem 3.G coped with formalizing the CDP and
CCDP as a binary optimization problem. The obtained models have the quadratic objective
representing modularity and linear or quadratic constraints. Therefore, the models can be directly solved
by general nonlinear solvers, disregarding if the binary variables are supported.
thus covering a much wider class of CCDPs.</p>
      <sec id="sec-4-1">
        <title>The variables of the matrices</title>
        <p>and  have diferent meanings. Namely, the elements of X reflect
the relationship between two nodes, while the elements of Y reflect the relationship between a node
and a community. Supposedly, other constraints can be formalized by a combination of  , 
-entries,</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Conclusion</title>
      <p>This paper attacks a critical task in Network Analysis called Constrained Community Detection
(CCD). Binary optimization was chosen as a modeling tool. A new approach to modeling these
problems is presented, significantly expanding the set of formalized constraints. A comparison with
the standard modeling approach is made, demonstrating the advantages of our approach. Both
approaches can be combined, forming new CCD models in the form of polynomial binary optimization
problems. Nonlinear, including polynomial, solvers can be used to solve them. The polynomial and
binary nature of the models also allows their reduction to popular QUBO models, which are solved
very efectively on quantum and digital annealers.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Scott</surname>
          </string-name>
          , Social Network Analysis, 4th ed.,
          <source>Sage Publications</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M. E. V.</given-names>
            <surname>Valkenburg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Network</given-names>
            <surname>Analysis</surname>
          </string-name>
          , 3rd ed.,
          <source>Pearson College Div</source>
          ,
          <year>1974</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Sewell</surname>
          </string-name>
          ,
          <article-title>Model-based edge clustering</article-title>
          ,
          <source>Journal of Computational and Graphical Statistics</source>
          <volume>30</volume>
          (
          <year>2020</year>
          )
          <fpage>390</fpage>
          -
          <lpage>405</lpage>
          . doi:
          <volume>10</volume>
          .1080/10618600.
          <year>2020</year>
          .
          <volume>1811104</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B.</given-names>
            <surname>Farzad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Pichugina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Koliechkina</surname>
          </string-name>
          , Multi - layer community detection,
          <source>in: 2018 International Conference on Control, Artificial Intelligence, Robotics &amp; Optimization (ICCAIRO)</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>133</fpage>
          -
          <lpage>140</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICCAIRO.
          <year>2018</year>
          .
          <volume>00030</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>O.</given-names>
            <surname>Pichugina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Farzad</surname>
          </string-name>
          ,
          <article-title>A human communication network model</article-title>
          ,
          <source>in: Proceedings of the 12th International Conference on ICT in Education, Research and Industrial Applications. Integration, Harmonization and Knowledge Transfer</source>
          , volume
          <volume>1614</volume>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2016</year>
          , pp.
          <fpage>33</fpage>
          -
          <lpage>40</lpage>
          . URL: https:// ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3403</volume>
          /paper21.pdf, issn
          <volume>1613</volume>
          -
          <fpage>0073</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>V. D.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-L.</given-names>
            <surname>Guillaume</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Lambiotte</surname>
          </string-name>
          , E. Lefebvre,
          <article-title>Fast unfolding of communities in large networks</article-title>
          ,
          <source>Journal of Statistical Mechanics: Theory and Experiment</source>
          <year>2008</year>
          (
          <year>2008</year>
          )
          <article-title>P10008</article-title>
          . doi:
          <volume>10</volume>
          . 1088/
          <fpage>1742</fpage>
          -
          <lpage>5468</lpage>
          /
          <year>2008</year>
          /10/P10008.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Duch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Arenas</surname>
          </string-name>
          ,
          <article-title>Community detection in complex networks using extremal optimization</article-title>
          ,
          <source>Physical Review E</source>
          <volume>72</volume>
          (
          <year>2005</year>
          )
          <article-title>027104</article-title>
          . doi:
          <volume>10</volume>
          .1103/PhysRevE.72.027104.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>E.</given-names>
            <surname>Eaton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mansbach</surname>
          </string-name>
          ,
          <article-title>A spin-glass model for semi-supervised community detection</article-title>
          ,
          <source>Proceedings of the AAAI Conference on Artificial Intelligence</source>
          <volume>26</volume>
          (
          <year>2012</year>
          )
          <fpage>900</fpage>
          -
          <lpage>906</lpage>
          . doi:
          <volume>10</volume>
          .1609/aaai. v26i1.8320,
          <issue>number</issue>
          :
          <fpage>1</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>K.</given-names>
            <surname>Eguchi</surname>
          </string-name>
          , T. Murata,
          <article-title>Constrained community detection in multiplex networks</article-title>
          , in: G. L.
          <string-name>
            <surname>Ciampaglia</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Mashhadi</surname>
          </string-name>
          , T. Yasseri (Eds.),
          <source>Social Informatics, Lecture Notes in Computer Science</source>
          , Springer International Publishing,
          <year>2017</year>
          , pp.
          <fpage>75</fpage>
          -
          <lpage>87</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -67217-
          <issue>5</issue>
          _
          <fpage>6</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C. F. A.</given-names>
            <surname>Negre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ushijima-Mwesigwa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Mniszewski</surname>
          </string-name>
          ,
          <article-title>Detecting multiple communities using quantum annealing on the d-wave system</article-title>
          ,
          <source>PLOS ONE 15</source>
          (
          <year>2020</year>
          )
          <article-title>e0227538</article-title>
          . doi:
          <volume>10</volume>
          .1371/ journal.pone.
          <volume>0227538</volume>
          , publisher: Public Library of Science.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M. E. J.</given-names>
            <surname>Newman</surname>
          </string-name>
          ,
          <article-title>Modularity and community structure in networks</article-title>
          ,
          <source>Proceedings of the National Academy of Sciences</source>
          <volume>103</volume>
          (
          <year>2006</year>
          )
          <fpage>8577</fpage>
          -
          <lpage>8582</lpage>
          . doi:
          <volume>10</volume>
          .1073/pnas.0601602103.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P.</given-names>
            <surname>Wagenseller</surname>
          </string-name>
          <string-name>
            <surname>III</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Size matters: A comparative analysis of community detection algorithms</article-title>
          ,
          <year>2017</year>
          . URL: http://arxiv.org/abs/1712.01690.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Clauset</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. E. J.</given-names>
            <surname>Newman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Moore</surname>
          </string-name>
          ,
          <article-title>Finding community structure in very large networks</article-title>
          ,
          <source>Physical Review E</source>
          <volume>70</volume>
          (
          <year>2004</year>
          )
          <article-title>066111</article-title>
          . doi:
          <volume>10</volume>
          .1103/PhysRevE.70.066111.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>W. D.</given-names>
            <surname>Viles</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. J. O'Malley</surname>
          </string-name>
          ,
          <article-title>Constrained community detection in social networks</article-title>
          ,
          <year>2017</year>
          . URL: http://arxiv.org/abs/1708.04354.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>A.</given-names>
            <surname>Lancichinetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fortunato</surname>
          </string-name>
          ,
          <article-title>Community detection algorithms: A comparative analysis</article-title>
          ,
          <source>Physical Review E</source>
          <volume>80</volume>
          (
          <year>2009</year>
          )
          <article-title>056117</article-title>
          . doi:
          <volume>10</volume>
          .1103/PhysRevE.80.056117, publisher: American Physical Society.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M. E. J.</given-names>
            <surname>Newman</surname>
          </string-name>
          ,
          <article-title>Communities, modules and large-scale structure in networks</article-title>
          ,
          <source>Nature Physics 8</source>
          (
          <year>2012</year>
          )
          <fpage>25</fpage>
          -
          <lpage>31</lpage>
          . doi:
          <volume>10</volume>
          .1038/nphys2162, number: 1 Publisher: Nature Publishing Group.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>V. A.</given-names>
            <surname>Traag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Waltman</surname>
          </string-name>
          ,
          <string-name>
            <surname>N. J. van Eck</surname>
          </string-name>
          ,
          <article-title>From louvain to leiden: guaranteeing well-connected communities</article-title>
          ,
          <source>Scientific Reports</source>
          <volume>9</volume>
          (
          <year>2019</year>
          )
          <article-title>5233</article-title>
          . doi:
          <volume>10</volume>
          .1038/s41598-019-41695-z, number: 1 Publisher: Nature Publishing Group.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Girvan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. E. J.</given-names>
            <surname>Newman</surname>
          </string-name>
          ,
          <article-title>Community structure in social and biological networks</article-title>
          ,
          <source>Proceedings of the National Academy of Sciences</source>
          <volume>99</volume>
          (
          <year>2002</year>
          )
          <fpage>7821</fpage>
          -
          <lpage>7826</lpage>
          . doi:
          <volume>10</volume>
          .1073/pnas.122653799, publisher:
          <source>Proceedings of the National Academy of Sciences.</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>