=Paper=
{{Paper
|id=Vol-2098/paper12
|storemode=property
|title=Heuristic Algorithm for Finding the Maximum
Independent Set with Absolute Estimate of the
Accuracy
|pdfUrl=https://ceur-ws.org/Vol-2098/paper12.pdf
|volume=Vol-2098
|authors=Damir N. Gainanov,Nenad Mladenović,Varvara Rasskazova,Dragan Urosević
}}
==Heuristic Algorithm for Finding the Maximum
Independent Set with Absolute Estimate of the
Accuracy==
Heuristic Algorithm for Finding the Maximum Independent Set with Absolute Estimate of the Accuracy Damir N. Gainanov1 , Nenad Mladenović2 , Varvara Rasskazova3 , and Dragan Urosević4 1 Ural Federal University, Ekaterinburg, Russia damir.gainanov@gmail.com, 2 Serbian Academy of Sciences and Arts, Belgrade, Serbia nenadmladenovic12@gmail.com, 3 Moscow Aviation Institute, Moscow, Russia varvara.rasskazova@mail.ru, 4 Serbian Academy of Sciences and Arts, Belgrade, Serbia draganu@turing.mi.sanu.ac.rs Abstract. The paper presents an algorithm for finding the maximum independent set in an undirected graph with absolute estimate of the accuracy. Special notions are indroduced and theoretical results in the area of deviation of approximate solution from the exact one are pre- sended. Also the paper presents results of computational experiments on the complementary graphs DIMACS. Keywords: Algorithm · Maximum independent set · Absolute estimate of the accuracy 1 Introduction The problem of the maximum independent set of vertices in an undirected graph (MIS Problem) has applications in many areas. So, for example, the vertices of the social graph can be interpreted as users accounts. In this case, the inde- pendent set will correspond to different accounts, and the clique in the comple- mentary garph will correspond to fake ones. Another example is the location of enterprises. In this context, the independent set will correspond to the optimal location. Thus, the relevance of the research does not cause doubts in terms of practical significance. MIS Problem is a classical N P–hard. In complexity theory 3-satisfiability problem (3-SAT) reduces to it, which in turn reduces to the standard hard sat- isfiability problem (SAT). This effect indicates that the problem under consid- eration have special scientific interest. As it is known, for today the problem of Copyright c by the paper’s authors. Copying permitted for private and academic purposes. In: S. Belim et al. (eds.): OPTA-SCL 2018, Omsk, Russia, published at http://ceur-ws.org 142 D. Gainanov et al. equality of classes is not solved and there is no polynomial algorithm for any hard problem. In this connection, approximate and heuristic algorithms are develop- ing actively. The disadvantage of approximate algorithms is that estimates of the accuracy obtained are usually greatly overstated. So, for example, in [1] there was developed 2-approximate algorithm for vertex cover problem which guaran- tees the deviation of not more than 2 times. However for problems with known exact solution this estimate is measured by units of vertices. Another approxi- mate algorithm from [2] implements a greedy approach for the same problem and allows to obtain the estimate O (log n), which often leads to significant quality losses. Heuristic algorithms are very effective but any estimates are absent for them. For example, in [3]–[5] algorithms for maximum clique problem (MCP) were developed. Note that MCP mentioned above is the dual N P–hard for MIS Problem. In [6] an inference algorithm for monotone Boolean functions with absolute estimate of the accuracy was developed. The additive criterion has the advantage for solving large-dimensional problems. If the solution include many non-zero components then the deviation measured by units of components does not lead to significant losses for practical purposes. In this paper, this idea was continued to solve MIS Problem. 2 The Maximum Independent Set in the Graph Let us consider an undirected graph G = (V, E), in which V = {vi , i = [n]} is the set of vertices, and E = {e = (vi , vj ) , i, j ∈ [n]} is the set of edges. Definition 1. For an integer k ∈ [n − 1] we call a vertex v ∈ V of the graph G = (V, E) a k–vertex if |N (v)| = k and the neighborhood of this vertex is the complete induced subgraph. Definition 2. For integers k ∈ [n − 1] and m we call a vertex v ∈ V of the graph G = (V, E) a (k, m)–vertex if k = |N (v)| and in the neighborhood of this vertex there are absent m edges to be the complete induced subgraph. Proposition 1. If the vertex v is a k–vertex in the undirected graph G = (V, E) then there exists the maximum independent set of vertices S ⊆ V such that v ∈ S. Proof. Let us consider the arbitrary maximum independent set S of the graph and the k–vertex v ∈ V . It is easy to see that among elements of the set {v} ∪ N (v) Heuristic Algorithm for Maximum Independent Set 143 there is vertex u such that u ∈ S. Indead otherwise one could find an indepen- dent set S 0 ⊆ V such that S 0 = {v} ∪ S and |S 0 | > |S|.This contradicts the maximality of the independent set S. Now let us consider two possible cases. 1) If v ∈ S then we are done. 2) If v 6∈ S and u ∈ S for some vertex u ∈ N (v) then for S one can find the set S 0 = S − {u} ∪ {v} which is also the independent set by definition of k-vertex. By construction we have |S 0 | = |S|. Thus we obtained the maximum independent set S 0 such that v ∈ S 0 as was to be proved. Note that Proposition 1 underlies the idea of developing the heuristic algo- rithm with the absolute estimate of the accuracy. 3 The Algorithm for MIS Problem with Absolute Estimate of the Accuracy We denote the set of all independent sets of the graph G as S (G) and the set of all maximum independent sets of the graph G as Smax (G). Let max S (G) = |S|: S ∈ Smax (G) be the cardinality of the maximum indepen- dent set of the graph G. Proposition 2. Let two graphs G1 = (V1 , E1 ) and G2 = (V2 , E2 ) are such that V1 = V2 , and E1 ⊆ E2 . Then Smax (G2 ) ⊆ S (G1 ) . Proof. Consider an arbitrary independent set S ∈ S (G2 ) of the graph G2 . By the definition of the independent set we have: ∀ u, v ∈ V2 : u, v ∈ S (u, v) 6∈ E2 . By the hypothesis of the proposition, we have E1 ⊆ E2 and V1 = V2 . As a consequence for any S ∈ S (G2 ) we have ∀ u, v ∈ V2 : u, v ∈ S (u, v) 6∈ E1 . and thus S ∈ S (G1 ) . (1) Then for any set of vertices of the graph G such that S ∈ S (G2 ), the inclu- sion (1) holds, that is, S (G2 ) ⊆ S (G1 ) . At the same time by definition of the maximum independent set of the graph, we have that any maximum independent set of the graph is clearly the independent set of the graph, so Smax (G2 ) ⊆ S (G2 ) ⊆ S (G1 ) , as was to be proved. 144 D. Gainanov et al. Corollary 1. Let G1 = (V, E1 ) and G2 = (V, E2 ) be graphs such that E1 ⊆ E2 . Then max S (G1 ) ≥ max S (G2 ) . Proof. Let us consider the maximum independent set S ∈ Smax (G2 ). According to Proposition 2 we have S ∈ S (G1 ). By definition for any S ∈ S (G1 ) there exists S 0 ∈ Smax (G1 ) such that |S 0 | ≥ |S|. Then max S (G1 ) = |S 0 | ≥ |S| = max S (G2 ) , as was to be proved. The following proposition underlies the absolute estimate of the accuracy of the approximate solution. Proposition 3. Let G = V, E be a graph in which vertices v and u are not adjacent. And let G0 = (V 0 , E 0 ) be a graph in which V 0 = V and E 0 = E ∪{(v, u)}. Then max S (G) ≥ max S (G0 ) ≥ max S (G) − 1 . (2) Proof. The inequality max S (G) ≥ max S (G0 ) follows from Corollary 1 since E ⊆ E 0 . Let us prove the inequality max S (G0 ) ≥ max S (G) − 1. Let S be the maximum independent set of the graph G. 1) Suppose that v 6∈ S and u 6∈ S. Then S ∈ Smax (G0 ). Indead otherwise we would obtain that there exists the maximum independent set S 0 ∈ Smax (G0 ) such that |S 0 | > |S|. According to Proposition 1, we obtain that S 0 ∈ S (G), but this contradicts the maximality of the set S under consideration. Thus, in this case, we have: max S (G) = |S| = max S (G0 ) ≥ max S (G) − 1 . 2) Suppose that v ∈ S and u 6∈ S. If the edge (v, u) is added, then S ∈ Smax (G0 ) as was shown above. 3) Suppose that v ∈ S and u ∈ S. If the edge (v, u) is added, then we obtain S 6∈ Smax (G0 ). In this case we can find a set S 0 such that S 0 = S − {v}. Then S 0 ∈ Smax (G0 ) and |S 0 | = |S| − 1 . by construction. By the definition of the maximum independent set of the graph, we have: max S (G0 ) ≥ |S 0 | = |S| − 1 = max S (G) − 1 , as was to be proved. Corollary 2. Let {e1 , . . . , ek } be a subfamily of k vertex pairs that are not edges of the graph G = (V, E). And let the graph G0 = (V 0 , E 0 ) is such that V 0 = V and E 0 = E ∪ {e1 , . . . , ek }. Then max S (G) ≥ max S (G0 ) ≥ max S (G) − k . Heuristic Algorithm for Maximum Independent Set 145 Proof. For proof it suffices to apply Proposition 3 k times. Proposition 4. If the vertex v is a (k, m)–vertex in the graph G = (V, E) then there exists a maximal independent set S ⊆ V such that v ∈ S and |S| ≥ max S (G) − m . Proof. According to definition there are missed m edges in the neighborhood of the vertex v to be the complete induced subgraph. Suppose these edges are {e1 , . . . , em }. Then the vertex v is a k–vertex in the graph G0 , which is obtained from the graph G by the addition of these m edges {e1 , . . . , em }. According to Proposition 1 there exists S 0 : S 0 ∈ Smax (G0 ) such that v ∈ S 0 . According to Corollary 2 we have: |S 0 | = max S (G0 ) ≥ max S (G) − m . It follows from Proposition 2 that S 0 ∈ S (G). By definition there exists a max- imal independent set S such that |S| ≥ |S 0 | and, as a consequence |S| ≥ |S 0 | ≥ max S (G) − m , as was to be proved. Based on Proposition 4 one can propose an efficient recursive algorithm for solving MIS Problem. Let us define some features of this algorithm. On the input of the algorithm B (G) the graph G = (V, E) is given. For example, it could be given by the adjacency matrix. It is clear that each iteration of the algorithm leads to the update of the adjacency matrix, when a certain vertex is included in S and removed from V0 together with neighborhood. In arrays k[v], m[v] (of length n) the values of parameters k and m are stored for corresponding vertices. These values are also updated at each iteration of the algorithm. The function M inM axP aram (v) returns consequentially the vertex v with the minimum value of the parameter m and with the maximum value of the parameter k. In other words, at first vertices are ordered by increasing m. Then among selected vertices with the minimum m the vertex with the maximum k is selected. For example, if among vertices of the initial graph there is a unique ver- tex with the value 0 of the parameter m, then the function M inM axP aram (v) will return this vertex, regardless of the value of the parameter k. So the algorithm allows to find the independent set S and to calculate the absolute estimate (max S (G) − |S|). Algorithm B (G): for solving MIS Problem with an absolute estimate of the accuracy Require: undirected graph G = (V, E) Ensure: the independent set S (S ⊆ V ) and the estimate Est for deviation from the exact solution 146 D. Gainanov et al. 1: V0 ←− V {the current set of vertices consists of applicants for inclusion in the independent set} 2: S ←− {} 3: Est ←− 0 4: while V0 6= {} do 5: for all v ∈ V0 do 6: calculate k[v] 7: calculate m[v] 8: end for 9: v0 ←− M inM axP aram (v) 10: S ←− S ∪ {v0 } 11: Est ←− Est + m[v0 ] 12: V0 ←− V0 − {v0 } − N (v0 , V0 ){remove from the set of applicants the selected vertex together with neighborhood} 13: end while This algorithm was implemented and used to find the maximum indepen- dent set in complementary graphs DIMACS. It is clear that in the initial graph the solution obtained will correspond to the clique. Then it became possible to compare results with other effective algorithms. 3.1 Properties of the Algorithm The following corollary shows that in some cases the Algorithm B (G) allows to find the exact solution of the MIS Problem. Corollary 3. If Est = 0 then the independent set S found by Algorithm B (G) is the maximum. Proof. If Est = 0 then m[v] = 0 for all v ∈ S. This means that all these vertices are k–vertices. Thus, according to Proposition 1 the independent set obtained is the maximum. It is clear that Est = 0 is not always satisfied. This motivates to study particular classes of graphs for which the condition of the Corollary 3 hold. Proposition 5. For any connected graph G = (V, E), where |V | > 2, there exists the maximum independent set that contains all hanging vertices of this graph. Proof. Consider the connected graph G = (V, E), where |V | > 2. Let v ∈ V be hanging vertex. Implementation of the algorithm B (G) allows different states of V0 . Let’s consider each of them in detail. 1) The vertex v is a hanging and unique k–vertex. Then S ←− S ∪ {v} as required. 2) The vertex v is a hanging and not unique k–vertex. Consider the vertex u adjacent to the hanging vertex v. The vertex u can not be included in the set S at the first step, since the neighborhood of the vertex u is not the complete Heuristic Algorithm for Maximum Independent Set 147 induced subgraph. Indead the vertex v is hanging and adjacent to u, that is the vertex v is in the neighborhood of the vertex u. The graph G = (V, E) is connected, then in the neighborhood of the vertex u (in the first step, when V0 coincides with V ) there are also other vertices . But the vertex v can not be adjacent to any of them, since deg (v, V0 ) = 1 and the vertex adjacent to v is the vertex u. Thus, the vertex u is not a k–vertex and will not be included in S on the first step of the algorithm. A similar situation occurs at any step if at least one vertex remains in the neighborhood of the vertex u except for the hanging vertex v. Vertices in the neighborhood of the vertex u will still not be adjacent, and therefore the vertex u will not be a k–vertex. Let at some step all vertices adjacent to u be removed except for the hanging vertex v. This can happen when vertices adjacent to vertices from the neighbor- hood of the vertex u will be included in S. In this situation the vertex u becomes a k–vertex and can be included in S. However, the vertex v can also be included in S, which does not reduce the cardinality of S. Indeed, let s: |S| = s be the cardinality of S at some step. And let the vertex u be included in S and the vertex v be not included, then S ←− S ∪ {u} =⇒ |S| = s + 1 . Now suppose that the vertex v be included in S and the vertex u be not included, then S ←− S ∪ {v} =⇒ |S| = s + 1 , that is, one can include the hanging vertex without decreasing the cardinality of the independent set. Note that if the above situation occurs, then all vertices which are included in S are k–vertices. Thus, according to Corollary 3 there exists the maximum independent set of vertices which contains the hanging vertex v. Arguing similarly for each hanging vertex we get what was required to prove. Thus, we can formulate a more general proposition about the Algorithm B (G). Proposition 6. The Algorithm B (G) implemented in the class of trees allows to find the exact solution of the MIS Problem. Proof. According to Corollary 3 the independent set found by the Algorithm B (G) is maximum if Est = 0. In the case when initial graph is a tree at each step its hanging (orP isolated) vertices could be included in S. In this way Est = m[v] = 0 and the independent set S is the maximum as required. v∈S Note that Proposition 6 is the property of the Algorithm B (G). In addition to this algorithm, the method of dynamic programming from [7] can be effectively used to solve the MIS Problem in the class of trees. However, there are still structural distinctions between these two approaches. 148 D. Gainanov et al. 3.2 Complexity of the Algorithm Let us discuss the algorithmic complexity of the Algorithm B (G). For each vertex v it is necessary to find the number of vertices in the neigh- borhood and the number of edges that should be added. We remove the vertices v ∪ N (v, V0 ) and the edges e ∈ Gh{v} ∪ N (v, V0 )i until V0 becomes empty. Let n and m be the number of vertices and the number of edges of the initial graph respectivle. Then we obtain the following estimate. The common number of iterations is less than or equal to n. Each iteration needs no more than O (n · m) actions to calculate parameters and no more than O (m) actions to remove selected vertex together with neighborhood. Thus, the complexity of the Algorithm B (G) is O (n · n · m + n · m) = O n2 · m . 4 Computational Results The table below shows the results of computational experiments with comple- mentary graphs DIMACS. The table gives the results of the comparison with effective algorithms from [3]–[5]. In the column ω (G) there is maximal clique of the coresponding graph, and the exact solution (maximum clique) is highlighted in bold. In the column B (G) the results for proposed heuristic algorithm are given. Table 1. Computational results with complementary graphs DIMACS It can be seen that the best–known solutuion was obtained in some cases only but the approximate solution is rather close to the best–known or exact ones. Furthermore, the algorithm shows high efficiency in terms of time complexity of calculations. Heuristic Algorithm for Maximum Independent Set 149 5 Conclusion The paper presents the algorithm for MIS Problem with the absolute estimate of the accuracy of the approximate solution. Special notions were indroduced and theoretical results of research were presended. The properties of the algorithm are investigated and, in particular, its convergence in the class of trees is shown. The results of computational experiments on complementary graphs DIMACS are presented, which demonstrate rather good results in terms of the quality of the approximate solution and high efficiency in terms of the time complexity of the calculations. Acknowledgement. The results of the work were obtained within the frame- work of the state task of the Ministry of Education and Science of RF no. 2.2461.2017/4.6. References 1. Skiena, S.: The Algorithm Design Manual. Springer (2008) 2. Korte, B., Vygen, J.: Combinatorial Optimization. Springer (2008) 3. Li, C., Fang, Z., Xe, K.: Combining MaxSAT reasoning and incremental upper bound for the maximum clique problem. In: Proceedings of the 25th International Conference on Tools with Artificial Intelligence. pp. 939–946 (2013) 4. Batsyn, M., Nikolaev, A., San Segundo, P.: Infra–chromatic bound for exact maxi- mum clique search. Computers & Operations Research 64, 293–303 (2015) 5. Batsyn, M., Nikolaev, A., San Segundo, P.: Reusing the same coloring in the child nodes of the search tree for the maximum clique problem. In: LNCS. vol. 8994, pp. 275–280 (2015) 6. Gainanov, D.N., Rasskazova, V.A.: An inference algorithm for monotone Boolean functions associated with undirected graphs. Bulletin of the SUSU 9(3), 17–30 (2016) 7. Dasgupta, S., Papadimitriou, C., Vazirani, U.: Algorithms. McGraw–Hill (2006)