=Paper=
{{Paper
|id=None
|storemode=property
|title=Genetic Algorithm Applied to the Graph Coloring Problem
|pdfUrl=https://ceur-ws.org/Vol-841/submission_10.pdf
|volume=Vol-841
|dblpUrl=https://dblp.org/rec/conf/maics/HindiY12
}}
==Genetic Algorithm Applied to the Graph Coloring Problem==
Genetic Algorithm Applied to the Graph Coloring Problem
Musa M. Hindi and Roman V. Yampolskiy
Computer Engineering and Computer Science
J.B. Speed School of Engineering
Louisville, Kentucky
complete problems that also belong to Constraint
Abstract Satisfaction Problems (CSPs). The practical applications of
Graph Coloring Problems include but are not limited to:
In this paper we present a hybrid technique that applies a
genetic algorithm followed by wisdom of artificial crowds
approach to solving the graph-coloring problem. The • Map coloring (B. H. Gwee, M. H. Lim and J. S.
genetic algorithm described here utilizes more than one Ho 1993)
parent selection and mutation methods depending on the • Scheduling (Daniel Marx and D Aniel Marx 2004)
state of fitness of its best solution. This results in shifting • Radio Frequency Assignment (W. K. Hale 1980;
the solution to the global optimum more quickly than using S. Singha, T. Bhattacharya and S. R. B. Chaudhuri
a single parent selection or mutation method. The algorithm
2008)
is tested against the standard DIMACS benchmark tests
while limiting the number of usable colors to the known • Register allocation (Wu Shengning and Li Sikun
chromatic numbers. The proposed algorithm succeeded at 2007)
solving the sample data set and even outperformed a recent • Pattern Matching
approach in terms of the minimum number of colors • Sudoku
needed to color some of the graphs.
In this paper we demonstrate the use of genetic algorithms
The Graph Coloring Problem (GCP) is a well-known NP- in solving the graph-coloring problem while strictly
complete problem. Graph coloring includes both vertex adhering to the usage of no more than the number of colors
coloring and edge coloring. However, the term graph equal to the chromatic index to color the various test
coloring usually refers to vertex coloring rather than edge graphs.
coloring.
Given a number of vertices, which form a connected
Prior Work
graph, the objective is to color each vertex such that if two A great deal of research has been done to tackle the
vertices are connected in the graph (i.e. adjacent) they will theoretical aspects of the Graph Coloring Problem in terms
be colored with different colors. Moreover, the number of of its generalization as a Constraint Satisfaction Problem
different colors that can be used to color the vertices is (Isabel Méndez Díaz and Paula Zabala 1999). The
limited and a secondary objective is to find the minimum problem’s various applications and solutions have been
number of different colors needed to color a certain graph discussed in detail in Porumbel’s paper (Daniel Cosmin
without violating the adjacency constraint. That number Porumbel 2009). Evolutionary computation and parameter
for a given graph (G) is known as the Chromatic Number control has been detailed in a number of papers including
(χ(G)) (Isabel Méndez Díaz and Paula Zabala 1999). ones by Back, Hammel, and Schwefel (T. Back, U.
Hammel and H. P. Schwefel 1997) as well as work by
If k = {1, 2, 3...} and P(G, k) is the number of possible Eiben, Hinterding and Michalewicz (A. E. Eiben, R.
solutions for coloring the graph G with k colors, then Hinterding and Z. Michalewicz 1999). Srinivas and Patnaik
examined crossover and mutation probabilities for
optimizing genetic algorithm performance (M. Srinivas and
χ(G) = min(k: P(G, k) > 0) (1) L. M. Patnaik 1994). Genetic algorithms and evolutionary
approaches have been used extensively in solutions for the
Graph Coloring Problem and its applications (F. F. Ali, et
al. 1999; K. Tagawa, et al. 1999; Cornelius Croitoru, et al.
2002; C. A. Glass and A. Prugel-Bennett 2003; Justine W.
Graph coloring problems are very interesting from the
Shen 2003; Greg Durrett, Muriel Médard and Una-May
theoretical standpoint since they are a class of NP-
O’Reilly 2010; Lixia Han and Zhanli Han 2010). Most
recent work utilized a parallel genetic algorithm on a
similar dataset to the one used in this paper (Reza Figure 1 displays the chromosome representation of a
Abbasian and Malek Mouhoub 2011). graph of 7 vertices using 4 colors:
The concept of utilizing a crowd of individuals for solving The ultimate goal when solving GCPs is to reach a solution
NP complete problems has also been the topic of various where no two adjacent vertices have the same color.
papers. Most notably the Wisdom of Crowds concept has Therefore, the GA process continues until it either finds a
been used in solving the Traveling Salesman Problem solution (i.e. 0 conflicts) or the algorithm has been run for
(Sheng Kung Michael Yi, et al. 2010b) as well as the the predefined number of generations. In addition to the
Minimum Spanning Tree Problem (Sheng Kung Michael GA, if a run fails to reach a solution a Wisdom of Crowds
Yi, et al. 2010a). In this paper we attempt to supplement approach will be applied to the top performers in an
the solution produced by the genetic algorithm utilizing an attempt to produce a better solution.
artificial crowd (Leif H. Ashby and Roman V. Yampolskiy
2011; Roman V. Yampolskiy and Ahmed El-Barkouky
2011).
Proposed Approach
Genetic algorithms share an overall structure and
workflow yet they vary in the specific details according to
the particular problem. The algorithm consists of a parent
selection method, a crossover method and a mutation
method.
The general algorithm is:
Algorithm1: General Genetic Algorithm
define: population, parents, child
population = randomly generated chromosomes;
while (terminating condition is not reached) {
gaRun();
}
// a single run of a genetic algorithm Figure 1: Chromosome representation of a colored connected
function gaRun() { graph
parents = getParents();
child = crossover(parents); The overall genetic algorithm in this approach is a
child = mutate(child); generational genetic algorithm. The population size is kept
population.add(child); constant at all times and with each generation the bottom
} performing half of the population is removed and new
randomly generated chromosomes are added. The
The goal of the previous algorithm is to improve the population size is set to 50 chromosomes. The value was
fitness of the population by mating its fittest individuals to chosen after testing a number of different population sizes.
produce superior offspring that offer a better solution to The value 50 was the least value that produced the desired
the problem. This process continues until a terminating results.
condition is reached which could be simply that the total
number of generations has been run or any other parameter The GA uses two different parent selection methods, a
like non-improvement of fitness over a certain number of single crossover method and two different mutation
generations or that a solution for the problem has been methods. Which of the parent selection and mutation
found. methods ends up selected depends on the state of the
population and how close it is to finding a solution. The
The chromosome representation of the GCP is simply an parent selection, crossover and mutation methods are
array with the length set to the number of vertices in the outlined as follows:
graph. Each cell in the array is assigned a value from 0 to
the number of colors – 1. The adjacencies between the
vertices are represented by an adjacency matrix of
dimensions n×n where n: the number of vertices.
Algorithm2: parentSelection1: A bad edge is defined as an edge connecting two vertices
define: parent1, parent2, tempParents; that have the same color. The number of bad edges is the
fitness score for any chromosome. As mentioned above, the
tempParents = two randomly selected chromosomes from the alteration between the two different parent selection and
population; mutation methods depends on the best fitness. If the best
parent1 = the fitter of tempParents; fitness is greater than 4 then parentSelection1 and
mutation1 are used. If the best fitness is 4 or less then
tempParents = two randomly selected chromosomes from the parentSelection2 and mutation2 are used. This alteration is
population; the result of experimenting with the different data sets. It
parent2 = the fitter of tempParents; was observed that when the best fitness score is low (i.e.
approaching an optimum) the usage of parent selection 2
return parent1, parent2; (which copies the best chromosome as the new child) along
with mutation2 (which randomly selects a color for the
Algorithm3: parentSelection2: violating vertex) results in a solution more often and more
define: parent1, parent2 quickly than using the other two respective methods.
parent1 = the top performing chromosome; Finally, the algorithm is run for 20,000 generations or until
parent2 = the top performing chromosome; a solution with 0 bad edges is found. If a solution is not
found after 20,000 generations the
return parent1, parent2; wisdomOfArtificialCrowds algorithm is run. The algorithm
used is a localized wisdom of crowds algorithm that only
Algorithm4: crossover builds a consensus out of the violating edges in the best
define: crosspoint, parent1, parent2, child solution. Moreover, it uses the best half of the final
population to produce an aggregate solution. Only a
crosspoint = random point along a chromosome; localized consensus is generated so as not to produce a
child = colors up to and including crosspoint from parent 1 + result that alters the correctly colored vertices. Also, it
colors after crosspoint to the end of the chromosome from takes the best half because they share the most similarity
parent2; and thus will most likely be different at the level of the bad
edges rather than the good ones.
return child;
Algorithm7: wisdomOfArtificialCrowds
Algorithm5: mutation1: define: aggregateChromosome, newColor, expertChromosomes;
define: chromosome, allColors, adjacentColors, validColors,
newColor; expertChromosomes = best half of the final population;
aggregateChromosome = best performing chromosome;
for each(vertex in chromosome) {
if (vertex has the same color as an adjacent vertex) { for each (vertex in graph) {
adjacentColors = all adjacent colors; if (vertex is part of a bad edge) {
validColors = allColors – adjacentColors; newColor = most used color for vertex among
expertChromosomes;
newColor = random color from validColors; aggregateChromosome.setColor(vertex, newColor)
chromosome.setColor(vertex, newColor) }
} }
}
return chromosome; Data
Algorithm6: mutation2: Data used to test our approach are derived from the
define: chromosome, allColors DIMACS benchmarking graph collection. DIMACS is the
Center for Discrete Mathematics and Theoretical Computer
for each(vertex in chromosome) { Science. It is part of Rutgers University (The State
if (vertex has the same color as an adjacent vertex) { University of New Jersey Rutgers, et al. 2011). The data
newColor = random color from allColors; are frequently used in challenges involving constraint
chromosome.setColor(vertex, newColor) satisfaction problems. The files used have a .col extension.
} Each file contains a header with the number of vertices (p)
} and the number of edges (e):
return chromosome;
p edge 496 11654
A number of lines follow the header with each line File |V| |E| Expected kmin HPGAGCP Time
denoting the connected edges and their vertex indices: χ(G) Result (s)
myciel3.col 11 20 4 4 4 0.003
myciel4.col 23 71 5 5 5 0.006
e 1 100 queen5_5.col 23 160 5 5 5 0.031
queen6_6.col 25 290 7 7 8 6.100
myciel5.col 36 236 6 6 6 0.014
16 files were chosen from the DIMACS collection. The queen7_7.col 49 476 7 7 8 6.270
graphs the files represent vary in vertex count, edge count queen8_8.col 64 728 9 9 10 47.482
and overall complexity. The vertex count ranges between huck.col 74 301 11 11 11 0.015
jean.col 80 254 10 10 10 0.015
11 and 561 and the edge count ranges from 20 to 11654. david.col 87 406 11 11 11 0.019
The rationale behind the selection of these graphs other games120.col 120 638 9 9 9 0.027
miles250.col 128 387 8 8 8 0.076
than the wide range of variation is that there is a known miles1000.col 128 3216 42 42 42 48.559
chromatic number for each of them, or at least a good anna.col 138 493 11 11 11 0.058
approximation. fpsol2.i.1.col 496 11654 65 65 65 22.656
homer.col 561 1629 13 13 13 0.760
Table 1: Results of running the proposed algorithm on 16 .col
The following files were used in this approach:
files from the DIMACS collection
(myciel3.col, myciel4.col, myciel5.col, queen5_5.col,
queen6_6.col, queen7_ 7.col, queen8_8.col, huck.col,
jean.col, david.col. games120.col, miles250.col, The following graphs plot the relationship between the
miles1000.col, anna.col, fpsol2.i.1.col, homer.col). fitness and the generation for a sample set of the files used:
Results
Table 1 displays the following statistics for each file:
• The number of vertices |V|
• The number of edges |E|
• The expected chromatic number χ(G)
• The minimum number of colors used by this
algorithm kmin
• The minimum number of colors used by a
comparative publication using a Hybrid Parallel
Genetic Algorithm (HPGAGCP)
• Average time it took to find a solution
The genetic algorithm was developed in Java utilizing JDK Figure 2: Fitness score over the number of generations for
1.6 and JUNG (Java Universal Network/Graph) framework myciel3.col
for graph visualization (Joshua O'Madadhain, Danyel
Fisher and Tom Nelson 2011). Performance plots were
generated using MATLAB R2010a.
The tests were run on a desktop PC with the following
specifications:
CPU: Intel Core i7 860 @2.8Ghz
RAM: 8 GB DDR3 @1333MHz
Figure 3: Fitness score over the number of generations for
queen5_5.col
Figures 2 and 3 are not very interesting since the solutions
are found after a few generations. The next plot, however,
is of particular interest since it clearly represents the erratic
behavior of the fitness score between the initial rapid drop
until a solution is ultimately found. In standard genetic
algorithms the fitness score continues to increase or
decrease (depending of the definition of better fitness)
until the end of the run. This is not the case here. This
factor plays a huge role in obtaining the global optimum
with a higher probability than without it as will be
discussed later.
Figure 6: GCP solution for games120.col
Figure 4: Fitness score over the number of generations for
queen6.col
Sample solutions: (graphs were visualized using the JUNG
framework (Joshua O'Madadhain, Danyel Fisher and Tom
Nelson 2011)):
Figure 7: GCP solution for fpsol2.i.col
Discussion
During the design of this approach, the issue of designing
good operators was a constant concern. The goal of any
good operator is to bring the chromosomes of a population
closer to the desired solution. However, during the process,
a chromosome’s fitness often improves but eventually ends
up in a local optimum. The overall design of this approach
Figure 5: GCP solution for myciel5.col
aimed to improve fitness towards the global optimum while
avoiding the pitfalls of local optima.
To achieve that, a number of factors need to be considered. postprocessor was applied the best solution had 5
Initially, the crossover function is applied to parents that conflicting edges. After application of the Wisdom of
result from the first parent selection method. This method Artificial Crowds postprocessor the graph was colored
selects parents by conducting a small tournament between slightly differently but still had 5 conflicting edges. It is
random pairs of chromosomes. Two pairs are chosen worth noting that the postprocessor added an average of
randomly and the fitter of each pair becomes a parent. 250 ms to the overall process.
These fit parents are then used as input to this crossover
method. The crossover conducts a simple one-point The test cases used were very useful in both testing and
crossover with the cross point being chosen at random. tuning the algorithm. The algorithm was able to scale
The result of this crossover is then subjected to the first across the different graphs and produce optimum solutions
mutation method. The mutation is carried out at a high rate in each case. A recent 2011 publication presented a parallel
of 0.7. This mutation examines each vertex and if a vertex genetic algorithm for the GCP (Reza Abbasian and Malek
violates the coloring constraint a valid color is chosen at Mouhoub 2011). This paper used most of the same test
random. files that were used in Abbasian’s approach. That
approach’s proposed algorithm failed to solve three of the
This process is very effective in reducing the number of graphs and only produced solutions when the color count
conflicts rapidly which can be seen in all the plots through was increased by 1. The files representing these graphs are
an almost perpendicular drop in fitness. However, in spite queen6_6.col, queen7_7.col and queen8_8.col. The
of this method’s effectiveness at increasing fitness rapidly, algorithm used in our approach successfully solved these
it has the side effect of keeping the solution at a local three files using the specified known chromatic number as
optimum. the number of colors used. Our approach is also generally
faster at finding a solution. It is faster in all cases except
To fix that, another parent selection and mutation method four. Three of those cases are the three aforementioned
is introduced. The two methods are applied when the files, which the comparative method did not succeed at
overall fitness of the best solution drops below 5 conflicts. finding a solution using the known chromatic index. The
After that point crossover is no longer applied. The top fourth case is miles1000.col.
performer is selected and is subjected to the second
mutation method. This method finds the conflicting Conclusion
vertices and replaces their conflicting colors with random
colors; which could be invalid as well. This has the The overarching algorithm used in this approach is a
potential to either find a globally optimum solution (i.e. 0 genetic algorithm with a subsequent Wisdom of Crowds
conflicts) or produce a solution that is worse! This can be post-processor. Within the genetic algorithm itself is a set
observed by the erratic pattern in some of the graphs after of operators that utilize methods from the genetic algorithm
the sharp descent and before the actual resolution of the domain as well as applying various heuristics in a
problem. stochastic manner. The end result is an quick progressive
climb to the peak of the globally optimum solution.
This seemingly worsening fitness is not bad however. In
fact, it is partly due to this worsening of the fitness that The algorithms described here can also be applied to the
some solutions are found at all! When the solution various subsets of the general GCP. In particular, Sudoku
becomes worse the fitness score increases. This will force can benefit from these algorithms where it can be
the algorithm back to using the first parent selection and represented as a graph with 81 vertices that must be colored
mutation methods. But, the population now contains a using no more than 9 different colors (i.e. different
solution that hadn’t been there before, which increases the numbers).
possibility of reaching the global optimum. The continuous
back and forth between parent selection and mutation References
methods plays a crucial role in shifting the solution from a
local optimum to a global optimum. Abbasian, Reza, and Mouhoub, Malek. 2011. An efficient
hierarchical parallel genetic algorithm for graph coloring
Finally, if a solution is not found after 20,000 generations, problem. In Proceedings of The 13th annual conference on
a Wisdom of Artificial Crowds algorithm is applied to the Genetic and evolutionary computation, 521-528. Dublin, Ireland:
resultant population to produce a better solution. The ACM
genetic algorithm had been producing optimal results and
thus, per the algorithmic workflow, the Wisdom of
Artificial Crowds postprocessor wouldn’t be applied. Ali, F. F., Nakao, Z., Tan, R. B., and Yen-Wei, Chen. 1999. An
However, in order to test its effectiveness, a test was evolutionary approach for graph coloring. In Proceedings of The
conducted by decreasing the generation count to 10,000 to International Conference on Systems, Man, and Cybernetics, 527-
intentionally produce a suboptimal solution. The test was 532. IEEE
carried out on the miles1000.col file. Before the
Ashby, Leif H., and Yampolskiy, Roman V. 2011, Genetic Porumbel, Daniel Cosmin. 2009. Heuristic Algorithms and
Algorithm and Wisdom of Artificial Crowds Algorithm Applied Learning Techniques: Applications to the Graph Coloring
to Light Up, The 16th International Conference on Computer Problem, Thesis Draft, Département Informatique, Université
Games: AI, Animation, Mobile, Interactive Multimedia, d'Angers
Educational & Serious Games, Louisville, KY, USA: July 27 –
30, 2011 Rutgers, The State University of New Jersey, Research, AT&T
Labs, Jersey, Cancer Institute of New, University, Princeton,
Back, T., Hammel, U., and Schwefel, H. P. 1997. Evolutionary Labs, Alcatel-Lucent Bell, America, NEC Laboratories, and
computation: comments on the history and current state. IEEE Sciences, Applied Communication. 2011. Center for Discrete
Transactions on Evolutionary Computation 1: 3-17 Mathematics and Theoretical Computer Science (DIMACS). Nov.
1, 2011. http://dimacs.rutgers.edu/
Croitoru, Cornelius, Luchian, Henri, Gheorghies, Ovidiu, and
Apetrei, Adriana. 2002. A New Genetic Graph Coloring Shen, Justine W. 2003. Solving the Graph Coloring Problem
Heuristic. In Proceedings of The Computational Symposium on using Genetic Programming, in Genetic Algorithms and Genetic
Graph Coloring and its Generalizations, 63-74. Ithaca, New Programming at Stanford 2003: 187-196, Stanford Bookstore
York, USA
Shengning, Wu, and Sikun, Li. 2007. Extending Traditional
Díaz, Isabel Méndez, and Zabala, Paula. 1999, A Generalization Graph-Coloring Register Allocation Exploiting Meta-heuristics
of the Graph Coloring Problem, Departamento de Computacion, for Embedded Systems. In Proceedings of The Third International
Universidad de Buenes Aires. Conference on Natural Computation. ICNC, 324-329. Haikou,
Hainan, China
Durrett, Greg, Médard, Muriel, and O’Reilly, Una-May. 2010. A
Genetic Algorithm to Minimize Chromatic Entropy: 59-70, eds. Singha, S., Bhattacharya, T., and Chaudhuri, S. R. B. 2008. An
P. Cowling and P. Merz, Springer Berlin / Heidelberg Approach for Reducing Crosstalk in Restricted Channel Routing
Using Graph Coloring Problem and Genetic Algorithm. In
Eiben, A. E., Hinterding, R., and Michalewicz, Z. 1999. Proceedings of The International Conference on Computer and
Parameter control in evolutionary algorithms. IEEE Transactions Electrical Engineering, 807-811. Phuket Island, Thailand
on Evolutionary Computation 2: 124-141
Srinivas, M., and Patnaik, L. M. 1994. Adaptive probabilities of
Glass, C. A., and Prugel-Bennett, A. 2003. Genetic algorithm for crossover and mutation in genetic algorithms. IEEE Transactions
graph coloring: Exploration of Galinier and Hao's algorithm. on Systems, Man and Cybernetics 4: 656-667
Journal of Combinatorial Optimization 3: 229-236
Tagawa, K., Kanesige, K., Inoue, K., and Haneda, H. 1999.
Gwee, B. H., Lim, M. H., and Ho, J. S. 1993. Solving four- Distance based hybrid genetic algorithm: an application for the
colouring map problem using genetic algorithm. In Proceedings graph coloring problem. In Proceedings of Congress on
of First New Zealand International Two-Stream Conference on Evolutionary Computation, 2332. Washington DC
Artificial Neural Networks and Expert Systems, 332-333. New
Zealand Yampolskiy, Roman V., and El-Barkouky, Ahmed. 2011.
Wisdom of Artificial Crowds Algorithm for Solving NP-Hard
Hale, W. K. 1980. Frequency assignment: Theory and Problems. International Journal of Bio-Inspired Computation
applications. Proceedings of the IEEE 12: 1497-1514 (IJBIC) 6: 358-369
Han, Lixia, and Han, Zhanli. 2010. A Novel Bi-objective Genetic Yi, Sheng Kung Michael, Steyvers, Mark, Lee, Michael D., and
Algorithm for the Graph Coloring Problem. In Proceedings of Dry, Matthew. 2010a. Wisdom of the Crowds in Minimum
The 2010 Second International Conference on Computer Spanning Tree Problems. In Proceedings of The 32nd Annual
Modeling and Simulation, 3-6. Sanya, China: IEEE Computer Conference of the Cognitive Science Society. Portland, Oregon
Society
Yi, Sheng Kung Michael, Steyvers, Mark, Lee, Michael D., and
Marx, Daniel, and Marx, D Aniel. 2004, Graph Coloring Dry, Matthew J. 2010b. Wisdom of the Crowds in Traveling
Problems and Their Applications in Scheduling, John von Salesman Problems.
Neumann PhD Students Conference, Budapest, Hungary http://www.socsci.uci.edu/~mdlee/YiEtAl2010.pdf
O'Madadhain, Joshua, Fisher, Danyel, and Nelson, Tom. 2011,
JUNG: Java Universal Network/Graph framework, Oct. 1 2011,
http://jung.sourceforge.net/.