<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Graph framework, Oct.</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Genetic Algorithm Applied to the Graph Coloring Problem</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Musa M. Hindi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Roman V. Yampolskiy</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Engineering and Computer Science J.B. Speed School of Engineering Louisville</institution>
          ,
          <addr-line>Kentucky</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Map coloring (B. H. Gwee, M. H. Lim and J. S. Ho 1993) Scheduling (Daniel Marx and D Aniel Marx 2004) Radio Frequency Assignment (W. K. Hale 1980; S. Singha, T. Bhattacharya and S. R. B. Chaudhuri 2008) Register allocation (Wu Shengning and Li Sikun 2007) Pattern Matching Sudoku</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Rutgers, The State University of New Jersey, Research, AT&amp;T Labs, Jersey, Cancer Institute of New, University</institution>
          ,
          <addr-line>Princeton, Labs, Alcatel-Lucent Bell, America, NEC Laboratories</addr-line>
          ,
          <institution>and Sciences, Applied Communication. 2011. Center for Discrete Mathematics and Theoretical Computer Science (DIMACS). Nov.</institution>
          <addr-line>1, 2011</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2004</year>
      </pub-date>
      <volume>1</volume>
      <issue>2011</issue>
      <abstract>
        <p>In this paper we present a hybrid technique that applies a genetic algorithm followed by wisdom of artificial crowds approach to solving the graph-coloring problem. The genetic algorithm described here utilizes more than one parent selection and mutation methods depending on the state of fitness of its best solution. This results in shifting the solution to the global optimum more quickly than using a single parent selection or mutation method. The algorithm is tested against the standard DIMACS benchmark tests while limiting the number of usable colors to the known chromatic numbers. The proposed algorithm succeeded at solving the sample data set and even outperformed a recent approach in terms of the minimum number of colors needed to color some of the graphs.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>χ(G) = min(k: P(G, k) &gt; 0)
(1)
Graph coloring problems are very interesting from the
theoretical standpoint since they are a class of
NPcomplete problems that also belong to Constraint
Satisfaction Problems (CSPs). The practical applications of
Graph Coloring Problems include but are not limited to:
•
•
•
•
•
•
In this paper we demonstrate the use of genetic algorithms
in solving the graph-coloring problem while strictly
adhering to the usage of no more than the number of colors
equal to the chromatic index to color the various test
graphs.</p>
    </sec>
    <sec id="sec-2">
      <title>Prior Work</title>
      <p>
        A great deal of research has been done to tackle the
theoretical aspects of the Graph Coloring Problem in terms
of its generalization as a Constraint Satisfaction Problem
(Isabel Méndez Díaz and Paula Zabala 1999). The
problem’s various applications and solutions have been
discussed in detail in Porumbel’s paper (Daniel Cosmin
Porumbel 2009). Evolutionary computation and parameter
control has been detailed in a number of papers including
ones by Back, Hammel, and Schwefel (T. Back, U.
Hammel and H. P. Schwefel 1997) as well as work by
Eiben, Hinterding and Michalewicz (A. E. Eiben, R.
Hinterding and Z. Michalewicz 1999). Srinivas and Patnaik
examined crossover and mutation probabilities for
optimizing genetic algorithm performance (M. Srinivas and
L. M. Patnaik 1994). Genetic algorithms and evolutionary
approaches have been used extensively in solutions for the
Graph Coloring Problem and its applications (F. F. Ali, et
al. 1999; K. Tagawa, et al. 1999; Cornelius Croitoru, et al.
2002; C. A. Glass and A. Prugel-Bennett 2003; Justine W.
Shen 2003; Greg Durrett, Muriel Médard and Una-May
O’Reilly 2010; Lixia Han and Zhanli Han 2010). Most
recent work utilized a parallel genetic algorithm on a
similar dataset to the one used in this paper
        <xref ref-type="bibr" rid="ref1">(Reza
Abbasian and Malek Mouhoub 2011)</xref>
        .
The concept of utilizing a crowd of individuals for solving
NP complete problems has also been the topic of various
papers. Most notably the Wisdom of Crowds concept has
been used in solving the Traveling Salesman Problem
(Sheng Kung Michael Yi, et al. 2010b) as well as the
Minimum Spanning Tree Problem (Sheng Kung Michael
Yi, et al. 2010a). In this paper we attempt to supplement
the solution produced by the genetic algorithm utilizing an
artificial crowd
        <xref ref-type="bibr" rid="ref1 ref1">(Leif H. Ashby and Roman V. Yampolskiy
2011; Roman V. Yampolskiy and Ahmed El-Barkouky
2011)</xref>
        .
      </p>
    </sec>
    <sec id="sec-3">
      <title>Proposed Approach</title>
      <p>Genetic algorithms share an overall structure and
workflow yet they vary in the specific details according to
the particular problem. The algorithm consists of a parent
selection method, a crossover method and a mutation
method.</p>
      <p>The general algorithm is:
Algorithm1: General Genetic Algorithm
define: population, parents, child
population = randomly generated chromosomes;
while (terminating condition is not reached) {</p>
      <p>gaRun();
}
// a single run of a genetic algorithm
function gaRun() {
parents = getParents();
child = crossover(parents);
child = mutate(child);
population.add(child);
}
The goal of the previous algorithm is to improve the
fitness of the population by mating its fittest individuals to
produce superior offspring that offer a better solution to
the problem. This process continues until a terminating
condition is reached which could be simply that the total
number of generations has been run or any other parameter
like non-improvement of fitness over a certain number of
generations or that a solution for the problem has been
found.</p>
      <p>The chromosome representation of the GCP is simply an
array with the length set to the number of vertices in the
graph. Each cell in the array is assigned a value from 0 to
the number of colors – 1. The adjacencies between the
vertices are represented by an adjacency matrix of
dimensions n×n where n: the number of vertices.
The ultimate goal when solving GCPs is to reach a solution
where no two adjacent vertices have the same color.
Therefore, the GA process continues until it either finds a
solution (i.e. 0 conflicts) or the algorithm has been run for
the predefined number of generations. In addition to the
GA, if a run fails to reach a solution a Wisdom of Crowds
approach will be applied to the top performers in an
attempt to produce a better solution.
The overall genetic algorithm in this approach is a
generational genetic algorithm. The population size is kept
constant at all times and with each generation the bottom
performing half of the population is removed and new
randomly generated chromosomes are added. The
population size is set to 50 chromosomes. The value was
chosen after testing a number of different population sizes.
The value 50 was the least value that produced the desired
results.</p>
      <p>The GA uses two different parent selection methods, a
single crossover method and two different mutation
methods. Which of the parent selection and mutation
methods ends up selected depends on the state of the
population and how close it is to finding a solution. The
parent selection, crossover and mutation methods are
outlined as follows:
Algorithm2: parentSelection1:
define: parent1, parent2, tempParents;
tempParents = two randomly selected chromosomes from the
population;
parent1 = the fitter of tempParents;
tempParents = two randomly selected chromosomes from the
population;
parent2 = the fitter of tempParents;
return parent1, parent2;
Algorithm3: parentSelection2:
define: parent1, parent2
parent1 = the top performing chromosome;
parent2 = the top performing chromosome;
return parent1, parent2;
Algorithm4: crossover
define: crosspoint, parent1, parent2, child
}
}
crosspoint = random point along a chromosome;
child = colors up to and including crosspoint from parent 1 +
colors after crosspoint to the end of the chromosome from
parent2;
return child;
Algorithm5: mutation1:
define: chromosome, allColors, adjacentColors, validColors,
newColor;
for each(vertex in chromosome) {
if (vertex has the same color as an adjacent vertex) {
adjacentColors = all adjacent colors;
validColors = allColors – adjacentColors;
newColor = random color from validColors;
chromosome.setColor(vertex, newColor)
}
return chromosome;
Algorithm6: mutation2:
define: chromosome, allColors
for each(vertex in chromosome) {
if (vertex has the same color as an adjacent vertex) {
newColor = random color from allColors;
chromosome.setColor(vertex, newColor)
}
return chromosome;
A bad edge is defined as an edge connecting two vertices
that have the same color. The number of bad edges is the
fitness score for any chromosome. As mentioned above, the
alteration between the two different parent selection and
mutation methods depends on the best fitness. If the best
fitness is greater than 4 then parentSelection1 and
mutation1 are used. If the best fitness is 4 or less then
parentSelection2 and mutation2 are used. This alteration is
the result of experimenting with the different data sets. It
was observed that when the best fitness score is low (i.e.
approaching an optimum) the usage of parent selection 2
(which copies the best chromosome as the new child) along
with mutation2 (which randomly selects a color for the
violating vertex) results in a solution more often and more
quickly than using the other two respective methods.
Finally, the algorithm is run for 20,000 generations or until
a solution with 0 bad edges is found. If a solution is not
found after 20,000 generations the
wisdomOfArtificialCrowds algorithm is run. The algorithm
used is a localized wisdom of crowds algorithm that only
builds a consensus out of the violating edges in the best
solution. Moreover, it uses the best half of the final
population to produce an aggregate solution. Only a
localized consensus is generated so as not to produce a
result that alters the correctly colored vertices. Also, it
takes the best half because they share the most similarity
and thus will most likely be different at the level of the bad
edges rather than the good ones.</p>
      <p>Algorithm7: wisdomOfArtificialCrowds
define: aggregateChromosome, newColor, expertChromosomes;
expertChromosomes = best half of the final population;
aggregateChromosome = best performing chromosome;
for</p>
      <p>each (vertex in graph) {
if (vertex is part of a bad edge) {
newColor = most used color for vertex
expertChromosomes;
aggregateChromosome.setColor(vertex, newColor)
}
}
among</p>
    </sec>
    <sec id="sec-4">
      <title>Data</title>
      <p>Data used to test our approach are derived from the
DIMACS benchmarking graph collection. DIMACS is the
Center for Discrete Mathematics and Theoretical Computer
Science. It is part of Rutgers University (The State
University of New Jersey Rutgers, et al. 2011). The data
are frequently used in challenges involving constraint
satisfaction problems. The files used have a .col extension.
Each file contains a header with the number of vertices (p)
and the number of edges (e):
A number of lines follow the header with each line
denoting the connected edges and their vertex indices:
16 files were chosen from the DIMACS collection. The
graphs the files represent vary in vertex count, edge count
and overall complexity. The vertex count ranges between
11 and 561 and the edge count ranges from 20 to 11654.
The rationale behind the selection of these graphs other
than the wide range of variation is that there is a known
chromatic number for each of them, or at least a good
approximation.</p>
      <p>The following files were used in this approach:
(myciel3.col, myciel4.col, myciel5.col, queen5_5.col,
queen6_6.col, queen7_ 7.col, queen8_8.col, huck.col,
jean.col, david.col. games120.col, miles250.col,
miles1000.col, anna.col, fpsol2.i.1.col, homer.col).</p>
    </sec>
    <sec id="sec-5">
      <title>Results</title>
      <p>
        The tests were run on a desktop PC with the following
specifications:
The following graphs plot the relationship between the
fitness and the generation for a sample set of the files used:
Figures 2 and 3 are not very interesting since the solutions
are found after a few generations. The next plot, however,
is of particular interest since it clearly represents the erratic
behavior of the fitness score between the initial rapid drop
until a solution is ultimately found. In standard genetic
algorithms the fitness score continues to increase or
decrease (depending of the definition of better fitness)
until the end of the run. This is not the case here. This
factor plays a huge role in obtaining the global optimum
with a higher probability than without it as will be
discussed later.
Sample solutions: (graphs were visualized using the JUNG
framework
        <xref ref-type="bibr" rid="ref1">(Joshua O'Madadhain, Danyel Fisher and Tom
Nelson 2011)</xref>
        ):
      </p>
    </sec>
    <sec id="sec-6">
      <title>Discussion</title>
      <p>During the design of this approach, the issue of designing
good operators was a constant concern. The goal of any
good operator is to bring the chromosomes of a population
closer to the desired solution. However, during the process,
a chromosome’s fitness often improves but eventually ends
up in a local optimum. The overall design of this approach
aimed to improve fitness towards the global optimum while
avoiding the pitfalls of local optima.</p>
      <p>To achieve that, a number of factors need to be considered.
Initially, the crossover function is applied to parents that
result from the first parent selection method. This method
selects parents by conducting a small tournament between
random pairs of chromosomes. Two pairs are chosen
randomly and the fitter of each pair becomes a parent.
These fit parents are then used as input to this crossover
method. The crossover conducts a simple one-point
crossover with the cross point being chosen at random.
The result of this crossover is then subjected to the first
mutation method. The mutation is carried out at a high rate
of 0.7. This mutation examines each vertex and if a vertex
violates the coloring constraint a valid color is chosen at
random.</p>
      <p>This process is very effective in reducing the number of
conflicts rapidly which can be seen in all the plots through
an almost perpendicular drop in fitness. However, in spite
of this method’s effectiveness at increasing fitness rapidly,
it has the side effect of keeping the solution at a local
optimum.</p>
      <p>To fix that, another parent selection and mutation method
is introduced. The two methods are applied when the
overall fitness of the best solution drops below 5 conflicts.
After that point crossover is no longer applied. The top
performer is selected and is subjected to the second
mutation method. This method finds the conflicting
vertices and replaces their conflicting colors with random
colors; which could be invalid as well. This has the
potential to either find a globally optimum solution (i.e. 0
conflicts) or produce a solution that is worse! This can be
observed by the erratic pattern in some of the graphs after
the sharp descent and before the actual resolution of the
problem.</p>
      <p>This seemingly worsening fitness is not bad however. In
fact, it is partly due to this worsening of the fitness that
some solutions are found at all! When the solution
becomes worse the fitness score increases. This will force
the algorithm back to using the first parent selection and
mutation methods. But, the population now contains a
solution that hadn’t been there before, which increases the
possibility of reaching the global optimum. The continuous
back and forth between parent selection and mutation
methods plays a crucial role in shifting the solution from a
local optimum to a global optimum.</p>
      <p>Finally, if a solution is not found after 20,000 generations,
a Wisdom of Artificial Crowds algorithm is applied to the
resultant population to produce a better solution. The
genetic algorithm had been producing optimal results and
thus, per the algorithmic workflow, the Wisdom of
Artificial Crowds postprocessor wouldn’t be applied.
However, in order to test its effectiveness, a test was
conducted by decreasing the generation count to 10,000 to
intentionally produce a suboptimal solution. The test was
carried out on the miles1000.col file. Before the
postprocessor was applied the best solution had 5
conflicting edges. After application of the Wisdom of
Artificial Crowds postprocessor the graph was colored
slightly differently but still had 5 conflicting edges. It is
worth noting that the postprocessor added an average of
250 ms to the overall process.</p>
      <p>
        The test cases used were very useful in both testing and
tuning the algorithm. The algorithm was able to scale
across the different graphs and produce optimum solutions
in each case. A recent 2011 publication presented a parallel
genetic algorithm for the GCP
        <xref ref-type="bibr" rid="ref1">(Reza Abbasian and Malek
Mouhoub 2011)</xref>
        . This paper used most of the same test
files that were used in Abbasian’s approach. That
approach’s proposed algorithm failed to solve three of the
graphs and only produced solutions when the color count
was increased by 1. The files representing these graphs are
queen6_6.col, queen7_7.col and queen8_8.col. The
algorithm used in our approach successfully solved these
three files using the specified known chromatic number as
the number of colors used. Our approach is also generally
faster at finding a solution. It is faster in all cases except
four. Three of those cases are the three aforementioned
files, which the comparative method did not succeed at
finding a solution using the known chromatic index. The
fourth case is miles1000.col.
      </p>
    </sec>
    <sec id="sec-7">
      <title>Conclusion</title>
      <p>The overarching algorithm used in this approach is a
genetic algorithm with a subsequent Wisdom of Crowds
post-processor. Within the genetic algorithm itself is a set
of operators that utilize methods from the genetic algorithm
domain as well as applying various heuristics in a
stochastic manner. The end result is an quick progressive
climb to the peak of the globally optimum solution.
The algorithms described here can also be applied to the
various subsets of the general GCP. In particular, Sudoku
can benefit from these algorithms where it can be
represented as a graph with 81 vertices that must be colored
using no more than 9 different colors (i.e. different
numbers).
for graph
coloring
problem. In Proceedings of The 13th annual conference on
Genetic and evolutionary computation, 521-528. Dublin, Ireland:
ACM
Ali, F. F., Nakao, Z., Tan, R. B., and Yen-Wei, Chen. 1999. An
evolutionary approach for graph coloring. In Proceedings of The
International Conference on Systems, Man, and Cybernetics,
527532. IEEE
Ashby, Leif H., and Yampolskiy, Roman V. 2011, Genetic
Algorithm and Wisdom of Artificial Crowds Algorithm Applied
to Light Up, The 16th International Conference on Computer
Games: AI, Animation, Mobile, Interactive Multimedia,
Educational &amp; Serious Games, Louisville, KY, USA: July 27 –
30, 2011
Back, T., Hammel, U., and Schwefel, H. P. 1997. Evolutionary
computation: comments on the history and current state. IEEE
Transactions on Evolutionary Computation 1: 3-17
Croitoru, Cornelius, Luchian, Henri, Gheorghies, Ovidiu, and
Apetrei, Adriana. 2002. A New Genetic Graph Coloring
Heuristic. In Proceedings of The Computational Symposium on
Graph Coloring and its Generalizations, 63-74. Ithaca, New
York, USA
Díaz, Isabel Méndez, and Zabala, Paula. 1999, A Generalization
of the Graph Coloring Problem, Departamento de Computacion,
Universidad de Buenes Aires.</p>
      <p>Durrett, Greg, Médard, Muriel, and O’Reilly, Una-May. 2010. A
Genetic Algorithm to Minimize Chromatic Entropy: 59-70, eds.
P. Cowling and P. Merz, Springer Berlin / Heidelberg
Eiben, A. E., Hinterding, R., and Michalewicz, Z. 1999.
Parameter control in evolutionary algorithms. IEEE Transactions
on Evolutionary Computation 2: 124-141
Glass, C. A., and Prugel-Bennett, A. 2003. Genetic algorithm for
graph coloring: Exploration of Galinier and Hao's algorithm.
Journal of Combinatorial Optimization 3: 229-236
Gwee, B. H., Lim, M. H., and Ho, J. S. 1993. Solving
fourcolouring map problem using genetic algorithm. In Proceedings
of First New Zealand International Two-Stream Conference on
Artificial Neural Networks and Expert Systems, 332-333. New
Zealand
Hale, W. K. 1980. Frequency assignment: Theory and
applications. Proceedings of the IEEE 12: 1497-1514
Han, Lixia, and Han, Zhanli. 2010. A Novel Bi-objective Genetic
Algorithm for the Graph Coloring Problem. In Proceedings of
The 2010 Second International Conference on Computer
Modeling and Simulation, 3-6. Sanya, China: IEEE Computer
Society</p>
      <p>Porumbel, Daniel Cosmin. 2009. Heuristic Algorithms and
Learning Techniques: Applications to the Graph Coloring
Problem, Thesis Draft, Département Informatique, Université
d'Angers
Shen, Justine W. 2003. Solving the Graph Coloring Problem
using Genetic Programming, in Genetic Algorithms and Genetic
Programming at Stanford 2003: 187-196, Stanford Bookstore
Shengning, Wu, and Sikun, Li. 2007. Extending Traditional
Graph-Coloring Register Allocation Exploiting Meta-heuristics
for Embedded Systems. In Proceedings of The Third International
Conference on Natural Computation. ICNC, 324-329. Haikou,
Hainan, China
Singha, S., Bhattacharya, T., and Chaudhuri, S. R. B. 2008. An
Approach for Reducing Crosstalk in Restricted Channel Routing
Using Graph Coloring Problem and Genetic Algorithm. In
Proceedings of The International Conference on Computer and
Electrical Engineering, 807-811. Phuket Island, Thailand
Srinivas, M., and Patnaik, L. M. 1994. Adaptive probabilities of
crossover and mutation in genetic algorithms. IEEE Transactions
on Systems, Man and Cybernetics 4: 656-667
Tagawa, K., Kanesige, K., Inoue, K., and Haneda, H. 1999.
Distance based hybrid genetic algorithm: an application for the
graph coloring problem. In Proceedings of Congress on
Evolutionary Computation, 2332. Washington DC
Yampolskiy, Roman V., and El-Barkouky, Ahmed. 2011.
Wisdom of Artificial Crowds Algorithm for Solving NP-Hard
Problems. International Journal of Bio-Inspired Computation
(IJBIC) 6: 358-369</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Abbasian</surname>
          </string-name>
          , Reza, and
          <string-name>
            <surname>Mouhoub</surname>
          </string-name>
          , Malek.
          <year>2011</year>
          .
          <article-title>An efficient hierarchical parallel genetic algorithm</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>