An Artificial Coevolutionary Framework for Adversarial AI

                                   Una-May O’Reilly and Erik Hemberg
                                         Massachusetts Institute of Technology


                       Abstract                                Coevolutionary Component     Attack controller   Engagement Component

                                                                Strategy adaptation of     Defense controller     Strategy evaluator
  Cyber adversaries are engaged in a perpetual arms                     Attack
  race. They are continuously maneuvering to outwit                       &
                                                                       Defense            Engagement measures
  the opposing posture. Replicating and studying the dy-
  namics of these engagements provides a route to proac-
  tive, adversarially-hardened cyber defenses. The con-       Figure 1: Component overview of our coevolutionary
  stant struggle can be computationally formulated as         adversarial AI framework. The coevolutionary com-
  a competitive coevolutionary system which generates         ponent performs search over the adversary controllers.
  many arms races that can be harvested for robust so-        The engagement component evaluates the strategies of
  lutions. We present a paradigm, techniques and tools        the adversaries and returns the measurements of the
  that recreate the coevolutionary process in the context     engagement.
  of network cyber security scenarios. We describe its
  current use cases and how we harvest defensive solu-
  tions from it.
                                                              ing to a variety of different objectives so a decision
                                                              maker can choose among them. In particular, the field
                    Introduction                              of coevolutionary algorithms (Popovici et al. 2012) pro-
The greatest concern a prepared cyber defender might          vides search heuristics that specifically direct competi-
raise is: “What if my assumptions are wrong?” It is           tive engagements. The engagements are between mem-
common knowledge that the only certainty is that an           bers of adversarial populations with opposing objectives
intelligent adversary will always keep trying to gain an      that each undergo selection on the basis of performance
advantage. Moreover, once forced to react, a defender         and variation to adapt. Coevolutionary logic results in
is too late. So, how can a defender use Artificial Intel-     population-wide adversarial dynamics. Such dynamics
ligence (AI) to gain an edge in an environment that is        can expose possible adversarial behaviors that a defense
stacked to the attacker’s advantage, where the defender       would like to anticipate. A competitive coevolutionary
seems doomed to always be one step behind?                    algorithm can be a component of a larger system, see
   One approach, adversarial AI, is to deploy defensive       for example Figure 1, in which a complementary compo-
configurations, that consider multiple possible antici-       nent sets up the environment where pairs of adversaries
pated adversarial behaviors and already take into ac-         engage and measures the outcome for each adversary.
count their expected impact, goal, strategies or tactics.     These measures can be used by the coevolutionary al-
Note that the precise metrics in this accounting can          gorithm to judge an adversary’s fitness.
vary. For example, impact can be any combination of              Herein we summarize a framework that we
financial cost, disruption level or outcome risk. Or, a       have used to generate robust defensive configura-
defender could prioritize a worst case, average case or       tions (Prado Sanchez 2018; Pertierra 2018). It is com-
a trade-off configuration.                                    posed of different coevolutionary algorithms to help it
   One way that such defensive configurations can be          generate diverse behavior. The algorithms, for further
found is by using stochastic search methods that first        diversity, use different “solution concepts”, i.e. mea-
explore the simulated competitive behavior of adver-          sures of adversarial success. Because engagements are
saries and then generate ranked configurations accord-        frequently computationally expensive and have to be
                                                              pairwise sampled from two populations each generation,
Copyright c by the papers authors. Copying permitted
                                                              the framework has a number of enhancements that en-
for private and academic purposes. In: Joseph Collins,
Prithviraj Dasgupta, Ranjeev Mittu (eds.): Proceedings of     able efficient use of a fixed budget of computation or
the AAAI Fall 2018 Symposium on Adversary-Aware Learn-        time.
ing Techniques and Trends in Cybersecurity, Arlington, VA,       The framework supports a number of use-cases using
USA, 18-19 October, 2018, published at http://ceur-ws.org     simulation and emulation of varying model granular-
ity. These include: A) Defending a peer-2-peer net-         lead to an arms race between test and solution, both
work against Distributed Denial of Service (DDOS) at-       evolving while pursuing opposite objectives (Popovici
tacks (Garcia et al. 2017) B) Defenses against spread-      et al. 2012). An example of learning in a coevolution-
ing device compromise in a segmented enterprise net-        ary algorithm is shown in Algorithm 1. A basic coevo-
work (Hemberg et al. 2018), and C) Deceptive de-            lutionary algorithm evolves two populations with e.g.
fense against the internal reconnaissance of an adver-      tournament selection and for variation uses crossover
sary within a software defined network (Pertierra 2018)     and mutation. One population comprises attacks and
   The framework is linked up to a decision sup-            the other defenses. In each generation, competitions
port module named ESTABLO (Sanchez et al. 2018;             are formed by pairing attack and defense. The popula-
Prado Sanchez 2018). The engagements of every run           tions are evolved in alternating steps: first the attacker
of any of the coevolutionary algorithms are cached and,     population is selected, varied, updated and evaluated
later, ESTABLO gathers adversaries resulting from dif-      against the defenders, and then the defender population
ferent algorithms for its compendium. It then competes      is selected, varied, updated and evaluated against the
the adversaries of each side against those of the other     defenders. Each attacker–defender pair is dispatched to
side and ranks each side’s members according to multi-      the engagement component to compete and the result is
ple criteria. It also provides visualizations and compar-   used as a component of fitness for each of them. Fitness
isons of adversarial behaviors. This information informs    is calculated over all an adversary’s engagements.
the decision process of a defensive manager.                   The representation of tests (and solutions) is cus-
   The adversarial AI framework’s specific contributions    tomizable in any coevolutionary algorithm (Rothlauf
are:                                                        2011) under the design constraint that it be amenable
   • The use of coevolutionary algorithms to adaptively     to stochastic variation, e.g. “genetic crossover” or mu-
generate adversarial dynamics supporting preemptively       tation. It may directly express the test or it may do so
investigating adversarial arms races that could occur.      indirectly, e.g. with a grammar. In the latter case, an
   • A suite of different coevolutionary algorithms that    intermediate interpreter works with a rule-based gram-
diversify the behavior of the adversaries to broaden the    mar to map from a “genome” that undergoes variation
potential dynamics.                                         to a “phenome” that expresses an executable behavior.
   • Use cases that model a variety of adversarial threat   Grammars (and GA representations, in general) offer
and defensive models.                                       design flexibility: changing out a grammar and the en-
   • A decision support module that supports selection      vironment of behavioral execution does not require any
of a superior anticipatory defensive configuration.         changes to the rest of the algorithm.
   Background provides context on modeling and sim-            Coevolutionary algorithms can encounter problem-
ulation and coevolutionary search algorithm. Frame-         atic dynamics where tests are unable improve solu-
work describes our coevolutionary method, engage-           tions, or drive toward a solution that is the a priori
ment component and decision support module. Use             intended goal. There are accepted remedies to specific
Cases provides examples applying to cyber security          coevolutionary pathologies (Bongard and Lipson 2005;
and network attacks. Conclusions summarizes and             Ficici 2004; Popovici et al. 2012). They generally in-
addresses future work.                                      clude maintaining population diversity so that a search
                                                            gradient is always present and using more explicit mem-
                    Background                              ory, e.g. a Hall of Fame or an archive, to prevent
The strategy of testing the security of a system by         regress (Miconi 2009). The pathologies of coevolution-
trying to successfully attack it is somewhat analogous      ary algorithms are similar to those encountered by gen-
to software fuzzing (Miller, Fredriksen, and So 1990).      erative adversarial networks (GANs) (Goodfellow et al.
Fuzzing tests software adaptively to search for bugs        2014; Arora et al. 2017)
while adaptive attacks test defenses. In contrast to
software where a bugs is fixed by humans, our ap-           Modeling and Simulation
proach automatically adapts a defense. This forms a         A coevolutionary algorithm includes an environment
novel counter attack. Fuzzing is driven by genetic al-      that supports executing the tests and solutions to com-
gorithms (GA) whereas, to drive cyber arms races in         pete against each other in each engagement. We use
which both adversaries adapt, our approach uses cou-        modeling and simulation for this purpose. Mod-sim sys-
pled GAs called competitive coevolutionary algorithms.      tems range in complexity, level of abstraction and res-
                                                            olution. Modeling and simulation comprise a powerful
Coevolutionary Search Algorithms                            approach, “mod-sim”, for investigating general security
Coevolutionary algorithms, related to evolutionary al-      scenarios (Tambe 2012), computer security (Thomp-
gorithms (Bäck 1996), explore domains in which the          son, Morris-King, and Cam 2016; Lange et al. 2017;
quality of a candidate solution is determined by its        Winterrose and Carter 2014) and network dynamics
ability to successfully pass some set of tests. Recip-      in particular, e.g., in CANDLES – the Coevolutionary,
rocally, a test’s quality is determined by its ability to   Agent-based, Network Defense Lightweight Event Sys-
force errors from some set of solutions. In competi-        tem of (Rush, Tauritz, and Kent 2015), attacker and
tive coevolution, similar to game theory, the search can    defender strategies are coevolved in the context of a sin-
                                                                                     Parameters                     BNF Grammar
Algorithm 1 Example Coevolutionary Algorithm
Input:
                                                                                  Search Algorithm                    CFG Parser
T : number of iterations L: Fitness function
µ: mutation probability, λ : population size                                      Grammar rewriting
 1: A0 ← [a1,0 , . . . , aλ,0 ] ∼ U (A) . Initialize minimizer population           Integer input sequence       Context Free Grammar        Output Sentence (Strategy)
 2: D0 ← [d1,0 , . . . , dλ,0 ] ∼ U (D) . Initialize maximizer population
 3: t ← 0                                      . Initialize iteration counter
 4: repeat
                                                                                   Search                                Engagement
 5:    t←t+1                                              . Increase counter
                                                                                                                                        Interpreter      Fitness Evaluator
 6:    At ← select(At−1 ))                                         . Selection
 7:    At ← perturb(At , µ))                                      . Mutation
                                                                                      Coevolutionary Algorithm                                              Fitness Value
 8:                                                         . Best minimizer
 9:    a0∗ , d0∗ ← arg mina∈At arg maxd∈Dt−1 L(a, d)
10:                                              . Replace worst minimizer
11:      if L(a0∗ , d0∗ ) < L(aλ,t−1 , dλ,t−1 ) then                             Figure 2: A BNF grammar and search parameters are
12:            aλ,t−1 ← a0∗                            . Update population       used as input. The grammar rewrites the integer input
13:      end if                                                                  to a sentence. Fitness is calculated by interpreting the
14:      At ← At−1                                        . Copy population
                                                                                 sentence and then evaluate it. The search component,
15:      t ← t + 1 . Increase counter before alternating to maximizer
16:      Dt ← select(Dt−1 ))                                       . Selection
                                                                                 a coevolutionary algorithm, modifies the solutions us-
17:      Dt ← perturb(Dt , µ))                                    . Mutation
                                                                                 ing two central mechanisms: fitness based selection and
18:                                                        . Best maximizer      random variation.
19:      a00 , d00 ← arg mina∈At arg maxd∈Dt L(a, d)
20:                                             . Replace worst maximizer
21:      if L(a00 , d00 ) > L(aλ,t , dλ,t−1 ) then                               the actual engagement environment. Mod-sim is ap-
22:            dλ,t−1 ← d00                            . Update population
                                                                                 propriate when testbeds incur long experimental cycle
23:      end if
                                                                                 times or do not abstract away irrelevant detail.
24:      Dt ← Dt−1                                        . Copy population
25: until t ≥ T
26: a∗ , d∗ ← arg mina∈AT arg maxd∈DT L(a, d) . Best minimizer                   Adversary Representation
27: return a∗ , d∗
                                                                                 The framework uses grammars to express open ended
                                                                                 behavioral action sequences for attack and defense
                                                                                 strategies (a.k.a controllers). See Figure 2 and (O’Neill
gle, custom, abstract computer network defense simu-                             and Ryan 2003) for more details. While the frame-
lation.                                                                          work’s grammars currently are strategic in nature, we
                                                                                 foresee incorporating higher level behavior related to
               Framework Components                                              plans and goals.
Coevolutionary Algorithms                                                           A grammar is introduced in Backus Naur Form
                                                                                 (BNF) and describes a language in the problem do-
The framework supports diverse behavior by executing                             main. The BNF description is parsed to a context
algorithms that vary in synchronization of the two pop-                          free grammar representation. Its (rewrite) rules express
ulations and solution concepts. (Prado Sanchez 2018;                             how a sentence, i.e. test or solution, can be composed
Pertierra 2018). Working within a fixed time or fitness                          by rewriting a start symbol. The adversaries are fixed
evaluation budget, the framework also                                            length integer vectors that are use to control the rewrit-
1. Caches engagements to avoid repeating them;                                   ing. To interpret them, in sequence each of the vector’s
                                                                                 integers is referenced. This resulting sentence is the
2. Uses Gaussian process estimation to identify and
                                                                                 strategy that is executed. For solving different prob-
  evaluate the most uncertain engagement (Pertierra
                                                                                 lems, it is only necessary to change the BNF gram-
  2018);
                                                                                 mar, engagement environment and fitness function of
3. Uses a recommender technique to approximate some                              the adversaries. This modularity, and reusability of the
  adversary’s fitnesses (Pertierra 2018); and                                    parser and rewriter are efficient software engineering
4. Uses a spatial grid to reduce complete pair-                                  and problem solving advantages. The grammar addi-
  wise engagements to a Moore neighborhood quan-                                 tionally helps communicate the framework’s function-
  tity (Mitchell 2006; Williams and Mitchell 2005).                              ality to stakeholders by enabling conversations and val-
                                                                                 idation at the domain level. This contributes to stake-
Engagement Environment                                                           holder confidence in solutions and the framework.
The engagement component is flexible and can support                             Decision Support
a problem-specific network testbed, simulator or model.
The abstraction level of the use case determines the                             Competitive coevolution has the following chal-
choice of a simple to more detailed mod-sim or even                              lenges (Sanchez et al. 2018; Prado Sanchez 2018):
1. Solutions and tests are not on comparable on a “level     placement defense challenge is to optimize the strategic
  playing field” because fitness is based solely on the      placement of assets in the network. While under the
  context of engagements.                                    threat of node-level DDOS attack, the defense must en-
2. Blind spots, unvisited by the algorithms may exist.       able a set of tasks. It does this by fielding feasible paths
                                                             between the nodes that host the assets which support
3. From multiple runs, with one or more algorithms, it       the tasks. A mobile asset is, for example, mobile per-
  is unclear how to automatically select a “best” solu-      sonnel or a software application that can be served by
  tion.                                                      any number of nodes. A task is, for example, the con-
  The framework’s decision support module, ESTABLO,          nection that allows personnel to use a software applica-
see Figure 3, addresses these challenges. ESTABLO:           tion.
A) runs competitive coevolutionary search algorithms
with different solution concepts; B) combines the best       Availability Attacks on Segmented
solutions and tests at the end of each run into a com-       Networks
pendium; C) competes each solution against different         Attackers also often introduce malware into networks.
test sets, including the compendium and a set of unseen      Once an attacker has compromised a device on a net-
tests, to measure its performance according to different     work, they can move to connected devices, akin to
solution concepts; D) selects the “best” solutions from      contagion. This use case considers network segmen-
the compendium using a ranking and filtering process;        tation, a widely recommended defensive strategy, de-
and E) visualizes the best solutions to support a trans-     ployed against the threat of serial network security at-
parent and auditable decision.                               tacks that delay the mission of the network’s opera-
                                                             tor (Hemberg et al. 2018) in the context of malware
        Use Cases of the Framework                           spread.
                                                                Network segmentation divides the network topologi-
In this section we demonstrate use cases of the Ad-          cally into enclaves that serve as isolation units to deter
versarial AI framework. Broadly their goal is to iden-       inter-enclave contagion. How much network segmenta-
tify defensive configurations that are effective against a   tion is helpful is a tradeoff. On the one hand, a more
range of potential adversaries.                              segmented network provides less mission efficiency be-
                                                             cause of increased overhead in inter-enclave communica-
DOS Attacks on Peer-to-Peer Networks                         tion. On the other hand, smaller enclaves contain com-
A peer-to-peer (P2P) network is a robust and resilient       promise by limiting the spread rate, and their cleansing
means of securing mission reliability in the face of ex-     incurs fewer mission delays. Adding complexity, given
treme distributed denial of service (DDOS) attacks.          some segmentation, a network operator can further use
The project named RIVALS (Garcia et al. 2017) assists        threat monitoring and network cleansing policies to de-
in developing P2P network defense strategies against         tect and dislodge attackers but they come with a trade-
DDOS attacks. It models adversarial DDOS attack and          off of cost versus efficacy.
defense dynamics to help identify robust network de-            The use case assumes a network supports an enter-
sign and deployment configurations that support mis-         prise in carrying out its business or mission, and that an
sion completion despite an ongoing attack.                   adversary employs availability attacks against the net-
   RIVALS models DDOS attack strategies using a va-          work to disrupt this mission. Specifically, the attacker
riety of behavioral languages ranging from simple to         starts by using an exploit to compromise a vulnerable
complex. A simple language e.g. allows a strategy to         device on the network. This inflicts a mission delay
select one or more network servers to disable for some       when a mission critical device is infected. Then, the
duration. Defenders can choose one of three different        attacker moves laterally to compromise additional de-
network routing protocols: shortest path, flooding and       vices and maximally delay the mission. The network
a peer-to-peer ring overlay to try to maintain their per-    and its segments are pre-determined but the placement
formance. A more complex one allows a varying number         of critical devices within an enclave and the deployment
of steps over which the attack is modulated in duration,     of defensive threat monitoring device are open to opti-
strength and targets and can even include online adap-       mization.
tation based on observed impact. Defenders can adapt            The use case employs a simulation model as its en-
based on local or global network conditions. Attack          gagement environment. Malware contagion of a specific
completion and resource cost minimization serve as at-       spread rate is assumed. The defender decides placement
tacker objectives. Mission completion and resource cost      of mission devices and tap sensitivities in the enclaves.
minimization are the reciprocal defender objectives. RI-     The attacker decides the strength, duration and num-
VALS has a suite of coevolutionary algorithms that use       ber of attacks in an attack plan targeting all enclaves.
archiving to maintain progressive exploration and that       For a network with a set of four enclave topologies, the
support different solution concepts as fitness metrics.      framework is able to generate strong availability attack
   An example of attackers from ESTABLO on a mobile          patterns that were not identified a priori. It also identi-
resource allocation defense used in RIVALS (Sanchez          fies effective configurations that minimize mission delay
et al. 2018) is shown in Figure 4. The mobile asset          when facing these attacks.
 Attacker 2
                                                                                                                                                                                                                 2
                                                          Coev
                                                         (MEU)
                                                                                                     all possibletests. � en it removes all that arealready in thecom-
                                                                                                     pendium. It then calculates thesmallest symmetric di�erencebe-                                    1             3
                                                      MinMaxCoev

                                       Experiment
                                                      (Best-Worst)
                                                                                                     tween each remainingtest andall testsin thecompendiumanduses                                                4
                                                                                                                                                                                                   0        6
                                         Config
                                                          IPCA
                                                                         Attackers                   thisinformation to construct afrequency distribution samplebased                                                 7
                                                        (Pareto)                                                                                                                                                9
                                                                         Defenders
                                                                                                     on di�erence. It then randomly draws fromthis distribution with a                                 8                       5

GECCO’18, July 2018, Kyoto, Japan                                                                      ***************, ***************, ***************, and ***************
                                                                        Solutions                                                                                                                                    10
                                                         rIPCA
                                                        (Pareto)       Compendium                    bias that favors tests of small and largedi�erences, i.e. very or not                        15       11
                                                                                                                                                                                                                          12
                                       Configure     Run multiple
                                                                     Pool solutions and
                                                                                                     similar to thecompendium. � eresults of thesedraws becomes its                                    14       13
                                      Experiment    coevolutionary
                   Attacker 1                         algorithms
                                                                     re-evaluate fitness
                                                                                                     unseen test set. ESTABLOthen calculates the�tness measurements
                                                                                                     of solutions over theset.
                                Figure 2: Compendium population in ESTABLO.                Figure6: Top solutionsfor attackers(Figure6aand 6b) and defenders(Figure6c) alongwith therankingschemethat produced
                                                                                                                                 Solution           Selection                  Visualization
    Coevolutionary Search            Compendium Creation                        Compendium them. Figure 6d is the worst case attack
                                                                                          Step    4. Evaluation
                                                                                                     Solution
                                                                                                     nodes 2 and 4, Selection
                                                                                                                                    (red) for the top defender (green), note that even though there still exists a physical
                                                                                           path from                             - Rank
                                                                                                                    the Chord overlay   network has been compromised and was   - Selected            solutions
                                                                                                                                                                                   not able to �nd a path.                  Deploy
    - Solution concepts              - Algorithm combination                    - All vs All
                                                                                          ESTABLOnext anticipates that thedecision             maker   is working    with
                        generality, in this paper ESTABLOonly stores the�nal population                                          - Unseen data                                 - Diversity
                                                                                                      a�ackersis5.16withavarianceof
                                                                                                     only  a priori information to  3.29whilethissolution
                                                                                                                                       guideit in selectingdisplayed    between solutions
                                                                                                                                                                 atop solution.   For     produced by thedi�erent algorithms weused.
                        and archive of each run in its compendium. ESTABLOconducts                    an averagephenotypedistanceof 14.83. � eother top solution has    For this experiment, wewereableto pick up on afew trends. For
                                                                                                     thi            it           d      k ith 4di� t l ti                       ki

Figure 3: Overview of the ESTABLO framework for decision support through selection and visualization by using a
compendium of solutions from coevolutionary algorithms.


                                                                                                                    gagement environment. This allows us to explore the
                                                                                                                    dynamics between attacker and defender on a network
                                                                                                                    where the deception and reconnaissance strategies can
                                                                                                                    be adapted in response to each other (Pertierra 2018).
                                                                                                                    A deception strategy is executed through a modified
                                                                                                                    POX SDN controller. A reconnaissance strategy is exe-
                                                                                                                    cuted by a NMAP scan(Lyon 2018). The attacker strat-
                                                                                                                    egy includes choices of: which IP addresses to scan, how
Figure 4: The x axis shows a sorted subsample of at-                                                                many IP addresses to scan, which subnets to scan, the
tackers (note, the top 10 are shown and then every                                                                  percent of the subnets to scan, the scanning speed, and
tenth) and the y axis shows the ranking score. The                                                                  the type of scan. The defender strategy includes choices
ranking is done on the scores from the compendium.                                                                  of: the number of subnets to setup, the number of hon-
The values for the same run and unseen test sets are                                                                eypots, the distribution of the real hosts throughout
shown on separate lines. The algorithm used to evolve                                                               the subnets, and the number of real hosts that exist
the attacker is shown by the marker and the color. The                                                              on the network. Fitness is comprised of four compo-
attacker in the box with the solid line is the top ranked                                                           nents: how fast the defender detects that there is a
solution from the Combined Score ranking schemes.                                                                   scan taking place, the total time it takes to run the
The solution in the dashed box is the top ranked so-                                                                scan, the number of times that the defender detects the
lution from the Minimum Fitness ranking scheme.                                                                     scanner, and the number of real hosts that the scan-
                                                                                                                    ner discovers. Through experimentation and analysis,
                                                                                                                    the framework is able to discover certain configurations
Internal Reconnaissance in Software                                                                                 that the defender can use to significantly increase its
Defined Networks                                                                                                    ability to detect scans. Similarly, there are specific re-
                                                                                                                    connaissance configurations that have a better chance
Once an adversary has compromised a network end-                                                                    of being undetected.
point, they can perform network reconnaissance (Sood
and Enbody 2013). After reconnaissance provides a
view of the network and an understanding of where                                                                                                                   Conclusion
vulnerable nodes are located, they are able to execute
a plan of attack. One way to protect against recon-                                                                 We have described an AI framework that recreates, in
naissance is by obfuscating the network to delay the at-                                                            an abstract way, the adversarial, competitive coevolu-
tacker. This approach is well suited to software defined                                                            tionary process that occurs in security scenarios. We
networks (SDN) such as those being used in many cloud                                                               presented its current use cases and how we harvest de-
server settings because it requires programmability that                                                            fensive solutions from it. Future work includes extend-
they support (Kirkpatrick 2013). The SDN controller                                                                 ing it to support more cyber security applications, con-
knows which machines are actually on the network and                                                                sidering other use cases and developing more efficient
can superficially alter (without function loss) the net-                                                            or true to reality algorithms.
work view of each node, as well as place decoys (hon-
eypots) on the network to mislead, trap and slow down                                                                                                    Acknowledgments
reconnaissance.
  One such multi-component deceptive defense system                                                                 This material is based upon work supported by
(Achleitner, Laporta, and McDaniel 2016) foils scan-                                                                DARPA. The views and conclusions contained herein
ning by generating “camouflaged” versions of the actual                                                             are those of the authors and should not be interpreted
network and providing them to hosts when they renew                                                                 as necessarily representing the official policies or en-
their DHCP leases. We use this deception system and                                                                 dorsements. Either expressed or implied of Applied
mininet (Team 2018) within the framework as an en-                                                                  Communication Services, or the US Government.
                     References                              Pertierra, M. 2018. Investigating coevolutionary algo-
Achleitner, S.; Laporta, T.; and McDaniel, P. 2016.          rithms for expensive fitness evaluations in cybersecu-
Cyber deception: Virtual networks to defend insider re-      rity. Master’s thesis, Massachusetts Institute of Tech-
connaissance. In Proceedings of the 2016 International       nology.
Workshop on Managing Insider Security Threats 57–68.         Popovici, E.; Bucci, A.; Wiegand, R. P.; and De Jong,
Arora, S.; Ge, R.; Liang, Y.; Ma, T.; and Zhang,             E. D. 2012. Coevolutionary principles. In Handbook of
Y. 2017. Generalization and Equilibrium in Gen-              natural computing. Springer. 987–1033.
erative Adversarial Nets (GANs).          arXiv preprint     Prado Sanchez, D. 2018. Visualizing adversaries - trans-
arXiv:1703.00573.                                            parent pooling approaches for decision support in cy-
Bäck, T. 1996. Evolutionary Algorithms in Theory and         bersecurity. Master’s thesis, Massachusetts Institute of
Practice: Evolution Strategies, Evolutionary Program-        Technology.
ming, Genetic Algorithms. Oxford University Press.           Rothlauf, F. 2011. Design of modern heuristics: princi-
Bongard, J. C., and Lipson, H. 2005. Nonlinear               ples and application. Springer Science & Business Me-
system identification using coevolution of models and        dia.
tests. IEEE Transactions on Evolutionary Computa-            Rush, G.; Tauritz, D. R.; and Kent, A. D. 2015. Co-
tion 9(4):361–384.                                           evolutionary agent-based network defense lightweight
Ficici, S. G. 2004. Solution concepts in coevolutionary      event system (candles). In Proceedings of the Compan-
algorithms. Ph.D. Dissertation, Citeseer.                    ion Publication of the 2015 on Genetic and Evolution-
Garcia, D.; Lugo, A. E.; Hemberg, E.; and O’Reilly,          ary Computation Conference, 859–866. ACM.
U.-M. 2017. Investigating coevolutionary archive based       Sanchez, D. P.; Pertierra, M. A.; Hemberg, E.; and
genetic algorithms on cyber defense networks. In Pro-        O’Reilly, U.-M. 2018. Competitive coevolutionary al-
ceedings of the Genetic and Evolutionary Computation         gorithm decision support. In Proceedings of the Genetic
Conference Companion, GECCO ’17, 1455–1462. New              and Evolutionary Computation Conference Companion,
York, NY, USA: ACM.                                          300–301. ACM.
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.;        Sood, A., and Enbody, R. 2013. Targeted cyberattacks:
Warde-Farley, D.; Ozair, S.; Courville, A.; and Bengio,      a superset of advanced persistent threats. IEEE security
Y. 2014. Generative adversarial nets. In Advances in         & privacy 11(1):54–61.
Neural Information Processing Systems, 2672–2680.            Tambe, M., ed. 2012. Security and Game Theory: Algo-
Hemberg, E.; Zipkin, J. R.; Skowyra, R. W.; Wagner,          rithms, Deployed Systems, Lessons Learned. Cambridge
N.; and O’Reilly, U.-M. 2018. Adversarial co-evolution       University Press.
of attack and defense in a segmented computer network        Team, M. 2018. Mininet - realistic virtual sdn network
environment. In Proceedings of the Genetic and Evo-          emulator. http://mininet.org/. [Online; accessed 6-
lutionary Computation Conference Companion, 1648–            July-2018].
1655. ACM.
                                                             Thompson, B.; Morris-King, J.; and Cam, H. 2016.
Kirkpatrick, K. 2013. Software-defined networking.           Controlling risk of data exfiltration in cyber networks
Communications of the ACM 56(9).                             due to stealthy propagating malware. In Military Com-
Lange, M.; Kott, A.; Ben-Asher, N.; Mees, W.; Baykal,        munications Conference, MILCOM 2016-2016 IEEE,
N.; Vidu, C.-M.; Merialdo, M.; Malowidzki, M.; and           479–484. IEEE.
Madahar, B. 2017. Recommendations for model-driven           Williams, N., and Mitchell, M. 2005. Investigating
paradigms for integrated approaches to cyber defense.        the success of spatial coevolution. In Proceedings of
arXiv preprint arXiv:1703.03306.                             the 7th annual conference on Genetic and evolutionary
Lyon, G. 2018. Nmap network scanner. https://nmap.           computation, 523–530. ACM.
org/. [Online; accessed 6-July-2018].                        Winterrose, M. L., and Carter, K. M. 2014. Strategic
Miconi, T. 2009. Why coevolution doesn’t "work":             evolution of adversaries against temporal platform di-
superiority and progress in coevolution. In European         versity active cyber defenses. In Proceedings of the 2014
Conference on Genetic Programming, 49–60. Springer           Symposium on Agent Directed Simulation, 9. Society
Berlin Heidelberg.                                           for Computer Simulation International.
Miller, B. P.; Fredriksen, L.; and So, B. 1990. An
empirical study of the reliability of unix utilities. Com-
munications of the ACM 33(12):32–44.
Mitchell, M. 2006. Coevolutionary learning with spa-
tially distributed populations. Computational intelli-
gence: principles and practice.
O’Neill, M., and Ryan, C. 2003. Grammatical evolu-
tion: evolutionary automatic programming in an arbi-
trary language, volume 4. Springer.