An Artificial Coevolutionary Framework for Adversarial AI Una-May O’Reilly and Erik Hemberg Massachusetts Institute of Technology Abstract Coevolutionary Component Attack controller Engagement Component Strategy adaptation of Defense controller Strategy evaluator Cyber adversaries are engaged in a perpetual arms Attack race. They are continuously maneuvering to outwit & Defense Engagement measures the opposing posture. Replicating and studying the dy- namics of these engagements provides a route to proac- tive, adversarially-hardened cyber defenses. The con- Figure 1: Component overview of our coevolutionary stant struggle can be computationally formulated as adversarial AI framework. The coevolutionary com- a competitive coevolutionary system which generates ponent performs search over the adversary controllers. many arms races that can be harvested for robust so- The engagement component evaluates the strategies of lutions. We present a paradigm, techniques and tools the adversaries and returns the measurements of the that recreate the coevolutionary process in the context engagement. of network cyber security scenarios. We describe its current use cases and how we harvest defensive solu- tions from it. ing to a variety of different objectives so a decision maker can choose among them. In particular, the field Introduction of coevolutionary algorithms (Popovici et al. 2012) pro- The greatest concern a prepared cyber defender might vides search heuristics that specifically direct competi- raise is: “What if my assumptions are wrong?” It is tive engagements. The engagements are between mem- common knowledge that the only certainty is that an bers of adversarial populations with opposing objectives intelligent adversary will always keep trying to gain an that each undergo selection on the basis of performance advantage. Moreover, once forced to react, a defender and variation to adapt. Coevolutionary logic results in is too late. So, how can a defender use Artificial Intel- population-wide adversarial dynamics. Such dynamics ligence (AI) to gain an edge in an environment that is can expose possible adversarial behaviors that a defense stacked to the attacker’s advantage, where the defender would like to anticipate. A competitive coevolutionary seems doomed to always be one step behind? algorithm can be a component of a larger system, see One approach, adversarial AI, is to deploy defensive for example Figure 1, in which a complementary compo- configurations, that consider multiple possible antici- nent sets up the environment where pairs of adversaries pated adversarial behaviors and already take into ac- engage and measures the outcome for each adversary. count their expected impact, goal, strategies or tactics. These measures can be used by the coevolutionary al- Note that the precise metrics in this accounting can gorithm to judge an adversary’s fitness. vary. For example, impact can be any combination of Herein we summarize a framework that we financial cost, disruption level or outcome risk. Or, a have used to generate robust defensive configura- defender could prioritize a worst case, average case or tions (Prado Sanchez 2018; Pertierra 2018). It is com- a trade-off configuration. posed of different coevolutionary algorithms to help it One way that such defensive configurations can be generate diverse behavior. The algorithms, for further found is by using stochastic search methods that first diversity, use different “solution concepts”, i.e. mea- explore the simulated competitive behavior of adver- sures of adversarial success. Because engagements are saries and then generate ranked configurations accord- frequently computationally expensive and have to be pairwise sampled from two populations each generation, Copyright c by the papers authors. Copying permitted the framework has a number of enhancements that en- for private and academic purposes. In: Joseph Collins, Prithviraj Dasgupta, Ranjeev Mittu (eds.): Proceedings of able efficient use of a fixed budget of computation or the AAAI Fall 2018 Symposium on Adversary-Aware Learn- time. ing Techniques and Trends in Cybersecurity, Arlington, VA, The framework supports a number of use-cases using USA, 18-19 October, 2018, published at http://ceur-ws.org simulation and emulation of varying model granular- ity. These include: A) Defending a peer-2-peer net- lead to an arms race between test and solution, both work against Distributed Denial of Service (DDOS) at- evolving while pursuing opposite objectives (Popovici tacks (Garcia et al. 2017) B) Defenses against spread- et al. 2012). An example of learning in a coevolution- ing device compromise in a segmented enterprise net- ary algorithm is shown in Algorithm 1. A basic coevo- work (Hemberg et al. 2018), and C) Deceptive de- lutionary algorithm evolves two populations with e.g. fense against the internal reconnaissance of an adver- tournament selection and for variation uses crossover sary within a software defined network (Pertierra 2018) and mutation. One population comprises attacks and The framework is linked up to a decision sup- the other defenses. In each generation, competitions port module named ESTABLO (Sanchez et al. 2018; are formed by pairing attack and defense. The popula- Prado Sanchez 2018). The engagements of every run tions are evolved in alternating steps: first the attacker of any of the coevolutionary algorithms are cached and, population is selected, varied, updated and evaluated later, ESTABLO gathers adversaries resulting from dif- against the defenders, and then the defender population ferent algorithms for its compendium. It then competes is selected, varied, updated and evaluated against the the adversaries of each side against those of the other defenders. Each attacker–defender pair is dispatched to side and ranks each side’s members according to multi- the engagement component to compete and the result is ple criteria. It also provides visualizations and compar- used as a component of fitness for each of them. Fitness isons of adversarial behaviors. This information informs is calculated over all an adversary’s engagements. the decision process of a defensive manager. The representation of tests (and solutions) is cus- The adversarial AI framework’s specific contributions tomizable in any coevolutionary algorithm (Rothlauf are: 2011) under the design constraint that it be amenable • The use of coevolutionary algorithms to adaptively to stochastic variation, e.g. “genetic crossover” or mu- generate adversarial dynamics supporting preemptively tation. It may directly express the test or it may do so investigating adversarial arms races that could occur. indirectly, e.g. with a grammar. In the latter case, an • A suite of different coevolutionary algorithms that intermediate interpreter works with a rule-based gram- diversify the behavior of the adversaries to broaden the mar to map from a “genome” that undergoes variation potential dynamics. to a “phenome” that expresses an executable behavior. • Use cases that model a variety of adversarial threat Grammars (and GA representations, in general) offer and defensive models. design flexibility: changing out a grammar and the en- • A decision support module that supports selection vironment of behavioral execution does not require any of a superior anticipatory defensive configuration. changes to the rest of the algorithm. Background provides context on modeling and sim- Coevolutionary algorithms can encounter problem- ulation and coevolutionary search algorithm. Frame- atic dynamics where tests are unable improve solu- work describes our coevolutionary method, engage- tions, or drive toward a solution that is the a priori ment component and decision support module. Use intended goal. There are accepted remedies to specific Cases provides examples applying to cyber security coevolutionary pathologies (Bongard and Lipson 2005; and network attacks. Conclusions summarizes and Ficici 2004; Popovici et al. 2012). They generally in- addresses future work. clude maintaining population diversity so that a search gradient is always present and using more explicit mem- Background ory, e.g. a Hall of Fame or an archive, to prevent The strategy of testing the security of a system by regress (Miconi 2009). The pathologies of coevolution- trying to successfully attack it is somewhat analogous ary algorithms are similar to those encountered by gen- to software fuzzing (Miller, Fredriksen, and So 1990). erative adversarial networks (GANs) (Goodfellow et al. Fuzzing tests software adaptively to search for bugs 2014; Arora et al. 2017) while adaptive attacks test defenses. In contrast to software where a bugs is fixed by humans, our ap- Modeling and Simulation proach automatically adapts a defense. This forms a A coevolutionary algorithm includes an environment novel counter attack. Fuzzing is driven by genetic al- that supports executing the tests and solutions to com- gorithms (GA) whereas, to drive cyber arms races in pete against each other in each engagement. We use which both adversaries adapt, our approach uses cou- modeling and simulation for this purpose. Mod-sim sys- pled GAs called competitive coevolutionary algorithms. tems range in complexity, level of abstraction and res- olution. Modeling and simulation comprise a powerful Coevolutionary Search Algorithms approach, “mod-sim”, for investigating general security Coevolutionary algorithms, related to evolutionary al- scenarios (Tambe 2012), computer security (Thomp- gorithms (Bäck 1996), explore domains in which the son, Morris-King, and Cam 2016; Lange et al. 2017; quality of a candidate solution is determined by its Winterrose and Carter 2014) and network dynamics ability to successfully pass some set of tests. Recip- in particular, e.g., in CANDLES – the Coevolutionary, rocally, a test’s quality is determined by its ability to Agent-based, Network Defense Lightweight Event Sys- force errors from some set of solutions. In competi- tem of (Rush, Tauritz, and Kent 2015), attacker and tive coevolution, similar to game theory, the search can defender strategies are coevolved in the context of a sin- Parameters BNF Grammar Algorithm 1 Example Coevolutionary Algorithm Input: Search Algorithm CFG Parser T : number of iterations L: Fitness function µ: mutation probability, λ : population size Grammar rewriting 1: A0 ← [a1,0 , . . . , aλ,0 ] ∼ U (A) . Initialize minimizer population Integer input sequence Context Free Grammar Output Sentence (Strategy) 2: D0 ← [d1,0 , . . . , dλ,0 ] ∼ U (D) . Initialize maximizer population 3: t ← 0 . Initialize iteration counter 4: repeat Search Engagement 5: t←t+1 . Increase counter Interpreter Fitness Evaluator 6: At ← select(At−1 )) . Selection 7: At ← perturb(At , µ)) . Mutation Coevolutionary Algorithm Fitness Value 8: . Best minimizer 9: a0∗ , d0∗ ← arg mina∈At arg maxd∈Dt−1 L(a, d) 10: . Replace worst minimizer 11: if L(a0∗ , d0∗ ) < L(aλ,t−1 , dλ,t−1 ) then Figure 2: A BNF grammar and search parameters are 12: aλ,t−1 ← a0∗ . Update population used as input. The grammar rewrites the integer input 13: end if to a sentence. Fitness is calculated by interpreting the 14: At ← At−1 . Copy population sentence and then evaluate it. The search component, 15: t ← t + 1 . Increase counter before alternating to maximizer 16: Dt ← select(Dt−1 )) . Selection a coevolutionary algorithm, modifies the solutions us- 17: Dt ← perturb(Dt , µ)) . Mutation ing two central mechanisms: fitness based selection and 18: . Best maximizer random variation. 19: a00 , d00 ← arg mina∈At arg maxd∈Dt L(a, d) 20: . Replace worst maximizer 21: if L(a00 , d00 ) > L(aλ,t , dλ,t−1 ) then the actual engagement environment. Mod-sim is ap- 22: dλ,t−1 ← d00 . Update population propriate when testbeds incur long experimental cycle 23: end if times or do not abstract away irrelevant detail. 24: Dt ← Dt−1 . Copy population 25: until t ≥ T 26: a∗ , d∗ ← arg mina∈AT arg maxd∈DT L(a, d) . Best minimizer Adversary Representation 27: return a∗ , d∗ The framework uses grammars to express open ended behavioral action sequences for attack and defense strategies (a.k.a controllers). See Figure 2 and (O’Neill gle, custom, abstract computer network defense simu- and Ryan 2003) for more details. While the frame- lation. work’s grammars currently are strategic in nature, we foresee incorporating higher level behavior related to Framework Components plans and goals. Coevolutionary Algorithms A grammar is introduced in Backus Naur Form (BNF) and describes a language in the problem do- The framework supports diverse behavior by executing main. The BNF description is parsed to a context algorithms that vary in synchronization of the two pop- free grammar representation. Its (rewrite) rules express ulations and solution concepts. (Prado Sanchez 2018; how a sentence, i.e. test or solution, can be composed Pertierra 2018). Working within a fixed time or fitness by rewriting a start symbol. The adversaries are fixed evaluation budget, the framework also length integer vectors that are use to control the rewrit- 1. Caches engagements to avoid repeating them; ing. To interpret them, in sequence each of the vector’s integers is referenced. This resulting sentence is the 2. Uses Gaussian process estimation to identify and strategy that is executed. For solving different prob- evaluate the most uncertain engagement (Pertierra lems, it is only necessary to change the BNF gram- 2018); mar, engagement environment and fitness function of 3. Uses a recommender technique to approximate some the adversaries. This modularity, and reusability of the adversary’s fitnesses (Pertierra 2018); and parser and rewriter are efficient software engineering 4. Uses a spatial grid to reduce complete pair- and problem solving advantages. The grammar addi- wise engagements to a Moore neighborhood quan- tionally helps communicate the framework’s function- tity (Mitchell 2006; Williams and Mitchell 2005). ality to stakeholders by enabling conversations and val- idation at the domain level. This contributes to stake- Engagement Environment holder confidence in solutions and the framework. The engagement component is flexible and can support Decision Support a problem-specific network testbed, simulator or model. The abstraction level of the use case determines the Competitive coevolution has the following chal- choice of a simple to more detailed mod-sim or even lenges (Sanchez et al. 2018; Prado Sanchez 2018): 1. Solutions and tests are not on comparable on a “level placement defense challenge is to optimize the strategic playing field” because fitness is based solely on the placement of assets in the network. While under the context of engagements. threat of node-level DDOS attack, the defense must en- 2. Blind spots, unvisited by the algorithms may exist. able a set of tasks. It does this by fielding feasible paths between the nodes that host the assets which support 3. From multiple runs, with one or more algorithms, it the tasks. A mobile asset is, for example, mobile per- is unclear how to automatically select a “best” solu- sonnel or a software application that can be served by tion. any number of nodes. A task is, for example, the con- The framework’s decision support module, ESTABLO, nection that allows personnel to use a software applica- see Figure 3, addresses these challenges. ESTABLO: tion. A) runs competitive coevolutionary search algorithms with different solution concepts; B) combines the best Availability Attacks on Segmented solutions and tests at the end of each run into a com- Networks pendium; C) competes each solution against different Attackers also often introduce malware into networks. test sets, including the compendium and a set of unseen Once an attacker has compromised a device on a net- tests, to measure its performance according to different work, they can move to connected devices, akin to solution concepts; D) selects the “best” solutions from contagion. This use case considers network segmen- the compendium using a ranking and filtering process; tation, a widely recommended defensive strategy, de- and E) visualizes the best solutions to support a trans- ployed against the threat of serial network security at- parent and auditable decision. tacks that delay the mission of the network’s opera- tor (Hemberg et al. 2018) in the context of malware Use Cases of the Framework spread. Network segmentation divides the network topologi- In this section we demonstrate use cases of the Ad- cally into enclaves that serve as isolation units to deter versarial AI framework. Broadly their goal is to iden- inter-enclave contagion. How much network segmenta- tify defensive configurations that are effective against a tion is helpful is a tradeoff. On the one hand, a more range of potential adversaries. segmented network provides less mission efficiency be- cause of increased overhead in inter-enclave communica- DOS Attacks on Peer-to-Peer Networks tion. On the other hand, smaller enclaves contain com- A peer-to-peer (P2P) network is a robust and resilient promise by limiting the spread rate, and their cleansing means of securing mission reliability in the face of ex- incurs fewer mission delays. Adding complexity, given treme distributed denial of service (DDOS) attacks. some segmentation, a network operator can further use The project named RIVALS (Garcia et al. 2017) assists threat monitoring and network cleansing policies to de- in developing P2P network defense strategies against tect and dislodge attackers but they come with a trade- DDOS attacks. It models adversarial DDOS attack and off of cost versus efficacy. defense dynamics to help identify robust network de- The use case assumes a network supports an enter- sign and deployment configurations that support mis- prise in carrying out its business or mission, and that an sion completion despite an ongoing attack. adversary employs availability attacks against the net- RIVALS models DDOS attack strategies using a va- work to disrupt this mission. Specifically, the attacker riety of behavioral languages ranging from simple to starts by using an exploit to compromise a vulnerable complex. A simple language e.g. allows a strategy to device on the network. This inflicts a mission delay select one or more network servers to disable for some when a mission critical device is infected. Then, the duration. Defenders can choose one of three different attacker moves laterally to compromise additional de- network routing protocols: shortest path, flooding and vices and maximally delay the mission. The network a peer-to-peer ring overlay to try to maintain their per- and its segments are pre-determined but the placement formance. A more complex one allows a varying number of critical devices within an enclave and the deployment of steps over which the attack is modulated in duration, of defensive threat monitoring device are open to opti- strength and targets and can even include online adap- mization. tation based on observed impact. Defenders can adapt The use case employs a simulation model as its en- based on local or global network conditions. Attack gagement environment. Malware contagion of a specific completion and resource cost minimization serve as at- spread rate is assumed. The defender decides placement tacker objectives. Mission completion and resource cost of mission devices and tap sensitivities in the enclaves. minimization are the reciprocal defender objectives. RI- The attacker decides the strength, duration and num- VALS has a suite of coevolutionary algorithms that use ber of attacks in an attack plan targeting all enclaves. archiving to maintain progressive exploration and that For a network with a set of four enclave topologies, the support different solution concepts as fitness metrics. framework is able to generate strong availability attack An example of attackers from ESTABLO on a mobile patterns that were not identified a priori. It also identi- resource allocation defense used in RIVALS (Sanchez fies effective configurations that minimize mission delay et al. 2018) is shown in Figure 4. The mobile asset when facing these attacks. Attacker 2 2 Coev (MEU) all possibletests. � en it removes all that arealready in thecom- pendium. It then calculates thesmallest symmetric di�erencebe- 1 3 MinMaxCoev Experiment (Best-Worst) tween each remainingtest andall testsin thecompendiumanduses 4 0 6 Config IPCA Attackers thisinformation to construct afrequency distribution samplebased 7 (Pareto) 9 Defenders on di�erence. It then randomly draws fromthis distribution with a 8 5 GECCO’18, July 2018, Kyoto, Japan ***************, ***************, ***************, and *************** Solutions 10 rIPCA (Pareto) Compendium bias that favors tests of small and largedi�erences, i.e. very or not 15 11 12 Configure Run multiple Pool solutions and similar to thecompendium. � eresults of thesedraws becomes its 14 13 Experiment coevolutionary Attacker 1 algorithms re-evaluate fitness unseen test set. ESTABLOthen calculates the�tness measurements of solutions over theset. Figure 2: Compendium population in ESTABLO. Figure6: Top solutionsfor attackers(Figure6aand 6b) and defenders(Figure6c) alongwith therankingschemethat produced Solution Selection Visualization Coevolutionary Search Compendium Creation Compendium them. Figure 6d is the worst case attack Step 4. Evaluation Solution nodes 2 and 4, Selection (red) for the top defender (green), note that even though there still exists a physical path from - Rank the Chord overlay network has been compromised and was - Selected solutions not able to �nd a path. Deploy - Solution concepts - Algorithm combination - All vs All ESTABLOnext anticipates that thedecision maker is working with generality, in this paper ESTABLOonly stores the�nal population - Unseen data - Diversity a�ackersis5.16withavarianceof only a priori information to 3.29whilethissolution guideit in selectingdisplayed between solutions atop solution. For produced by thedi�erent algorithms weused. and archive of each run in its compendium. ESTABLOconducts an averagephenotypedistanceof 14.83. � eother top solution has For this experiment, wewereableto pick up on afew trends. For thi it d k ith 4di� t l ti ki Figure 3: Overview of the ESTABLO framework for decision support through selection and visualization by using a compendium of solutions from coevolutionary algorithms. gagement environment. This allows us to explore the dynamics between attacker and defender on a network where the deception and reconnaissance strategies can be adapted in response to each other (Pertierra 2018). A deception strategy is executed through a modified POX SDN controller. A reconnaissance strategy is exe- cuted by a NMAP scan(Lyon 2018). The attacker strat- egy includes choices of: which IP addresses to scan, how Figure 4: The x axis shows a sorted subsample of at- many IP addresses to scan, which subnets to scan, the tackers (note, the top 10 are shown and then every percent of the subnets to scan, the scanning speed, and tenth) and the y axis shows the ranking score. The the type of scan. The defender strategy includes choices ranking is done on the scores from the compendium. of: the number of subnets to setup, the number of hon- The values for the same run and unseen test sets are eypots, the distribution of the real hosts throughout shown on separate lines. The algorithm used to evolve the subnets, and the number of real hosts that exist the attacker is shown by the marker and the color. The on the network. Fitness is comprised of four compo- attacker in the box with the solid line is the top ranked nents: how fast the defender detects that there is a solution from the Combined Score ranking schemes. scan taking place, the total time it takes to run the The solution in the dashed box is the top ranked so- scan, the number of times that the defender detects the lution from the Minimum Fitness ranking scheme. scanner, and the number of real hosts that the scan- ner discovers. Through experimentation and analysis, the framework is able to discover certain configurations Internal Reconnaissance in Software that the defender can use to significantly increase its Defined Networks ability to detect scans. Similarly, there are specific re- connaissance configurations that have a better chance Once an adversary has compromised a network end- of being undetected. point, they can perform network reconnaissance (Sood and Enbody 2013). After reconnaissance provides a view of the network and an understanding of where Conclusion vulnerable nodes are located, they are able to execute a plan of attack. One way to protect against recon- We have described an AI framework that recreates, in naissance is by obfuscating the network to delay the at- an abstract way, the adversarial, competitive coevolu- tacker. This approach is well suited to software defined tionary process that occurs in security scenarios. We networks (SDN) such as those being used in many cloud presented its current use cases and how we harvest de- server settings because it requires programmability that fensive solutions from it. Future work includes extend- they support (Kirkpatrick 2013). The SDN controller ing it to support more cyber security applications, con- knows which machines are actually on the network and sidering other use cases and developing more efficient can superficially alter (without function loss) the net- or true to reality algorithms. work view of each node, as well as place decoys (hon- eypots) on the network to mislead, trap and slow down Acknowledgments reconnaissance. One such multi-component deceptive defense system This material is based upon work supported by (Achleitner, Laporta, and McDaniel 2016) foils scan- DARPA. The views and conclusions contained herein ning by generating “camouflaged” versions of the actual are those of the authors and should not be interpreted network and providing them to hosts when they renew as necessarily representing the official policies or en- their DHCP leases. We use this deception system and dorsements. Either expressed or implied of Applied mininet (Team 2018) within the framework as an en- Communication Services, or the US Government. References Pertierra, M. 2018. Investigating coevolutionary algo- Achleitner, S.; Laporta, T.; and McDaniel, P. 2016. rithms for expensive fitness evaluations in cybersecu- Cyber deception: Virtual networks to defend insider re- rity. Master’s thesis, Massachusetts Institute of Tech- connaissance. In Proceedings of the 2016 International nology. Workshop on Managing Insider Security Threats 57–68. Popovici, E.; Bucci, A.; Wiegand, R. P.; and De Jong, Arora, S.; Ge, R.; Liang, Y.; Ma, T.; and Zhang, E. D. 2012. Coevolutionary principles. In Handbook of Y. 2017. Generalization and Equilibrium in Gen- natural computing. Springer. 987–1033. erative Adversarial Nets (GANs). arXiv preprint Prado Sanchez, D. 2018. Visualizing adversaries - trans- arXiv:1703.00573. parent pooling approaches for decision support in cy- Bäck, T. 1996. Evolutionary Algorithms in Theory and bersecurity. Master’s thesis, Massachusetts Institute of Practice: Evolution Strategies, Evolutionary Program- Technology. ming, Genetic Algorithms. Oxford University Press. Rothlauf, F. 2011. Design of modern heuristics: princi- Bongard, J. C., and Lipson, H. 2005. Nonlinear ples and application. Springer Science & Business Me- system identification using coevolution of models and dia. tests. IEEE Transactions on Evolutionary Computa- Rush, G.; Tauritz, D. R.; and Kent, A. D. 2015. Co- tion 9(4):361–384. evolutionary agent-based network defense lightweight Ficici, S. G. 2004. Solution concepts in coevolutionary event system (candles). In Proceedings of the Compan- algorithms. Ph.D. Dissertation, Citeseer. ion Publication of the 2015 on Genetic and Evolution- Garcia, D.; Lugo, A. E.; Hemberg, E.; and O’Reilly, ary Computation Conference, 859–866. ACM. U.-M. 2017. Investigating coevolutionary archive based Sanchez, D. P.; Pertierra, M. A.; Hemberg, E.; and genetic algorithms on cyber defense networks. In Pro- O’Reilly, U.-M. 2018. Competitive coevolutionary al- ceedings of the Genetic and Evolutionary Computation gorithm decision support. In Proceedings of the Genetic Conference Companion, GECCO ’17, 1455–1462. New and Evolutionary Computation Conference Companion, York, NY, USA: ACM. 300–301. ACM. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Sood, A., and Enbody, R. 2013. Targeted cyberattacks: Warde-Farley, D.; Ozair, S.; Courville, A.; and Bengio, a superset of advanced persistent threats. IEEE security Y. 2014. Generative adversarial nets. In Advances in & privacy 11(1):54–61. Neural Information Processing Systems, 2672–2680. Tambe, M., ed. 2012. Security and Game Theory: Algo- Hemberg, E.; Zipkin, J. R.; Skowyra, R. W.; Wagner, rithms, Deployed Systems, Lessons Learned. Cambridge N.; and O’Reilly, U.-M. 2018. Adversarial co-evolution University Press. of attack and defense in a segmented computer network Team, M. 2018. Mininet - realistic virtual sdn network environment. In Proceedings of the Genetic and Evo- emulator. http://mininet.org/. [Online; accessed 6- lutionary Computation Conference Companion, 1648– July-2018]. 1655. ACM. Thompson, B.; Morris-King, J.; and Cam, H. 2016. Kirkpatrick, K. 2013. Software-defined networking. Controlling risk of data exfiltration in cyber networks Communications of the ACM 56(9). due to stealthy propagating malware. In Military Com- Lange, M.; Kott, A.; Ben-Asher, N.; Mees, W.; Baykal, munications Conference, MILCOM 2016-2016 IEEE, N.; Vidu, C.-M.; Merialdo, M.; Malowidzki, M.; and 479–484. IEEE. Madahar, B. 2017. Recommendations for model-driven Williams, N., and Mitchell, M. 2005. Investigating paradigms for integrated approaches to cyber defense. the success of spatial coevolution. In Proceedings of arXiv preprint arXiv:1703.03306. the 7th annual conference on Genetic and evolutionary Lyon, G. 2018. Nmap network scanner. https://nmap. computation, 523–530. ACM. org/. [Online; accessed 6-July-2018]. Winterrose, M. L., and Carter, K. M. 2014. Strategic Miconi, T. 2009. Why coevolution doesn’t "work": evolution of adversaries against temporal platform di- superiority and progress in coevolution. In European versity active cyber defenses. In Proceedings of the 2014 Conference on Genetic Programming, 49–60. Springer Symposium on Agent Directed Simulation, 9. Society Berlin Heidelberg. for Computer Simulation International. Miller, B. P.; Fredriksen, L.; and So, B. 1990. An empirical study of the reliability of unix utilities. Com- munications of the ACM 33(12):32–44. Mitchell, M. 2006. Coevolutionary learning with spa- tially distributed populations. Computational intelli- gence: principles and practice. O’Neill, M., and Ryan, C. 2003. Grammatical evolu- tion: evolutionary automatic programming in an arbi- trary language, volume 4. Springer.