Explainability via Responsibility Faraz Khadivpour and Matthew Guzdial Department of Computing Science, Alberta Machine Intelligence Institute (Amii) University of Alberta, Canada {khadivpour, guzdial}@ualberta.ca Abstract comparisons between the input and the output of a model (Cortez and Embrechts 2011; 2013; Simonyan, Vedaldi, Procedural Content Generation via Machine Learning and Zisserman 2013; Bach et al. 2016; Dabkowski and (PCGML) refers to a group of methods for creating game content (e.g. platformer levels, game maps, etc.) using ma- Gal 2017; Selvaraju et al. 2017), or analyze the output in chine learning models. PCGML approaches rely on black box terms of the model’s parameters (Boz and Hillman 2000; models, which can be difficult to understand and debug by hu- Garcı́a, Fernández, and Herrera 2009; Letham et al. 2015; man designers who do not have expert knowledge about ma- Hara and Hayashi 2018). Alternatively, there is the strat- chine learning. This can be even more tricky in co-creative egy to attempt to simplify the model (Che et al. 2015; systems where human designers must interact with AI agents Tan et al. 2017; Xu et al. 2018). The major difference be- to generate game content. In this paper we present an ap- tween our approach and these previous ones is that we proach to explainable artificial intelligence in which certain present a method which makes it possible to explain an AI training instances are offered to human users as an explana- agent’s action through a detailed inspection of what it has tion for the AI agent’s actions during a co-creation process. learned during the training phase. We evaluate this approach by approximating its ability to pro- vide human users with the explanations of AI agent’s actions Questions we might want to ask an AI agent include “How and helping them to more efficiently cooperate with the AI did you learn to do that action?” or “What did you learn agent. that led you to make that decision?” (Cook et al. 2019). We sought to develop an approach that could answer these ques- tions. Thus, our approach needed to find explanations for the Introduction AI agent’s decisions based on its training data. In science and engineering, a black box is a component that In this paper, we make use of the problem domain of a cannot have its internal logic or design directly examined. In co-creative Super Mario Bros. level design agent. We use artificial intelligence (AI), “The black box problem” refers this domain since XAI is critical in co-creative systems. We to certain kinds of AI agents for which it is difficult or im- introduce an approach to detect the training instance that possible to naively determine how they came to a partic- is most responsible for an AI agent’s action. We can then ular decision (Zednik 2019). Explainable artificial intelli- present the most responsible training instance to the human gence (XAI) is an assembly of methods and techniques to user as an answer to how the AI agent learned to make a par- deal with the black box problem (Biran and Cotton 2017). ticular decision. To evaluate this approach we compare the Machine Learning (ML) is a subset of artificial intelligence quality of these responsible training instances to random in- that focuses on computer algorithms that automatically learn stances as explanations in two experiments on existing data. and improve through experience. (Goodfellow, Bengio, and Courville 2016). The current state-of-the-art models in ML, Related Work deep neural networks, are black box models. Intuitively, it Our problem domain is generating explanations for a is difficult to cooperate with an individual when you cannot PCGML co-creative agent. Therefore we separate the prior understand them. This is critical in co-creative systems (also related work into three main areas: Procedural Content Gen- called mixed-initiative systems), in which a human and an eration via Machine Learning (PCGML), co-creative sys- AI agent work together to produce the final output. (Yan- tems, and Explainable Artificial Intelligence (XAI). nakakis, Liapis, and Alexopoulos 2014). There is a wealth of existing methods in the field of XAI Procedural Content Generation via Machine (Adadi and Berrada 2018). For example, those that draw Learning (PCGML) Copyright c 2020 for this paper by its authors. Use permitted un- Procedural Content Generation via Machine Learning der Creative Commons License Attribution 4.0 International (CC (PCGML) is a field of research focused on the creation of BY 4.0). game content by machine learning models that have been trained on existing game content (Summerville et al. 2018). Super Mario Bros. level design represents the most consis- tent area of research into PCGML. Researchers have applied many machine learning methods such as Markov chains (Snodgrass and Ontanón 2016), Monte-Carlo Tree Search (MCTS) (Summerville, Philip, and Mateas 2015), Long Short-Term Recurrent Neural Networks (LSTMs) (Sum- merville and Mateas 2016), Autoencoders (Jain et al. 2016), Generative Adversarial Neural Networks (GANs) (Volz et al. 2018), and genetic algorithms through learned evaluation functions (Dahlskog and Togelius 2014) to generate these Figure 1: General steps of our approach levels. In a recent work, Khalifa et al proposed a framework to generate game levels using Reinforcement Learning (RL), though they did not evaluate it in Super Mario Bros. (Khalifa interaction between user and model (Guzdial et al. 2018). et al. 2020). We also draw on reinforcement learning for our Ehsan et al. introduced AI rationalization, an approach agent, however our approach differs from this prior work in for explaining agent behavior for automated game playing terms of focusing on explainability. based on how a human would explain a similar behavior (Ehsan et al. 2018). Zhu et al. proposed a new research area Co-creative systems of eXplainable AI for Designers (XAID) to help game de- There are numerous prior co-creative systems for game de- signers better utilize AI and ML in their design tasks through sign. These approaches traditionally have not made use of co-creation (Zhu et al. 2018). ML, instead they rely on approaches like heuristics search, There exist a few approaches to explain RL agent’s ac- evolutionary algorithms, and grammars (Smith, Whitehead, tions (Puiutta and Veith 2020). Madmul et al. presented and Mateas 2010; Liapis, Yannakakis, and Togelius 2013; an approach that learns structural causal models to derive Yannakakis, Liapis, and Alexopoulos 2014; Deterding et al. causal explanations of the behavior of model-free RL agents 2017; Baldwin et al. 2017; Charity, Khalifa, and Togelius (Madumal et al. 2019). Kumar et al. presented a deep re- 2020). ML methods have only recently been incorporated inforcement learning approach to control an energy storage into co-creative game content generation. Guzdial et al. pro- system. They visualized the learned policies of the RL agent posed a Deep RL agent for co-creative Procedural Level through the course of training and visualized the strategies Generation via Machine Learning (PLGML) (Guzdial, Liao, followed by the agent to users (Kumar 2019). Cruz et al. pro- and Riedl 2018). In another recent work, Schrum et al. pre- posed a memory-based explainable reinforcement learning sented a tool for applying interactive latent variable evolu- (MXRL) where an agent explained the reasons why some tion to generative adversarial network models that produce decisions were taken in certain situations using an episodic video game levels (Schrum et al. 2020). The major differ- memory (Cruz, Dazeley, and Vamplew 2019). In another re- ence between our approach and previous ones is that it ex- cent paper, an approach was presented that employs expla- plains an AI partner’s actions based on what it learned dur- nations as feedback from humans in a human-in-the-loop re- ing training. inforcement learning system (Guan, Verma, and Kambham- It is important to note that we are not actually evaluating pati 2020). our approach in the context of co-creative interaction with a To the best of our knowledge, this is the first XAI work human subject study. We are only making use of data from focused on the training data of a target ML model. Our ap- prior studies in which humans interacted with ML and RL proach differs from existing XAI work in detailed inspection agents in co-creative systems. and alteration of the training phase. Explainable Artificial Intelligence (XAI) System Overview The majority of existing XAI approaches can be sepa- In this paper, we present an approach for Explainable AI rated according to which of two general methods they rely (XAI) that aims to answer the question “What did the AI on: (A) visualizing the learned features of a model (Er- agent learn during training that led it to make that specific han et al. 2009; Simonyan, Vedaldi, and Zisserman 2013; action?”. As is shown in Figure 1, the general steps of the Nguyen, Yosinski, and Clune 2015; 2016; Nguyen et al. approach are as follows: First, during training a DNN, we 2017; Olah, Mordvintsev, and Schubert 2017; Weidele, Stro- detect the training instance (or instances) that maximally al- belt, and Martino 2019) and (B) demonstrating the relation- ters each neuron inside the network. Secondly, during test- ship between neurons (Zeiler and Fergus 2014; Fong and ing, we pass each instance through the network and find the Vedaldi 2017; Selvaraju et al. 2017). Olah et al. developed a neuron that is most activated (Erhan, Courville, and Bengio unified framework that included both (A) and (B) methods. 2010). Then given the information from the first step, we (Olah et al. 2018). can easily identify an instance (or instances) from the train- There are a few prior works focused on XAI applied to ing data that maximally impacted the most activated neuron. game design and game playing. Guzdial et al. presented We refer to this as “the most responsible training instance” an approach to Explainable PCGML via Design Patterns in for the AI agent’s action. The intuition is that the user can which the design patterns act as a vocabulary and mode of take this explanation as something akin to the end goal of the agent taking that action. Our hope is that it will be help- ful in the user deciding whether to keep or remove some addition by the AI. For example in Figure 3, given the most responsible level as the explanation, the user might keep the lower of the two Goombas, despite the fact that it seems to be floating, if they can match it to the Goombas from the most responsible level. For this purpose, we pre-trained a Deep RL agent using data from interactions of human users with three different ML level design partners (LSTM, Markov Chain, and Bayes Net) to generate the Super Mario Bros level. This is the same Deep RL architecture and data from prior work by Guzdial Figure 2: Architecture of our Convolutional Neural Network et al. (Guzdial, Liao, and Riedl 2018) for co-creative Proce- (CNN). dural Level Generation via Machine Learning (PLGML), in which they made use of the level design editor from (Guzdial et al. 2017) which is publicly online.1 The agent is designed of a human user working with the AI in the co-creative tool. to take in a current level design state and to output additions We can then search these arrays and find the ID of a training to that level design, in order to iteratively complete a level instance that is the most responsible for changes to a partic- with a human partner. ular weight. Our training inputs are states and the outputs are the Q Our end goal is to determine the most responsible train- table values for taking a particular action for the particular ing instance for a particular prediction made by our trained state. The input comes into the network as a state of shape CNN. To do that, we need to find out what part of the net- (40x15x34). The 40 is the width and 15 is the height of a work was most important in making that prediction. We can level chunk. At each x,y location there are 34 possible level then determine the most responsible instance for the final components (e.g. ground, goomba, pipe, mushroom, tree, weights of this most important part of the network. The most Mario, flag, ...) that could be placed there. As is shown in activated filter of each convolutional layer is a filter that con- the visualized architecture of the Convolutional Neural Net- tributes to the slice with the largest magnitude in the output work (CNN) in Figure 2, it has three convolutional layers of that layer. Hence the most activated filter can be con- and a fully connected layer followed by a reshaping function sidered the most important part of the convolutional layer to make the output in the form of the action matrix which is for that specific test instance (Erhan, Courville, and Bengio (40x15x32). The player (Mario) and flag are the level en- 2010). For example, we pass a test instance into the network. tities that cannot be counted as an action, so there are 32 A test instance is a (40x15x34) state that is a chunk of a par- possible action components instead of the 34 state entities. tially designed level. Since the first convolutional layer has 8 Our activation function is “Leaky ReLu” for every layer and 4x4x34 filters with the same padding, the output would be in the loss function is “Mean Squared Error” and the optimizer the shape of (40x15x8). Then we find the (40x15) slice with is “Adam”, with the network built in Tensorflow (Abadi et the largest values. The most activated filter is a (4x4x34) ar- al. 2016). We make use of this existing agent and data since ray in our convolutional layer which led to the slice with the it is the only example of a co-creative PCGML agent where greatest magnitude. the data from a human subject study is publicly available. Finally, once we have the maximally activated filter we During each training epoch we employ a batch size of can identify the most responsible training instance (or in- one to track when each training instance passes through stances) by querying the MRIN-Conv arrays we built during the network. We calculate and store the change of neuron training. The most responsible training instance is the ID weights between batches. After training, by summing over that most repeated in the MRIN-Conv array associated with the changes of each neuron weight with respect to training the maximally activated filter. We chose the most repeated data, we are able to identify which training instance max- ID since it is the one that most frequently impacted the ma- imally results in alteration of a neuron. Since positive and jority of the neurons in the filter during training. negative values can counteract each other’s effects, it is im- portant to not look at the absolute values until the end of the training. We can then sum and store this information in- Evaluation side eight arrays of shape (4x4x34) for the first convolutional In this section, we present two evaluations of our system. layer, 16 arrays of shape (3x3x8) for the second convolu- We call the first evaluation our “Explainability Evaluation” tional layer, and 32 arrays of shape (3x3x16) for the third as it addresses the ability of our system to provide explana- convolutional layer. These are the shapes of the filters in tions that help a user predict an AI agent’s actions. We call each layer. We name these arrays Most Responsible Instance the second evaluation our “User Labeling Error Evaluation” for each Neuron in each Convolutional layer (MRIN-Conv1, as it addresses the ability of our system to help human users MRIN-Conv2, and MRIN-Conv3). These data representa- identify positive and negative AI additions during the co- tions link neurons to IDs representing a particular instance creative process. Both evaluations approximate the impact of our approach on human partners by using existing data of 1 1https://github.com/mguzdial3/Morai-Maker-Engine AI-human interactions. Essentially, we act as though the pre- (B) Our second testset is obtained from a study in which ex- pert level designer users interacted with the trained Deep RL agent (Guzdial et al. 2019). If we find success with the first testset then that would in- dicate that our trained Deep RL agent is a good surrogate for the original three ML agents, since we would be in ef- fect predicting the next action of one of these agents. Good results for the second testset would demonstrate the capa- bility for prediction of the Deep RL agent’s actions itself. Since the first convolutional layer is the layer that most di- rectly reasons over the level structure, we decided to find the most responsible training instance of just the first con- volutional layer. However, this setup puts our approach at a disadvantage, since we are going to compare only one most Figure 3: An example of explaining an AI agent’s action by responsible level to 20 random ones. representing the most responsible level. For comparing the most responsible level and the random levels to the actions, we needed to define a suitable metric. We desired a metric that detects local overlaps and repre- recorded actions of the AI agent were outputs from our Deep sents the similarity between a level and action. We wanted RL agent and identify the responsible training instances as to pick square windows which are not the same size as the if this were the case. Due to the fact that our system derives first convolutional layer, to capture some local structures examples as explanations for the behavior of a co-creative without biasing the metric too far towards our first convo- Deep RL agent, a human subject study would be the natu- lutional layer. As a result, we found all three-by-three non- ral way to evaluate our system. However, prior to a human empty patches for both a given level and an action. Then subject study, we first wanted to gather some evidence of the we counted the number of exact matches of these patches value of this approach. on both sides, removing the matched ones from the dataset since we wanted to count the same patches only once. Fi- Explainability Evaluation nally, we divided the total number of the matched patches The first claim we made was that this approach can help hu- by the total number of patches in the action, since this was man users better understand and predict the actions of an always smaller than the number from the level. We refer to AI agent. In this experiment we use the most responsible this metric as the local overlap ratio. level as an approximation of the AI agent’s goal, in other Explainability Evaluation Results words what final level the AI agent is working towards. The most responsible level refers to a level at the end of a human We had 242 samples in the first testset and 69 samples in the user’s interactions with an AI agent. We identify this level second one. Since we wanted to compare instances in which by finding the most responsible training instance as above the AI agent actually made some serious changes, we chose and identifying the level at the end of that training sequence. instances where the AI agent added more than 10 compo- This experiment is meant to determine if this can help a user nents in its next action. Thus we came to 38 and 46 instances to predict the AI agent’s actions. To do this, we passed test from the first and second testsets, respectively. instances into our network and found the most responsible Our approach outperforms the random baseline in 78.94 training instances. We then compared the most responsible percent of 38 instances for the ML agents data and 67.29 level for some current test instance to the AI agent’s action percent of 46 instances for the Deep RL agent data. The av- in the next test instance. If the most responsible level is sim- erage of the local overlap ratios is shown in Table 1 (higher ilar to the action it would indicate that the most responsible is better). The minimum value here would be 0 for zero over- level can be a potential explanation for the AI agent’s action lap and the maximum value would be 1 for complete overlap by priming the user to better predict future actions by the AI between the action and the most responsible level or the ran- agent. In comparison, we randomly selected 20 levels from dom level. This normalization means that even small differ- the training data and found their similarities to the AI agent’s ences in this metric represent large perceptual differences. action in the next test instance. If our approach outperforms For example, a 0.04 difference in the local overlap ratio be- the random levels, it will support the claim that the respon- tween the most responsible level and the random levels in sible level is better suited to helping predict future AI agent Table 1 indicates the most responsible level has 20 more actions compared to random levels. three-by-three non-empty overlaps. We expect that the rea- son that the Deep RL agent values are generally lower is We used two different sets of test data: that the second study made use of published level designers (A) Our first testset is derived from a study in which users rather than novices and an adaptive Deep RL Agent, mean- interacted with pairs of three different ML agents as men- ing that there was more varied behavior compared with the tioned in our System Overview section (Guzdial, Liao, three ML agents. and Riedl 2018). We used the same testset identified in An example of explainability is demonstrated in Figure 3. that paper. As is shown in the figure, the AI agent made an action and TestSet Most Responsible Level Random Levels they reached this point. Thus we compared these two states ML Agents 0.4653 0.3841 to find all the changes that the AI agent or the user made and Deep RL 0.2880 0.2472 named this the Difference-state (D-state). We compared each D-state with the final generated level Table 1: A table comparing the average of the most respon- derived from the most responsible training instance. We also sible levels to the average of the random levels for both test- compared each D-state with 20 other randomly selected lev- sets. els from the existing data. For the comparison, we used the local overlap ratio defined in the previous evaluation. If our approach outperforms the random baseline, we will be able added some components (e.g. goomba and ground) to the to say that there is some support for the responsible level existing state. By looking at the chunk of the most respon- helping the user avoid false-positives and false-negatives in sible level, the user might realize that the AI agent wants comparison to random levels. to generate a level including some goombas as enemies and some blocks in the middle of the screen. The AI agent also User Labeling Error Evaluation Results added ground at the bottom and top of the screen, which the We found five false-negative and 24 false-positive exam- user could identify as being consistent with both their input ples in the first testset and five false-negative and 54 false- to the agent and the most responsible level. positive examples in the second one. The results of the eval- uation are demonstrated in Figures 4. User Labeling Error Evaluation For the first dataset which included the actions of the three For the second evaluation, we wanted to get some sense of ML agents, our approach outperformed the random baseline whether this approach could be successful in terms of as- in 65.51 percent of the examples. The average of the local sisting a human user in better understanding good and bad overlap ratio values for our approach was 0.1717 which is agent actions during the co-creation process. To do this, we more than the 0.1647 for the random levels. For the sec- needed to identify specific instances where our tool could be ond dataset obtained from the Deep RL agent, our approach helpful in the data we have available. We defined two such outperformed the baseline in 59.32 percent of the examples. concepts: (A) false-positive decisions and (B) false-negative The average of the local overlap ratio values were 0.2665 decisions, based on the interactions between users and AI and 0.2328 for the most responsible level and random levels, partner during level generation: respectively. Again this represents a large perceptual differ- (A) False-positive decisions are additions by the AI partner ence of roughly 15 more non-empty 3x3 overlaps. that the user kept at first but then deleted later. Interestingly, our approach outperforms the random levels in all of the false-negative examples in the second dataset, (B) False-negative decisions are additions by the AI partner compared with just 20 percent of false-negatives in the first that the user deleted at first but then added later. dataset. Further, our approach performs around 1.5 times Given these concepts, if we could help the user avoid making better than the random levels in 15 false-positive examples these kinds of decisions, our approach could help a human in the second dataset. These instances come from the study user during level generation. We anticipated that one reason that used the same RL agent as we used to derive our expla- that users made these kinds of decisions was from a lack of nations, which could account for this performance. context of the AI agent’s action. Thus, if the user had context they may not delete or keep what they would otherwise keep Discussion or delete, respectively. In this paper, we present an XAI approach for a pre-trained To accomplish this, we implemented an algorithmic way Deep RL agent. Our hypothesis was that our method could to determine false-positives and false-negatives among the be helpful to human users. We evaluated it by approximat- two testsets described in the previous evaluation. In this al- ing this process for two tasks using two existing datasets. gorithm, we first find all user decisions in terms of deleting These datasets are obtained from studies using three ML or keeping an addition by the AI agent. Then we look at the partners and an RL agent. Essentially, we used the XAI- level at the end of the user and the AI agent’s interaction. If a enabled agent in this paper as if it were the agents used deleted AI addition exists in the final level, it is counted as a in these datasets. The results of our first evaluation demon- false-negative example, and if a kept addition does not exist strates that our method is able to represent examples as ex- in the final level it is counted as a false-positive example. planations to help users predict an agent’s next action. The Once we discovered all false-negative and false-positive results of our second evaluation support our hypothesis and examples, we found the state before the example was give us an initial signal that this approach could be success- added by the AI agent and named it the Introduction- ful in order to help human users more efficiently cooperate state (I-state). We found the state in which false-positivity with a Deep RL agent. This indicates the ability of our ap- or false-negativity occurred (i.e. when a user re-added a proach to help human designers by presenting an explana- false-negative or deleted a false-positive) and named it the tion for an AI agent’s actions during a co-creation process. Contradiction-state (C-state). Since some change between A human subject study would be a more reasonable way the I-state and the C-state led to the user altering their de- to evaluate this system since human users might be able to cision, we wanted to see some sign that presenting the most derive meaning from the responsible level that our similar- responsible level to the user could change their mind before ity metric could not capture. Our approach performs better Figure 4: Results of the User Labeling Error Evaluation. than our baseline of random levels in both evaluation meth- In the future, we intend to explore how more general rep- ods and this presents evidence towards its value at this task. resentations of responsibility such as Shapely values might However, we look forward to investigating a human subject intersect with this approach (Ghorbani and Zou 2019). study in order to fully validate these results. Only the domain of a co-creative system for designing Su- There could be other alternatives to a human subject per Mario Bros. levels is explored in this paper. Thus mak- study. For example, a secondary AI agent that predicts our ing use of other games will be required to ensure this is a primary AI agent’s actions can play a human partner’s role in general method for level design co-creativity. Beyond that, the co-creative system. Thus making use of a secondary AI we anticipate a need to demonstrate our approach on differ- agent to evaluate our system before running a human subject ent domains outside of games. We look forward to running study might be a simple next step. another study to apply our approach to human-in-the-loop It is important to mention that we only offer one most reinforcement learning or other co-creative domains. responsible level from only the first convolutional layer as an explanation. Looking into providing a user with multiple Conclusions responsible levels or looking into the most responsible lev- In this paper we present an approach to XAI that provides els of the other layers could be a potential way to further human users with the most responsible training instance as improve our approach. Our metric for determining the most an explanation for an AI agent’s action. In support of this responsible training instance is based on finding the most approach, we present results from two evaluations. The first repeated instance inside the MRIN-Conv arrays associated evaluation demonstrates the ability of our approach to of- with the most activated filter. We identified the most acti- fer explanations and to help a human partner predict an vated filter by looking at the absolute values. We plan to in- AI agent’s actions. The second evaluation demonstrates the vestigate other metrics such as looking for the most activated ability of our approach to help human users better identify neurons outside of the filters. In addition, considering neg- good and bad instances of an AI agent’s behavior. To the ative and positive values separately in the maximal activa- best of our knowledge this represents the first XAI approach tion process could also lead to improved behavior. Negative focused on training instances. values might indicate that an instance negatively impacted a neuron. It could be the case then that the filter might be Acknowledgements maximally activated because it was giving a very strong sig- nal against some action. We acknowledge the support of the Natural Sciences and One quirk of our current approach is that the most respon- Engineering Research Council of Canada (NSERC) and the sible training instance depends on the order in which it was Alberta Machine Intelligence Institute (Amii). presented to the model during the training. Thus, this mea- sure does not tell us about any inherent quality of a partic- References ular training data instance, only it’s relevance to a particu- Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, lar model that has undergone a particular training regimen. J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. 2016. Tensorflow: A system for large-scale machine learn- Erhan, D.; Bengio, Y.; Courville, A.; and Vincent, P. 2009. ing. In 12th {USENIX} symposium on operating systems Visualizing higher-layer features of a deep network. Univer- design and implementation ({OSDI} 16), 265–283. sity of Montreal 1341(3):1. Adadi, A., and Berrada, M. 2018. Peeking inside the black- Erhan, D.; Courville, A.; and Bengio, Y. 2010. Understand- box: A survey on explainable artificial intelligence (xai). ing representations learned in deep architectures. Depart- IEEE Access 6:52138–52160. ment dInformatique et Recherche Operationnelle, University Bach, S.; Binder, A.; Müller, K.-R.; and Samek, W. 2016. of Montreal, QC, Canada, Tech. Rep 1355:1. Controlling explanatory heatmap resolution and semantics Fong, R. C., and Vedaldi, A. 2017. Interpretable expla- via decomposition depth. In 2016 IEEE International Con- nations of black boxes by meaningful perturbation. In Pro- ference on Image Processing (ICIP), 2271–2275. IEEE. ceedings of the IEEE International Conference on Computer Baldwin, A.; Dahlskog, S.; Font, J. M.; and Holmberg, J. Vision, 3429–3437. 2017. Mixed-initiative procedural generation of dungeons Garcı́a, S.; Fernández, A.; and Herrera, F. 2009. Enhanc- using game design patterns. In 2017 IEEE Conference ing the effectiveness and interpretability of decision tree and on Computational Intelligence and Games (CIG), 25–32. rule induction classifiers with evolutionary training set se- IEEE. lection over imbalanced problems. Applied Soft Computing Biran, O., and Cotton, C. 2017. Explanation and justification 9(4):1304–1314. in machine learning: A survey. In IJCAI-17 workshop on Ghorbani, A., and Zou, J. 2019. Data shapley: Equi- explainable AI (XAI), volume 8. table valuation of data for machine learning. arXiv preprint Boz, O., and Hillman, D. 2000. Converting a trained neural arXiv:1904.02868. network to a decision tree dectext-decision tree extractor. Goodfellow, I.; Bengio, Y.; and Courville, A. 2016. Deep Citeseer. learning. MIT press. Charity, M.; Khalifa, A.; and Togelius, J. 2020. Baba is y’all: Guan, L.; Verma, M.; and Kambhampati, S. 2020. Expla- Collaborative mixed-initiative level design. arXiv preprint nation augmented feedback in human-in-the-loop reinforce- arXiv:2003.14294. ment learning. arXiv preprint arXiv:2006.14804. Che, Z.; Purushotham, S.; Khemani, R.; and Liu, Y. 2015. Guzdial, M. J.; Chen, J.; Chen, S.-Y.; and Riedl, M. 2017. Distilling knowledge from deep networks with applications A general level design editor for co-creative level design. to healthcare domain. arXiv preprint arXiv:1512.03542. In Thirteenth Artificial Intelligence and Interactive Digital Cook, M.; Colton, S.; Pease, A.; and Llano, M. T. 2019. Entertainment Conference. Framing in computational creativity-a survey and taxonomy. Guzdial, M.; Reno, J.; Chen, J.; Smith, G.; and Riedl, M. In ICCC, 156–163. 2018. Explainable pcgml via game design patterns. arXiv Cortez, P., and Embrechts, M. J. 2011. Opening black box preprint arXiv:1809.09419. data mining models using sensitivity analysis. In 2011 IEEE Guzdial, M.; Liao, N.; Chen, J.; Chen, S.-Y.; Shah, S.; Shah, Symposium on Computational Intelligence and Data Mining V.; Reno, J.; Smith, G.; and Riedl, M. O. 2019. Friend, (CIDM), 341–348. IEEE. collaborator, student, manager: How design of an ai-driven Cortez, P., and Embrechts, M. J. 2013. Using sensitivity game level editor affects creators. In Proceedings of the analysis and visualization techniques to open black box data 2019 CHI Conference on Human Factors in Computing Sys- mining models. Information Sciences 225:1–17. tems, 1–13. Cruz, F.; Dazeley, R.; and Vamplew, P. 2019. Memory- Guzdial, M.; Liao, N.; and Riedl, M. 2018. Co- based explainable reinforcement learning. In Australasian creative level design via machine learning. arXiv preprint Joint Conference on Artificial Intelligence, 66–77. Springer. arXiv:1809.09420. Dabkowski, P., and Gal, Y. 2017. Real time image saliency Hara, S., and Hayashi, K. 2018. Making tree ensembles in- for black box classifiers. In Advances in Neural Information terpretable: A bayesian model selection approach. In Inter- Processing Systems, 6967–6976. national Conference on Artificial Intelligence and Statistics, Dahlskog, S., and Togelius, J. 2014. A multi-level level 77–85. generator. In 2014 IEEE Conference on Computational In- Jain, R.; Isaksen, A.; Holmgård, C.; and Togelius, J. 2016. telligence and Games, 1–8. IEEE. Autoencoders for level generation, repair, and recognition. Deterding, S.; Hook, J.; Fiebrink, R.; Gillies, M.; Gow, J.; In Proceedings of the ICCC Workshop on Computational Akten, M.; Smith, G.; Liapis, A.; and Compton, K. 2017. Creativity and Games. Mixed-initiative creative interfaces. In Proceedings of the Khalifa, A.; Bontrager, P.; Earle, S.; and Togelius, J. 2017 CHI Conference Extended Abstracts on Human Fac- 2020. Pcgrl: Procedural content generation via reinforce- tors in Computing Systems, 628–635. ment learning. arXiv preprint arXiv:2001.09212. Ehsan, U.; Harrison, B.; Chan, L.; and Riedl, M. O. 2018. Kumar, H. 2019. Explainable ai: Deep reinforcement learn- Rationalization: A neural machine translation approach to ing agents for residential demand side cost savings in smart generating natural language explanations. In Proceedings of grids. arXiv preprint arXiv:1910.08719. the 2018 AAAI/ACM Conference on AI, Ethics, and Society, Letham, B.; Rudin, C.; McCormick, T. H.; Madigan, D.; 81–87. et al. 2015. Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model. The An- Summerville, A. J.; Philip, S.; and Mateas, M. 2015. Mcm- nals of Applied Statistics 9(3):1350–1371. cts pcg 4 smb: Monte carlo tree search to guide platformer Liapis, A.; Yannakakis, G. N.; and Togelius, J. 2013. Sen- level generation. In Eleventh Artificial Intelligence and In- tient sketchbook: computer-assisted game level authoring. teractive Digital Entertainment Conference. Madumal, P.; Miller, T.; Sonenberg, L.; and Vetere, F. 2019. Tan, S.; Caruana, R.; Hooker, G.; and Lou, Y. 2017. Detect- Explainable reinforcement learning through a causal lens. ing bias in black-box models using transparent model distil- arXiv preprint arXiv:1905.10958. lation. arXiv preprint arXiv:1710.06169. Nguyen, A.; Clune, J.; Bengio, Y.; Dosovitskiy, A.; and Volz, V.; Schrum, J.; Liu, J.; Lucas, S. M.; Smith, A.; and Yosinski, J. 2017. Plug & play generative networks: Condi- Risi, S. 2018. Evolving mario levels in the latent space tional iterative generation of images in latent space. In Pro- of a deep convolutional generative adversarial network. In ceedings of the IEEE Conference on Computer Vision and Proceedings of the Genetic and Evolutionary Computation Pattern Recognition, 4467–4477. Conference, 221–228. Nguyen, A.; Yosinski, J.; and Clune, J. 2015. Deep neural Weidele, D.; Strobelt, H.; and Martino, M. 2019. Deepling: networks are easily fooled: High confidence predictions for Avisual interpretability system for convolutional neural net- unrecognizable images. In Proceedings of the IEEE confer- works. Proceedings SysML. ence on computer vision and pattern recognition, 427–436. Xu, K.; Park, D. H.; Yi, C.; and Sutton, C. 2018. Interpret- Nguyen, A.; Yosinski, J.; and Clune, J. 2016. Multifaceted ing deep classifier by visual distillation of dark knowledge. feature visualization: Uncovering the different types of fea- arXiv preprint arXiv:1803.04042. tures learned by each neuron in deep neural networks. arXiv Yannakakis, G. N.; Liapis, A.; and Alexopoulos, C. 2014. preprint arXiv:1602.03616. Mixed-initiative co-creativity. Olah, C.; Satyanarayan, A.; Johnson, I.; Carter, S.; Schubert, Zednik, C. 2019. Solving the black box problem: A norma- L.; Ye, K.; and Mordvintsev, A. 2018. The building blocks tive framework for explainable artificial intelligence. Phi- of interpretability. Distill 3(3):e10. losophy & Technology 1–24. Olah, C.; Mordvintsev, A.; and Schubert, L. 2017. Feature Zeiler, M. D., and Fergus, R. 2014. Visualizing and under- visualization. Distill 2(11):e7. standing convolutional networks. In European conference on computer vision, 818–833. Springer. Puiutta, E., and Veith, E. 2020. Explainable reinforcement learning: A survey. arXiv preprint arXiv:2005.06247. Zhu, J.; Liapis, A.; Risi, S.; Bidarra, R.; and Youngblood, G. M. 2018. Explainable ai for designers: A human- Schrum, J.; Gutierrez, J.; Volz, V.; Liu, J.; Lucas, S.; and centered perspective on mixed-initiative co-creation. In Risi, S. 2020. Interactive evolution and exploration within 2018 IEEE Conference on Computational Intelligence and latent level-design space of generative adversarial networks. Games (CIG), 1–8. IEEE. arXiv preprint arXiv:2004.00151. Selvaraju, R. R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; and Batra, D. 2017. Grad-cam: Visual explana- tions from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on com- puter vision, 618–626. Simonyan, K.; Vedaldi, A.; and Zisserman, A. 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034. Smith, G.; Whitehead, J.; and Mateas, M. 2010. Tanagra: A mixed-initiative level design tool. In Proceedings of the Fifth International Conference on the Foundations of Digital Games, 209–216. Snodgrass, S., and Ontanón, S. 2016. Learning to generate video game maps using markov models. IEEE transactions on computational intelligence and AI in games 9(4):410– 422. Summerville, A., and Mateas, M. 2016. Super mario as a string: Platformer level generation via lstms. arXiv preprint arXiv:1603.00930. Summerville, A.; Snodgrass, S.; Guzdial, M.; Holmgård, C.; Hoover, A. K.; Isaksen, A.; Nealen, A.; and Togelius, J. 2018. Procedural content generation via machine learning (pcgml). IEEE Transactions on Games 10(3):257–270.