Quest: A Query-driven Explanation Framework for Black-Box Classifiers on Tabular Data Nadja Geisler Technical University of Darmstadt (TU Darmstadt), Department of Computer Science, Hochschulstraße 10, 64289 Darmstadt, Germany Keywords XAI, post-hoc explanation, model-agnostic, classification, black box Figure 1: LIME/Anchors explanations near the decision Figure 2: Toy examples of areas for LIME, Anchors & boundary in the UCI adult dataset (from [1]). Quest near the decision boundary (adapted from [1]) Explainability efforts are well established in the ML query-driven post-hoc explanations of individual classi- and AI communities by now, with local, model-agnostic fier decisions on tabular data. approaches currently being the tool of choice in informa- tion retrieval [2] and search [3] as well as many other Query-driven explanation First, we introduce a areas. One major challenge in the field is the lack of more expressive representation of explanations con- sophisticated approaches for tabular/relational data, as sisting of query predicates. A custom set of common opposed to text or images. Generic approaches, e.g. fea- query predicates is extremely expressive while still com- ture importance, limit expressiveness and readability. pact. Queries such as capital-gain > capital-loss LIME [4] still remains the basis of many approaches or rel == ’married’ AND children > 1 also have the for local, model-agnostic explanations. It was adapted to benefit of being easily converted into the WHERE clause tabular data to serve as a baseline for Anchors [1]. Both of a SQL statement, to be executed directly on any re- approaches use a white-box model (surrogate) to approx- lational database. Still more importantly, using queries imate a black-box model locally in order to explain its to explain black-box model behavior within local boun- decisions. However, their resulting explanations are lim- daries has the advantage of explaining not only why a ited: LIME (as implemented by the authors) focuses on model produced an outcome but also why not! feature importance. Anchors are derived as simple predi- An explanation produced by Quest can be thought cates to create if-then rules. Figure 1 shows examples for of as boundaries and a decision surface that separates the UCI adult data set [5]. classes within the boundaries. Samples on one side of As an alternative we suggest Quest, a framework for the decision surface within the boundaries form the re- sult set of a query 𝑄. Alongside 𝑄 (“Why?”) stands 𝑄 DESIRES 2021 – 2nd International Conference on Design of (“Why not?”) with its result set covering samples on the Experimental Search & Information REtrieval Systems, September opposite side of the decision surface. This consideration 15–18, 2021, Padua, Italy of queries as explanations gives the user an intuitive way $ nadja.geisler@cs.tu-darmstadt.de (N. Geisler) € https://www.dm.tu-darmstadt.de/ (N. Geisler) of thinking about the local neighborhood, supports gen-  0000-0002-5245-6718 (N. Geisler) eralization on the user’s side and keeps the focus on the © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). data instead of the surrogate model. The combination of CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 𝑄 and 𝑄 ensures an explanation from both sides of the Quest for a point near the decision boundary. Note decision surface, something established systems for this that Quest boundaries (green, dashed) differ, left being task lack. Query representation can be limited to a small distance-based, right linear. The decision boundary is number of operators without loss of expressiveness and linear with one attribute depending on the other in both normalized to facilitate comparison and elimination of cases. The explanation classes allow for endless exten- duplicates. This is achieved through application-specific, sibility, as quality metrics used for the selection process user-defined functions for complex relationships and re- are defined on the query representation they all share. ducing logical operators while retaining functional com- Within explanation classes, we suggest imposing a hard pleteness. An example would be the disjunctive normal complexity restraint (that could be adapted to context form (DNF) using only AND, OR, and NOT operators. upfront) and then optimizing for accuracy. We impose a complexity budget on 𝑄/𝑄 to ensure We compare explanations (candidates produced by users are able to understand them well. This can be Quest and baselines such as LIME/Anchors) regarding adapted to the target group. Complexity of queries can be accuracy, coverage (area and/or proportion of original thought of as the number and type of predicates making samples) and class balance within an explanation. it easily computable and comparable. Framework approach We now need to determine Acknowledgments 𝑄/𝑄 for a given data point such that it best explains the This research and development project is/was funded behavior of the black-box model in the local neighbor- by the German Federal Ministry of Education and Re- hood within a fix complexity budget. To leverage the flex- search (BMBF) within the “The Future of Value Creation ibility of the query language, a very complex approach – Research on Production, Services and Work” program of generating queries suitable to a wide range of scenar- (funding number 02L19C150) and managed by the Project ios (numerical/categorical attributes, data distributions, Management Agency Karlsruhe (PTKA). The author is sparsity, dependencies between attribute values, noise responsible for the content of this publication. level, . . . ) would be necessary. A framework approach gives us the opportunity to be flexible and extensible but still conceptionally straight forward. References We suggest a selection mechanism over several ex- planation classes to minimize drawbacks of individual [1] M. T. Ribeiro, S. Singh, C. Guestrin, Anchors: High- approaches and produce a good fit for the input data. precision model-agnostic explanations, in: Proceed- Classes not applicable to the given data/scenario can be ings of the AAAI Conference on Artificial Intelli- eliminated immediately while the further process is es- gence, volume 32, 2018, pp. 1527–1535. sentially a hyper-parameter search, covering the decision [2] M. Verma, D. Ganguly, Lirme: Locally inter- between explanation classes as well as their respective pretable ranking model explanation, in: Proceed- parameters. Strategies like pruning or successive halving ings of the 42nd International ACM SIGIR Con- can be applied after starting with several instances of ference on Research and Development in Infor- applicable explanation classes for the given data point. mation Retrieval, SIGIR’19, Association for Com- We propose three exemplary classes that vary in ex- puting Machinery, New York, NY, USA, 2019, pressiveness (i.e., complexity and form of the representa- p. 1281–1284. URL: https://doi.org/10.1145/3331184. tions they produce) as well as other properties: 3331377. doi:10.1145/3331184.3331377. [3] J. Singh, A. Anand, Exs: Explainable search us- Decision trees make robust candidates that can be ap- ing local model agnostic interpretability, in: Pro- plied to numerical and categorical attributes, can ceedings of the Twelfth ACM International Confer- capture disjoint areas, and work with any condi- ence on Web Search and Data Mining, WSDM ’19, tion type. Association for Computing Machinery, New York, NY, USA, 2019, p. 770–773. URL: https://doi.org/ Adaptations of clustering techniques, using a suitable 10.1145/3289600.3290620. doi:10.1145/3289600. distance metric and constraints on labels intu- 3290620. itively fit the task of grouping instances. [4] M. T. Ribeiro, S. Singh, C. Guestrin, "Why should A linear model on a reduced feature set could be used i trust you? Explaining the predictions of any clas- for local relations between attributes. sifier, in: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery Each class could produce a different “form” of neigh- and data mining, 2016, pp. 1135–1144. borhood, different from the rigidity of LIME/Anchors. [5] D. Dua, C. Graff, UCI machine learning repository, Figure 2 shows a toy example for LIME, Anchors and 2017. URL: http://archive.ics.uci.edu/ml.