Logics for Binary-input Classifiers and Their Explanations Xinghan Liu1,∗,† , Emiliano Lorini1,∗,† 1 IRIT-CNRS, University of Toulouse, 118 Route de Narbonne, 31062, Toulouse, France Abstract We present our work [1, 2] on modal logics for binary-input classifiers and their explanations. They are able to represent classifiers that propositional logic cannot. In particular, black box classifier is understood as uncertainty among admissible classifiers which are coherent with an agent’s partial knowledge, and represented in a product modal logic framework. We also briefly show the logics’ application to XAI. Keywords Boolean function, black box classifier, product modal logic, explainable AI (XAI) 1. Introduction The notions of explanation and explainability have been extensively investigated by philosophers [3, 4, 5] and are key aspects of AI-based systems. Classifier systems compute a given function in the context of a classification or prediction task. Explaining why the system has classified a given instance in a certain way is crucial for making the system intelligible and for finding biases in the classification process. Thus, a variety of notions of explanations have been discussed in the area of explainable AI (XAI) including abductive, contrastive and counterfactual explanations [6, 7, 8, 9, 10, 11, 12, 13, 14, 15]. At the mathematical level, a Boolean classifier is nothing but a Boolean function 𝑓, and traditionally is represented by a propositional formula 𝜑. Using modal logic we can model binary-input classifiers which have finite-valued output and are possibly partial. Moreover, it enables us to represent black box classifiers which are key research objects in XAI. A classifier is a white box, if it is determined and given in the model, while black box is understood as uncertainty (indeterminacy) among admissible classifiers which are coherent with an agent’s partial knowledge about the classifier. In this paper we present four modal logics for binary-input classifiers in a unified framework: BCL (Binary-input Classifier Logic) and WBCL (Weak BCL); PLC (Product logic for binary-input Classifier) and WPLC (Weak PLC), according to whether the set of atomic propositions is finite or countably infinite, and whether the represented classifiers are white box or black box, see Table 1. 1st Workshop on Bias, Ethical AI, Explainability and the role of Logic and Logic Programming, BEWARE-22, co-located with AIxIA 2022, University of Udine, Udine, Italy, 2022 ∗ Corresponding author. † These authors contributed equally. Envelope-Open xinghan.liu@univ-toulouse.fr (X. Liu); Emiliano.Lorini@irit.fr (E. Lorini) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) BCL WBCL PLC WPLC Cardinality of language is finite infinite finite infinite Classifiers are white box white box black box black box Table 1 Four modal logics for classifiers Furthermore, we exemplify how to apply them to XAI by a) expressing abductive explanation in our language; b) defining its counterpart in the case of black box classifiers; c) formalizing the explanation verification as a model checking problem. 2. Four Logics: BCL, WBCL, PLC and WPLC Language Let Atm0 be a countable set of atomic propositions with elements noted 𝑝, 𝑝 ′ , … to denote input variables of classifiers. We introduce a finite set Val to denote the output values (classifications, decisions) of the classifier. Elements of Val are noted 𝑐, 𝑐 ′ , … for classes. For any 𝑐 ∈ Val, we call t(𝑐) a decision atom, to be read as “the actual decision (or output) takes value 𝑐”, and have Dec = {t(𝑐) ∶ 𝑐 ∈ Val}. Finally, let Atm = Atm0 ∪ Dec. The modal language ℒ (Atm) is hence defined by the following grammar: 𝜑 ∶∶= 𝑝 ∣ t(𝑐) ∣ ¬𝜑 ∣ 𝜑 ∧ 𝜑 ∣ 2I 𝜑 ∣ 2F 𝜑, where 𝑝 ranges over Atm0 , 𝑐 ranges over Val, and 𝑋 is a finite subset of Atm0 which we note 𝑋 ⊆fin Atm0 . Connectives ∨, →, ↔, ◇I and ◇F are defined in the normal way. Let Atm(𝜑), Atm0 (𝜑), Dec(𝜑) denote the set of all atomic propositions, input variables, and de- cision atoms in the formula 𝜑 respectively. Finally, let ℒ −2F (Atm) denote the 2F -free fragment of ℒ (Atm). Semantics The language is built to model (possibly partial) functions from 2Atm0 to Val, and their interactions. Let us begin with the language ℒ −2F (Atm), which is interpreted relative to classifier models whose class is defined as follows. Definition 1 (Classifier model). A classifier model (CM) is a tuple 𝐶 = (𝑆, 𝑓 ) where: • 𝑆 ⊆ 2Atm0 is the set of states, and • 𝑓 ∶ 𝑆 ⟶ Val is a decision (or classification) function. A pointed CM is a pair (𝐶, 𝑠) where 𝐶 is a CM and 𝑠 ∈ 𝑆. We call 𝐶 = (𝑆, 𝑓 ) finite if 𝑆 is finite. The class of (finite) classifier models is noted CM (finite-CM). Hence, the classifier 𝑓 has more than 2 outputs if |𝑟𝑎𝑛(𝑓 )| > 2; has countably infinite variables if Atm0 is countably infinite; and is partial if 𝑆 ≠ 2Atm0 . Essentially, one can view a CM just as an S5 model on Atm0 with partition labelled by elements in Val, which is indicated by the satisfaction relation defined below, where 2I tentatively seems nothing but an S5 operator. Definition 2 (Satisfaction relation 1). Let 𝜑 ∈ ℒ −2F (Atm), 𝐶 = (𝑆, 𝑓 ) be a CM and 𝑠 ∈ 𝑆: (𝐶, 𝑠) ⊧ 𝑝 ⟺ 𝑝 ∈ 𝑠, (𝐶, 𝑠) ⊧ t(𝑐) ⟺ 𝑓 (𝑠) = 𝑐, (𝐶, 𝑠) ⊧ ¬𝜑 ⟺ (𝐶, 𝑠) ⊧ ̸ 𝜑, (𝐶, 𝑠) ⊧ 𝜑 ∧ 𝜓 ⟺ (𝐶, 𝑠) ⊧ 𝜑 and (𝐶, 𝑠) ⊧ 𝜓 , (𝐶, 𝑠) ⊧ 2I 𝜑 ⟺ ∀𝑠 ′ ∈ 𝑆, (𝐶, 𝑠 ′ ) ⊧ 𝜑. As mentioned, we think of black box classifier as uncertainty over a set of admissible classifiers coherent with the agent’s partial knowledge. This thought is formalized as the multi-classifier model defined below, which is nothing but a set of CMs with the same set of states. Definition 3 (Multi-classifier model). A multi-classifier model (MCM) is a pair Γ = (𝑆, Φ) where 𝑆 ⊆ 2Atm0 and Φ ⊆ Val 𝑆 , where Val 𝑆 is the set of functions with domain 𝑆 and codomain Val. A pointed MCM is a triple (Γ, 𝑠, 𝑓 ) where Γ = (𝑆, Φ) is an MCM, 𝑠 ∈ 𝑆 and 𝑓 ∈ Φ. We call Γ = (𝑆, Φ) finite if 𝑆 is finite. The class of all (finite) multi-classifier models is noted MCM (finite-MCM). Definition 4 (Satisfaction relation 2). Let 𝜑 ∈ ℒ (Atm), Γ = (𝑆, Φ) an MCM, 𝑠 ∈ 𝑆 and 𝑓 ∈ Φ: (Γ, 𝑠, 𝑓 ) ⊧ 𝑝 ⟺ 𝑝 ∈ 𝑠, (Γ, 𝑠, 𝑓 ) ⊧ t(𝑐) ⟺ 𝑓 (𝑠) = 𝑐, (Γ, 𝑠, 𝑓 ) ⊧ ¬𝜑 ⟺ (Γ, 𝑠, 𝑓 ) ⊧ ̸ 𝜑, (Γ, 𝑠, 𝑓 ) ⊧ 𝜑 ∧ 𝜓 ⟺ (Γ, 𝑠, 𝑓 ) ⊧ 𝜑 and (Γ, 𝑠, 𝑓 ) ⊧ 𝜓 , (Γ, 𝑠, 𝑓 ) ⊧ 2I 𝜑 ⟺ ∀𝑠 ′ ∈ 𝑆 ∶ (Γ, 𝑠 ′ , 𝑓 ) ⊧ 𝜑, (Γ, 𝑠, 𝑓 ) ⊧ 2F 𝜑 ⟺ ∀𝑓 ′ ∈ Φ ∶ (Γ, 𝑠, 𝑓 ′ ) ⊧ 𝜑. Both 2I 𝜑 and 2F 𝜑 have standard modal reading but range over different sets. 2I 𝜑 has to be read “𝜑 necessarily holds for the actual function, regardless of the input instance”, while its dual ◇I 𝜑 =def ¬2I ¬𝜑 has to be read “𝜑 possibly holds for the actual function, regardless of the input instance”. Similarly, 2F 𝜑 has to be read “𝜑 necessarily holds for the actual input instance, regardless of the function” and its dual ◇F 𝜑 has to be read “𝜑 possibly holds for the actual input instance, regardless of the function”. Therefore, the agent knows that the actual classification for 𝑠 is 𝑐, if (Γ, 𝑠, 𝑓 ) ⊧ 2F t(𝑐), i.e. only classifiers outputting 𝑐 for 𝑠 are admissible; and (Γ, 𝑠, 𝑓 ) ⊧ ◇F t(𝑐) means that classifying 𝑠 as 𝑐 is coherent with agent’s partial knowledge. With these two modal dimensions, our framework subjects to the so-called product modal logic. An important abbreviation is the following, where 𝑋 ⊆fin Atm0 : [𝑋 ]𝜑 =def ⋀ ((⋀ 𝑝 ∧ ⋀ ¬𝑝) → 2I ((⋀ 𝑝 ∧ ⋀ ¬𝑝) → 𝜑)). 𝑌 ⊆𝑋 𝑝∈𝑌 𝑝∈𝑋 ⧵𝑌 𝑝∈𝑌 𝑝∈𝑋 ⧵𝑌 Complex as it seems, [𝑋 ]𝜑 means nothing but “𝜑 necessarily holds, if the values of the input variables in 𝑋 are kept fixed”. It can be justified by checking that (Γ, 𝑠, 𝑓 ) ⊧ [𝑋 ]𝜑, if and only if ∀𝑠 ′ ∈ 𝑆, if 𝑠 ∩ 𝑋 = 𝑠 ′ ∩ 𝑋 then (Γ, 𝑠 ′ , 𝑓 ) ⊧ 𝜑. Its dual ⟨𝑋 ⟩𝜑 =def ¬[𝑋 ]¬𝜑 has to be read “𝜑 possibly holds, if the values of the input variables in 𝑋 are kept fixed”. These modalities have a ceteris paribus reading and were first introduced in [16]. Notice when 𝑋 = ∅, [∅] coincides with 2I . Axiomatics We have to separate two cases, when Atm0 is finite or countably infinite. The reason lays on the axiom Funct in Table 2, which intends to express the “functionality” property syntactically. We define cn𝑋,Atm0 =def ⋀𝑝∈𝑋 𝑝 ∧ ⋀𝑝∈Atm0 ⧵𝑋 ¬𝑝. But when Atm0 is infinite, cn𝑋,Atm0 is not a well-formed formula, and Funct has to be abandoned. Definition 5 (Axiomatics). We define PLC as the extension of classical propositional logic with all axioms and inference rules in Table 2; WPLC as PLC minus Funct; BCL as all 2F -free axioms and inference rule in Table 2; and WBCL as BCL minus Funct. (■𝜑 ∧ ■(𝜑 → 𝜓 )) → ■𝜓 (K■ ) ■𝜑 → 𝜑 (T■ ) ■𝜑 → ■■𝜑 (4■ ) ¬■𝜑 → ■¬■𝜑 (5■ ) 2F 2I 𝜑 ↔ 2I 2F 𝜙 (Comm) ⋁ t(𝑐) (AtLeastt(𝑐) ) 𝑐∈Val t(𝑐) → ¬t(𝑐 ′ ) if 𝑐 ≠ 𝑐 ′ (AtMostt(𝑐) ) (cn𝑋,Atm0 ∧ t(𝑐)) → 2I (cn𝑋,Atm0 → t(𝑐)) (Funct) 𝑝 → 2F 𝑝 (Indep2F ,𝑝 ) ¬𝑝 → 2F ¬𝑝 (Indep2F ,¬𝑝 ) 𝜑 (Nec■ ) ■𝜑 Table 2 Axioms and rules of inference, with ■ ∈ {2I , 2F } We obtained the technical results in Theorem 1 and Table 3, whose proofs are in [1, 2]. Theorem 1. Let Atm0 be finite, then BCL and PLC are sound and complete with respect to CM and MCM respectively. Let Atm0 be infinite, then WBCL and WPLC are sound and complete with respect to CM and MCM respectively. Finite variables Infinite variables Fragment ℒ −2F (Atm) Polynomial NP-complete Full language ℒ (Atm) Polynomial in NEXPTIME Table 3 Summary of complexity results 3. Classifier Explanations: Objective and Subjective In the jargon of Boolean functions, a term or a property is a conjunction of literals (an atom or its negation), which we denote by 𝜆. We use 𝑇 𝑒𝑟𝑚(𝑋 ) to denote all terms whose atoms are in 𝑋. In the XAI literature recently people have focused on local explanation, namely to answer why the given instance is classified as such and so. A central notion is called abductive explanation [11] (or sufficient reason [17]). It is expressible in ℒ (Atm) as follows: A X p (𝜆, 𝑐) =def 𝜆 ∧ [Atm(𝜆)]t(𝑐) ∧ ⋁ ⟨Atm(𝜆) ⧵ {𝑝}⟩¬t(𝑐). 𝑝∈Atm(𝜆) The three conjuncts mean that 1) 𝜆 is a “part” of the instance; 2) atoms in Atm(𝜆) staying the same valuation as in 𝑠, t(𝑐) necessarily holds regardless of other atoms; 3) 𝜆 is the “minimal” such property, in the sense that any its proper part 𝜆′ ⊂ 𝜆 fails condition 2). Hence intuitively, it is sufficient and necessary to answer why the actual classification is 𝑐 by saying “because the instance obtains property 𝜆”. Let 𝐶 = (𝑆, 𝑓 ) be a CM. We say that 𝑓 is 𝑋-definite for 𝑋 ⊆fin Atm0 , if ∀𝑠, 𝑠 ′ ∈ 𝑆, 𝑠 ∩ 𝑋 = 𝑠 ′ ∩ 𝑋 then 𝑓 (𝑠) = 𝑓 ′ (𝑠). And it is easy to see that 𝑓 is 𝑋-definite, iff (𝐶, 𝑠) ⊧ D e f (𝑋 ) where D e f (𝑋 ) =def ⋀𝑐∈Val 2I (⟨𝑋 ⟩t(𝑐) → [𝑋 ]t(𝑐)). When the classifier is 𝑋-definiteness for some 𝑋 ⊆fin Atm0 , AXp always exists for the actual classification. We may call it the “principle of sufficient reason” (PSR) in term of Spinoza [Ethics, 1p11d2], and obtain the following validity. ⊧CM (t(𝑐) ∧ D e f (𝑋 )) → ⋁ A X p (𝜆, 𝑐) 𝜆∈Term(𝑋 ) However, a sufficient reason may not be known to the agent when the classifier is a black box. We define 𝜆 as a subjective abductive explanation of the actual classification 𝑐, noted S u b A X p (𝜆, 𝑐), if the agent knows that 𝜆 is an abductive explanation of the actual classification 𝑐, that is: S u b A X p (𝜆, 𝑐) =def 2F A X p (𝜆, 𝑐). To see how S u b A X p fails PSR, consider the following example. Suppose a classifier trained for deciding whether a paper is acceptable for a conference which has four input features: significance, originality, clarity and anonymity. Let 1 and 0 denote acceptance and rejection respectively. Example 1 (Fail of PSR in black box). Let Γ = (𝑆, Φ) be an MCM of this black box, where 𝑆 = 2{𝑠𝑖,𝑜𝑟,𝑐𝑙,𝑎𝑛} and 𝑠1 = {𝑠𝑖, 𝑜𝑟, 𝑎𝑛} ∈ 𝑆. Consider 𝑓1 , 𝑓2 ∈ Φ whose syntactic expressions are 2I (t(1) ↔ ((𝑜𝑟 ∧ 𝑎𝑛) ∨ (𝑐𝑙 ∧ 𝑎𝑛)), and 2I (t(1) ↔ (𝑠𝑖 ∧ 𝑎𝑛)) resp.. Then, (Γ, 𝑠1 , 𝑓1 ) ⊧ A X p (𝑜𝑟 ∧ 𝑎𝑛, 1) ∧ ⋀ ¬S u b A X p (𝜆, 1). 𝜆∈𝑇 𝑒𝑟𝑚({𝑠𝑖,𝑜𝑟,𝑐𝑙,𝑎𝑛}) Therefore, it is of particular interest to determine how much knowledge is needed to verify subjective AXps. This problem can be studied in form of model checking. Let ΓΣ,𝑆,𝑠0 = (𝑆, ΦΣ,𝑆,𝑠0 ) denote an MCM induced by Σ a finite subset of ℒ −2F (Atm), 𝑆 ⊆ 2Atm0 , 𝑠0 ∈ 𝑆, where ΦΣ,𝑆,𝑠0 =def {𝑓 ∈ Val 𝑆 ∶ ∀𝜓 ∈ Σ, (𝑆, 𝑓 , 𝑠0 ) ⊧ 𝜓 }. We can formalize the following model checking problem. Subjective AXp existence Given: finite Σ ⊂ ℒ −2F (Atm), 𝑆 ⊆ 2Atm0 , 𝑠0 ∈ 𝑆. Question: Does it exist a term 𝜆 s.t. (ΓΣ,𝑆,𝑠0 , 𝑠0 , 𝑓 ) ⊧ A X p (𝜆, 𝑓 (𝑠0 )) for all 𝑓 ∈ ΦΣ,𝑆,𝑠0 ? There are many other explanation notions, and logical extensions discussed in [1, 2]. And we are working on applying this family of modal logics to more topics in XAI. References [1] X. Liu, E. Lorini, A logic for binary classifiers and their explanation, in: P. Baroni, C. Benzmüller, Y. N. Wáng (Eds.), Logic and Argumentation - 4th International Conference, CLAR 2021, Hangzhou, China, 2021, Proceedings, Springer, 2021, pp. 302–321. [2] X. Liu, E. Lorini, A logic of “black box” classifier systems, in: Logic, Language, Informa- tion, and Computation: 28th International Workshop, WoLLIC 2022, Iași, Romania, 2022, Proceedings, Springer Nature, 2022, pp. 158–174. [3] C. G. Hempel, P. Oppenheim, Studies in the logic of explanation, Philosophy of science 15 (1948) 135–175. [4] B. Kment, Counterfactuals and explanation, Mind 115 (2006) 261–310. [5] J. Woodward, Explanation and invariance in the special sciences, The British journal for the philosophy of science 51 (2000) 197–254. [6] M. T. Ribeiro, S. Singh, C. Guestrin, ” why should i trust you?” explaining the predictions of any classifier, in: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1135–1144. [7] O. Biran, C. Cotton, Explanation and justification in machine learning: A survey, in: IJCAI-17 workshop on explainable AI (XAI), volume 8(1), 2017, pp. 8–13. [8] S. Wachter, B. Mittelstadt, C. Russell, Counterfactual explanations without opening the black box: Automated decisions and the gdpr, Harv. JL & Tech. 31 (2017) 841. [9] A. Dhurandhar, P.-Y. Chen, R. Luss, C.-C. Tu, P. Ting, K. Shanmugam, P. Das, Explanations based on the missing: Towards contrastive explanations with pertinent negatives, in: Advances in neural information processing systems, 2018, pp. 592–603. [10] A. Shih, A. Choi, A. Darwiche, A symbolic approach to explaining bayesian network classifiers, arXiv preprint arXiv:1805.03364 (2018). [11] A. Ignatiev, N. Narodytska, J. Marques-Silva, Abduction-based explanations for machine learning models, in: Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, 2019, pp. 1511–1519. [12] R. K. Mothilal, A. Sharma, C. Tan, Explaining machine learning classifiers through di- verse counterfactual explanations, in: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 2020, pp. 607–617. [13] S. Verma, J. Dickerson, K. Hines, Counterfactual explanations for machine learning: A review, arXiv preprint arXiv:2010.10596 (2020). [14] T. Miller, Contrastive explanation: A structural-model approach, The Knowledge Engi- neering Review 36 (2021). [15] X. Huang, Y. Izza, A. Ignatiev, M. Cooper, N. Asher, J. Marques-Silva, Tractable explanations for d-dnnf classifiers, in: Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, 2022, pp. 5719–5728. [16] D. Grossi, E. Lorini, F. Schwarzentruber, The ceteris paribus structure of logics of game forms, Journal of Artificial Intelligence Research 53 (2015) 91–126. [17] A. Darwiche, A. Hirth, On the reasons behind decisions, in: ECAI 2020 - 24th European Conference on Artificial Intelligence, volume 325 of Frontiers in Artificial Intelligence and Applications, IOS Press, 2020, pp. 712–720.