An Approach To Formalization of an Extension of Floyd-Hoare Logic Artur Kornilowicz1 , Andrii Kryvolap2 , Mykola Nikitchenko2 , Ievgen Ivanov2 1 Institute of Informatics, University of Bialystok, Ciolkowskiego 1M, 15-245 Bialystok, Poland 2 Taras Shevchenko National University of Kyiv 64/13, Volodymyrska Street, 01601 Kyiv, Ukraine arturk@math.uwb.edu.pl, krivolapa@gmail.com, nikitchenko@unicyb.kiev.ua, ivanov.eugen@gmail.com Abstract. The classical Floyd-Hoare logic is defined for the case of to- tal pre- and postconditions and partial programs (i.e. programs can be undefined on some input data, but conditions must be defined on all data). In this paper we propose an extension of this logic for the case of partial conditions and partial programs on hierarchically organized data. These data are based on the naming relation and are called nomi- native data. They can conveniently represent many data structures used in programming. The semantics of the proposed logic is represented by special algebras of partial functions and predicates over nominative data. Operations of these algebras are called compositions. We present an in- ference system for the mentioned logic and propose an approach to its formalization in Mizar proof assistant. The obtained results can be used in software verification. Keywords. Formal methods, software verification, Floyd-Hoare logic, proof assistant, nominative data. Key Terms. Specification Process, Verification Process 1 Introduction Floyd-Hoare logic [1–3] is a formal system widely used for reasoning about pro- gram correctness. It is based on the notion of a Floyd-Hoare triple (assertion) which consists of a precondition, a program, and a postcondition and means the following requirement: when input data satisfies the precondition, the program output must satisfy the postcondition, if the program terminates. Specification of program properties in terms of Floyd-Hoare triples is natural and reasoning is convenient thanks to a compositional proof system. A survey of the important results on properties of the Hoare’s proof system (soundness, completeness in specific senses) and its extensions was given in [3]. In the classical Floyd-Hoare logic predicates (pre- and postconditions) are assumed to be total (defined on all data) and programs can be partial (in the sense that if a program does not terminate, its resulting value is undefined). The ability to deal with partiality is an important aspect, because partial oper- ations frequently arise in programming. In most programming languages some basic operations on data such as arithmetic division are already partial. Further- more, partiality of programs may be caused by non-termination which can arise from loop constructs and/or recursion. For similar reasons partiality can arise in software specifications. In [4] the following classification of partiality phenomena in software speci- fications was proposed: non-termination, i.e. if evaluation of an expression does not terminate, its value is assumed to be undefined and the operation is consid- ered partial; error value, i.e. if some values of an operation’s argument are illegal (e.g. division by zero, Pop operation applied to an empty stack, etc.), the result of the operation on such values is assumed to be undefined and the operation is considered partial; and nondeterminism, i.e. if a result of an operation on an argument value is not determined uniquely by the specification of this operation (operation is underspecified), the result of application of the operation to such a value is assumed to be undefined and the operation is considered partial. Other opinions on the meanings of partiality in software specifications can be found in [5–7]. In [5] a taxonomy of the ways of dealing with partiality in software specifica- tion languages and logics was proposed. Among different approaches notable are excluding partial functions from consideration and providing alternative nota- tions (e.g. graph of a partial function), using a three-valued (many-valued) logic, where the third value represents an undefined result, or making all function ap- plications denote [5]. It should be noted that almost all approaches that try to not allow partial programs and/or predicates that describe program guards or properties explicitly and reduce or translate them to the classical case of total functions and predicates have drawbacks analyzed in detail in [4–6]. A more natural and potentially fruitful approach is to allow partiality in both programs and program specifications and construct non-classical proof systems allowing explicit reasoning about properties of such programs and specifications. This approach is applied in this paper to Floyd-Hoare logic. More specifically, in the classical Floyd-Hoare logic the predicates describing program pre- and postconditions are assumed to be total. But obviously, it is desirable to be able to use partial operations in pre- and postconditions of programs, where partiality may be interpreted in one of the senses proposed in [4]. So it is desirable to obtain an extension of Floyd-Hoare logic that is able to deal with both partial programs and partial predicates. We consider such an extension in this paper. In the previous works [8, 9] we have considered extensions of Floyd-Hoare logic to partial mappings over data represented as partial mappings on named values (called nominative sets) and proposed the corresponding inference systems and investigated their soundness and extensional and intensional completeness. However, nominative sets (which can be considered as partial functions from names to values) naturally represent only a flat data organization in low-level programming. Using Floyd-Hoare logic with partial mappings over nominative sets for reasoning about programs which operate on complex data structures (e.g. trees) is inconvenient, because one needs to take into account many low-level details about data structure implementation. For this reason, in this paper we propose an extension of Floyd-Hoare logic for the case of partial conditions and partial programs on hierarchically organized data. As was shown in [10] hierarchically organized data (more specifically, a class of such data called nominative data [10]) is sufficient for representing many data structures (like multidimensional arrays, lists, trees, tables, etc.) that are frequently used in programming. To develop such an extension we will adopt the composition-nominative approach [11] to program formalization. This approach aims to propose a mathematical basis for development of formal methods of anal- ysis and synthesis of software systems and is grounded on several principles [11], including the Development principle (from abstract to concrete), the Principle of integrity of intensional and extensional aspects, the Principle of priority of semantics over syntax, Compositionality principle, and the Nominativity princi- ple. The latter Nominativity principle states that nominative data [2] (a special class of hierarchically organized data) are adequate mathematical models of var- ious forms of data that are processed and stored in computing systems. There exist several types of nominative data [11] (with simple or complex names and with simple or complex values), but all of them are based on naming relations that associate names and values. In the composition-nominative approach on the abstract level a computing system is modeled as a partial function that maps nominative data (input data) to nominative data (output data). Such functions are called binominative. Properties of data are represented as partial predicates on nominative data. Nominative functions and predicates can be composed in many ways, e.g. by sequential composition, branching, and so on. Operations that construct composed systems from constituents are called compositions. A set of compositions together with a set of functions obtained from a chosen set of basic functions by applications of compositions forms an algebraic system (program algebra) which is a semantic model of a programming language. The syntax of this language follows naturally from this semantic model: programs are represented as terms of the described algebra. The relation of the above mentioned notions to semantics of programming languages can be illustrated on a simple programming language WHILE [12]. This is an imperative language in which programs are composed from state- ments which contain boolean and arithmetic expressions. A program state is an assignment of values to variable names which can be modeled as a nomina- tive set. Semantics of a statement can be represented as a partial binominative function (a mapping from states to states) and semantics of a boolean or arith- metic expression can be represented as a function on nominative data which takes boolean or integer values. Semantics of statements are composed to obtain semantics of a program. In accordance with the composition-nominative approach the semantic com- ponent of our Floyd-Hoare logic extension will be based on program algebra (a set of functions and predicates on nominative data which can be obtained from some chosen basic functions and predicates using a specific set of compositions). In this paper the carrier sets of our program algebra will consist of partial func- tions and predicates over nominative data with complex names and complex values [10]. We will treat a Floyd-Hoare triple as a composition with two predicates on nominative data and a program (a partial binominative function which belongs to the carrier set of the program algebra) as arguments. The predicates represent pre- and postconditions and the result of the composition is a predicate. How- ever, the classical definition of Floyd-Hoare triple validity leads to Floyd-Hoare composition that is not monotone [8]. Monotonicity is one of the key proper- ties used for reasoning about programs. It is also important for reasoning about loop-free programs and using them as approximations of programs with loops. This explains the need of a special definition of Floyd-Hoare composition for the extension of Floyd-Hoare logic on partial predicates which is monotone, but converges to the classical definition, if predicates are total. Such a definition was presented in [8] and we will adapt it to the case considered in this paper. To make our Floyd-Hoare logic extension practically applicable for program verification one can implement it in a proof assistant software [13]. Many well known proof assistants (e.g. Isabelle, Coq, PVS, etc.) provide a substantial support for reasoning about total functions, programs, predicate and are convenient for either formulating the classical Floyd-Hoare logic ax- iomatically, or embedding it in their logics. For example, Isabelle proof assistant includes the “Hoare” HOL (Higher-Order Logic)-based theory that provides an implementation of Hoare logic for a simple imperative programming language with WHILE loops following [14, 15]. However, a support of reasoning about pro- grams using partial predicate pre- and postconditions is generally not developed. We propose an approach to formalization of our extended Floyd-Hoare logic which supports partial pre- and postconditions in the proof assistant Mizar [16, 17]. This proof assistant is based on first-order logic and axiomatic set theory (Tarski-Grothendieck set theory) and has one of the richest libraries of formalized mathematical theories that spans many branches of mathematics. This library is called Mizar Mathematical Library (MML). Consequently, Mizar has well devel- oped tools for working with partial functions and predicates and is well-suited for our purposes. Besides, the Mizar system has a degree of proof automation sup- port such as discovery of a list of proven facts that imply the current goal which may be used as basis for implementing software verification in a semi-automatic mode. To simplify and partially automate application of the Floyd-Hoare logic to proving program properties it is convenient to have a corresponding system of inference rules. The traditional inference system for the language WHILE [12] is sound and extensionally complete for the classical Floyd-Hoare logic with total predicates [12] (extensional completeness means that pre- and postconditions may be arbitrary predicates [12]; intensional completeness means that pre- and postconditions should be presented by formulas of a given language). Soundness and completeness are important for practical applicability of an inference system (if a system is not sound, assertions that can be inferred using this system may be false; if a system is not complete, some of the valid assertions could be impossible to infer). However, this inference system is not sound and complete for partial predicates as was shown in [8]. To deal with the soundness and completes problem we will modify the tradi- tional inference system for the language WHILE and introduce additional con- straints on inference rules that correspond to the new definition of validity of Floyd-Hoare assertions, and investigate its soundness and extensional complete- ness. The obtained results extend the results concerning inference systems for a Floyd-Hoare logic with partial predicates over flat (nonhierarchical) data ob- tained in [8, 9]. The paper is organized in the following way. In Section 2 we describe the notion of nominative data and define main operations on them. In Section 3 we describe the semantic base of our extended Floyd-Hoare logic. In Section 4 we specify the syntax of our extended Floyd-Hoare logic. In Section 5 we propose an inference system for our logic and consider problems of its soundness and completeness. In Section 6 we describe an approach to formalization of our extended Floyd-Hoare logic in Mizar. In Section 7 we describe the related work. In Section 8 we give conclusions. 2 Algebra of Nominative Data In the composition-nominative approach data are treated as nominative data. There are several types of nominative data, but all of them are based on naming relations. The simplest type of nominative data is the class of nominative sets which are partial mappings from a set of names (program variables) to a set of basic values. Other types of nominative data represent hierarchical data organi- zations [10]. Before giving definitions, let us introduce the following notation. p To distinguish total functions from partial we will use the symbol −→ for t n partial functions and −→ for total. We will also use the symbol −→ for partial p functions with finite graph. For any partial function f : D −→ D0 on some set D: – f (d) ↓ denotes that f is defined on d ∈ D; – f (d) ↓= d0 denotes that f is defined on d ∈ D with a value d0 ∈ D0 ; – f (d) ↑ denotes that f is undefined on d ∈ D; – dom(f ) = {d ∈ D | f (d) ↓} is the domain of a function (note that in different branches of mathematics there exist different definitions of the domain of a partial functions; we will adopt the convention used in recursion theory); We will denote by f1 (d1 ) ∼ = f2 (d2 ) the strong equality, i.e. the condition that f1 (d1 ) ↓ if and only if f2 (d2 ) ↓, and if f1 (d1 ) ↓, then f1 (d1 ) = f2 (d2 ). For any nonempty set V we will denote by V + the set of all nonempty finite sequences (words) of elements of V . For any word u ∈ V + we will denote by |u| its length. If u, v ∈ V + , we will denote by uv the concatenation of u and v. We will write u ≤ v, if u is a prefix of v, and u < v, if u ≤ v, u 6= v. For any set of names V and a set of basic values A the corresponding class V A of nominative sets is defined as V p A = V −→ A. We will use the following notations for nominative sets: – [v1 → a1 , . . . , vn → an ], where v1 , . . . , vn are names from V and a1 , ... an are atoms from A, denotes a nominative set with the graph {(v1 , a1 ), . . . , (vn , an )}; – [vi 7→ ai |i ∈ I], where I is some set of indices, means a nominative set with the graph {(vi , ai ) | i ∈ I}; – v 7→ a ∈ d, where d is a nominative set, means that d(v) ↓= a, i.e. the value of the variable v in d is a; – [] or ∅ denotes the empty nominative set (a nowhere defined function). Nominative data are built over classes of names V and basic values (atoms) A using naming relations name→value. A rough idea is that a nominative data d is either an atom from A, or has the form [v1 → d1 , . . . , vn → dn ], where v1 , . . . , vn are names from V and d1 , . . . , dn are either atoms or other nominative data. Nominative data are classified in accordance with the following 2 parameters [10]: values can be simple (unstructured) or complex (structured), and names can be simple (unstructured) or complex (structured). These parameters give 4 types of nominative data. To define the notion of a complex name we will use the Development principle (from abstract to concrete) and consider the simplest case of name construction: complex names are sequences of simple names which satisfy the associativity property [10]. More specifically, we will assume that complex names are constructed with the help of concatenation operation (which is associative). We will adopt the following Principle of associative construction and processing of complex names [10]: complex names are constructed from sim- ple names using concatenation, and data with complex names must be processed by operations that take into account associativity of names. Moreover, we will require that data with complex names satisfy the Principle of unambiguous as- sociative naming [10]: one complex name must have at most one corresponding value in any given data. The definitions of 4 types of nominative data (T N DSS , T N DSC , T N DCS , T N DCC ) are given below. We will assume that V and A are fixed nonempty sets of simple names and basic values. We will call the elements of V + complex names. 1. Data of the type T N DSS (data with simple names and simple values ) are n elements of the set V −→ A. For example, [u 7→ 1, v 7→ 2], if u, v ∈ V , 1, 2 ∈ A. 2. Data of the type T N DSC (data with simple names and complex values ) are elements of the set N D(V, A), where [ N D(V, A) = N Dk (V, A), k≥0   n N D0 (V, A) = A ∪ {∅}, N Dk+1 (V, A) = N Dk (V, A) ∪ V −→ N Dk (V, A) for all k = 0, 1, 2, .... Data of this class are hierarchically constructed, for example, [u 7→ 1, v 7→ [w 7→ 2]], if u, v, w ∈ V and 1, 2 ∈ A. Such data can be represented by oriented trees with arcs labeled by names and leafs labelled by atoms. We will call any finite sequence of names p = (v1 , v2 , ..., vk ) a path. A path in a given data d ∈ N D(V, A) is a path (v1 , v2 , ..., vk ) such that the value of the expression (...((d(v1 ))(v2 ))...(vk )) is defined (it corresponds to a path from the root to a leaf in a tree). If p = (v1 , v2 , ..., vk ) is a path in d, we will say that (...((d(v1 ))(v2 ))...(vk )) is the value of p in d and denote it as d(v1 , v2 , ..., vk ). A terminal path is a path with atomic or empty value. The rank of d is the least k such that d ∈ N Dk (V, A). 3. Data of the type T N DCS (data with complex names and simple values ) are elements of the set N DV S(V, A), where N DV S(V, A) is the set of all elements of n A ∪ (V + −→ A) n such that either d ∈ A, or d ∈ V + −→ A and all strings from dom(d) are pairwise incomparable in the sense of the prefix relation. For example, [uv 7→ 1, uw 7→ 2, w 7→ 3], if u, v, w ∈ V are different and 1, 2, 3 ∈ A. 4. Data of the type T N DCC (data with complex names and complex values) are elements of the set N DV C(V, A), where N DV C(V, A) is the set of all d ∈ N D(V + , A) such that for any two paths (u1 , u2 , ..., uk ) and (v1 , v2 , ..., vl ) in d, neither of which is a prefix of another, words u1 u2 ...uk and v1 v2 ...vl are incomparable in the sense of prefix relation (principle of unambiguous associative naming). For example, [uv 7→ 1, w 7→ [uw 7→ 2]], if u, v, w ∈ V are different, 1, 2 ∈ A. In [8] it was shown how conventional data structures can be represented by different kinds of nominative data. The most complex and interesting type of nominative data is T N DCC , so we will use it in this paper. The main operations on nominative data are operations of denaming (taking a value of a name), naming (assigning to a name a new value), and overlapping (overwriting). Below we give definitions of these operations for nominative data of the type T N DCC . For any word u ∈ V + and any data d ∈ N DV C(V, A) let us denote d/u = [v1 7→ d(v) | d(v) ↓, v = uv1 , v1 ∈ V + ] (division of d by u). p Definition 1. Associative denaming v ⇒a : N DV C(V, A) −→ N DV C(V, A) is an operation with a parameter v ∈ V + defined by induction on length of v as follows: – if |v| = 1, then v ⇒a (d) = d(v), if d(v) ↓; v ⇒a (d) = d/v if d(v) ↑ and d/v 6= ∅, and v ⇒a (d) ↑ otherwise. – if |v| = n > 1, then v ⇒a (d) ∼ = v1 ⇒a (x ⇒a (d)), where v = xv1 , x ∈ V , v1 ∈ V n−1 (principle of associative denaming). For example, (uv) ⇒a ([u 7→ [vw 7→ 1, u 7→ 2]]) = [w 7→ 1]. It is easy to check that v ⇒a satisfies the following property (associativity) u ⇒a (d) ∼ = un ⇒a (un−1 ⇒a (... u1 ⇒a (d)...)) for all complex names u, u1 , u2 , ..., un ∈ V + such that u = u1 u2 ...un . t Definition 2. Naming is an operation ⇒ v : N DV C(V, A) −→ N DV C(V, A) with a parameter v ∈ V + such that ⇒ v(d) = [v 7→ d]. Overlapping can be considered as an operation which updates values in the first argument with the values from the second argument. It joins two data and resolves name conflicts in favor of its second argument. We will define two kinds of overlapping: global and local. Global overlapping can be used for formalization of procedures calls and the local overlapping formalizes the assignment operator in programming languages. Definition 3. Global overlapping is a binary operation p ∇a : N DV C(V, A) × N DV C(V, A) −→ N DV C(V, A) defined inductively by the rank of the first argument as follows. Let N DV Ck (V, A) = N DV C(V, A)∩N Dk (V, A) (data of the rank not greater than k). Induction base. If d0 ∈ N DV C0 (V, A), then  d2 , d1 = ∅ and d2 ∈ N DV C(V, A)\A; d 1 ∇a d 2 ∼ = undef ined, d1 ∈ A or d2 ∈ A. Induction step. Assume that the value d1 ∇a d2 is defined for all d1 , d2 such that d1 ∈ N DV Ck (V, A). Let d1 ∈ N DV Ck+1 (V, A) and d1 ∈ / N DV Ck (V, A). Then d1 ∇a d2 = d, where d ∈ N DV C(V, A) is defined by its values on names u ∈ V +: – d(u) = d2 (u), if u ∈ dom(d2 ) and u does not have a proper prefix which belongs to dom(d1 ). – d(u) = d1 (u)∇a (d2 /u), if d1 (u) ↓ and d1 (u) ∈ / A, and u is a proper prefix of some element of dom(d2 ); – d(u) = d2 /u, if d1 (u) ↓ and d1 (u) ∈ A, and u is a proper prefix of some element of dom(d2 ); – d(u) = d1 (u), if d1 (u) ↓ and u is not comparable (in the sense of prefix relation) with any element of dom(d2 ); – d(u) ↑, otherwise. The following examples illustrate this operation: 1. [u 7→ d1 ]∇a [v 7→ d2 ] = [u 7→ d1 , v 7→ d2 ], if u, v are incomparable in the sense of prefix relation; 2. [uv 7→ d1 ]∇a [u 7→ d2 ] = [u 7→ d2 ], i.e. a value under a name in the sec- ond argument overwrites a value under extension of this name in the first argument; 3. [u 7→ d1 ]∇a [uv 7→ d2 ] = [u 7→ (d1 ∇s [v 7→ d2 ])] (d1 ∈ / A), i.e., a value under a name in the second argument modifies values under prefixes of this name in the first argument. Definition 4. Local overlapping is an operation p ∇va : N DV C(V, A) × N DV C(V, A) −→ N DV C(V, A) with a parameter v ∈ V + such that d 1 ∇v d 2 ∼ = d1 ∇a (⇒ v(d2 )). a For example, [u 7→ 1]∇va [w 7→ 2] = [u 7→ 1, v 7→ [w 7→ 2]]. Definition 5. An algebra of nominative data of type T N DCC (denoted as N DA(V, A)) is an algebra with the carrier N DV C(V, A) and the operations {⇒ v}v∈V + , {v ⇒a }v∈V + , {∇va }v∈V + . 3 Semantics of Extended Floyd-Hoare Logic p Let Bool = {F, T } denote the set of Boolean values, P rV,A = N DV C(V, A) −→ Bool. The elements of P rV,A are called partial nominative predicates. They can be used to represent semantics of conditions in programs. p Let F P rg V,A = N DV C(V, A) −→ N DV C(V, A). The elements of F P rg V,A are called binominative functions. They can be used to represent semantics of programs. Multi-sorted algebras on sets of partial nominative predicates and partial binominative functions can be used to define semantics of program logics [11, 18]. The operations of such algebras will be called compositions. There are many possible ways to define compositions that provide means to construct complex programs from simpler ones. We have chosen the following compositions to include them as basic to the logics of program level: – parametric assignment composition AS x which corresponds to assignment operator := ; – composition of identical program id which corresponds to the skip operator of the WHILE language; – composition of sequential execution •; – conditional composition IF which corresponds to the if-then-else operator; – cycle (loop) composition WH which corresponds to the while-do operator. We also need compositions that provide the possibility to construct predicates describing properties of programs. The main composition of this kind is the Floyd-Hoare composition FH. It takes a precondition, a postcondition, and a program as inputs and yields a predicate that represents respective Floyd-Hoare assertion. We will also define a composition of preimage predicate transformer inspired by weakest precondition introduced by Dijkstra [19]. Let us give definitions of the mentioned compositions. Definition 6. Disjunction is a binary composition t ∨ : P rV,A × P rV,A −→ P rV,A such that for all p, q ∈ P rV,A and d ∈ N DV C(V, A):   T, if p(d) ↓= T or q(d) ↓= T, (p ∨ q)(d) = F, if p(d) ↓= F and q(d) ↓= F,  undefined in other cases. Definition 7. Negation is a unary composition t ¬ : P rV,A −→ P rV,A such that for all p ∈ P rV,A and d ∈ N DV C(V, A):   F, if p(d) ↓= T, (¬p)(d) = T, if p(d) ↓= F,  undefined in other cases. We will consider conjunction p ∧ q of predicates p, q as an abbreviation for ¬(¬p ∨ ¬q). Definition 8. Existential quantification over hierarchical data is a unary com- position t ∃x : P rV,A −→ P rV,A with a parameter x ∈ V + such that for all p ∈ P rV,A and d ∈ N DV C(V, A):   T, if p(d∇xa d0 ) ↓= T for some d0 ∈ N DV C(V, A), (∃x p)(d) = F, if p(d∇xa d0 ) ↓= F for all d0 ∈ N DV C(V, A),  undefined in other cases. ˉn (V ) the set of all tuples (x1 , ..., xn ) ∈ For each n = 1, 2, 3, ... denote by U (V + )n of n complex names such that x1 , x2 , ..., xn are pairwise incomparable in the sense of prefix relation ≤. S Also, let us denote Uˉ (V ) = ∞ U ˉ n=1 n (V ). Definition 9. For each n = 1, 2, 3, ..., superposition of n functions into a func- tion is a n+1-ary composition t SFxˉ : (F P rg V,A )n+1 −→ F P rg V,A with a parameter x ˉn (V ) such that for all f, g1 , ..., gn ∈ ˉ = (x1 , ..., xn ) ∈ U F P rg V,A and d ∈ N DV C(V, A): SFxˉ (f, g1 , ..., gn )(d) ∼ = f (d∇a [x1 7→ g1 (d), ..., xn 7→ gn (d)]). Definition 10. For each n = 1, 2, 3, ..., superposition of n functions into a pred- icate is a n+1-ary composition t SPxˉ : P rV,A × (F P rg V,A )n −→ P rV,A ˉn (V ) such that for all p ∈ P rV,A , g1 , ..., gn ∈ ˉ = (x1 , ..., xn ) ∈ U with a parameter x V,A F P rg , and d ∈ N DV C(V, A): SPxˉ (p, g1 , ..., gn )(d) ∼ = p(d∇a [x1 7→ g1 (d), ..., xn 7→ gn (d)]). Definition 11. Denomination is a null-ary composition 0 x : F P rg V,A with a parameter x ∈ V + such that for each d ∈ N DV C(V, A): 0 x(d) ∼ = x ⇒a (d). Definition 12. Assignment over hierarchical data is a composition t AS x : F P rg V,A −→ F P rg V,A with a parameter x ∈ V + such that for each f ∈ F P rg V,A and d ∈ N DV C(V, A): AS x (f )(d) ∼ = d∇xa f (d). Definition 13. Identity program composition is a null-ary composition id : F P rg V,A such that for each d ∈ N DV C(V, A): id(d) = d. Definition 14. Sequential execution is a binary composition t • : F P rg V,A × F P rg V,A −→ F P rg V,A such that for all f, g ∈ F P rg V,A and d ∈ N DV C(V, A): (f • g)(d) ∼ = g(f (d)). Definition 15. Branching is a ternary composition t IF : P rV,A × F P rg V,A × F P rg V,A −→ F P rg V,A such that for all r ∈ P rV,A (condition), f, g ∈ F P rg V,A (branches bodies), and d ∈ N DV C(V, A):   f (d), if r(d) ↓= T and f (d) ↓, IF (r, f, g)(d) = g(d), if r(d) ↓= F and g(d) ↓,  undefined in other cases. Definition 16. While cycle is a binary composition t W H : P rV,A × F P rg V,A −→ F P rg V,A such that for each r ∈ P rV,A (condition), f ∈ F P rg V,A (loop body), and d ∈ N DV C(V, A): W H(p, f )(d) ↓= f (n) (d), if there exists an integer n ≥ 0 such that p(f (i) (d)) ↓= T for all i = 0, 1, ..., n − 1 and p(f (n) (d)) ↓= F , where f (i) denotes f • f • ... • f and f (0) = id; | {z } i and W H(p, f )(d) ↑, otherwise. Definition 17. Monotone Floyd-Hoare composition t F H : P rV,A × F P rg V,A × P rV,A −→ P rV,A is a composition such that for all p, q ∈ P rV,A (pre- and postcondition), f ∈ F P rg V,A (program), and d ∈ N DV C(V, A):   T, if p(d) ↓= F or q(f (d)) ↓= T, F H(p, f, q)(d) = F, if p(d) ↓= T and q(f (d)) ↓= F,  undefined in other cases. Definition 18. Predicate transformer composition is a binary composition t P C : F P rg V,A × P rV,A −→ P rV,A such that for all q ∈ P rV,A , f ∈ F P rg V,A , and d ∈ N DV C(V, A):   T, if f (d) ↓ and q(f (d)) ↓= T, P C(f, q)(d) = F, if f (d) ↓ and q(f (d)) ↓= F,  undefined in other cases. Predicate transformer composition is the same as Glushkov prediction op- eration (sequential execution of a function and a predicate) [10]. We call this composition (defined for partial predicates) as preimage predicate transformer composition in order to relate it to the weakest precondition predicate trans- former [19]. Note that F H(p, pr, q) = p → P C(pr, q). The Floyd-Hoare composition is called monotone, because it satisfies the following property, as was shown in [8]: p ⊆ p0 , q ⊆ q 0 , f ⊆ f 0 ⇒ F H(p, f, q) ⊆ F H(p0 , f 0 , q 0 ), where inclusion ⊆ is understood as inclusion of the graphs of functions and predicates. Definition 19. An (associative) nominative program algebra N P A(V, A) is a two-sorted algebra < P r V,A , F P rg V,A ; ∨, ¬, {∃x}x∈V + , {SPxˉ }xˉ∈Uˉ (V ) , {SFxˉ }xˉ∈Uˉ (V ) , {0 x}x∈V + , id, {AS x }x∈V + , •, IF, W H, F H, P C > . This algebra is the semantic base of our extension of Floyd-Hoare logic to partial predicates and (hierarchical) nominative data with complex names and complex values. 4 Syntax and Interpretation Terms of the algebra N P A(V, A) defined over some sets of predicate symbols P s and program symbols P rs specify the syntax (language) of the logic. Let us give definitions of the sets of program texts P tV,P s,P rs , formulas F rV,P s,P rs , and Floyd-Hoare assertions F HF r V,P s,P rs . The sets P tV,P s,P rs and F rV,P s,P rs are defined inductively (here we use the symbols of compositions in the purely syntactic sense, i.e. they are currently not associated with semantics): 1. if pr ∈ P rs, then pr ∈ P tV,P s,P rs ; 2. if n ≥ 1, pr, pr1 , ..., prn ∈ P rs, and x ˉn (V ), then ˉ∈U SFxˉ (pr, pr1 , ..., prn ) ∈P tV,P s,P rs ; 3. if x ∈ V + and pr ∈ P rs, then AS x (pr) ∈P tV,P s,P rs ; 4. 0 x ∈ P tV,P s,P rs for each x ∈ V + and id ∈ P tV,P s,P rs ; 5. if pr1 , pr2 ∈P tV,P s,P rs , then pr1 • pr2 ∈P tV,P s,P rs ; 6. if Φ ∈ F r V,P s,P rs and pr1 , pr2 ∈P tV,P s,P rs , then IF (Φ, pr1 , pr2 ) ∈P tV,P s,P rs ; 7. if Φ ∈ F rV,P s,P rs and pr ∈P tV,P s,P rs , then W H(Φ, pr) ∈P tV,P s,P rs ; 8. if P ∈ P s, then P ∈F rV,P s,P rs ; 9. if Φ, Ψ ∈F rV,P s,P rs , then Φ ∨ Ψ, Φ ∧ Ψ ∈F rV,P s,P rs ; 10. if Φ ∈F rV,P s,P rs , then ¬Φ ∈F rV,P s,P rs ; 11. if n ≥ 1, Φ ∈F rV,P s,P rs , pr1 , ..., prn ∈ P rs, and x ˉ∈U ˉn (V ), x ˉ V,P s,P rs then SP (Φ, pr1 , ..., prn ) ∈F r ; 12. if Φ ∈F rV,P s,P rs and v ∈ V + , then ∃vΦ ∈F rV,P s,P rs . The set F HF r V,P s,P rs is the set of all formulas of the form {p}f {q}, where p, q ∈ F rV,P s,P rs and f ∈ P tV,P s,P rs . Definition 20. Let V be a set of basic names, P s be a set of predicate symbols, P rs be a set of program symbols, and A be an arbitrary set. Then an interpre- t tation J is a triple (N DA(V, A), IP s , IP rs ), where IP s : P s −→ P rV,A is an t interpretation mapping for predicate symbols and IP rs : P rs −→ F P rg V,A is an interpretation mapping for program symbols. For any interpretation J = (N DA(V, A), IP s , IP rs ) we will denote by JF r and JP t the formula and program text interpretation mappings t JF r : F rV,P s,P rs −→ P rV,A , t JP t : P tV,P s,P rs −→ F P rg V,A which are the standard extensions of IP s and IP rs to F rV,P s,P rs and P tV,P s,P rs respectively (defined by structural induction). Also, we will denote by JF HF r the t interpretation mapping of Floyd-Hoare assertions JF HF r : F HF r V,P s,P rs −→ P rV,A defined as follows: JF HF r ({p}f {q}) = F H(JF r (p), JP t (f ), JF r (q)). In this paper we will not define interpretations explicitly expecting that they are clear from the context. For any P ∈F rV,P s,P rs or P ∈ F HF r V,P s,P rs we will denote by PJ or (P )J the predicate that corresponds to P under interpretation J. We will omit the index J when it is clear from the context. We will use the following notation for any predicate p: pT = {d | p(d) ↓= T } is the truth domain of a predicate p; pF = {d | p(d) ↓= F } is the falsity domain p. Definition 21. A formula P ∈F rV,P s,P rs or a Floyd-Hoare assertion P ∈F HF rV,P s,P rs is valid (irrefutable) in an interpretation J (denoted as J |= P ), if PJF = ∅. Definition 22. A formula P ∈F rV,P s,P rs or a Floyd-Hoare assertion P ∈F HF rV,P s,P rs is valid (denoted as |= P ), if it is valid in every interpretation. Let us define the logical consequence relation |= ⊆ F rV,P s,P rs ×F rV,P s,P rs : p |= q ⇔ |= p → q, where p → q means ¬p ∨ q for any p, q ∈F rV,P s,P rs . We will also need the following special logical consequence relations |=T , |=F ⊆ F rV,P s,P rs × F rV,P s,P rs such that – p |=T q ⇔ pTJ ⊆ qJT for every interpretation J; – p |=F q ⇔ qJF ⊆ pF J for every interpretation J. 5 Inference System for a Floyd-Hoare Logic with Partial Predicates over Nominative data To make the program logic which we have defined applicable to software verifi- cation problems it is necessary to present an inference system. Such an inference system could be based on the inference system for the classical Floyd-Hoare logic with total predicates for the language WHILE [12], but it is known be unsound in the case of partial predicates [9] which is considered in the paper. For this reason additional constraints need to be added to achieve a sound inference system. We will write ` X p to denote that a formula p is derived in some inference system X. An inference system X is sound, if ` X p ⇒ |= p for each formula p, and is complete, if |= p ⇒ ` X p for each p. Completeness can be treated in extensional or intensional approaches. For extensional completeness [9] pre- and postconditions can be arbitrary predicates. Intensional completeness requires that pre- and postconditions are presented by formulas in a given language. The classical inference system for the language WHILE [12] can be presented in semantic form as follows (x ∈ V ): R AS {S x (p, h)} AS x (h) {p} R SKIP {p} id {p} P {p} f {q}, {q} g {r} {r ∧ p} f {q}, {¬r ∧ p} g {q} R SEQ R IF {p} f • g {r} {p} IF (r, f, g) {q} {r ∧ p} f {p} {p0 } f {q 0 } R WH R CONS p → p0 , q 0 → q {p} W H(r, f ) {¬r ∧ p} {p} f {q} This inference system is sound and extensionally complete for total predi- cates, but for partial predicates it is not sound [9], because rules R SEQ, R WH, and R CONS do not guarantee a valid derivation from valid premises. As one of solutions we propose the following inference system AC with added constraints (x ∈ V + , n ≥ 1, x ˉ∈U ˉn ): R AS’ {S x (p, h)} AS x (h) {p} P R SFID’ {S xˉ (p, g , ..., g )}S xˉ (id, g , ..., g ){p} P 1 n F 1 n {p}SFxˉ (id, g1 , ..., gn ) • f {q} R SF’ {p}SFxˉ (f, g1 , ..., gn ){q} R SKIP’ {p} id {p} {p} f {q}, {q} g {r} R SEQ’ , p |= P C(f • g, r) {p} f • g {r} {r ∧ p} f {q}, {¬r ∧ p} g {q} R IF’ {p} IF (r, f, g) {q} {r ∧ p} f {p} R WH’ {p} W H(r, f ) {¬r ∧ p} , p |= P C(W H(r, f ), ¬r ∧ p) {p0 } f {q 0 } 0 0 R CONS’ {p} f {q} , p |=T p , q |=F q In this system constraints are presented via consequence relations |=, |=T , |=F . These relations can be substituted by the corresponding inference relations (`, `T , `F ), but this question is outside the scope of the paper and we will not go into detail here. Theorem 1. ` AC {P C(pr, q)}pr{q} for each pr and q. Proof (presented for some arbitrary interpretation J, but for simplicity sake we omit this J from the text of this proof ). Let us use induction over the structure of pr. Base of induction. Let us prove that ` AC {P C(AS x (h), q)}AS x (h){q}. By the rule R AS’, it is sufficient to show that P C(AS x (h), q) = SPx (q, h). Using the definition of P C and assignment it is easy to check that P C(AS x (h), q)(d) = q(d∇xa h(d)). Moreover, we have: SPx (q, h) = q(d∇a [x 7→ h(d)]) = q(d∇xa h(d)) = P C(AS x (h), q). Thus ` AC {P C(AS x (h), q)}AS x (h){q}. For ` AC {P C(id, q)}id{q} the proof is obvious. Inductive step. 1. The case when pr is SFxˉ (f, g1 , ..., gn ) is straightforward. 2. Consider the case of sequential execution. Given the premises `AC {P C(pr1 , P C(pr2 , q))}pr1 {P C(pr2 , q)} , `AC {P C(pr2 , q)}pr2 {q} , the required derivation `AC {P C(pr1 • pr2 , q)}pr1 • pr2 {q} can be proved in the same way as in the proof of the sequential execution case in [9, Theorem 4.2], so we omit a detailed proof here. 3. Consider the branching composition. Assume that `AC {P C(pr1 , q)}pr1 {q}, `AC {P C(pr2 , q)}pr2 {q}. Let us prove that `AC {P C(IF (r, pr1 , pr2 ), q)}IF (r, pr1 , pr2 ){q}. Let us prove ¬r ∧ P C(IF (r, pr‘1 , pr2 ), q) |=T P C(pr2 , q). If d ∈ (¬r∧P C(IF (r, pr1 , pr2 ), q))T , then we have r(d) ↓= F , IF (r, pr1 , pr2 )(d) ↓, and q(IF (r, pr1 , pr2 )(d)) ↓= T . Then IF (r, pr1 , pr2 )(d) ↓= pr2 (d). Thus pr2 (d) ↓ holds and q(pr2 (d)) ↓= T , so P C(pr2 , q)(d) ↓= T , so d ∈ P C(pr2 , q)T . Thus ¬r ∧ P C(IF (r, pr1 , pr2 ), q) |=T P C(pr2 , q). Similarly, r ∧ P C(IF (r, pr1 , pr2 ), q) |=T P C(pr1 , q). Then by R CONS’ `AC {r ∧ P C(IF (r, pr1 , pr2 ), q)}pr1 {q} , `AC {¬r ∧ P C(IF (r, pr1 , pr2 ), q)}pr2 {q} . Then using R IF we conclude that `AC {P C(IF (r, pr1 , pr2 ), q)}IF (r, pr1 , pr2 ){q} . 4. Let us prove `AC {P C(W H(r, pr), q)}W H(r, pr){q} for each pr, q, r. Let p = P C(W H(r, pr), q). Let p0 be a predicate such that: p0 (d) = T , if p(d) ↓= T and p0 (d) = F otherwise. We have `AC {P C(pr, p0 )}pr{p0 } . Let us show that r ∧ p0 |=T P C(pr, p0 ). Note that pT = p0T , whence p |=T p0 . Assume that (r ∧ p0 )(d) ↓= T . Then p0 (d) ↓= p(d) = T , whence q(W H(r, pr)(d)) ↓= T . Moreover, r(d) ↓= T , so there exists d0 = pr(d) such that W H(r, pr)(d0 ) ↓= W H(r, pr)(d) and q(W H(r, pr)(d0 )) ↓= T . Then p0 (d0 ) ↓= p(d0 ) = P C(W H(r, pr), q)(d0 ) = T. From p0 (d0 ) ↓= T and d0 = pr(d) we have P C(pr, p0 )(d) ↓= T . We conclude that r ∧ p0 |=T P C(pr, p0 ). Let us prove that ¬r ∧ p0 |=F q. Assume that q(d) ↓= F . If r(d) ↓= T then (¬r ∧ p0 )(d) ↓= F . If r(d) ↓= F , then p(d) = P C(W H(r, pr), q)(d) ↓= F , so p0 (d) = p(d) ↓= F and (¬r ∧ p0 )(d) ↓= F . If r(d) ↑, then p(d) = P C(W H(r, pr), q)(d) ↑, whence p0 (d) ↓= F and (¬r ∧ p0 )(d) ↓= F . In all cases, (¬r ∧ p0 )(d) ↓= F . Thus ¬r ∧ p0 |=F q. Let us show that p0 |=T P C(W H(r, pr), ¬r ∧ p0 ). Assume that p0 (d) ↓= T . Then p(d) ↓= T . Then there exists d0 such that W H(r, pr)(d) ↓= d0 and q(d0 ) = T , because p = P C(W H(r, pr), q). By the definition of W H, r(d0 ) ↓= F and W H(r, pr)(d0 ) ↓= d0 . Because W H(r, pr)(d0 ) ↓= d0 and q(d0 ) = T , we have p(d0 ) ↓= T = P C(W H(r, pr), q)(d0 ). Then p0 (d0 ) ↓= T and (¬r ∧ p0 )(d0 ) ↓= T . Then W H(r, pr)(d) ↓= d0 and P C(W H(r, pr), ¬r ∧ p0 )(d) ↓= T . We conclude that p0 |=T P C(W H(r, pr), ¬r ∧ p0 ). Then p0 |= P C(W H(r, pr), ¬r ∧p0 ). Moreover, from r ∧p0 |=T P C(pr, p0 ) and R CONS’, `AC {r ∧ p0 }pr{p0 } , so `AC {p0 }W H(r, pr){¬r ∧ p0 } by the rule R WH’. Because ¬r ∧ p0 |=F q and p |=T p0 , we have `AC {p}W H(r, pr){q} by R CONS’. Thus `AC {P C(W H(r, pr), q)}W H(r, pr){q} . We have considered all compositions, so, taking into account that an inter- pretation J was chosen arbitrary, we have that ` AC {P C(pr, q)}pr{q} for each pr and q. Theorem 2. The inference system AC is sound. This theorem can be proved analogously to [8, Theorem 4]. The completeness problems (extensional and intensional) are more difficult. In this case it is rea- sonable to identify different classes of predicates (similarly to [8, 9]) and to prove completeness for such classes of predicates. In this case Theorem 1 is used. 6 Towards Formalization of Extended Floyd-Hoare Logic in Mizar We proposed a formalization of nominative data in Mizar in [20]. In the men- tioned work different types of nominative data were defined as Mizar modes with set parameters V and A which meant the sets of basic names and atomic val- ues respectively. We can use this formalization as a basis for formalization of the extended Floyd-Hoare logic for programs over nominative data of the types T N DCC and T N DSC . 1) Define the mode of binominative functions over nominative data of the type T N DSC (please refer to [20] for the definition of TypeSCNominativeData mentioned below): reserve V for set; reserve A for set; reserve D for TypeSCNominativeData of V, A; definition let V, A, D; mode TypeSCBinominativeFunction of D is PartFunc of D, D; end; and, similarly, the mode of binominative functions over nominative data of the type T N DCC . This gives us a formalization of F P rg V,A defined in Section 3. 2) Analogously define the modes of partial predicates over nominative data of the types T N DSC , T N DCC over V, A (P rV,A ). 3) Formalize P tV,P s,P rs (program texts), F rV,P s,P rs (formulas), F HF r V,P s,P rs (and Floyd-Hoare assertions) as Mizar modes with parameters V , P s, P rs. 4) Formalize the interpretation mapping in accordance with Definition 20. 5) Formalize the relation of validity in an interpretation in accordance with Definition 21 and the relation of validity in accordance with Definition 22. 6) Formalize the logical consequence relation p |= q ⇔|= p → q for p, q ∈F rV,P s,P rs and the special logical consequence relations p |=T q ⇔ pTJ ⊆ qJT for any interpretation J, and p |=F q ⇔ qJF ⊆ pF J for any interpretation J. 7) Formulate the inference rules of the AC inference system described in Section 5 as Mizar schemes and prove their semantic validity. Finally, the proven inference rules can be used to prove semantics properties of programs defined by concrete program texts (elements of P tV,P s,P rs ). 7 Related Work Logical approaches to program specification and reasoning about program prop- erties were used in the works by R. Floyd [1] and T. Hoare [2]. These approaches were based on axiomatic systems with total predicates and used triples of a precondition, program, and a postcondition. Later it became evident that par- tiality of predicates and programs needs to be taken into account which gave rise to three-valued logics which represented undefinedness of a predicate by a spe- cial third value. In particular, such logics were studied by Lukasiewicz, Kleene, Bochvar and others. At the same time many other extensions of the Floyd-Hoare logic were proposed as a basis for program verification, including Dynamic logic and Separation logic. In Dynamic logic [21] special modalities that allow usage of program texts and specifications alongside are used. A Floyd-Hoare assertion {p}pr{q} can be replaced with a formula p → [pr]q, where [pr]q indicates that if a program pr terminates, then necessarily holds q. For deterministic programs, [pr]q is equal to our preimage composition which also allows use of specifications and program texts together. Separation logic [22] was introduced to deal with widespread usage of heap and pointers in programming. This logic has special means for specifying heap properties. But only a heap function that maps mem- ory addresses to values is assumed to be partial. In other aspects only total predicates are considered. The ways of dealing with partiality in current software specification languages (VDM, RSL, Z, etc.) are described in [4]. Most approaches use either a many- valued logic or underspecification for dealing with partiality. 8 Conclusions We have proposed an extension of the Floyd-Hoare logic for the case of partial conditions and programs on hierarchically organized data called nominative data. Such data can be used to conveniently represent many data structures used in programming. We have proposed an inference system for our extended Floyd- Hoare logic and investigated its soundness and extensional completeness and propose an approach to its formalization in the Mizar system. In the future works we plan to implement the proposed approach and apply the results to software verification tasks. References 1. Floyd, R.: Assigning meanings to programs. Mathematical aspects of computer science 19 (1967) 2. Hoare, C.: An axiomatic basis for computer programming. Commun. ACM 12 (1969) 576–580 3. Apt, K.: Ten years of Hoare’s logic: A survey – part I. ACM Trans. Program. Lang. Syst. 3 (1981) 431–483 4. Hähnle, R.: Many-valued logic, partiality, and abstraction in formal specification languages. Logic Journal of the IGPL 13 (2005) 415–433 5. Jones, C.: Reasoning about partial functions in the formal development of pro- grams. In: Proceedings of AVoCS’05. Volume 145., Elsevier, Electronic Notes in Theoretical Computer Science (2006) 3–25 6. Gries, D., Schneider, F.: Avoiding the undefined by underspecification. Technical report, Ithaca, NY, USA (1995) 7. Duzi, M.: Do we have to deal with partiality? Miscellanea Logica 5 (2003) 45–76 8. Kryvolap, A., Nikitchenko, M., Schreiner, W.: Extending Floyd-Hoare logic for partial pre- and postconditions. In Ermolayev, V., Mayr, H., Nikitchenko, M., Spi- vakovsky, A., Zholtkevych, G., eds.: Information and Communication Technologies in Education, Research, and Industrial Applications. Volume 412 of Communica- tions in Computer and Information Science. Springer International Publishing (2013) 355–378 9. Nikitchenko, M., Kryvolap, A.: Properties of inference systems for Floyd-Hoare logic with partial predicates. Acta Electrotechnica et Informatica 13 (2013) 70–78 10. Skobelev, V., Nikitchenko, M., Ivanov, I.: On algebraic properties of nominative data and functions. In Ermolayev, V., Mayr, H., Nikitchenko, M., Spivakovsky, A., Zholtkevych, G., eds.: Information and Communication Technologies in Ed- ucation, Research, and Industrial Applications. Volume 469 of Communications in Computer and Information Science. Springer International Publishing (2014) 117–138 11. Nikitchenko, M., Shkilniak, S.: Mathematical logic and theory of algorithms. Publishing house of Taras Shevchenko National University of Kyiv, Ukraine (in Ukrainian) (2008) 12. Nielson, H., Nielson, F.: Semantics with applications – a formal introduction. Wiley professional computing. Wiley (1992) 13. Wiedijk, F.: The seventeen provers of the world. Foreword by Dana S. Scott. Volume 3600 of Lecture Notes in Artificial Intelligence. Springer-Verlag Berlin Heidelberg (2006) 14. Gordon, M.: Mechanizing programming logics in higher order logic. In Birtwistle, G., Subrahmanyam, P., eds.: In Current Trends in Hardware Verification and Au- tomated Theorem Proving. Springer-Verlag (1989) 15. Von Wright, J., Hekanaho, J., Luostarinen, P., Langbacka, T.: Mechanizing some advanced refinement concepts. Formal Methods in System Design (1993) 49–81 16. Bancerek, G., Byliński, C., Grabowski, A., Kornilowicz, A., Matuszewski, R., Nau- mowicz, A., Pa̧k, K., Urban, J.: Mizar: State-of-the-art and beyond. Volume 9150 of Lecture Notes in Computer Science. Springer (2015) 261–279 17. Grabowski, A., Kornilowicz, A., Naumowicz, A.: Four decades of Mizar. Journal of Automated Reasoning 55 (2015) 191–198 18. Nikitchenko, M., Tymofieiev, V.: Satisfiability in composition-nominative logics. Central Europ. J. Computer Science 2 (2012) 194–213 19. Dijkstra, E.: A Discipline of Programming. 1st edn. Prentice Hall PTR, Upper Saddle River, NJ, USA (1997) 20. Ivanov, I., Kornilowicz, A., Nikitchenko, M.: Formalization of nominative data in Mizar. Proceedings of TAAPSD 2015, 23-26 December 2015, Taras Shevchenko National University of Kyiv, Ukraine (2015) 82–85 21. Harel, D., Tiuryn, J., Kozen, D.: Dynamic Logic. MIT Press, Cambridge, MA, USA (2000) 22. Sannella, D., Tarlecki, A.: Foundations of Algebraic Specification and Formal Software Development. Monographs in Theoretical Computer Science. An EATCS Series. Springer (2012)