A Meta Language for Mathematical Reasoning Michael Junk Stefan Hölle University of Constance University of Constance 78457 Konstanz, Germany 78457 Konstanz, Germany michael.junk@uni-konstanz.de stefan.hoelle@uni-konstanz.de Sebastian Sahli University of Constance 78457 Konstanz, Germany sebastian.sahli@uni-konstanz.de Abstract We present a formal system which establishes a meta-language for the description and the subsequent reasoning in general mathematical mod- els and theories. The basic expressions of the language act like functions and types at the same time. As crucial ingredient, the evaluation oper- ation is only admissible for compatible arguments. The initial function objects and base expressions allow list formation, restriction of func- tions, formulation of theorems, equality and existence statements. We will present examples of the expressiveness of the language, the basic syntactic structure and inferential rules. 1 Starting Point Our approach to formal mathematics was initiated by requirements arising from research projects and teaching in applied mathematics and scientific computing. The idea was to establish a formalism which allows to specify mathematical models intuitively and consistently in the whole range from word problems in elementary school up to full fledged models in industrial projects, while being easy to convey even to first year students. In the past four years, we have mainly concentrated on the design of syntactical elements and rules which are easily comprehensible for math students attending the preparatory course, the basic math lectures and the modeling lecture. With consistency, conciseness and student-acceptance playing the role of environmental pressure, an evolutionary process has led to the meta-language MATh as acronym for meta-language for models, algorithms and theories [1]. Now we are interested in discussions about similarities and differences to existing approaches and hope to contribute some ideas which resulted from our particular starting point. 2 Mathematical Models Our understanding of mathematical models is quite broad. In particular, we do not assume a particular set theory or logic from the beginning but rather consider these as mathematical models themselves. In general, a model (or theory according to [2]) comprises of a finite number of parameters which represent abstract mathematical objects whose meaning emerges from a list of axioms which describe their mutual relations. Copyright c by the paper’s authors. Copying permitted for private and academic purposes. In: O. Hasan, C. Kaliszyk, A. Naumowicz (eds.): Proceedings of the Workshop Formal Mathematics for Mathematicians (FMM), Hagenberg, Austria, 13-Aug-2018, published at http://ceur-ws.org Once the model assumptions are fixed, consequences are drawn from these assumptions according to precisely stated rules. While concluding from the axioms can be viewed as an interior aspect of a model, there are also exterior aspects because models may be used as building blocks within other models. As example, we consider the concept probability space as a mathematical model of uncertainty which we call pSpace here. It starts out with parameters Ω, F, P which represent a sample space Ω, a set F of events and a probability measure P . Important consequences are related, for example, to the notion random variable, expectation and independence which are defined within the model. Similarly, the model pSpace is a consequence of measure theory which belongs to the standard model of real numbers which again belongs to the consequences of set theory and classical logic. But pSpace is also used sideways in many specific models which deal with uncertainty. Simple textbook examples (coin-flipping, fair dice, lottery etc.) and numerous statistical models specify concrete sample spaces, event sets and probability measures which satisfy the axioms of pSpace once they replace Ω, F, P . By applying the model in this way, all its true statements translate into the more specific scenario and are ready to be used to derive other consequences. This aspect of using pSpace is very similar to applying a theorem in order to access its implication. However, it is quite likely that also the expectation functional E or the adjective independent defined in pSpace need to be translated to the specific scenario. To achieve this, one could think of pSpace as a class-constructor which assigns to an admissible triple (D, A, Q) an object S := pSpace(D, A, Q) which contains the specific versions of E and independent as named components, i.e. S.E and S.independent. Naturally, one would consider S of type pSpace in this case, i.e. S : pSpace. From this example we derive the important observation that (1) models are built within other models (proofs are required to ensure that the consequences are valid) and (2) models are used within other models and produce return values for that purpose (proofs are required to check the model assumptions for the arguments). To reflect these different purposes syntactically, we introduce so called frames which consist of a list of parameters followed by a list of axioms, a list of consequences and a resulting expression where each of these blocks may be empty. Using an exemplary grammar, a sketch of the pSpace-description may look like pSpace := (Omega, F, P) with (Omega, F, P) : measureSpace; P(Omega)=1; consequences rV(G with G:sigmaAlgebra) := X with X:measurable(F,G) end; E(X with X in L1(P)) := integral(X,P); prove forall X with X:bounded holds X in L1(P) directly ...qed .. . result (’randVar’ >> rV, ’Expectation’ >> E, ...); end Here the result is a record with field names randVar, Expectation and so on. It turns out, that the same syntactic structure can be used for various other purposes like definig plain functions tan := x with not(cos(x)=0) result sin(x)/cos(x) end Here, the consequences are often empty (unless well-definition cannot be determined automatically and needs proof steps from the user). Of course, one would generally recast this definition in a more intuitive form with the same content tan(x with not(cos(x)=0)) := sin(x)/cos(x) Since functions are special frames, the type-statement 0:tan is valid (being of type tan means being formed by tan, i.e. being in the range of tan). To access the argument condition of a frame, we use the expression Def(tan) which is equal to the frame D := x with not(cos(x)=0) end In this case, no result is provided which means implicitly that the frame parameter itself is used, i.e. D is an identity function. Also here, being of type D means being in the range of D, but since range and domain of definition coincide, it also means satisfying the frame condition. For this reason, frames returning simply the frame parameters take over the role of collections (like sets) and are used to define all standard mathematical notions, e.g. injective := f with forall x,y with f(x)=f(y) holds x=y end Altogether, models, theories, notions, functions, sets, classes and types can be summarized as frames which again are essentially functions enriched with statements derived from the assumptions on the arguments. Since every frame is defined within another frame, a hierarchical organization of mathematical content follows naturally. To initialize this hierarchy, the MATh-interpreter provides an initial context in which frames can be built based on rules which are presented in the following section. 3 Formal Aspects of MATh A MATh-text essentially describes a sequence of actions which modify a context represented in the MATh- interpreter. Such actions may be definitions, opening and closing of sub-contexts, claim checking, adding new objects and axioms etc. which generally take MATh-expressions as arguments. The context is essentially char- acterized by a list of parameters and axioms, a list of defined names with the expressions they point to and a list of true statements which have been deduced from the axioms. If a sub-context is opened, names and true statements are inherited, although local names may mask parent names. Expressions in MATh can be admissible and if they are, they can hold (i.e. be true). These attributes are controlled by corresponding rules which are realized as strategies within the MATh-interpreter. If statements are to be proved, references to these strategies are required (like in many proof assistants[4, 5]). Due to automatic strategy searches and recursive strategy calls, some sort of automatic theorem proving can be achieved, although this is not our primary concern. Currently, the interpreter is only a tool to check that the proposed language rules are practicable. It should be noted that MATh expects its input in terms of expression trees which are built by combining elementary expression data-structures of an intermediate language. In this way, the realization of the initial context is not tied to any comfortable syntax and can be kept rather slim. The actual human readable syntax of a full implementation will offer more constructs which are mapped to the base expressions after parsing (our current implementation uses the ANTLR parser[3] - the essential grammar description is online [1]). We will now introduce the base expressions of the intermediate language in a textual form (table 1). Table 1: Base Expressions: x can be replaced by a name or a list of pairwise different names and f,a,y,Ai ,Bj can be replaced by expressions map x->y evaluation f(a) list (A1 ,...,An ) condition x with A1 ;...;An end theorem forall x with A1 ;...;An hold B1 ;...;Bm end Initially admissible expressions are given by seven parameter names which we discuss in the sequel. indicator, Def, equal, example, list, restriction, frame The parameter indicator is used to realize the type relation x:T which is only a shorthand of indicator(T)(x). It parallels the idea of indicator functions 1U of sets U which evaluate to 1U (y) = 1 if and only if y ∈ U holds. In this spirit, we say that x is of type T if indicator(T)(x) holds. The parameter Def controls function evaluation in the sense that f(x) is admissible if f,x are admissible and if x:Def(f) holds. Since admissibility of f(x) depends on admissibility of three other evaluations in indicator(Def(f))(x), additional rules ensure that certain combinations of Def and indicator evaluations are admissible. The parameter equal realizes the equality expression x=y as abbreviation of x:equal(y). Rules ensure that arbitrary admissible expressions can be combined in equality statements. The expression example is used to formulate existence statements. Here the idea is that example assigns a witness to each non-empty type, i.e. rules ensure that example(T):T holds whenever it is admissible. With this usage in mind, Def(example) discriminates between empty and non-empty types so that T:Def(example) is reasonably abbreviated as exists T, meaning that examples of type T exist. Since the availability of such a choice function entails the axiom of choice if sets are modeled as special types, the explicit usage of example has to be suppressed if the axiom of choice is not desired. In this case, access is only possible in proof steps which allow working with witnesses of valid existence statements. Another usage of example is its restriction to unique types which gives rise to a definite description operator. This is important when working with types that possess exactly one example. A list like L:=(a,b) with admissible components is considered as a function whose arguments are 1,2 and whose corresponding values are a,b. In particular, strategies ensure that Def(L)=(1,2), L(1)=a, L(2)=b are true as well as and a:L, b:L. More generally, lists of length n are examples of type list(n) so that Def(list) can be viewed as the type of unsigned integers. In this way, n-ary functions are functions on subtypes of list(n) which can be naturally combined with the syntax convention that f(a,b,c) abbreviates f((a,b,c)). The base expression x->y is admissible if the expression y is partially admissible, meaning that the Def- constraints in building up y are neglected. Together with the rule, that u:Def(x->y) holds if y is admissible after replacing x by u, this allows the construction of partial functions. Next, we consider the base expression E := x with A1 ;...;An end which essentially describes an identity function and therefore acts set-like. It accepts a list of expressions A1 ;...;An and is admissible if all x->Ai are. Rules ensuring the admissibility of u:E (i.e. the truth of u:Def(indicator(E))) for certain types of u or E are typically formulated as axioms which have to be chosen carefully to avoid contradictions. We comment on this in the last section. Once the admissibility is assured, u:E holds iff each (x->Ai )(u) holds. In this case, E(u) is identical to u. Syntactically, we allow also definitions N:=z in the list A1 ;...;An . When forming the actual base expression, this is tranformed to z:admissible while the meta-information about N as name for z is extracted and used to update the state of the interpreter. With restriction, the domain of x->y can be modified. More specifically, for F:=restriction(D,R) the rules assure that u:Def(F) holds iff u:D and u:Def(R) are true with the result F(u)=R(u). In this way, general function definitions like those in the previous section are possible. For such general functions, the type relation y:F is equivalent to exists x with y=F(x) end. The base expression forall x with A1 ;...;An hold B1 ;...;Bm end introduces the theorem-concept in MATh. The statement is true if in a sub-context with additional parameter x and axioms A1 ;...;An the truth of B1 ;...;Bm can be shown (which is the direct proof strategy). With a modus ponens strategy, theorems can be applied. As example, the subtype relation A forall n with n:nat; n:X holds succ(n):X; forall X with 0:X; step(X) holds nat forall n with n:nat; X(n) holds X(succ(n)); forall X with X:formula; X(0); step(X) holds forall n with n:nat holds X(n); The way in which the Def-concept is introduced, obvious contradictions in the spirit of Russell’s antinomy are avoided. In a model where not is the classical negation function, a Russell-type expression would be R:= x with not(x:x) end which is admissible. However, there is no rule ensuring R:Def(indicator(R)) so that we don’t know whether R:R as abbreviation of indicator(R)(R) is admissible. In particular, the dangerous tertium-non- datur statement (R:R) or not(R:R) cannot be formed. If we assume that R:R is admissible, then the standard argument leads to a contradiction which finally proves that not(R:Def(indicator(R))) holds which means that R:R is not admissible. While the Def-concept avoids obvious contradictions it also blocks a lot of (seemingly) uncritical arguments. In particular, axioms on the structure of Def(indicator(T)) are needed to allow constructive conclusions of the form x:T. As example, we mention the axiom that x:T is admissible for any x if T is a list or a condition of the particular form x with x:A end. Similarly, the usual ZF set theory results, if the ZF set constructions are formulated as condition valued mappings and if x:T is postulated admissible for all x of type set. References [1] MATh-Homepage, http://www.math.uni-konstanz.de/mmath. (currently in German) [2] A. Tarski. Introduction to Logic and to the Methodology of Deductive Sciences. Oxford University Press, 1941. [3] T. Parr. The Definitive ANTLR 4 Reference. The Pragmatic Programmers, 2012. [4] Nuprl-Homepage, http://www.nuprl.org/. [5] Isabelle-Homepage, https://isabelle.in.tum.de/.