<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Textbook Mathematics in the Naproche-SAD System</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Peter Koepke</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Mathematical Institute, University of Bonn</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <fpage>8</fpage>
      <lpage>12</lpage>
      <abstract>
        <p>We report on the Naproche-SAD project (Natural Proof Checking System for Automated Deduction) as part of a wider programme of Natural Formal Mathematics. After recalling the classical SAD system with its natural proof language ForTheL (Formula Theory Language), we describe its further development into Naproche-SAD, and present snippets from recent formalizations of textbook material.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Theorem. The set of prime numbers is infinite.</p>
      <p>Proof. Let A be a finite set of prime numbers. Take a function p and a number r such that p lists A in r steps.
ranp N+. Qir=1 pi 6= 0. Take n = Qir=1 pi + 1. n is nontrivial. Take a prime divisor q of n. Let us show that
q is not an element of A. Assume the contrary. Take i such that (1 i r and q = pi). pi divides Qir=1 pi (by
MultProd). Then q divides 1 (by DivMin). Contradiction. qed.</p>
      <p>Hence A is not the set of prime numbers.</p>
      <p>This text is written in the ForTheL language of SAD as available for download and through a web interface
at [LVP]. ForTheL supports the Definition - Theorem - Proof structure typical of modern mathematics. The
formulation uses complete and grammatical English sentences interwoven with symbolic phrases. Texts are
parsed into an annotated first-order presentation for further processing and checking.</p>
      <p>The example illustrates the use of notions in ForTheL to achieve a soft typing comparable to typing by
nouns in natural languages. A notion (noun) like number is introduced by a Signature command. Notions can
be modified by predicates (adjectives): we define the predicate is prime and obtain a modified notion prime
number. ForTheL provides standard constructors of complicated terms like set of. The combinations of such
mechanisms enable elegant variable-free formulations of first-order statements like The set of prime numbers is
infinite.</p>
      <p>Further mechanisms familiar from mathematical texts are provided. To avoid repetitive type declarations,
variables can be pretyped. Linguistic and symbolic patterns for terms and predicates can be introduced freely,
as long as the parser is able to identify the positions of variables. On the other hand there is a wide scope for
further extensions and improvements of the original ForTheL language; a phrase like “ p lists A in r steps”, e.g.,
should be replaced by more familiar constructs like A = fp1; : : : ; prg.</p>
      <p>Naproche-SAD. The Naproche project (Natural Proof Checking) at the University of Bonn pursues similar
aims as EA and SAD, with particular emphasis on a linguistically correct input grammar. In his PhD work
Marcos Cramer [Cra13] built the Naproche proof checker in which he successfully formalized some initial part of
Landau’s Grundlagen.</p>
      <p>Andrei Paskevich and Marcos Cramer did not continue this work after about 2013. In 2017, in accordance
and consultation with Andrei Paskevich, the Naproche project has adopted the SAD system, aiming to further
its efficiency and coverage. The ForTheL language has been extended by several frequently used constructs,
and the software has been partially rewritten, extended and annotated in the Master project of Steffen Frerix
(see [FK18]). The Naproche-SAD system is now able to efficiently check some chapter-sized texts at the level of
beginning university mathematics.</p>
      <p>The checking of ForTheL input texts proceeds as in SAD: after parsing, a reasoner module sequentially
processes the statements of the text within the context of previous statements. First there is a weak type
checking (“ontological checking”) to see whether the typing presuppositions of each application of a term or a
predicate are satisfied; since terms can be substituted into other terms etc. the number of these checks can be
high. Thereafter the logical check is performed. All checks are first attempted by the reasoner; if this fails they
are exported to an external first-order ATP (Automatic Theorem Prover); if the external prover fails there are
a number of retries with successively stronger premisses. The specific organisation of these processes is decisive
for the correctness and efficiency of the proving. Various intermediate proof results about soft types are cached
for later use.</p>
      <p>Naproche-SAD is written in Haskell. It uses E prover as its ATP, but other first-order provers could easily be
attached instead. There is a LATEX filter which accepts text written in LATEX and transforms them to ForTheL
for checking. Naproche-SAD has now been integrated into the jedit PIDE (Proof Integrated Development
Environment) known from Isabelle to support interactive formalizations. The code is freely available on Github
[Nap18].</p>
      <p>Set Theory. Since the language and techniques of set theory are omnipresent in modern pure mathematics,
a comprehensive natural formalization project must include basic set theory. A wide-ranging formation of
abstraction terms fx j 'g is at the heart of set theory, so that mechanisms for the translation of such collections
into first-order logic have to be provided. In SAD, abstraction terms can be formed with arbitrary ForTheL
formulas ', and are registered as sets. Thus simply writing the term fx j x 62 xg in SAD introduces the Russell
contradiction into a text. If fx j x 62 xg is registered as a set a then
In SAD, it is the author’s responsibility to avoid contradictory abstraction terms (as in many presentations of
naive set theory). In Naproche-SAD, we employ some ideas from class theory: we use the notions of class, object
and set ; abstraction terms are registered as classes whose elements are objects; sets are defined to be classes that
are objects.</p>
      <p>The notion of function is as basic as that of a set or class. So it is predefined in Naproche-SAD, and a
flexible mechanism for the definition of functions by cases and recursion is provided. We are in the process of
formalizing a standard course in set theory and have so far covered relations, functions, ordinals and cardinals.
The formalization of the crucial Zermelo Wellordering Theorem is close to the lecture notes.</p>
      <p>Theorem. For every set x there exist an ordinal and function f such that f : $ x.</p>
      <p>Proof. Let x be a set. Define
for in Ord.
(1) There exists such that F ( ) = x.</p>
      <p>Proof . Assume the contrary. ...</p>
      <p>F ( ) =
x; if x n F [ ] = ;
some v 2 x n F [ ]; if x n F [ ] has an element</p>
      <p>In the definition of F the \cases construct from LATEX is also accepted by Naproche-SAD. This construct
spawns further proof obligations, e.g., that the two cases don’t overlap and cover all possibilities. Also the axiom
of choice enters the definition by the choice of v in the second case, indicated by the word “some v”. Naproche-SAD
supports several forms of recursions; in this case, the definition of F ( ) recurs to the set F [ ] = fF ( ) j &lt; g
of “previous” values of the function which is justified by the strong wellfoundedness of the 2-relation.</p>
      <p>Principles of Mathematical Analysis by Walter Rudin. The Principles of Mathematical Analysis
[Rud86] have been a standard textbook for decades. In a practical Bachelor module we have formalized parts
of some initial chapters. To demonstrate the similarity between [Rud86] and our formalization let us first quote
the original Theorem 1.20(a) of [Rud86].</p>
      <p>Theorem (120a). (a) If x 2 R, y 2 R, and x &gt; 0, then there is a positive integer n such that</p>
      <p>nx &gt; y:
Proof. Let A be the set of all nx, where n runs through the positive integers. If (a) were false, then y would be
an upper bound of A. But then A has a least upper bound in R. Put = sup A. Since x &gt; 0, x &lt; , and
x is not an upper bound of A. Hence x &lt; mx for some positive integer m. But then &lt; (m + 1)x 2 A,
which is impossible, since is an upper bound of A.</p>
      <p>Here the de Bruijn factor is close to 1:</p>
      <p>Theorem (Naproche-SAD version). If x 2 R and y 2 R and x &gt; 0 then there is a positive integer n such that
n x &gt; y:
Proof. Define X = fn x j n is a positive integerg. Assume the contrary. Then y is an upper bound of X. Take
a least upper bound of X. x &lt; and x is not an upper bound of X. Take an element z of X such
that not z x. Take a positive integer m such that z = m x. Then x &lt; m x (by 15b).
= (</p>
      <p>x) + x &lt; (m x) + x = (m + 1) x:
(m + 1) x is an element of X. Contradiction. Indeed</p>
      <p>is an upper bound of X.</p>
      <p>The Appendix of General Topology by John L. Kelley. The well-known General Topology by Kelley
[Kel75] is written in a formalistic style which lends itself to formalizations. We are currently formalizing the
Appendix which introduces the Kelley-Morse theory of classes which is related to the foundational
class-objectset theory that we have implemented in Naproche-SAD. The Appendix is structured as a sequence of about
200 numbered axioms, definitions and theorems, most of them without proofs. Naproche-SAD together with E
prover is often able to find proofs automatically, so that we achieve the same proof granularity as the original
text. One could even consider extending ForTheL so that some parts of Kelley are formally correct ForTheL
texts.</p>
      <p>We present our formalization of Kelley’s version of the Kuratowski ordered pair. Kelley uses the universe U
of all sets as an indicator for “undefined". Later in the text the ordered pair will be treated as an undefined
basic notion with the axiomatic property stated in Theorem 55. Thereafter ordered pairs will be used to define
relations and functions.</p>
      <p>Ordered Pairs
Definition (48). (x; y) = ffxg; fx; ygg. Let the ordered pair of x and y stand for (x; y).</p>
      <p>Theorem (49a). (x; y) is a set iff x is a set and y is a set.</p>
      <p>Theorem (49b). If (x; y) is not a set then (x; y) = U .</p>
      <p>Theorem (50). If x and y are sets then S(x; y) = fx; yg and T(x; y) = fxg and S T(x; y) = x and T T(x; y) =
x and S S(x; y) = x [ y and T S(x; y) = x \ y.</p>
      <p>Theorem (Unnumbered statement). If x is not a set or y is not a set then S T(x; y) = 0 and T T(x; y) = U
and S S(x; y) = U and T S(x; y) = 0.</p>
      <p>Definition (51). 1stz = T T z. Let the first coordinate of z stand for 1stz.</p>
      <p>Definition (52). 2ndz = (T S z) [ ((S S z)
Theorem (53). 2ndU = U .</p>
      <p>Theorem (54a). If x and y are sets then 1st(x; y) = x.</p>
      <p>Theorem (54b). If x and y are sets then 2nd(x; y) = y.</p>
      <p>S T z). Let the second coordinate of z stand for 2ndz.</p>
      <p>Proof. Let x and y be sets. 2nd(x; y) = (T S(x; y)) [ ((S S(x; y))
S T(x; y)) = (x \ y) [ ((x [ y)
x) = y.</p>
      <p>Theorem (54c). If x is not a set or y is not a set then 1st(x; y) = U and 2nd(x; y) = U .</p>
      <p>Theorem (55). If x and y are sets and (x; y) = (r; s) then x = r and y = s.</p>
      <p>Discussion and Further Plans. Our examples and experiences demonstrate that natural formalizations of
undergraduate mathematics in the familiar textbook style are realistically possible. The main components for a
successful system are a sophisticated model of the language of mathematics as implemented in ForTheL and its
parser, the efficient generation and organization of a large number of first-order proof obligations, and powerful
first-order ATPs. Much work is needed to expand the coverage of the language and to create interlinked libraries
of basic mathematical theories. The possibility of combining Naproche-SAD with big systems like Isabelle/ZF has
to be investigated. Currently we are continuing the development of Naproche-SAD and textbook formalizations
as above as Bachelor- and Mastertheses and practicals. Two Bachelor projects deal with the formalization of
standard set theory. A practical module is concerned with integrating the ForTheL language into LATEX and
translating from ForTheL to Lean.
[BR02]</p>
      <p>Grzegorz Bancerek and Piotr Rudnicki. A compendium of continuous lattices in MIZAR. J. Autom. Reasoning,
29(3-4):189–224, 2002.</p>
      <p>Marcos Cramer. Proof-checking mathematical texts in controlled natural language. PhD thesis, University of
Bonn, 2013.</p>
      <p>L. C. Paulson. The relative consistency of the axiom of choice - mechanized using Isabelle/ZF. LMS Journal
of Computation and Mathematics, 6:198–248, 2003.
[Rud86]
[vBJ77]</p>
      <p>W. Rudin. Principles of Mathematical Analysis. McGraw - Hill Book C., 1986.</p>
      <p>Lambert S. van Benthem Jutting. Checking Landau’s “Grundlagen” in the Automath system. PhD thesis,
Eindhoven, 1977.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>