Leveraging Information Contained
                                in Theory Presentations

                                                  Yasmine Sharoda
                                               sharodym@mcmaster.ca
                              Supervisors: Jacques Carette and William Farmer
                               Computing and Software, McMaster University, Canada


    Building large libraries of mathematics is an active area of research in Mathematical Knowledge Management
(MKM). A library structures information using some unit of knowledge. Whether that unit is called a theory,
specification, module, locale or something else, we can abstractly see it as a way for encapsulating knowledge. We
want to study the units of encapsulated information in formal systems, abstracting over the language details. Our
abstraction consider units in which one can define sorts, typed symbols (optionally providing their definitions)
and axioms stating the properties of the defined sorts and symbols. We use the term theory presentation to refer
to this abstract unit. The theory of this abstraction has been well-studied under the name universal algebra
[5]. Our goal is to identify what knowledge can be automatically generated from theory presentations. We want
to provide a library of algorithms to generate this knowledge. This way, we provide tool support for library
development and formalization tasks by removing some of the boilerplate associated with defining constructs
that can be automatically generated. Not only does generating these new pieces save time and effort for the
developer, but, more importantly, leverages the information contained in the theory graph by establishing new
connections between the theories. A theory graph is a collection of theories and theory morphisms. We want to
be able to generate the new pieces of knowledge and the morphisms connecting them to the graph. We describe
our contributions as

  • Evaluating the status of current popular libraries of mathematics, in terms of modularity and reusability.

  • Compiling a list of constructions that can be automatically generated from the syntax of a theory presen-
    tation.

  • Building a library of generic algorithms for computing these constructions, given a theory written in an
    abstract description language. At this point, we only consider theory presentations in their most abstract
    format as a collection of sorts, function symbols and axioms written in some formal language.

  • Implementing generators that can produce these constructions for a specific theory in a specific language,
    taking into considerations all similarities and differences between different formal systems.

   We now list some of the constructions that can be generated from theory presentations:
signatures, sub-algebras, product-algebras, projection, congruence, quotient algebra, record definition, homo-
morphism, homomorphism- equality, isomorphism, endomorphism, automorphism, composition-of- morphisms,
closed term language, open term language, staged-term- language, structural induction principle, evaluation
functions, rewrite rules, equivalence of terms, and parse trees.
Some of these constructions are inspired by universal algebra, others are more related to programming disciplines,
like meta-programming.

Copyright © by the paper’s authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
In: C. Kaliszyk, E. Brady, J. Davenport, W.M. Farmer, A. Kohlhase, M. Kohlhase, D. Müller, K. Pąk, and C. Sacerdoti Coen (eds.):
Joint Proceedings of the FMM and LML Workshops, Doctoral Program and Work in Progress at the Conference on Intelligent
Computer Mathematics 2019 co-located with the 12th Conference on Intelligent Computer Mathematics (CICM 2019), Prague,
Czech Republic, July 8–12, 2019, published at http://ceur-ws.org


                                                               1
Approach
We are interested in syntactically manipulating theory presentations, in their most abstract form. To achieve a
high level of abstraction, we use Mmt [3] as a data description language (DDL). Mmt is a foundation independent
framework for managing mathematical knowledge. Knowledge is structured as a theory graph. Being foundation
independent, it avoids the commitment to any specific logical foundation and allows the translation between the
different ones. This makes it appealing to our research work as it enables us to abstract over language details,
explore the minimal logic in which the different pieces of knowledge can be generated, and explore whether these
pieces can be formalized in other logics.
   After generating the different pieces of knowledge, we want to be able to translate them into different formal
systems. Considering systems with very different foundations, like Isabelle/HOL and Agda. The first is based
on HOL, while the second is based on dependent type theory. They have different representations of theories,
yet the information contained in them is the same. For example, a Monoid theory consists of a sort, a point,
a binary operation and three axioms. These pieces do not change based on the logic, but the way they are
formalized does. We are interested in automating the generation of theories in different formal systems. For this,
we aim to develop a feature model exploring the similarities and differences between the different systems. This
feature model will be used to design generators from our abstract model to the different systems. This way, the
information is generated once, and translated automatically to different representations.

Evaluation
We use the MathScheme library [1] to test our implementation of combinators to generate information. The
library contains over 1000 axiomatic theories, defined by using the combinators introduced in [2] and structured
as a theory graph. To be able to use the library, we translate it into Mmt. The combinators used to build
the library are interpreted as diagram level combinators [4]. Theories in the library have different foundations.
Some theories are based on simple logics, like equational logic, but others are based on more expressive1 ones,
like dependent type theory.


In conclusion, we aim at supporting the development of libraries by generating some related construction that
can be made available by syntactic manipulation of theory presentations. As an outcome of our work, a user
should be able to write something like
BooleanAlgebra := Theory {
  U : type ;
  * : (U , U ) -> U ;
  + : (U , U ) -> U ;
  --> : (U , U ) -> U ;
  0 : U;
  1 : U;
  compl : U -> U ;
  axiom unipotent_ - - > _1 : forall x : U . ( x --> x ) = (1);
  ....
}
generate ( termLang , simplify )
and then ask the system to evaluate simplify ((x * y) + 1) without any extra effort to define the term
language or simplification rules.

References
[1] Jacques Carette, William M Farmer, Filip Jeremic, Vincent Maccio, Russell O’Connor, and Quang M Tran.
    The mathscheme library: Some preliminary experiments. arXiv preprint arXiv:1106.1862, 2011.

[2] Jacques Carette and Russell O’Connor. Theory presentation combinators. In International Conference on
    Intelligent Computer Mathematics, pages 202–215. Springer, 2012.
  1 By expressivity we mean practical expressivity, in terms that information is expressed in a practical and convenient way


                                                                2
[3] Michael Kohlhase, Florian Rabe, and Vyacheslav Zholudev. Towards MKM in the large: Modular represen-
    tation and scalable software architecture. In International Conference on Intelligent Computer Mathematics,
    pages 370–384. Springer, 2010.

[4] Florian Rabe and Yasmine Sharoda. Diagram combinators in mmt. In International Conference on Intelligent
    Computer Mathematics. Springer, 2019.
[5] Wolfgang Wechler. Universal algebra for computer scientists, volume 25 of eatcs monographs on theoretical
    computer science, 1992.


                                                      3