=Paper= {{Paper |id=Vol-2307/paper53 |storemode=property |title=None |pdfUrl=https://ceur-ws.org/Vol-2307/paper53.pdf |volume=Vol-2307 |dblpUrl=https://dblp.org/rec/conf/mkm/WiesingK18 }} ==None== https://ceur-ws.org/Vol-2307/paper53.pdf
 A Proposal for an OpenMath JSON Encoding

                           Tom Wiesing                 Michael Kohlhase
                      Computer Science, FAU Erlangen-Nürnberg, Germany
                                     http://kwarc.info



                                              Abstract

                   OpenMath is a semantic representation format of mathe-
                   matical objects and formulae. There are several encodings
                   of OpenMath Objects, most notably the XML and Binary
                   encodings. In this paper we propose another one based on
                   JSON – a lightweight data-interchange format used heavily
                   in the Web Applications area.
                   We survey two existing OpenMath JSON encodings already
                   and show how their advantages can be combined and their
                   disadvantages be alleviated. We give a thorough specifica-
                   tion of the encoding, present a JSON schema implemented in
                   TypeScript, and provide a web service that validates JSON-
                   encoded OpenMath and transforms OpenMath objects be-
                   tween XML and JSON encodings.

1    Introduction
OpenMath [BusCapCar:oms04] is a semantic representation format of mathematical objects
and formulae. In a nutshell, OpenMath standardizes six basic object types (symbols, variables,
numbers, strings, and foreign objects), three ways of building complex objects: (function) ap-
plication, binding, and management facilities like structure sharing and error reporting. The
OpenMath object model underlies Content MathML [CarlisleEd:MathML3:on], making it
well-integrated with MathML presentation.
   There are several encodings of OpenMath Objects, most notably the XML and Binary en-
codings. In this paper we propose another one based on JSON, a lightweight data-interchange
format that used heavily in the Web Applications area.
   JSON [JSON:web], short for JavaScript Object Notation, is a lightweight data-interchange
format. While being a subset of JavaScript, it is defined independently. JSON can represent
both primitive types and composite types.

Copyright c by the paper’s authors. Copying permitted for private and academic purposes.
In: O. Hasan, J. Davenport, M. Kohlhase, (eds.): Proceedings of the 29th Openmath Workshop, Hagenberg,
2018, published at http://ceur-ws.org
   Primitive JSON data types are strings (e.g. "Hello world"), Numbers (e.g. 42 or
3.14159265), Booleans (true and false) and null.         Composite JSON types are either
(non-homogeneous) arrays (e.g. [1, "two", false]) or key-value pairs called objects (e.g.
{"foo": "bar", "answer": 42}).
   Constructs corresponding to JSON objects are found in most programming languages. Fur-
thermore, the syntax is very simple; hence many languages have built-in facilities for translating
their existing data structures to and from JSON. The use for an OpenMath JSON encoding is
clear: It would enable easy use of OpenMath across many languages.
   In the next Section we survey two existing OpenMath JSON encodings. Section 3 proposes
a new encoding that combines the advantages and alleviates their disadvantages. We give a
thorough specification of the encoding, present a JSON schema implemented in TypeScript,
and provide a web service that validates JSON-encoded OpenMath and transforms OpenMath
objects between XML and JSON encodings. Section 5 concludes the paper.

2   Existing JSON Encodings for OpenMath
There are existing approaches for encoding OpenMath as JSON. We will discuss two particular
ones here.

XML as JSON
The JSONML standard [jsonml:webpage] allows generic encoding of arbitrary XML as JSON.
This can easily be adapted to the case of OpenMath. To encode an OpenMath object as JSON,
one first encodes it as XML and then makes use of JSONML in a second step. Using this
method, the term plus(x, 5) would correspond to:
[
    "OMOBJ",
    {"xmlns":"http://www.openmath.org/OpenMath"},
    [
        "OMA",
        [
            "OMS",
            {"cd": "arith1", "name": "plus"}
        ],
        [
            "OMV",
            {"name": "x"}
        ],
        [
            "OMI",
            "5"
        ]
    ]
]

  This translation has the advantage that it is near-trivial to translate between the XML and
JSON encodings of OpenMath. It also has some disadvantages:
    • The encoding does not use the native JSON datatypes. One of the advantages of JSON is
      that it can encode most basic data types directly, without having to turn the data values
      into strings. To encode the floating point value 1e-10 (a valid JSON number literal) using
      the JSONML encoding, one can not directly place it into the result. Instead, one has to turn
      it into a string first. Despite many JSON implementations providing such a functionality,
      in practice this would require frequent translation between strings and high-level datatypes.
      This is not what JSON is intended for, instead the provided data types should be used.
    • The awkwardness of some of the XML encoding remains. Due to the nature of XML the
      XML encoding sometimes needs to introduce elements that do not directly correspond to
      any OpenMath objects. For example, the OMATP element is used to encode a set of
      attribute / value pairs. This introduces unnecessary overhead into JSON, as an array of
      values could be used instead.
    • Many languages use JSON-like structures to implement structured data types. Thus it
      stands to reason that an OpenMath JSON encoding should also provide a schema to allow
      languages to implement OpenMath easily. This is not the case for a JSONML encoding.
OpenMath-JS
The openmath-js [openmathjs:webpage] encoding takes a different approach. It is an (incom-
plete) implementation of OpenMath in JavaScript and was developed by Nathan Carter for use
with Lurch [CarterMonks:OM:CICM-WS-WiP2013] on the web. It is written in literate
coffee script, a derivative language of JavaScript.
  In this encoding, the term plus(x, 5) would correspond to:
{
    "t":"a",
    "c": [
       {
          "t":"sy",
          "cd":"arith1",
          "n":"plus"
       },
       {
          "t":"v",
          "n":"x"
       },
       {
          "t":"i",
          "v":"5"
       }
    ]
}
  This encoding solves some of the disadvantages of the JSONML encoding, however it still has
some drawbacks:
    • It was written as a JavaScript, not JSON, encoding. The existing library provides JavaScript
      functions to encode OpenMath objects. However, the resulting JSON has only minimal
      names. This makes it difficult for humans to read and write directly.
    • No formal schema exists, like in the JSONML encoding.




3     The OpenMath-JSON encoding

Given the disadvantages with the existing encodings we propose a new one that combines the
advantages and alleviates the problems. In particular, the new encoding should be close to the
OpenMath XML encoding, and at the same time make use of native JSON concepts.
  Furthermore, we want to formalize this encoding by providing a JSON schema for easy vali-
dation, which is not achieved by any existing approach.
   Concretely, we will use JSON Schema [handrewsjsonschema:18]. This defines a vocabulary
allowing us to validate and annotate JSON documents. Additionally, tools for programatic
verification exist in many languages.
   Unfortunately, JSON schema is often tedious to read and write for humans. This is espe-
cially true when it comes to recursively defined data structures. OpenMath has many recursive
structures. Instead of writing our encoding in JSON Schema directly, we decided to write the
schema in a different language and then compile it to JSON Schema.
   For this purpose, we decided to make use of TypeScript [typescript:webpage]. TypeScript
is a language derived from JavaScript – TypeScript files are JavaScript plus type annotations.
As such, it can be easily written and understood by humans. On top of TypeScript, we make
use of a compiler [vega-ts-jscon-schema-generator:webpage] from TypeScript definitions
into JSON Schema.
    In general, objects in our encoding look similar to the following:
{
     "kind": "OMV",
     "id": "something",
     "name": "x"
}

   The kind attribute specifies which kind of OpenMath object this is. These values correspond
to the element names used in the XML encoding. This correspondence lays the foundations
of easy translation between the two. In TypeScript this property is also referred to as a Type
Guard, because it guards the type of object that is represented.
  As in the XML encoding it is possible to make use of structure sharing. For this purpose the
id attribute can be used. We will come back to this in more detail below, when we define to
the OMR type.
  In the following we will go over the details of our encoding. For this we will make use of
a TypeScript-like syntax, that is easily readable. In our description we omit the id attribute,
which can be added to any encoded object. The complete source code of our encoding – and
details on how to use it – can be found on Github [URL:openmathjson:github].
3.1    Object Constructor – OMOBJ
The OpenMath Object Constructor – OMOBJ – is defined as follows:
{
      "kind": "OMOBJ",
      /** optional version of openmath being used */
      "openmath"?: "2.0",
      /** the actual object */
      "object": omel /* any element */
}
Concretely, the integer 3 encapsulated in an object constructor using this encoding is as follows:
{
      "kind": "OMOBJ",
      "openmath": "2.0",
      "object": {
          "kind": "OMI",
          "integer": 3
      }
}
   Let us have a look at this first example attribute for attribute.
   The first attribute – kind – represents the type of OpenMath object in question. Notice that
it occurs twice – once in the OMOBJ and a second time in the wrapped OMI. We will talk in detail
about integer representation below, and hence only care about this first one.
   The second attribute – openmath – is defined as optional by our schema. This indicates the
version of OpenMath that is being used – "2.0" in our case.
   The third and final attribute is the object attribute. This contains the wrapped object –
it is defined as of omel type. This type omel can contain any OpenMath object – concretely
primitive objects (Symbols OMS, Variables OMV, Integers OMI, Floats OMF, Bytes OMB, Strings
OMSTR), composite objects (Application OMA, Attribution OMATTR, Binding OMBIND) or Errors
OME and References OMR. In this particular case, we just have the integer 3.

3.2    Symbols – OMS
An OpenMath Symbol is encoded as follows:
{
      "kind": "OMS",
      /** the base for the cd, optional */
      "cdbase"?: uri, /* any valid URI */,
      /** content dictonary the symbol is in, any name */
      "cd": name,
      /** name of the symbol */
      "name": name /* any valid symbol name */
}
  Notice the uri and name types in the definition. These are not directly JSON types. We
define the uri type to be a any JSON string that represents a valid URI. Similarly, we define
the name type to be any JSON string that represents a valid symbol name.
    For example to encode the sin symbol from the transc1 CD:
{
      "kind": "OMS",
      "cd": "transc1",
      "name": "sin"
}

3.3    Variables – OMV
An OpenMath Variable is encoded as follows:
{
      "kind": "OMV",
      /** name of the variable */
      "name": name
}
    We again make use of the name type here.
    For example to encode a variable x:
{
      "kind": "OMV",
      "name": "x"
}

3.4    Integers – OMI
Unlike the previous elements, our encoding allows integers to be encoded in three different ways.
In particular, we define them as follows:
{
      "kind": "OMI",
      //
      // exactly one of the following
      //

      /* any json integer */
      "integer": integer,
      /* any string matching ^-?[0-9]+$ */
      "decimal": decimalInteger,
      /* any string matching ^-?x[0-9A-F]+.$ */
      "hexadecimal": hexInteger
}
    We allow integers to be represented as one of the following:

JSON Integers Representing an OpenMath Integer as a JSON integer allows making use of
   datatypes that JSON offers. This was one of the goals we wanted to achieve with our
   encoding.
      JSON has no integer type in and of itself – it only provides a number type. To work around
      this in our schema, we define a custom integer type by restricting the number type to
    integers.

    OpenMath integers can be of arbitrary size. While the JSON specification does not limit
    the size of numbers, it also allows any implementation the freedom to pick some limit. Thus
    this representation only work well for reasonably sized integers. But for larger numbers,
    the other two variants should be used.



Decimal Strings A second way to encode an OpenMath Integer is to make use of the straight-
   forward decimal string encoding. This is any string consisting only of digits (and potentially
   having a minus sign in the case of a negative integer).



Hexademical String We can allow to encode an OpenMath Integer as a hexadecimal integer.
   To make this different from decimally encoded integers, we require a lower-case x to be
   placed before the string. This also has the advantage that this corresponds exactly to the
   XML encoding.



  We can thus encode the integer −120 in three different ways:



  • as a JSON integer
    {
        "kind": "OMI",
        "integer": -120
    }




  • as a decimal-encoded string
    {
        "kind": "OMI",
        "decimal": "-120"
    }




  • as a hexadecimal-encoded string
    {
        "kind": "OMI",
        "hexadecimal": "-x78"
    }
3.5       Floats – OMF

Like integers, we allow floats to be encoded in three different ways:
{
      "kind": "OMF",

      //
      // exactly one of the following
      //

      /* any json number */
      "float": number,
      /* any string matching
          (-?)([0-9]+)?(\.[0-9]+)?([eE](-?)[0-9]+)? */
      "decimal": decimalFloat,
      /* any string matching ^([0-9A-F]+)$ */
      "hexadecimal": hexFloat
}

    Here, the different cases work exactly like in the integer case.
  We can represent a float either as a native JSON number, a string in decimal representation
or a string in hexadecimal representation. Just like above, the string representations correspond
to the ones allowed by the XML encoding.
    For example the floating point number 10−10 can be represented in three different ways:


    • as a JSON float
      {
           "kind": "OMF",
           "float": 1e-10
      }



    • as a decimal-encoded string
      {
           "kind": "OMF",
           "decimal": "0.0000000001"
      }



    • as a hexadecimal-encoded string
      {
           "kind": "OMF",
           "hexaecimal": "3DDB7CDFD9D7BDBB"
      }
3.6       Bytes – OMB
Bytes can be encoded in two different ways:
{
      "kind": "OMB",
      //
      // exactly one of the following
      //

      /** an array of bytes
          where a byte is an integer from 0 to 255 */
      "bytes": byte[],
      /** a base64 encoded string */
      "base64": base64string
}


Array of bytes This encoding again makes use of JSON data structures – representing bytes
   as a concrete list of bytes. As the byte datatype does not exist directly in JSON, we
   represent a single byte as an integer between 0 and 255 (inclusive).

Base64 encoded string A base64-encoded string of bytes corresponds to the XML encoding.

    For example we can encode the ascii bytes of the string hello world

    • as a byte array
      {
           "kind": "OMB",
           "bytes": [
               104, 101, 108, 108, 111, 32,
               119, 111, 114, 108, 100
           ]
      }

    • as a base64-encoded string
      {
           "kind": "OMB",
           "base64": "aGVsbG8gd29ybGQ="
      }

3.7       Strings – OMS
The encoding of strings is straightforward – we can make use of native JSON strings.
{
      "kind": "OMSTR",
      /** the string */
      "string": string
}
    Thus the string "Hello world" is encoded as follows:
{
      "kind": "OMSTR",
      "string": "Hello world"
}




3.8    Applications – OMA



OpenMath Applications are encoded as follows:
{
      "kind": "OMA",
      /** the base for the cd, optional */
      "cdbase"?: uri,
      /** the term that is being applied */
      "applicant": omel,
      /** the arguments that the applicant
          is being applied to. Optional and
          assumed to be empty if omitted */
      "arguments"?: omel[]
}


    Here we again make use of the already known omel and uri types.

    For example when encoding sin(x) we get:
{
      "kind": "OMA",
      "applicant": {
          "kind": "OMS",
          "cd": "transc1",
          "name": "sin"
      },
      "arguments": [{
          "kind": "OMV",
          "name": "x"
      }]
}
3.9    Attributions – OMATTR
OpenMath Attributions are encoded as follows:
{
      "kind": "OMATTR",
      /** the base for the cd, optional */
      "cdbase": uri,
      /** attributes attributed to this object, non-empty */
      "attributes": ([
          OMS, omel|OMFOREIGN
      ])[],
      /** object that is being attributed */
      "object": omel
}
  Note that each attribute is represented as a pair of the name of the attribute and its corre-
sponding value. This gives us the attributes property in the definiton above being an array of
pairs. Here we have a somewhat significant derivation from the XML encoding. This introduced
an extra element OMATP to represent the pairs – this is not necessary for our case.
  As an example, to attribute a variable x as having real type:
{
      "kind": "OMATTR",
      "attributes": [
          [
              { "kind": "OMS", "cd": "ecc", "name": "type" },
              { "kind": "OMS", "cd": "ecc", "name": "real" }
          ]
      ],
      "object": {
          "kind": "OMV",
          "name": "x"
      }
}

3.10    Bindings – OMB
We encode bindings as:
{
      "kind": "OMBIND",
      /** the base for the cd, optional */
      "cdbase"?: uri,
      /** the binder being used */
      "binder": omel,
      /** the variables being bound, non-empty */
      "variables": (OMV | attvar)[],
      /** the object that is being bound */
      "object": omel
}
    Here, the variables property is defined as a non-empty array of



    • a variable OMV, or



    • an attributed variable (the type attvar), that is an OMATTR with the object being at-
      tributed being a variable.



    For example to encode λx. sin(x):
{
     "kind": "OMBIND",
     "binder":
         { "kind": "OMS", "cd": "fns1", "name": "lambda" },
     "variables": [
         { "kind": "OMV", "name": "x" }
     ],
     "object": {
         "kind": "OMA",
         "applicant":
             { "kind": "OMS", "cd": "transc1", "name": "sin" },
         "arguments": [
             { "kind": "OMV", "name": "x" }
         ]
     }
}




3.11     Errors – OME

An OpenMath Error is encoded as follows:
{
     "kind": "OME",
     /** the error that has occured */
     "error": OMS,
     /** arguments to the error, optional */
     "arguments"?: (omel|OMFOREIGN)[]
}

    For example, to annotate a division by zero error in x/0:
{
    "kind": "OME",
    "error":
        { "kind": "OMS", "cd": "aritherror",
          "name": "DivisionByZero" },
    "arguments": [{
        "kind": "OMA",
        "applicant": { "kind": "OMS", "cd": "arith1",
                      "name": "divide" },
        "arguments": [
            { "kind": "OMV", "name": "x" },
            { "kind": "OMI", "integer": 0}
        ]
    }]
}


3.12   Foreign Objects – OMFOREIGN
An OpenMath Foreign Object is an object that is not part of OpenMath. In JSON, we can
encode one such object as follows:
{
    "kind": "OMFOREIGN",
    /** encoding of the foreign object, optional */
    "encoding"?: string,
    /** the foreign object */
    "foreign": any
}
  By nature of being JSON, foreign objects inside the JSON encoding are obviously limited of
being representable as JSON.
  As a very simple example a LATEXmath term could be represented as:
{
    "kind": "OMFOREIGN",
    "encoding": "text/x-latex",
    "foreign": "$\sin(x)$"
}


3.13   References – OMR
Finally, OpenMath References are represented as:
{
    "kind": "OMR"
    /** element that is being referenced */
    "href": uri
}
  These can be used for structure sharing. Concretely, one can take any OpenMath object with
an id attribute and refer to it in another place using the OMR object with an appropriate href
attribute.
  Consider as an example the term f (f (f (a, a), f (a, a)), f (f (a, a), f (a, a))). Informally, we can
write this as                                  y

                                          f (f (f (a, a), y), x)
                                                   x

. In our encoding, this corresponds to
{
    "kind": "OMOBJ",
    "object": {
        "kind": "OMA",
        "applicant": { "kind": "OMV", "name": "f" },
        "arguments": [{
            "kind": "OMA", "id": "x",
            "applicant": { "kind": "OMV", "name": "f" },
            "arguments": [{
                "kind": "OMA", "id": "y",
                "applicant": { "kind": "OMV", "name": "f" },
                "arguments":
                    [{ "kind": "OMV", "name": "a" },
                    { "kind": "OMV", "name": "a" }]
            }, { "kind": "OMR", "href": "#y" }]
        }, {
            "kind": "OMR", "href": "#x"
        }]
    }
}

4    A JSON Validation and XML/JSON Translation Web Service
To demonstrate our OpenMath-JSON encoding, we have created a web site which can be found
at [openmathjson:web]. This site is implemented in TypeScript and encapsulated using a
docker [docker:webpage] container. It serves three purposes:
   Primarily, it serves as a presentation of the encoding, providing examples and documenting
it’s usage.
   Secondly, it enables validation of OpenMath JSON objects. This can be seen in Figure 1. The
user can enter some JSON, press the Validate JSON button, and receive immediate feedback if
their JSON is a valid OpenMath object or not. In particular, the user can also see a detailed
error message if their object is not valid OpenMath JSON.
   This makes use of the OpenMath JSON schema, and validates the users’ JSON using a generic
JSON Schema Validator. Furthermore, this is also exposed using a REST API, enabling easy
validation of OpenMath JSON in other applications.
   Finally, it translates between XML and JSON encoded OpenMath objects. As for validation,
the site enables the user to enter some JSON and be presented with some XML and vice-versa;
see Figures 2 and 3.
   As we designed our encoding with this translatability goal in mind, the implementation of
it was straight-forward. For programmatic access, translation and validation are also exposed
  Figure 1: Interface for validating an OpenMath object.




Figure 2: Interface for converting OpenMath XML to JSON.
                Figure 3: Interface for converting OpenMath JSON to XML.
using a REST API.

5   Conclusion
We have proposed a new OpenMath JSON encoding that facilitates the content-oriented com-
munication of mathematical objects and formulae in the web applications and web services
area. Our encoding combines the advantages of existing approaches, alleviates their problems,
and enables validation via a JSON schema we provide. To support development with the new
encoding, we supply a JSON validation service and a JSON/XML encoding translation service.
  In the future we hope that the OpenMath Society will canonicalize a JSON encoding for
OpenMath using our proposal as a basis.

Acknowledgements
The authors gratefully acknowledge fruitful discussions with the OpenMath Community. The
work reported here was supported by the OpenDreamKit Horizon 2020 European Research
Infrastructures project (#676541) and DFG project RA-18723-1 OAF.