<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">The Usage of Negation in Real-World JSON Schema Documents</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Mohamed-Amine</forename><surname>Baazizi</surname></persName>
							<email>baazizi@ia.lip6.fr</email>
							<affiliation key="aff0">
								<orgName type="laboratory">LIP6 UMR</orgName>
								<orgName type="institution">Sorbonne Université</orgName>
								<address>
									<postCode>7606</postCode>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Dario</forename><surname>Colazzo</surname></persName>
							<email>dario.colazzo@dauphine.fr</email>
							<affiliation key="aff1">
								<orgName type="institution" key="instit1">Université Paris-Dauphine</orgName>
								<orgName type="institution" key="instit2">PSL Research University</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Giorgio</forename><surname>Ghelli</surname></persName>
							<email>ghelli@di.unipi.it</email>
							<affiliation key="aff2">
								<orgName type="department">Dipartimento di Informatica</orgName>
								<orgName type="institution">Università di Pisa</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Carlo</forename><surname>Sartiani</surname></persName>
							<email>carlo.sartiani@unibas.it</email>
							<affiliation key="aff3">
								<orgName type="department">DIMIE</orgName>
								<orgName type="institution">Università della Basilicata</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Stefanie</forename><surname>Scherzinger</surname></persName>
							<email>stefanie.scherzinger@uni-passau.de</email>
							<affiliation key="aff4">
								<orgName type="institution">Universität Passau</orgName>
								<address>
									<settlement>Passau</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">The Usage of Negation in Real-World JSON Schema Documents</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">4A2ED2260179EA9ED57A2BA2B3AFDC72</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-23T20:31+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Empirical Study</term>
					<term>Conceptual Modeling</term>
					<term>JSON Schema</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Many software tools, but also formal frameworks for working with JSON Schema, do not fully support negation. This motivates us to study whether negation is actually used in practice, for which aims, and whether it could, in principle, be replaced by simpler operators. We have collected a large corpus of 80k open source JSON Schema documents. We perform a systematic analysis, quantify usage patterns of negation, and also qualitatively analyze schemas. We show that negation is indeed used, albeit infrequently, following a stable set of patterns.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>JSON has become one of the most popular formats for data exchange. While many schema languages for JSON have been proposed <ref type="bibr" target="#b0">[1]</ref>, JSON Schema <ref type="bibr" target="#b1">[2]</ref> is receiving considerable attention. In this language, a schema is a logical combination of assertions, describing classes of constraints on objects, arrays, and base values. JSON Schema is constantly evolving and new drafts always introduce new features. The language is increasingly used for defining domain-specific data exchange formats <ref type="bibr" target="#b2">[3]</ref> and as a meta-language for defining other languages; a subset of JSON Schema serves as the schema language inside MongoDB <ref type="bibr" target="#b3">[4]</ref>. As a consequence, an active and quite broad development community is releasing JSON Schema tools (validators <ref type="bibr" target="#b4">[5]</ref>, in particular).</p><p>JSON Schema is powerful but complex, and its semantics is based on an intricate interplay among logical assertions. A distinctive feature is the not operator, whereby negation can be applied to any assertion. Negation is quite rare in type and schema languages, as it poses severe challenges.</p><p>(a) 1 { " not ": 2 { " required ": [" DisplaceModules "] } 3 } (b) 1 { " description ": "..." , 2 " @errorMessages ": 3 { " not ": " Invalid target : ..." } , 4</p><p>" not ": { " pattern ": "..." } ... } (c)</p><p>1 { " title " : " Object w / required foo ." , 2 " type ": " object " , 3 " properties ": { 4 " foo ": { " type ": " integer " } , 5</p><p>" bar ": { " type ": " string " } } , 6</p><p>" p at ter nP rop er tie s ": { 7 " f . * o ": { " type ": " integer " } } , 8 " required ": [" foo "] 9 } Example 1. One usage of not that startles novices (as discussed on StackOverflow <ref type="bibr" target="#b6">[6]</ref>) is in combination with the keyword required, as shown in Figure <ref type="figure" target="#fig_0">1</ref>(a). While "not required" may sound like "optional", it enforces that the object must violate the assertion, so member "DisplaceModules" must be absent. Indeed, the not-operator is often not fully supported, whether in academic prototype tools <ref type="bibr" target="#b7">[7]</ref>, commercial tools (e.g., <ref type="bibr" target="#b3">[4]</ref>), or even formal frameworks <ref type="bibr" target="#b8">[8]</ref>, mostly because of the inherent complexity of handling negation. This inspired us to investigate the usage of this operator in real-world schemas, in a principled analysis of 80k JSON Schema documents crawled from GitHub. We formulate these research questions: (1) how frequent is negation in practice, (2) how is negation used, and ( <ref type="formula">3</ref>) what are common usage patterns?</p><p>Contributions. The contribution of this systematic empirical study is threefold. We first established a method for the collecting and preparing JSON Schema documents. Next, we measured the frequency of use of JSON Schema operators and of paths that include not, and quantify main patterns of use. Finally, we identified well-supported jargons, i.e., common uses of not that have the potential to mature into JSON Schema design patterns. An extended version of this study can be found here <ref type="bibr" target="#b9">[9]</ref>. they indicate an annotation that should be associated with the instance. Since we are mostly interested in validation, and since, moreover, annotations are removed by the not operator, we will ignore them.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Preliminaries</head><p>Example 2. In the schema in Figure <ref type="figure" target="#fig_0">1</ref>(c), inspired from <ref type="bibr" target="#b4">[5]</ref>, line 1 carries an annotation. In defining an object (line 2), applicators define constraints on properties (lines 3), and the type of the properties matching a pattern (see <ref type="bibr">line 6)</ref>. Using an assertion, it is possible to indicate required properties (line 8).</p><p>Example 3. JSON Schema is an open standard: in Figure <ref type="figure" target="#fig_0">1</ref>(b), @errorMessages is a userdefined keyword whose value is an object that describes the error, and not a JSON Schema assertion. Hence, not in line 3 is just a member name, whereas negation does occur in line 4. The same string token has different semantics, depending on its context, which complicates parsing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Pattern Queries</head><p>To study which keywords occur below an instance of the not operator, we introduce a simple path language. A path such as .**.not.required matches any path that ends with an object field named required found inside an object field whose name is not. Paths are expressed using the following language. Path matching is defined as in JSONPath <ref type="bibr" target="#b10">[10]</ref>. Complex sub-schemas. We say that not has a complex sub-schema, when its object argument contains more than one keyword. In this case, we say these keywords co-occur in the negated schema; otherwise, a sub-schema is simple. As an example, consider the schema of Figure <ref type="figure">3(b)</ref>: the argument of not is complex, and we match the paths .not.enum and .not.type.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Methodology</head><p>Context. We explored GitHub for open source JSON Schema documents. We identified 91,6k URLs in July 2020, of which 85,6k could be retrieved (using wget). Discarding files with invalid syntax yields 82k files.</p><p>For each retrieved file, we analyzed the $schema declarations to identify the version of JSON Schema. Draft 2019-09 is still quite new, and not really represented. Draft-04 is declared in the vast majority of the files (79%), while Draft-07, Draft-06, and the old Draft-03 are each below 5%. An analysis of the file contents showed that the actual version that a schema follows is often different from the version declared.</p><p>Data Preparation. As a first step, we renamed all references ($ref) by a new keyword $eref, with the target of the reference as its child, but we did not expand references recursively. We expanded references to external documents, provided that we were able to locate the referenced document (e.g., either contained within our corpus, or by downloading the document). References were renamed to $fref when expansion failed. We observed that by expanding references we lose the conceptual information encoded in the reference path itself. Thus, $ref is often more than just a syntactic macro.</p><p>The schema corpus contains a large share of near-duplicate schemas, with small variations in syntax. We performed duplicate elimination by comparing compact schema signatures, defined as a function that maps each keyword to the number of its occurrences in the schema (encoded as a vector of keyword counts); we assumed that two schemas with the same signature are, with high probability, versions of the same schema, and we retained just one. After duplicate elimination our corpus shrunk to 11,500 distinct schemas.</p><p>As illustrated in Example 3, correctly recognizing keywords can be a challenge. For this reason, we renamed all property names to avoid confusion when searching for patterns that involve the keyword not. As schema authors can define their own keywords, we have no way to know whether their value should be interpreted as an assertion. We experimented with two approaches: a "strict" approach in which we renamed everything that was inside a user-defined keyword, hence making it inaccessible by the analysis, and a "lax" approach in which we kept the content of any user-defined keyword, so that all instances of not in Figure <ref type="figure" target="#fig_0">1</ref>(b) would be counted as keywords. With the strict approach, some interesting usage patterns are lost, and keyword usage is under-estimated. With the lax approach, we risk "false positives", and hence over-estimation. We decided that the over-estimation of the lax approach was preferable.</p><p>Analysis Process. The bulk of our effort is actually invested in data preparation. After experimenting with different data analysis platforms, we resorted to a relational encoding of the JSON Schema documents in PostgreSQL. This setup met our performance expectations, and allowed us to write queries in plain SQL.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Results of the Study</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">RQ1: How frequent is negation in practice?</head><p>We study the frequency of JSON Schema keywords within our corpus, and the Boolean operators (among them, negation). The reported absolute values are mainly interesting as indicators as to the relative occurrences of operators. Figure <ref type="figure" target="#fig_2">2</ref> visualizes the results. From left-to-right, we sort keywords by their number of occurrence (note the log-scaled vertical axes). We also show the number of files in which keywords occur, as a further indicator of keyword relevance.</p><p>The operator not appears in approx. 3% of all schemas, and occupies the 30th position, out of 46 keywords analyzed. Thus, it is a comparatively rare operator. The most common Boolean operator is oneOf, more frequent than anyOf. allOf is even less common. The Boolean operator if-then-else is even less common than not, but was only been introduced in Draft-07. We found the dissemination of oneOf surprising, since the exclusive-disjunctive semantics of oneOf is more complicated than the purely disjunctive anyOf: oneOf takes as argument a collection of subschemas 𝑆 1 , . . . , 𝑆 𝑛 , and a value 𝐽 satisfies oneOf only if it matches exactly one subschema; anyOf is satisfied by any value 𝐽 that matches at least one of the subschemas. Our hypothesis is that the description of a class as a oneOf-combination of a set of "subclasses" is familiar from the exclusive-subclassing mechanism of object-oriented languages.</p><p>The operator not appears 787 times in 298 different files out of 11,500. While not very frequent, its usage nevertheless merits a systematic study.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">RQ2: How is negation used in practice?</head><p>We evaluated pattern queries to identify keywords below not. Table <ref type="table" target="#tab_1">1</ref> summarizes the results. Consider the left half. We match the path .**.not.* 840 times (#Occ) in 289 files (#Files). Below the top summary row, we list the individual keywords, breaking down shares of matches in percent (visualized by progress bars). The right half of the table provides statistics for subschemas that are negated and referenced, and therefore reachable via a path .**.not.$eref.*.</p><p>In the following, we will omit the prefix ".**" from path queries, assuming the context is clear to our readers. We sorted the table on the total number of not.𝑘+not.$eref.𝑘 occurrences, and it is interesting to compare the weight of different keywords in both parts.</p><p>A not may not correspond to any not.* pattern, when followed by { }. We found 16 such occurrences, expressing the schema false, which is not satisfied by any instance. This use of not is a consequence of the fact that false has only been introduced with Draft-06.</p><p>Table <ref type="table" target="#tab_1">1</ref> indicates a total of 840 occurrences of not. *, Figure <ref type="figure" target="#fig_2">2</ref> reported 787 occurrences of not. The values differ since the negated sub-schema can be complex. Most instances of not have a simple sub-schema. Most negated complex schemas have two keywords, but some have three or four. The situation is very different with $eref, i.e., references expanded in pre-processing. Here, 93 occurrences of not.$eref correspond to 338 occurrences of not.$eref.*. Thanks to the mediation of $eref, the schema designer implicitly applies negation to a complex argument, with an average of 3-4 members. The most common argument of negation is required. The pattern not.items is secondmost common, followed by not.type and not.properties.</p><p>While not.required dominates the not.* case, the two most common cases of the not.$eref group are not.$eref.type, whose value is object in 80% of the cases, and not.$eref.properties, which indicates that not.$eref is mostly used to negate complex object definitions. This explains the much higher occurrence of descriptive keywords inside the referenced argument.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">RQ3: What are common real-world usage patterns?</head><p>Field and value exclusion. Field exclusion via not.required is the most frequent path.</p><p>Paths not.enum and not.const are used to exclude values. Snippets of example schemas</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Snippets of JSON Schema documents.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head></head><label></label><figDesc>𝑝 ::= 𝑠𝑡𝑒𝑝 | 𝑠𝑡𝑒𝑝 𝑝 𝑠𝑡𝑒𝑝 ::= .𝑘𝑒𝑦 | . * | [*] | .** The step .* retrieves all member values of an object, [*] retrieves all items of an array, and .** is the reflexive and transitive closure of the union of .* and [*], navigating to all nodes of the JSON tree to which it is applied.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Number of total occurrences (#Occ), and number of files (#Files), where a JSON Schema keyword appears. Boolean operators are highlighted.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>JSON data model. The grammar below captures the syntax of JSON values, which are basic values, objects, or arrays. Basic values 𝐵 include the null value, booleans, numbers 𝑛, and strings 𝑠. Objects 𝑂 represent sets of members, each member being a name-value pair, and arrays 𝐴 represent sequences of values.</head><label></label><figDesc>{𝑙 1 : 𝐽 1 , . . . , 𝑙 𝑛 : 𝐽 𝑛 } 𝑛 ≥ 0, 𝑖 ̸ = 𝑗 ⇒ 𝑙 𝑖 ̸ = 𝑙 𝑗 Applicators include the boolean operators anyOf, allOf, oneOf, not, the object operators properties, patternProperties, additionalProperties, the array operator items, and the reference operators $ref. Applicators indicate a request to apply a different operator to the same instance or to a component of the current instance. Annotations include title, description, and $comment, they do not affect validation, but</figDesc><table><row><cell>𝐽 ::= 𝐵 | 𝑂 | 𝐴</cell><cell></cell><cell>JSON expressions</cell></row><row><cell>𝐵 ::= null | true | false | 𝑛 | 𝑠</cell><cell>𝑛 ∈ Num, 𝑠 ∈ Str</cell><cell>Basic values</cell></row><row><cell cols="3">𝑂 ::= Objects</cell></row><row><cell>𝐴 ::= [𝐽 1 , . . . , 𝐽 𝑛 ]</cell><cell>𝑛 ≥ 0</cell><cell>Arrays</cell></row><row><cell cols="3">JSON Schema. JSON Schema is a language for defining constraints and requirements on</cell></row><row><cell cols="3">the content of JSON documents. We discuss here the main keywords, and continue with two</cell></row><row><cell>illustrative examples:</cell><cell></cell><cell></cell></row><row><cell cols="3">Assertions include required, enum, const, pattern and type, and indicate a test that is</cell></row><row><cell cols="2">performed on the corresponding instance.</cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1</head><label>1</label><figDesc>Occurrences of not.𝑘 paths (overall #Occ, and counting #Files).</figDesc><table><row><cell>Path</cell><cell>#Occ</cell><cell>#Files</cell><cell>Path</cell><cell>#Occ</cell><cell>#Files</cell></row><row><cell>not.*</cell><cell>840</cell><cell>289</cell><cell>not.$eref.*</cell><cell>338</cell><cell>28</cell></row><row><cell>required</cell><cell>28.6 %</cell><cell>29.1 %</cell><cell>required</cell><cell>10.7 %</cell><cell>53.6 %</cell></row><row><cell>items</cell><cell>15.0 %</cell><cell>9.3 %</cell><cell>items</cell><cell>0.0 %</cell><cell>0.0 %</cell></row><row><cell>type</cell><cell>7.4 %</cell><cell>17.7 %</cell><cell>type</cell><cell>15.1 %</cell><cell>71.4 %</cell></row><row><cell>properties</cell><cell>8.5 %</cell><cell>16.3 %</cell><cell>properties</cell><cell>11.8 %</cell><cell>64.3 %</cell></row><row><cell>$eref</cell><cell>11.1 %</cell><cell>9.7 %</cell><cell>$eref</cell><cell>0.0 %</cell><cell>0.0 %</cell></row><row><cell>enum</cell><cell>7.3 %</cell><cell>18.0 %</cell><cell>enum</cell><cell>3.6 %</cell><cell>28.6 %</cell></row><row><cell>allOf</cell><cell>2.7 %</cell><cell>8.0 %</cell><cell>allOf</cell><cell>11.2 %</cell><cell>17.9 %</cell></row><row><cell>pattern</cell><cell>5.6 %</cell><cell>9.7 %</cell><cell>pattern</cell><cell>0.0 %</cell><cell>0.0 %</cell></row><row><cell>anyOf</cell><cell>5.4 %</cell><cell>12.5 %</cell><cell>anyOf</cell><cell>0.6 %</cell><cell>7.1 %</cell></row><row><cell>description</cell><cell>0.5 %</cell><cell>1.4 %</cell><cell>description</cell><cell>12.1 %</cell><cell>25.0 %</cell></row><row><cell>title</cell><cell>0.2 %</cell><cell>0.7 %</cell><cell>title</cell><cell>11.5 %</cell><cell>25.0 %</cell></row><row><cell>$schema</cell><cell>0.0 %</cell><cell>0.0 %</cell><cell>$schema</cell><cell>12.1 %</cell><cell>32.1 %</cell></row><row><cell>$fref</cell><cell>3.2 %</cell><cell>4.8 %</cell><cell>$fref</cell><cell>0.0 %</cell><cell>0.0 %</cell></row><row><cell>oneOf</cell><cell>0.7 %</cell><cell>1.4 %</cell><cell>oneOf</cell><cell>5.3 %</cell><cell>10.7 %</cell></row><row><cell>additionalProperties</cell><cell>1.3 %</cell><cell>3.8 %</cell><cell>additionalProperties</cell><cell>2.7 %</cell><cell>25.0 %</cell></row><row><cell>patternProperties</cell><cell>1.8 %</cell><cell>5.2 %</cell><cell>patternProperties</cell><cell>0.0 %</cell><cell>0.0 %</cell></row><row><cell>const</cell><cell>0.7 %</cell><cell>0.4 %</cell><cell>const</cell><cell>0.0 %</cell><cell>0.0 %</cell></row><row><cell>definitions</cell><cell>0.0 %</cell><cell>0.0 %</cell><cell>definitions</cell><cell>0.9 %</cell><cell>10.7 %</cell></row><row><cell>id</cell><cell>0.0 %</cell><cell>0.0 %</cell><cell>id</cell><cell>0.6 %</cell><cell>7.1 %</cell></row><row><cell>dependencies</cell><cell>0.0 %</cell><cell>0.0 %</cell><cell>dependencies</cell><cell>0.6 %</cell><cell>7.1 %</cell></row><row><cell>not</cell><cell>0.0 %</cell><cell>0.0 %</cell><cell>not</cell><cell>0.6 %</cell><cell>7.1 %</cell></row><row><cell>$ref</cell><cell>0.0 %</cell><cell>0.0 %</cell><cell>$ref</cell><cell>0.6 %</cell><cell>7.1 %</cell></row><row><cell>$comment</cell><cell>0.1 %</cell><cell>0.4 %</cell><cell>$comment</cell><cell>0.0 %</cell><cell>0.0 %</cell></row></table></figure>
		</body>
		<back>
			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>(a)</head><p>" not ": { " enum ": [" markdown " , " code " , " raw "] } (b)</p><p>" not ": { " enum ": [" generic − linux "] , " type ": " string " } (c)</p><p>" not ": { " items ": { " not ": { " type ": " string " , " enum ": [ " Dataset " , " Image " , " Video " , " Sound " , " Text " ] } }</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>(d)</head><p>{ " type " : " object " , " oneOf ": [ { " properties ": { " when ": {" enum ": [" delayed "]}} , " required ": [" when " ," start_in "] } , { " properties ":</p><p>{ " when ": { " not ": {" enum ": [" delayed "]} }}} ] } (e) { " type ": " object " , " if ": { " required ": [" when "] , " properties ": { " when ": {" enum ": [" delayed "]} }} , " then ": { " properties ": { " when ": {" enum ": [" delayed "] }} , " required ": [" when " , " start_in "] }} are shown in Figures <ref type="figure">3(a</ref>) and (b). Such schemas have an obvious interpretation: the instance may have any type and must be different from the string or strings listed. In the majority of cases, the sub-schema is simple, as in Figure <ref type="figure">3</ref>(a). In the complex cases, enum is always paired with a "type" : "string" assertion, as in Figure <ref type="figure">3(b)</ref>. This assertion is redundant, since all values listed by enum are strings. This co-occurrence is not specific to negation, since also in positive schemas, enum is paired with a type assertion in the vast majority of cases.</p><p>Paraphrasing contains. The pattern not.items is among the most common not-paths. All such schemas have either the structure not.items.not (as in Figure <ref type="figure">3</ref>(c)) or not.items.enum.</p><p>The items assertion is verified by any instance that is not an array, or that is an empty array, or that is an array where every element satisfies the schema associated with items. Hence, it is only violated by instances that are arrays, and which contain at least one element that violates the schema. While items specifies a universally quantified property, not.items can be used to specify an existentially quantified property, as does the contains keyword. The jargon not.items.enum specifies that the array must contain at least one value that is not listed in the argument of enum. The jargon not.items.not specifies that the instance is an array that contains at least one value that satisfies 𝑆, according to the following equivalence:</p><p>"not": { "items": { "not": 𝑆 } } ⇔ {"type": "array", "contains": 𝑆 } These two cases cover, with minimal variations, all occurrences of not.items. To sum up, not.items can be used to express contains. This is an instance of a pattern that may be replaced by a single (and thus simpler) operator.</p><p>Paraphrasing Discriminated Unions. The schema snippet in Figure <ref type="figure">3(d)</ref> allows interesting observations about the use of oneOf. JSON Schema specifications do not prescribe that the branches of oneOf are mutually exclusive, but they state that a value must match a single branch only. However, the two branches of oneOf happen to be mutually exclusive: if "when" is absent, then only the second branch holds. If it is present, then it is associated to complementary types in the two branches, so here, oneOf is actually anyOf. Applying equivalent rewritings (from ¬𝑎∨𝑏 to 𝑎 ⇒ 𝑏, and pushing down negation), the schema can be rewritten as shown in Figure <ref type="figure">3(e)</ref>. Now the specification is clearer: if "when" has the value "delayed", then "start_in" is required.</p><p>This suggests that oneOf is used to express a form of discriminated unions.</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Schemas and types for JSON data: From theory to practice</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Baazizi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Colazzo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Ghelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Sartiani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. SIGMOD 2019</title>
				<meeting>SIGMOD 2019</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="2060" to="2063" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<ptr target="https://json-schema.org" />
		<title level="m">json-schema org, JSON Schema</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">What Are Real JSON Schemas Like? -An Empirical Analysis of Structural Properties</title>
		<author>
			<persName><forename type="first">B</forename><surname>Maiwald</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Riedle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Scherzinger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. EmpER 2019</title>
				<meeting>EmpER 2019</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="95" to="105" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<author>
			<persName><forename type="first">Inc</forename><surname>Mongodb</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">MongoDB Manual: $jsonSchema</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">4</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<ptr target="https://github.com/json-schema-org/" />
		<title level="m">JSON Schema Test Suite</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m">JSON-Schema-Test-Suite, version of commit hash #09fd353</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<author>
			<persName><surname>Stackoverflow</surname></persName>
		</author>
		<ptr target="https://stackoverflow.com/questions/30515253/json-schema-valid-if-object-does-not-contain-a-particular-property" />
		<title level="m">JSON Schema -valid if object does *not* contain a particular property</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Challenges in Checking JSON Schema Containment over Evolving Real-World Schemas</title>
		<author>
			<persName><forename type="first">M</forename><surname>Fruth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Baazizi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Colazzo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Ghelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Sartiani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Scherzinger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. EmpER 2020</title>
				<meeting>EmpER 2020</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="220" to="230" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Finding data compatibility bugs with JSON subschema checking</title>
		<author>
			<persName><forename type="first">A</forename><surname>Habib</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Shinnar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hirzel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Pradel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. ISSTA 2021</title>
				<meeting>ISSTA 2021</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="620" to="632" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">An empirical study on the &quot;usage of not&quot; in real-world JSON schema documents</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Baazizi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Colazzo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Ghelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Sartiani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Scherzinger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of ER 2021</title>
				<meeting>ER 2021</meeting>
		<imprint>
			<date type="published" when="2021">October 18-21, 2021, 2021</date>
			<biblScope unit="page" from="102" to="112" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Java XML and JSON: Document Processing for Java SE</title>
		<author>
			<persName><forename type="first">J</forename><surname>Friesen</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2019">2019</date>
			<publisher>Apress</publisher>
			<biblScope unit="page" from="299" to="322" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
