mirror of
https://github.com/elasota/Aerofoil.git
synced 2025-12-15 04:29:37 +00:00
Convert DITLs to JSON
This commit is contained in:
505
rapidjson/doc/schema.md
Normal file
505
rapidjson/doc/schema.md
Normal file
@@ -0,0 +1,505 @@
|
||||
# Schema
|
||||
|
||||
(This feature was released in v1.1.0)
|
||||
|
||||
JSON Schema is a draft standard for describing the format of JSON data. The schema itself is also JSON data. By validating a JSON structure with JSON Schema, your code can safely access the DOM without manually checking types, or whether a key exists, etc. It can also ensure that the serialized JSON conform to a specified schema.
|
||||
|
||||
RapidJSON implemented a JSON Schema validator for [JSON Schema Draft v4](http://json-schema.org/documentation.html). If you are not familiar with JSON Schema, you may refer to [Understanding JSON Schema](http://spacetelescope.github.io/understanding-json-schema/).
|
||||
|
||||
[TOC]
|
||||
|
||||
# Basic Usage {#Basic}
|
||||
|
||||
First of all, you need to parse a JSON Schema into `Document`, and then compile the `Document` into a `SchemaDocument`.
|
||||
|
||||
Secondly, construct a `SchemaValidator` with the `SchemaDocument`. It is similar to a `Writer` in the sense of handling SAX events. So, you can use `document.Accept(validator)` to validate a document, and then check the validity.
|
||||
|
||||
~~~cpp
|
||||
#include "rapidjson/schema.h"
|
||||
|
||||
// ...
|
||||
|
||||
Document sd;
|
||||
if (sd.Parse(schemaJson).HasParseError()) {
|
||||
// the schema is not a valid JSON.
|
||||
// ...
|
||||
}
|
||||
SchemaDocument schema(sd); // Compile a Document to SchemaDocument
|
||||
// sd is no longer needed here.
|
||||
|
||||
Document d;
|
||||
if (d.Parse(inputJson).HasParseError()) {
|
||||
// the input is not a valid JSON.
|
||||
// ...
|
||||
}
|
||||
|
||||
SchemaValidator validator(schema);
|
||||
if (!d.Accept(validator)) {
|
||||
// Input JSON is invalid according to the schema
|
||||
// Output diagnostic information
|
||||
StringBuffer sb;
|
||||
validator.GetInvalidSchemaPointer().StringifyUriFragment(sb);
|
||||
printf("Invalid schema: %s\n", sb.GetString());
|
||||
printf("Invalid keyword: %s\n", validator.GetInvalidSchemaKeyword());
|
||||
sb.Clear();
|
||||
validator.GetInvalidDocumentPointer().StringifyUriFragment(sb);
|
||||
printf("Invalid document: %s\n", sb.GetString());
|
||||
}
|
||||
~~~
|
||||
|
||||
Some notes:
|
||||
|
||||
* One `SchemaDocument` can be referenced by multiple `SchemaValidator`s. It will not be modified by `SchemaValidator`s.
|
||||
* A `SchemaValidator` may be reused to validate multiple documents. To run it for other documents, call `validator.Reset()` first.
|
||||
|
||||
# Validation during parsing/serialization {#Fused}
|
||||
|
||||
Unlike most JSON Schema validator implementations, RapidJSON provides a SAX-based schema validator. Therefore, you can parse a JSON from a stream while validating it on the fly. If the validator encounters a JSON value that invalidates the supplied schema, the parsing will be terminated immediately. This design is especially useful for parsing large JSON files.
|
||||
|
||||
## DOM parsing {#DOM}
|
||||
|
||||
For using DOM in parsing, `Document` needs some preparation and finalizing tasks, in addition to receiving SAX events, thus it needs some work to route the reader, validator and the document. `SchemaValidatingReader` is a helper class that doing such work.
|
||||
|
||||
~~~cpp
|
||||
#include "rapidjson/filereadstream.h"
|
||||
|
||||
// ...
|
||||
SchemaDocument schema(sd); // Compile a Document to SchemaDocument
|
||||
|
||||
// Use reader to parse the JSON
|
||||
FILE* fp = fopen("big.json", "r");
|
||||
FileReadStream is(fp, buffer, sizeof(buffer));
|
||||
|
||||
// Parse JSON from reader, validate the SAX events, and store in d.
|
||||
Document d;
|
||||
SchemaValidatingReader<kParseDefaultFlags, FileReadStream, UTF8<> > reader(is, schema);
|
||||
d.Populate(reader);
|
||||
|
||||
if (!reader.GetParseResult()) {
|
||||
// Not a valid JSON
|
||||
// When reader.GetParseResult().Code() == kParseErrorTermination,
|
||||
// it may be terminated by:
|
||||
// (1) the validator found that the JSON is invalid according to schema; or
|
||||
// (2) the input stream has I/O error.
|
||||
|
||||
// Check the validation result
|
||||
if (!reader.IsValid()) {
|
||||
// Input JSON is invalid according to the schema
|
||||
// Output diagnostic information
|
||||
StringBuffer sb;
|
||||
reader.GetInvalidSchemaPointer().StringifyUriFragment(sb);
|
||||
printf("Invalid schema: %s\n", sb.GetString());
|
||||
printf("Invalid keyword: %s\n", reader.GetInvalidSchemaKeyword());
|
||||
sb.Clear();
|
||||
reader.GetInvalidDocumentPointer().StringifyUriFragment(sb);
|
||||
printf("Invalid document: %s\n", sb.GetString());
|
||||
}
|
||||
}
|
||||
~~~
|
||||
|
||||
## SAX parsing {#SAX}
|
||||
|
||||
For using SAX in parsing, it is much simpler. If it only need to validate the JSON without further processing, it is simply:
|
||||
|
||||
~~~
|
||||
SchemaValidator validator(schema);
|
||||
Reader reader;
|
||||
if (!reader.Parse(stream, validator)) {
|
||||
if (!validator.IsValid()) {
|
||||
// ...
|
||||
}
|
||||
}
|
||||
~~~
|
||||
|
||||
This is exactly the method used in the [schemavalidator](example/schemavalidator/schemavalidator.cpp) example. The distinct advantage is low memory usage, no matter how big the JSON was (the memory usage depends on the complexity of the schema).
|
||||
|
||||
If you need to handle the SAX events further, then you need to use the template class `GenericSchemaValidator` to set the output handler of the validator:
|
||||
|
||||
~~~
|
||||
MyHandler handler;
|
||||
GenericSchemaValidator<SchemaDocument, MyHandler> validator(schema, handler);
|
||||
Reader reader;
|
||||
if (!reader.Parse(ss, validator)) {
|
||||
if (!validator.IsValid()) {
|
||||
// ...
|
||||
}
|
||||
}
|
||||
~~~
|
||||
|
||||
## Serialization {#Serialization}
|
||||
|
||||
It is also possible to do validation during serializing. This can ensure the result JSON is valid according to the JSON schema.
|
||||
|
||||
~~~
|
||||
StringBuffer sb;
|
||||
Writer<StringBuffer> writer(sb);
|
||||
GenericSchemaValidator<SchemaDocument, Writer<StringBuffer> > validator(s, writer);
|
||||
if (!d.Accept(validator)) {
|
||||
// Some problem during Accept(), it may be validation or encoding issues.
|
||||
if (!validator.IsValid()) {
|
||||
// ...
|
||||
}
|
||||
}
|
||||
~~~
|
||||
|
||||
Of course, if your application only needs SAX-style serialization, it can simply send SAX events to `SchemaValidator` instead of `Writer`.
|
||||
|
||||
# Remote Schema {#Remote}
|
||||
|
||||
JSON Schema supports [`$ref` keyword](http://spacetelescope.github.io/understanding-json-schema/structuring.html), which is a [JSON pointer](doc/pointer.md) referencing to a local or remote schema. Local pointer is prefixed with `#`, while remote pointer is an relative or absolute URI. For example:
|
||||
|
||||
~~~js
|
||||
{ "$ref": "definitions.json#/address" }
|
||||
~~~
|
||||
|
||||
As `SchemaDocument` does not know how to resolve such URI, it needs a user-provided `IRemoteSchemaDocumentProvider` instance to do so.
|
||||
|
||||
~~~
|
||||
class MyRemoteSchemaDocumentProvider : public IRemoteSchemaDocumentProvider {
|
||||
public:
|
||||
virtual const SchemaDocument* GetRemoteDocument(const char* uri, SizeType length) {
|
||||
// Resolve the uri and returns a pointer to that schema.
|
||||
}
|
||||
};
|
||||
|
||||
// ...
|
||||
|
||||
MyRemoteSchemaDocumentProvider provider;
|
||||
SchemaDocument schema(sd, &provider);
|
||||
~~~
|
||||
|
||||
# Conformance {#Conformance}
|
||||
|
||||
RapidJSON passed 262 out of 263 tests in [JSON Schema Test Suite](https://github.com/json-schema/JSON-Schema-Test-Suite) (Json Schema draft 4).
|
||||
|
||||
The failed test is "changed scope ref invalid" of "change resolution scope" in `refRemote.json`. It is due to that `id` schema keyword and URI combining function are not implemented.
|
||||
|
||||
Besides, the `format` schema keyword for string values is ignored, since it is not required by the specification.
|
||||
|
||||
## Regular Expression {#Regex}
|
||||
|
||||
The schema keyword `pattern` and `patternProperties` uses regular expression to match the required pattern.
|
||||
|
||||
RapidJSON implemented a simple NFA regular expression engine, which is used by default. It supports the following syntax.
|
||||
|
||||
|Syntax|Description|
|
||||
|------|-----------|
|
||||
|`ab` | Concatenation |
|
||||
|<code>a|b</code> | Alternation |
|
||||
|`a?` | Zero or one |
|
||||
|`a*` | Zero or more |
|
||||
|`a+` | One or more |
|
||||
|`a{3}` | Exactly 3 times |
|
||||
|`a{3,}` | At least 3 times |
|
||||
|`a{3,5}`| 3 to 5 times |
|
||||
|`(ab)` | Grouping |
|
||||
|`^a` | At the beginning |
|
||||
|`a$` | At the end |
|
||||
|`.` | Any character |
|
||||
|`[abc]` | Character classes |
|
||||
|`[a-c]` | Character class range |
|
||||
|`[a-z0-9_]` | Character class combination |
|
||||
|`[^abc]` | Negated character classes |
|
||||
|`[^a-c]` | Negated character class range |
|
||||
|`[\b]` | Backspace (U+0008) |
|
||||
|<code>\\|</code>, `\\`, ... | Escape characters |
|
||||
|`\f` | Form feed (U+000C) |
|
||||
|`\n` | Line feed (U+000A) |
|
||||
|`\r` | Carriage return (U+000D) |
|
||||
|`\t` | Tab (U+0009) |
|
||||
|`\v` | Vertical tab (U+000B) |
|
||||
|
||||
For C++11 compiler, it is also possible to use the `std::regex` by defining `RAPIDJSON_SCHEMA_USE_INTERNALREGEX=0` and `RAPIDJSON_SCHEMA_USE_STDREGEX=1`. If your schemas do not need `pattern` and `patternProperties`, you can set both macros to zero to disable this feature, which will reduce some code size.
|
||||
|
||||
# Performance {#Performance}
|
||||
|
||||
Most C++ JSON libraries do not yet support JSON Schema. So we tried to evaluate the performance of RapidJSON's JSON Schema validator according to [json-schema-benchmark](https://github.com/ebdrup/json-schema-benchmark), which tests 11 JavaScript libraries running on Node.js.
|
||||
|
||||
That benchmark runs validations on [JSON Schema Test Suite](https://github.com/json-schema/JSON-Schema-Test-Suite), in which some test suites and tests are excluded. We made the same benchmarking procedure in [`schematest.cpp`](test/perftest/schematest.cpp).
|
||||
|
||||
On a Mac Book Pro (2.8 GHz Intel Core i7), the following results are collected.
|
||||
|
||||
|Validator|Relative speed|Number of test runs per second|
|
||||
|---------|:------------:|:----------------------------:|
|
||||
|RapidJSON|155%|30682|
|
||||
|[`ajv`](https://github.com/epoberezkin/ajv)|100%|19770 (± 1.31%)|
|
||||
|[`is-my-json-valid`](https://github.com/mafintosh/is-my-json-valid)|70%|13835 (± 2.84%)|
|
||||
|[`jsen`](https://github.com/bugventure/jsen)|57.7%|11411 (± 1.27%)|
|
||||
|[`schemasaurus`](https://github.com/AlexeyGrishin/schemasaurus)|26%|5145 (± 1.62%)|
|
||||
|[`themis`](https://github.com/playlyfe/themis)|19.9%|3935 (± 2.69%)|
|
||||
|[`z-schema`](https://github.com/zaggino/z-schema)|7%|1388 (± 0.84%)|
|
||||
|[`jsck`](https://github.com/pandastrike/jsck#readme)|3.1%|606 (± 2.84%)|
|
||||
|[`jsonschema`](https://github.com/tdegrunt/jsonschema#readme)|0.9%|185 (± 1.01%)|
|
||||
|[`skeemas`](https://github.com/Prestaul/skeemas#readme)|0.8%|154 (± 0.79%)|
|
||||
|tv4|0.5%|93 (± 0.94%)|
|
||||
|[`jayschema`](https://github.com/natesilva/jayschema)|0.1%|21 (± 1.14%)|
|
||||
|
||||
That is, RapidJSON is about 1.5x faster than the fastest JavaScript library (ajv). And 1400x faster than the slowest one.
|
||||
|
||||
# Schema violation reporting {#Reporting}
|
||||
|
||||
(Unreleased as of 2017-09-20)
|
||||
|
||||
When validating an instance against a JSON Schema,
|
||||
it is often desirable to report not only whether the instance is valid,
|
||||
but also the ways in which it violates the schema.
|
||||
|
||||
The `SchemaValidator` class
|
||||
collects errors encountered during validation
|
||||
into a JSON `Value`.
|
||||
This error object can then be accessed as `validator.GetError()`.
|
||||
|
||||
The structure of the error object is subject to change
|
||||
in future versions of RapidJSON,
|
||||
as there is no standard schema for violations.
|
||||
The details below this point are provisional only.
|
||||
|
||||
## General provisions {#ReportingGeneral}
|
||||
|
||||
Validation of an instance value against a schema
|
||||
produces an error value.
|
||||
The error value is always an object.
|
||||
An empty object `{}` indicates the instance is valid.
|
||||
|
||||
* The name of each member
|
||||
corresponds to the JSON Schema keyword that is violated.
|
||||
* The value is either an object describing a single violation,
|
||||
or an array of such objects.
|
||||
|
||||
Each violation object contains two string-valued members
|
||||
named `instanceRef` and `schemaRef`.
|
||||
`instanceRef` contains the URI fragment serialization
|
||||
of a JSON Pointer to the instance subobject
|
||||
in which the violation was detected.
|
||||
`schemaRef` contains the URI of the schema
|
||||
and the fragment serialization of a JSON Pointer
|
||||
to the subschema that was violated.
|
||||
|
||||
Individual violation objects can contain other keyword-specific members.
|
||||
These are detailed further.
|
||||
|
||||
For example, validating this instance:
|
||||
|
||||
~~~json
|
||||
{"numbers": [1, 2, "3", 4, 5]}
|
||||
~~~
|
||||
|
||||
against this schema:
|
||||
|
||||
~~~json
|
||||
{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"numbers": {"$ref": "numbers.schema.json"}
|
||||
}
|
||||
}
|
||||
~~~
|
||||
|
||||
where `numbers.schema.json` refers
|
||||
(via a suitable `IRemoteSchemaDocumentProvider`)
|
||||
to this schema:
|
||||
|
||||
~~~json
|
||||
{
|
||||
"type": "array",
|
||||
"items": {"type": "number"}
|
||||
}
|
||||
~~~
|
||||
|
||||
produces the following error object:
|
||||
|
||||
~~~json
|
||||
{
|
||||
"type": {
|
||||
"instanceRef": "#/numbers/2",
|
||||
"schemaRef": "numbers.schema.json#/items",
|
||||
"expected": ["number"],
|
||||
"actual": "string"
|
||||
}
|
||||
}
|
||||
~~~
|
||||
|
||||
## Validation keywords for numbers {#Numbers}
|
||||
|
||||
### multipleOf {#multipleof}
|
||||
|
||||
* `expected`: required number strictly greater than 0.
|
||||
The value of the `multipleOf` keyword specified in the schema.
|
||||
* `actual`: required number.
|
||||
The instance value.
|
||||
|
||||
### maximum {#maximum}
|
||||
|
||||
* `expected`: required number.
|
||||
The value of the `maximum` keyword specified in the schema.
|
||||
* `exclusiveMaximum`: optional boolean.
|
||||
This will be true if the schema specified `"exclusiveMaximum": true`,
|
||||
and will be omitted otherwise.
|
||||
* `actual`: required number.
|
||||
The instance value.
|
||||
|
||||
### minimum {#minimum}
|
||||
|
||||
* `expected`: required number.
|
||||
The value of the `minimum` keyword specified in the schema.
|
||||
* `exclusiveMinimum`: optional boolean.
|
||||
This will be true if the schema specified `"exclusiveMinimum": true`,
|
||||
and will be omitted otherwise.
|
||||
* `actual`: required number.
|
||||
The instance value.
|
||||
|
||||
## Validation keywords for strings {#Strings}
|
||||
|
||||
### maxLength {#maxLength}
|
||||
|
||||
* `expected`: required number greater than or equal to 0.
|
||||
The value of the `maxLength` keyword specified in the schema.
|
||||
* `actual`: required string.
|
||||
The instance value.
|
||||
|
||||
### minLength {#minLength}
|
||||
|
||||
* `expected`: required number greater than or equal to 0.
|
||||
The value of the `minLength` keyword specified in the schema.
|
||||
* `actual`: required string.
|
||||
The instance value.
|
||||
|
||||
### pattern {#pattern}
|
||||
|
||||
* `actual`: required string.
|
||||
The instance value.
|
||||
|
||||
(The expected pattern is not reported
|
||||
because the internal representation in `SchemaDocument`
|
||||
does not store the pattern in original string form.)
|
||||
|
||||
## Validation keywords for arrays {#Arrays}
|
||||
|
||||
### additionalItems {#additionalItems}
|
||||
|
||||
This keyword is reported
|
||||
when the value of `items` schema keyword is an array,
|
||||
the value of `additionalItems` is `false`,
|
||||
and the instance is an array
|
||||
with more items than specified in the `items` array.
|
||||
|
||||
* `disallowed`: required integer greater than or equal to 0.
|
||||
The index of the first item that has no corresponding schema.
|
||||
|
||||
### maxItems and minItems {#maxItems-minItems}
|
||||
|
||||
* `expected`: required integer greater than or equal to 0.
|
||||
The value of `maxItems` (respectively, `minItems`)
|
||||
specified in the schema.
|
||||
* `actual`: required integer greater than or equal to 0.
|
||||
Number of items in the instance array.
|
||||
|
||||
### uniqueItems {#uniqueItems}
|
||||
|
||||
* `duplicates`: required array
|
||||
whose items are integers greater than or equal to 0.
|
||||
Indices of items of the instance that are equal.
|
||||
|
||||
(RapidJSON only reports the first two equal items,
|
||||
for performance reasons.)
|
||||
|
||||
## Validation keywords for objects
|
||||
|
||||
### maxProperties and minProperties {#maxProperties-minProperties}
|
||||
|
||||
* `expected`: required integer greater than or equal to 0.
|
||||
The value of `maxProperties` (respectively, `minProperties`)
|
||||
specified in the schema.
|
||||
* `actual`: required integer greater than or equal to 0.
|
||||
Number of properties in the instance object.
|
||||
|
||||
### required {#required}
|
||||
|
||||
* `missing`: required array of one or more unique strings.
|
||||
The names of properties
|
||||
that are listed in the value of the `required` schema keyword
|
||||
but not present in the instance object.
|
||||
|
||||
### additionalProperties {#additionalProperties}
|
||||
|
||||
This keyword is reported
|
||||
when the schema specifies `additionalProperties: false`
|
||||
and the name of a property of the instance is
|
||||
neither listed in the `properties` keyword
|
||||
nor matches any regular expression in the `patternProperties` keyword.
|
||||
|
||||
* `disallowed`: required string.
|
||||
Name of the offending property of the instance.
|
||||
|
||||
(For performance reasons,
|
||||
RapidJSON only reports the first such property encountered.)
|
||||
|
||||
### dependencies {#dependencies}
|
||||
|
||||
* `errors`: required object with one or more properties.
|
||||
Names and values of its properties are described below.
|
||||
|
||||
Recall that JSON Schema Draft 04 supports
|
||||
*schema dependencies*,
|
||||
where presence of a named *controlling* property
|
||||
requires the instance object to be valid against a subschema,
|
||||
and *property dependencies*,
|
||||
where presence of a controlling property
|
||||
requires other *dependent* properties to be also present.
|
||||
|
||||
For a violated schema dependency,
|
||||
`errors` will contain a property
|
||||
with the name of the controlling property
|
||||
and its value will be the error object
|
||||
produced by validating the instance object
|
||||
against the dependent schema.
|
||||
|
||||
For a violated property dependency,
|
||||
`errors` will contain a property
|
||||
with the name of the controlling property
|
||||
and its value will be an array of one or more unique strings
|
||||
listing the missing dependent properties.
|
||||
|
||||
## Validation keywords for any instance type {#AnyTypes}
|
||||
|
||||
### enum {#enum}
|
||||
|
||||
This keyword has no additional properties
|
||||
beyond `instanceRef` and `schemaRef`.
|
||||
|
||||
* The allowed values are not listed
|
||||
because `SchemaDocument` does not store them in original form.
|
||||
* The violating value is not reported
|
||||
because it might be unwieldy.
|
||||
|
||||
If you need to report these details to your users,
|
||||
you can access the necessary information
|
||||
by following `instanceRef` and `schemaRef`.
|
||||
|
||||
### type {#type}
|
||||
|
||||
* `expected`: required array of one or more unique strings,
|
||||
each of which is one of the seven primitive types
|
||||
defined by the JSON Schema Draft 04 Core specification.
|
||||
Lists the types allowed by the `type` schema keyword.
|
||||
* `actual`: required string, also one of seven primitive types.
|
||||
The primitive type of the instance.
|
||||
|
||||
### allOf, anyOf, and oneOf {#allOf-anyOf-oneOf}
|
||||
|
||||
* `errors`: required array of at least one object.
|
||||
There will be as many items as there are subschemas
|
||||
in the `allOf`, `anyOf` or `oneOf` schema keyword, respectively.
|
||||
Each item will be the error value
|
||||
produced by validating the instance
|
||||
against the corresponding subschema.
|
||||
|
||||
For `allOf`, at least one error value will be non-empty.
|
||||
For `anyOf`, all error values will be non-empty.
|
||||
For `oneOf`, either all error values will be non-empty,
|
||||
or more than one will be empty.
|
||||
|
||||
### not {#not}
|
||||
|
||||
This keyword has no additional properties
|
||||
apart from `instanceRef` and `schemaRef`.
|
||||
Reference in New Issue
Block a user