JSON standards for the data and partition schemas
When you define the data schema or partition schema manually via the UI using the JSON view, follow these JSON standards for the schemas.
When you define the data schema or partition schema manually via the UI using the JSON view, follow these JSON standards for the schemas.
Data schema
The data schema (dataSchema) is the schema you use if you choose to manually define the columns in your dataset and you select JSON view when you define the schema through the UI.
The data schema is expressed as a subset of JSON Schema. Because field order matters and some JSON parsers do not preserve the order of object properties, you must use the required array to ensure the column name order that you want. See JSON Schema.
Data schema OpenAPI spec
dataSchema:
type: object
description: JSON Schema for the dataset
required:
- properties
- required
properties:
properties:
type: object
description: The definition of the dataset's schema
required:
type: array
items:
type: string
description: Defines the order of the fields in the schema
Data schema example
Here is an example of a data schema for a dataset with 3 columns, including an object type that contains two string types. The required array sets the required order of the columns.
{
"properties": {
"zipcode": {
"type": "integer"
},
"city": {
"type": "string"
},
"room": {
"type": "object",
"properties": {
"roomfloor": { "type": "string" },
"roomname": { "type": "string" }
}
}
},
"required": ["zipcode", "city", "room"]
}
Data types supported by the data schema
The following data types are supported in the data schema (dataSchema). The data schema supports only a single data type per instance of the type keyword. The null type is not required or supported for schema input as all fields are treated as nullable by default.
For CSV and JSON formats, there is no static enforcement of integer and number data types at read time. You can choose precision from the value range or omit format (default is int64). For Parquet, the schema must match the landed integer data types exactly.
| Type | Description | Example |
|---|---|---|
| Array | A sequence of arbitrary length where each item matches the same schema. Null values can be included. | {"type": "array", "items": {"type": "any_type_in_this_list"}} |
| Boolean | Represents logical true or false values. | {"type": "boolean"} |
| Integer 32 | A 32-bit signed integer. | {"type": "integer", "format": "int32"} |
| Integer 64 | A 64-bit signed integer (default for integer types). | {"type": "integer"} or {"type": "integer", "format": "int64"} |
| Float | A single-precision floating-point number. | {"type": "number", "format": "float"} |
| Double | A double-precision floating-point number. | {"type": "number", "format": "double"} |
| Object | A collection of key-value pairs defined by properties. | {"type": "object", "properties": {"prop1": {"type": "string"}}} |
| String | A standard sequence of characters. | {"type": "string"} |
| Date String | A string formatted with Splunk time variables. | {"type": "string", "format": "%Y"} |
| BYTE_ARRAY | Binary data (Parquet only). | {"type": "string", "format": "byte_array"} |
| FIXED_LEN_BYTE_ARRAY | Fixed-length binary data (Parquet only). | {"type": "string", "format": "fixed_len_byte_array"} |
Partition schema
The partition schema (dataPartition.PartitionSchema) is the schema you use if you choose to manually define the partition keys in your dataset, and you select JSON view when you define the partition keys through the UI.
The partition schema is expressed as a subset of the JSON Schema, using the same format as the data schema (dataSchema). Because field order matters and some JSON parsers do not preserve the order of object properties, you must use the required array to ensure the column name order that you want. See JSON Schema.
Partition schema OpenAPI spec
partition:
type: object
description: JSON Schema (same subset as dataSchema) defining the partition fields
required:
- properties
- required
properties:
properties:
type: object
description: The definition of the partition fields
required:
type: array
items:
type: string
description: Defines the order of the partition fields
Partition schema example
Here is an example of a partition schema for a set of date-based partitions: year, month, and day. The required array is used to set the required order of the partition keys.
{
"properties": {
"year": {
"type": "integer",
"format": "int32"
},
"month": {
"type": "integer",
"format": "int32"
},
"day": {
"type": "integer",
"format": "int32"
}
},
"required": ["year", "month", "day"]
}
Data types supported by the partition schema
The only data types supported by the partition schema (dataPartition.PartitionSchema) are string and integer.