Custom data types
SPL2 allows you to create and use custom data types, in order to define specific data characteristics and constraints beyond the built-in types.
In addition to using the built-in data types that SPL2 supports by default, you can also create and use custom data types.
Data types are classifications that define the characteristics of a given piece of data, such as the allowed format and range of values. By constraining your data to types, you can specify how that data can be structured, what values it can include, and how that data can be used in commands and functions.
The built-in data types in SPL2 include common types such as string, number, array, and more. If the available data types don’t meet your requirements, you can create custom data types with more fine-tuned definitions.
For example, assume that your data includes an event field named test_scores that is intended to store numerical values ranging from 0 to 100. The most appropriate built-in data type for this test_scores field would be the int type. However, the int type allows values that are beyond the range of 0 to 100, such as -36 and 12000. To constrain the test_scores field to meet your requirements, you must create a custom data type that refines the definition of the int data type to be more specific.
The following sections on this page provide detailed information about how to create custom data types:
For information about how to check whether your data matches a custom data type, see the Checking data against custom data types section on this page.
For information about the built-in data types that are available in SPL2, see Built-in data types.
For more information about how you can use data types to schematize your data and support various practical use cases, see Creating and using data schemas with SPL2 data types.
Defining custom data types
A custom data type is always derived from an existing data type. This existing data type can be a built-in type or another custom type. To define a custom data type, refine the definition of an existing data type to meet your requirements by doing one or a combination of the following:
Limiting the range of supported values
Restricting the values to a particular format
Adding or subtracting the criteria of another data type definition
Use the following syntax to define a custom data type. The required syntax is in bold.
type <type-name> = <type-expression>
The <type-name> argument specifies the name for identifying the custom data type. If this name includes any character other than a-z, A-Z, 0-9, or the underscore ( _ ) character, then you must enclose the name in single quotation marks ( ' ).
The <type-expression> argument specifies the SPL2 expression for deriving the definition of the custom data type from an existing type definition. The exact syntax of this expression varies depending on how you want to refine the data type that you're using as the starting point. For information about the supported syntax, see Supported custom data type expressions on this page.
For example, the following expression defines a custom data type named score that allows integer values ranging from 0 to 100:
type score = int WHERE $value BETWEEN 0 and 100As another example, the following expression defines a custom data type named grade that allows the values A, B, C, D, and F:
type grade = string WHERE $value IN ("A", "B", "C", "D", "F")Combining data types
Because custom data types are derived from existing types, you can combine and layer multiple data types together to create custom data types with finely tuned definitions that support your specific use cases.
For example, after defining the score and grade data types as described in the previous section, you can then define a custom data type named results that combines both of those types. This results data type allows the letter grade values A, B, C, D, and F as well as test score values between 0 and 100:
type results = grade | scoreYou can use parentheses to specify the order of operations in a type expression that has multiple parts to it. For example, the following data types are defined differently:
type final_course_results = grade | score[]type assignment_marks = (grade | score)[]The final_course_results data type allows values that are either a single letter grade or an array of the exam scores used to calculate the final letter grade. Examples of valid data values for the final_course_results type would be A and [97, 90, 87, 91].
The assignment_marks data type only allows an array of values, but the individual values in the array can be either letter grades or score numbers. Examples of valid data values for the assignment_marks data type would be [A, B, A, C], [97, 80, 91, 70], and [A, 80, 77, B].
Checking data against custom data types
To check whether your data matches a custom data type, use the IS operator in a predicate expression. This predicate expression can be the return expression of a custom eval function. For more information, see IS operator and Custom eval functions.
For example, assume that the error_code custom data type has been defined. The following eval command checks whether the values in the code_number field match the error_code data type, and then returns either true or false in a field named type_check_results:
... | eval type_check_results = if((code_number IS error_code), "true", "false")As another example, the following expression defines a custom function named type_check that checks whether the input data matches the error_code data type, and then returns either true or false:
function type_check($data) {
return if(($data IS error_code), "true", "false")
}
Supported custom data type expressions
SPL2 supports a variety of expressions that you can use to define a custom data type. You can use these expressions to limit the range of values that the type supports, restrict the values of that type to a particular format, and add or subtract criteria from another data type definition.
The following table categorizes and describes the type expressions that you can use to define a custom data type in SPL2:
| Category | Description |
|---|---|
|
A derived data type that adds together the properties of 2 structured data types into a single data type. For example, assume that
| |
|
A derived data type that defines an ordered collection of homogeneous values. Use array types to describe JSON arrays of same-typed values. For example, the following expression defines an array type named
| |
|
A derived data type that applies a constraint, in the form of a predicate expression, on an existing data type in order to make the type definition more narrow or precise. For example, the following expression defines a constrained type named
| |
|
A derived data type that defines an object containing one or more properties, and optionally specifies the data type of each property. Use structured types to describe your data as JSON objects. For example, the following expression defines a structured type named
| |
|
A derived data type that subtracts the properties defined in one structured data type from the list of properties defined in another structured data type. For example, assume the following:
The following expression defines a subtractive type named
| |
|
A derived type that defines a collection of existing data types. A data value will match the union type if the value matches any of the data types listed in the union type definition. For example, the following expression defines a union type named
|
Additive types
An additive data type is a derived data type that adds together the properties of 2 structured data types into a single structured data type. Use an additive type expression to describe data as a JSON object that contains properties that are already defined in existing structured data types.
If the structured types that you’re adding together contain properties that have the same name, then be aware of the following:
-
Those properties must be associated with the same data type, or else an error occurs.
-
If one property is required while another is optional, the resulting property in the additive data type will be required.
Syntax
The required syntax for structured types is in bold:
type <type-name> = <structured-data-type> + <structured-data-type>
The required arguments are defined as follows:
type-name
Syntax: A string. If this string includes any character other than a-z, A-Z, 0-9, or the underscore ( _ ) character, then you must enclose the string in single quotation marks ( ' ).
Description: The name for identifying the custom data type.
structured-data-type
Syntax: <structured-data-type>
Description: The name of an existing structured data type that defines the object properties you want to include in the additive type.
Examples
Suppose you had created the following structured data types named customer and address:
type customer = {
id: int,
name: string,
age: int
}
type address = {
street: string,
city: string,
state: string,
zip: int
}
The customer data type is a JSON object containing 3 properties: id, name, and age. The address data type is a JSON object containing 4 properties: street, city, state, and zip.
You can use the following expression to create an additive type named customerWithAddress that is a JSON object with all 7 of the properties defined in the customer and address types: id, name, age, street, city, state, and zip.
type customerWithAddress = customer + address
Array types
An array data type is a derived data type that defines an ordered collection of homogeneous values. Use an array type expression to describe data as a JSON array of same-typed values.
Syntax
The required syntax for array types is in bold:
type <type-name> = <data-type>[]
The required arguments are defined as follows:
type-name
Syntax: A string. If this string includes any character other than a-z, A-Z, 0-9, or the underscore ( _ ) character, then you must enclose the string in single quotation marks ( ' ).
Description: The name for identifying the custom data type.
data-type
Syntax: <data-type>
Description: The name of an existing data type that you want to constrain into a JSON array format. When specifying the object data type, you can choose to use the notation {*} instead of the name object.
Examples
The following expression defines a custom data type named fullnames that is an array of strings. An example of a valid fullnames value would be ["john_anderson", "sasha_patel", "wei_zhang"].
type fullnames = string[]
The following expression defines a custom data type named personnel that is an array of objects. An example of a valid personnel value would be [{name: "Claudia Garcia", id: 109651}, {name: "Ikraam Rahat", id: 109562}].
type personnel = object[]
You can also define the personnel data type using this expression:
type personnel = {*}[]
Constrained types
A constrained data type is a derived data type that adds constraints to an existing data type in order to make the type definition more narrow or precise. Use a constrained type expression to refine an existing data type by including more specific requirements.
When specifying the constraint, you can use the match function to require data values to match a regular expression. Regular expression matching allows you to define custom data types that capture complex and varied data values, such as IP addresses, credit card numbers, and social security numbers.
Syntax
The required syntax for structured types is in bold:
type <type-name> = <data-type> WHERE <predicate-expression>
The required arguments are defined as follows:
type-name
Syntax: A string. If this string includes any character other than a-z, A-Z, 0-9, or the underscore ( _ ) character, then you must enclose the string in single quotation marks ( ' ).
Description: The name for identifying the custom data type.
data-type
Syntax: <data-type>
Description: The name of an existing data type that you want to add constraints to.
predicate-expression
Syntax: A predicate expression or a match function. For more information, see Predicate expressions in the current manual and match in the SPL2 Search Reference.
Description: An expression describing the constraint that you want to apply to the data type definition. The resulting custom data type accepts values only if this expression resolves to TRUE for those values. You can use the $value variable to access the values in your data that you want this expression to evaluate. Additionally, you can use dot notation in the format $value.<property-name> to access a property in a structured data type.
Examples
The following expression defines a custom data type named positive_integer that is an integer greater than or equal to 0. Examples of valid positive_integer values include 1, 24, and 300.
type positive_integer = int WHERE $value >= 0
This next expression defines a custom data type named ipv4 that is a string that matches a regular expression for valid IPv4 addresses:
type ipv4 = string WHERE match($value, "(([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\\.){3}([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])")
As another example of using regular expression matches as the constraint, the following expression defines a custom data type named ipv6 that is a string that matches a regular expression for valid IPv6 addresses:
type ipv6 = string WHERE match($value, ("((([0-9a-fA-F]){1,4})\\:){7}([0-9a-fA-F]){1,4}"))This next example demonstrates how to create a constrained data type that is based on a property in a structured data type. Consider the following structured data type called person:
type person = {
firstname:string,
surname:string,
age:int
}
You can create a constrained data type called elderly_person that refers to the person data type and puts a constraint on the age property. In this example, the value for the age property in the person data type must be greater than 70:
type elderly_person = person WHERE $value.age > 70
Structured types
A structured data type is a derived data type that defines an object containing one or more properties. Use a structured type expression to describe data as a JSON object.
Syntax
The required syntax for structured types is in bold:
type <type-name> = {<property-name> [: <property-data-type>], ...}
Separate each property with a comma. Enclose properties in curly brackets ( { } ).
The required arguments are defined as follows:
type-name
Syntax: A string. If this string includes any character other than a-z, A-Z, 0-9, or the underscore ( _ ) character, then you must enclose the string in single quotation marks ( ' ).
Description: The name for identifying the custom data type.
property-name
If the string includes any character other than a-z, A-Z, 0-9, or the underscore ( _ ) character, then you must enclose the string in single quotation marks ( ' ).
If you append a question mark ( ? ) to the string, then the property identified by the string is treated as optional. For this usage of the question mark, if the string is enclosed in any single quotation marks ( ' ), the question mark must be appended after the closing quotation mark.
Description: The name of a property in the JSON object. You can specify an optional property by appending a question mark ( ? ) to the property name. As an alternative to specifying a property name, you can use an asterisk ( * ) as a wildcard character that allows the property to have any name.
The optional arguments are defined as follows:
property-data-type
Syntax: A string of characters or a single asterisk ( * ).
Description: The data type of the property. You can specify the name of any built-in or custom data type. As an alternative to specifying the name of a data type, you can use an asterisk ( * ) as a wildcard character that allows the property to be of any data type.
Default: any
Examples
The following expression defines a custom data type named person that is an object with two properties: firstname and surname. The data types of the properties are not specified, so they default to the any type.
type person = {
firstname,
surname
}
For comparison, this next expression defines a custom data type named person_with_age that is an object with three properties: firstname, surname, and age. The firstname and surname properties must be strings, and the age property must be an integer.
type person_with_age = {
firstname: string,
surname: string,
age: int
}
You can use an asterisk ( * ) as a wildcard character that allows a property to have any name or data type. For example, the following expression defines a custom data type named person_open_ended that is an object that must contain properties named firstname, surname, and age, but can also contain additional properties with any name or data type:
type person_open_ended = {
firstname: string,
surname: string,
age: int,
*
}
You can choose to use the wildcard for the property name only or the property data type only, as shown in these next two examples:
type person_property_name_varied = {
firstname: string,
surname: string,
age: int,
*: string
}
type person_property_type_varied = {
firstname: string,
surname: string,
age: int,
details: *
}
You can specify an optional property by appending a question mark ( ? ) to the property name. For example, the following expression defines a custom data type named person_flexible that must contain properties named firstname and surname, and can optionally contain a property named age:
type person_flexible = {
firstname: string,
surname: string,
age?: int
}
Both of the following JSON objects meet the requirements of the person_flexible data type:
{firstname: "taylor", surname: "lee"}{firstname: "john", surname: "anderson", age: 46}
Subtractive types
A subtractive data type is a derived data type that subtracts the properties defined in one structured data type from the list of properties defined in another structured data type. Use a subtractive type expression to describe data as a JSON object that contains only a subset of the properties that are already defined in existing structured data types.
When you define a subtractive data type, you specify 2 structured data types, and the properties in the structured type that’s specified later are removed from the structured type that’s specified before it. The properties must meet the following requirements:
-
The property in the later structured type must be of type
any, or else an error occurs. -
If a property in the later structured type is required, then the same property must also exist in the earlier structured type, or else an error occurs.
-
Wildcard fields denoted by asterisks ( * ) are treated like regular fields. If both structured types contain a wildcard field, the field will be removed and the resulting subtractive data type will not contain the wildcard field.
Syntax
The required syntax for structured types is in bold:
type <type-name> = <structured-data-type> - <structured-data-type>
The required arguments are defined as follows:
type-name
Syntax: A string. If this string includes any character other than a-z, A-Z, 0-9, or the underscore ( _ ) character, then you must enclose the string in single quotation marks ( ' ).
Description: The name for identifying the custom data type.
structured-data-type
Syntax: <structured-data-type>
Description: The name of an existing structured data type. The properties from the structured type that is specified later in the definition are removed from the structured type that is specified before it.
Examples
Suppose you had created the following structured data types namedclient and person:
type client = {
id: int,
name: string,
age: int
}
type person = {
name: string,
age: int
}
The client data type is a JSON object containing 3 properties: id, name, and age. The person data type is a JSON object containing 2 of the same properties: name and age.
You can then create an additive type named client_anonymized that is a JSON object with only the id property:
type client_anonymized = client - person
Union types
A union data type is a derived data type that defines a collection of existing data types. Use union type expressions to create a custom data type that groups data of different types into a single data type.
Syntax
The required syntax for union types is in bold:
type <type-name> = <data-type> | <data-type> ...
Separate each data type in the union type definition with a pipe character ( | ), which works like an OR operator between each type.
The required arguments are defined as follows:
type-name
Syntax: A string. If this string includes any character other than a-z, A-Z, 0-9, or the underscore ( _ ) character, then you must enclose the string in single quotation marks ( ' ).
Description: The name for identifying the custom data type.
data-type
Syntax: <data-type>
Description: The name of an existing data type that you want to include in the union type.
Example
Suppose you had created constrained types named ipv4 and ipv6 that match IPv4 and IPv6 addresses, respectively. You can then create a union data type that groups the ipv4 and ipv6 data types together. This union data type would allow you to validate IP addresses in your data regardless of their specific format by checking the values against one data type instead of two. The following expression defines a union data type named ipaddress that groups the ipv4 and ipv6 data types together:
type ipaddress = ipv4 | ipv6
ipv4 and ipv6 data types, see the Examples section for constrained data types.