Custom data types

SPL2 allows you to create and use custom data types, in order to define specific data characteristics and constraints beyond the built-in types.

In addition to using the built-in data types that SPL2 supports by default, you can also create and use custom data types.

Data types are classifications that define the characteristics of a given piece of data, such as the allowed format and range of values. By constraining your data to types, you can specify how that data can be structured, what values it can include, and how that data can be used in commands and functions.

The built-in data types in SPL2 include common types such as string, number, array, and more. If the available data types don’t meet your requirements, you can create custom data types with more fine-tuned definitions.

For example, assume that your data includes an event field named test_scores that is intended to store numerical values ranging from 0 to 100. The most appropriate built-in data type for this test_scores field would be the int type. However, the int type allows values that are beyond the range of 0 to 100, such as -36 and 12000. To constrain the test_scores field to meet your requirements, you must create a custom data type that refines the definition of the int data type to be more specific.

The following sections on this page provide detailed information about how to create custom data types:

For information about how to check whether your data matches a custom data type, see the Checking data against custom data types section on this page.

For information about the built-in data types that are available in SPL2, see Built-in data types.

For more information about how you can use data types to schematize your data and support various practical use cases, see Creating and using data schemas with SPL2 data types.

Defining custom data types

A custom data type is always derived from an existing data type. This existing data type can be a built-in type or another custom type. To define a custom data type, refine the definition of an existing data type to meet your requirements by doing one or a combination of the following:

  • Limiting the range of supported values

  • Restricting the values to a particular format

  • Adding or subtracting the criteria of another data type definition

Use the following syntax to define a custom data type. The required syntax is in bold.

type <type-name> = <type-expression>

The <type-name> argument specifies the name for identifying the custom data type. If this name includes any character other than a-z, A-Z, 0-9, or the underscore ( _ ) character, then you must enclose the name in single quotation marks ( ' ).

The <type-expression> argument specifies the SPL2 expression for deriving the definition of the custom data type from an existing type definition. The exact syntax of this expression varies depending on how you want to refine the data type that you're using as the starting point. For information about the supported syntax, see Supported custom data type expressions on this page.

For example, the following expression defines a custom data type named score that allows integer values ranging from 0 to 100:

type score = int WHERE $value BETWEEN 0 and 100

As another example, the following expression defines a custom data type named grade that allows the values A, B, C, D, and F:

type grade = string WHERE $value IN ("A", "B", "C", "D", "F")

Combining data types

Because custom data types are derived from existing types, you can combine and layer multiple data types together to create custom data types with finely tuned definitions that support your specific use cases.

For example, after defining the score and grade data types as described in the previous section, you can then define a custom data type named results that combines both of those types. This results data type allows the letter grade values A, B, C, D, and F as well as test score values between 0 and 100:

type results = grade | score

You can use parentheses to specify the order of operations in a type expression that has multiple parts to it. For example, the following data types are defined differently:

type final_course_results = grade | score[]
type assignment_marks = (grade | score)[]

The final_course_results data type allows values that are either a single letter grade or an array of the exam scores used to calculate the final letter grade. Examples of valid data values for the final_course_results type would be A and [97, 90, 87, 91].

The assignment_marks data type only allows an array of values, but the individual values in the array can be either letter grades or score numbers. Examples of valid data values for the assignment_marks data type would be [A, B, A, C], [97, 80, 91, 70], and [A, 80, 77, B].

Checking data against custom data types

To check whether your data matches a custom data type, use the IS operator in a predicate expression. This predicate expression can be the return expression of a custom eval function. For more information, see IS operator and Custom eval functions.

For example, assume that the error_code custom data type has been defined. The following eval command checks whether the values in the code_number field match the error_code data type, and then returns either true or false in a field named type_check_results:

... | eval type_check_results = if((code_number IS error_code), "true", "false")

As another example, the following expression defines a custom function named type_check that checks whether the input data matches the error_code data type, and then returns either true or false:

function type_check($data) {
    return if(($data IS error_code), "true", "false")
}

Supported custom data type expressions

SPL2 supports a variety of expressions that you can use to define a custom data type. You can use these expressions to limit the range of values that the type supports, restrict the values of that type to a particular format, and add or subtract criteria from another data type definition.

The following table categorizes and describes the type expressions that you can use to define a custom data type in SPL2:

Category Description

Additive types

A derived data type that adds together the properties of 2 structured data types into a single data type.

For example, assume that customer is a structured data type that contains the properties id and name, and that demographic is a structured data type that contains the properties age and location. The following expression defines an additive type named customer_details that contains all 4 of those properties:

type customer_details = customer + demographic

Array types

A derived data type that defines an ordered collection of homogeneous values. Use array types to describe JSON arrays of same-typed values.

For example, the following expression defines an array type named fullnames that is a JSON array of string values:

type fullnames = string[]

Constrained types

A derived data type that applies a constraint, in the form of a predicate expression, on an existing data type in order to make the type definition more narrow or precise.

For example, the following expression defines a constrained type named http_error that only allows the integer values 403, 404, and 408:

type http_error=int WHERE $value in([403, 404, 408])

Structured types

A derived data type that defines an object containing one or more properties, and optionally specifies the data type of each property. Use structured types to describe your data as JSON objects.

For example, the following expression defines a structured type named person that is a JSON object containing the properties firstname, surname, and age. The firstname and surname properties must be strings, and the age property must be an integer

type person =  { 
   firstname:string, 
   surname:string,
   age:int 
}

Subtractive types

A derived data type that subtracts the properties defined in one structured data type from the list of properties defined in another structured data type.

For example, assume the following:

  • customer_details is a structured type that contains the properties id, name, age, and location.

  • demographic is a structured data type that contains the properties age and location.

The following expression defines a subtractive type named customer that contains only the id and name properties:

type customer = customer_details - demographic

Union types

A derived type that defines a collection of existing data types. A data value will match the union type if the value matches any of the data types listed in the union type definition.

For example, the following expression defines a union type named json that groups together the built-in types called array and object. Any valid JSON arrays and JSON objects in your data will match this union type and be considered to be valid json values.

type json = array | object

Additive types

An additive data type is a derived data type that adds together the properties of 2 structured data types into a single structured data type. Use an additive type expression to describe data as a JSON object that contains properties that are already defined in existing structured data types.

Note: Structured data types describe data as JSON objects. See Structured types on this page for more information.

If the structured types that you’re adding together contain properties that have the same name, then be aware of the following:

  • Those properties must be associated with the same data type, or else an error occurs.

  • If one property is required while another is optional, the resulting property in the additive data type will be required.

Syntax

The required syntax for structured types is in bold:

type <type-name> = <structured-data-type> + <structured-data-type>

The required arguments are defined as follows:

type-name

Syntax: A string. If this string includes any character other than a-z, A-Z, 0-9, or the underscore ( _ ) character, then you must enclose the string in single quotation marks ( ' ).

Description: The name for identifying the custom data type.

structured-data-type

Syntax: <structured-data-type>

Description: The name of an existing structured data type that defines the object properties you want to include in the additive type.

Examples

Suppose you had created the following structured data types named customer and address:

type customer = { 
   id: int,
   name: string,
   age: int
}
type address = { 
   street: string,
   city: string,
   state: string,
   zip: int
}

The customer data type is a JSON object containing 3 properties: id, name, and age. The address data type is a JSON object containing 4 properties: street, city, state, and zip.

You can use the following expression to create an additive type named customerWithAddress that is a JSON object with all 7 of the properties defined in the customer and address types: id, name, age, street, city, state, and zip.

type customerWithAddress = customer + address

Array types

An array data type is a derived data type that defines an ordered collection of homogeneous values. Use an array type expression to describe data as a JSON array of same-typed values.

Syntax

The required syntax for array types is in bold:

type <type-name> = <data-type>[]

The required arguments are defined as follows:

type-name

Syntax: A string. If this string includes any character other than a-z, A-Z, 0-9, or the underscore ( _ ) character, then you must enclose the string in single quotation marks ( ' ).

Description: The name for identifying the custom data type.

data-type

Syntax: <data-type>

Description: The name of an existing data type that you want to constrain into a JSON array format. When specifying the object data type, you can choose to use the notation {*} instead of the name object.

Examples

The following expression defines a custom data type named fullnames that is an array of strings. An example of a valid fullnames value would be ["john_anderson", "sasha_patel", "wei_zhang"].

type fullnames = string[]

The following expression defines a custom data type named personnel that is an array of objects. An example of a valid personnel value would be [{name: "Claudia Garcia", id: 109651}, {name: "Ikraam Rahat", id: 109562}].

type personnel = object[]

You can also define the personnel data type using this expression:

type personnel = {*}[]

Constrained types

A constrained data type is a derived data type that adds constraints to an existing data type in order to make the type definition more narrow or precise. Use a constrained type expression to refine an existing data type by including more specific requirements.

When specifying the constraint, you can use the match function to require data values to match a regular expression. Regular expression matching allows you to define custom data types that capture complex and varied data values, such as IP addresses, credit card numbers, and social security numbers.

Syntax

The required syntax for structured types is in bold:

type <type-name> = <data-type> WHERE <predicate-expression>

The required arguments are defined as follows:

type-name

Syntax: A string. If this string includes any character other than a-z, A-Z, 0-9, or the underscore ( _ ) character, then you must enclose the string in single quotation marks ( ' ).

Description: The name for identifying the custom data type.

data-type

Syntax: <data-type>

Description: The name of an existing data type that you want to add constraints to.

predicate-expression

Syntax: A predicate expression or a match function. For more information, see Predicate expressions in the current manual and match in the SPL2 Search Reference.

Description: An expression describing the constraint that you want to apply to the data type definition. The resulting custom data type accepts values only if this expression resolves to TRUE for those values. You can use the $value variable to access the values in your data that you want this expression to evaluate. Additionally, you can use dot notation in the format $value.<property-name> to access a property in a structured data type.

Examples

The following expression defines a custom data type named positive_integer that is an integer greater than or equal to 0. Examples of valid positive_integer values include 1, 24, and 300.

type positive_integer = int WHERE $value >= 0

This next expression defines a custom data type named ipv4 that is a string that matches a regular expression for valid IPv4 addresses:

type ipv4 = string WHERE match($value, "(([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\\.){3}([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])")

As another example of using regular expression matches as the constraint, the following expression defines a custom data type named ipv6 that is a string that matches a regular expression for valid IPv6 addresses:

type ipv6 = string WHERE match($value, ("((([0-9a-fA-F]){1,4})\\:){7}([0-9a-fA-F]){1,4}"))

This next example demonstrates how to create a constrained data type that is based on a property in a structured data type. Consider the following structured data type called person:

type person =  { 
	firstname:string, 
	surname:string,
	age:int 
}

You can create a constrained data type called elderly_person that refers to the person data type and puts a constraint on the age property. In this example, the value for the age property in the person data type must be greater than 70:

type elderly_person = person WHERE $value.age > 70

Structured types

A structured data type is a derived data type that defines an object containing one or more properties. Use a structured type expression to describe data as a JSON object.

Syntax

The required syntax for structured types is in bold:

type <type-name> = {<property-name> [: <property-data-type>], ...}

Separate each property with a comma. Enclose properties in curly brackets ( { } ).

The required arguments are defined as follows:

type-name

Syntax: A string. If this string includes any character other than a-z, A-Z, 0-9, or the underscore ( _ ) character, then you must enclose the string in single quotation marks ( ' ).

Description: The name for identifying the custom data type.

property-name

Syntax: A string of characters or a single asterisk ( * ). If a string of characters is specified, it follows these conventions:
  • If the string includes any character other than a-z, A-Z, 0-9, or the underscore ( _ ) character, then you must enclose the string in single quotation marks ( ' ).

  • If you append a question mark ( ? ) to the string, then the property identified by the string is treated as optional. For this usage of the question mark, if the string is enclosed in any single quotation marks ( ' ), the question mark must be appended after the closing quotation mark.

Description: The name of a property in the JSON object. You can specify an optional property by appending a question mark ( ? ) to the property name. As an alternative to specifying a property name, you can use an asterisk ( * ) as a wildcard character that allows the property to have any name.

The optional arguments are defined as follows:

property-data-type

Syntax: A string of characters or a single asterisk ( * ).

Description: The data type of the property. You can specify the name of any built-in or custom data type. As an alternative to specifying the name of a data type, you can use an asterisk ( * ) as a wildcard character that allows the property to be of any data type.

Default: any

Examples

The following expression defines a custom data type named person that is an object with two properties: firstname and surname. The data types of the properties are not specified, so they default to the any type.

type person = { 
   firstname, 
   surname 
}

For comparison, this next expression defines a custom data type named person_with_age that is an object with three properties: firstname, surname, and age. The firstname and surname properties must be strings, and the age property must be an integer.

type person_with_age = { 
   firstname: string, 
   surname: string,
   age: int
}

You can use an asterisk ( * ) as a wildcard character that allows a property to have any name or data type. For example, the following expression defines a custom data type named person_open_ended that is an object that must contain properties named firstname, surname, and age, but can also contain additional properties with any name or data type:

type person_open_ended = { 
   firstname: string, 
   surname: string,
   age: int,
   *
}

You can choose to use the wildcard for the property name only or the property data type only, as shown in these next two examples:

type person_property_name_varied = { 
   firstname: string, 
   surname: string,
   age: int,
   *: string
}
type person_property_type_varied = { 
   firstname: string, 
   surname: string,
   age: int,
   details: *
}

You can specify an optional property by appending a question mark ( ? ) to the property name. For example, the following expression defines a custom data type named person_flexible that must contain properties named firstname and surname, and can optionally contain a property named age:

type person_flexible = { 
   firstname: string, 
   surname: string,
   age?: int
}

Both of the following JSON objects meet the requirements of the person_flexible data type:

  • {firstname: "taylor", surname: "lee"}

  • {firstname: "john", surname: "anderson", age: 46}

Subtractive types

A subtractive data type is a derived data type that subtracts the properties defined in one structured data type from the list of properties defined in another structured data type. Use a subtractive type expression to describe data as a JSON object that contains only a subset of the properties that are already defined in existing structured data types.

Note: Structured data types describe data as JSON objects. See Structured types on this page for more information.

When you define a subtractive data type, you specify 2 structured data types, and the properties in the structured type that’s specified later are removed from the structured type that’s specified before it. The properties must meet the following requirements:

  • The property in the later structured type must be of type any, or else an error occurs.

  • If a property in the later structured type is required, then the same property must also exist in the earlier structured type, or else an error occurs.

  • Wildcard fields denoted by asterisks ( * ) are treated like regular fields. If both structured types contain a wildcard field, the field will be removed and the resulting subtractive data type will not contain the wildcard field.

Syntax

The required syntax for structured types is in bold:

type <type-name> = <structured-data-type> - <structured-data-type>

The required arguments are defined as follows:

type-name

Syntax: A string. If this string includes any character other than a-z, A-Z, 0-9, or the underscore ( _ ) character, then you must enclose the string in single quotation marks ( ' ).

Description: The name for identifying the custom data type.

structured-data-type

Syntax: <structured-data-type>

Description: The name of an existing structured data type. The properties from the structured type that is specified later in the definition are removed from the structured type that is specified before it.

Examples

Suppose you had created the following structured data types named client and person:
type client = { 
   id: int,
   name: string,
   age: int
}
type person = { 
   name: string,
   age: int
}
The client data type is a JSON object containing 3 properties: id, name, and age. The person data type is a JSON object containing 2 of the same properties: name and age. You can then create an additive type named client_anonymized that is a JSON object with only the id property:
type client_anonymized = client - person

Union types

A union data type is a derived data type that defines a collection of existing data types. Use union type expressions to create a custom data type that groups data of different types into a single data type.

Syntax

The required syntax for union types is in bold:

type <type-name> = <data-type> | <data-type> ...

Separate each data type in the union type definition with a pipe character ( | ), which works like an OR operator between each type.

The required arguments are defined as follows:

type-name

Syntax: A string. If this string includes any character other than a-z, A-Z, 0-9, or the underscore ( _ ) character, then you must enclose the string in single quotation marks ( ' ).

Description: The name for identifying the custom data type.

data-type

Syntax: <data-type>

Description: The name of an existing data type that you want to include in the union type.

Example

Suppose you had created constrained types named ipv4 and ipv6 that match IPv4 and IPv6 addresses, respectively. You can then create a union data type that groups the ipv4 and ipv6 data types together. This union data type would allow you to validate IP addresses in your data regardless of their specific format by checking the values against one data type instead of two. The following expression defines a union data type named ipaddress that groups the ipv4 and ipv6 data types together:

type ipaddress = ipv4 | ipv6
Note: For the regular expressions used to create the ipv4 and ipv6 data types, see the Examples section for constrained data types.