OCSF data conversion process
The Ingest Processor parses the incoming data, detects its event type, and then maps the data to an OCSF schema.
-
Parse the incoming data according to the data format that is associated with the specified source type.
-
Identify the type of event that the data represents.
-
Map the data to an appropriate OCSF schema based on the source type and event type.
cisco:asa
source type:<166>Jan 05 2024 03:21:14 10.194.183.195 : %ASA-6-611101: User authentication succeeded: Uname: <sasha_patel>
cisco:asa
data and convert it to OCSF format, the Ingest Processor identifies the log as an authentication event based on the 611101
message ID, and then maps the log to the Authentication
schema from OCSF. The converted data looks like the following:{
category_uid: 3,
metadata: {
uid: "13003627b465aab8433481239578db50",
product: {
name: "ASA",
vendor_name: "Cisco"
},
log_name: "Syslog",
event_code: "611101",
profiles: [
"host"
],
original_time: "Jan 05 2024 03:21:14",
version: "1.5.0"
},
session: {
is_vpn: true
},
message: "User authentication succeeded: Uname: <sasha_patel>",
unmapped: {
level: "6",
facility: 20
},
status_id: 1,
service: {
name: "ASA"
},
activity_id: 1,
class_uid: 3002,
dst_endpoint: {
ip: "10.194.183.195"
},
severity_id: 1,
time: 1704424874000,
device: {
type_id: 9,
ip: "10.194.183.195"
},
user: {
name: "sasha_patel"
},
type_uid: 300201
}
For information about the Authentication
schema, see “Authentication” in the OCSF schema browser: https://schema.ocsf.io/classes/authentication.
Fallback behavior for failed conversions
-
The source type or event type of the data is not supported. For more information, see Supported source types and event types.
-
The data doesn't match the source type specified in the pipeline configuration.
-
The source type is unknown due to a configuration error.
If the conversion fails, then the Ingest Processor maps the data to the generic Application Error
schema from OCSF. For information about this schema, see “Application Error” in the OCSF schema browser: https://schema.ocsf.io/classes/application_error.
cisco:asa
log:<166>Jan 05 2024 03:21:14 10.194.183.195 : %ASA-6-611101: User authentication succeeded: Uname: <sasha_patel>
pan:globalprotect
data, the conversion fails and the Ingest Processor produces the following result. Notice that the result includes a message
attribute that contains an error message, and a raw_data
attribute that contains a copy of the original raw data.{
category_uid: 6,
metadata: {
uid: "ecb1da171382f0b33d3608eeab4b902a",
product: {
path: "/processor",
feature: {
name: "to_ocsf eval function"
},
name: "Splunk SPL2 Processor",
vendor_name: "Splunk"
},
version: "1.5.0"
},
status_id: 2,
activity_id: 2,
class_uid: 6008,
severity_id: 6,
time: 1748295979566,
message: "OCSF translation failed: no matched predicate rules and no default rule; source type "pan:globalprotect": no translation",
raw_data: "<166>Jan 05 2024 03:21:14 10.194.183.195 : %ASA-6-611101: User authentication succeeded: Uname: <sasha_patel>",
type_uid: 600802
}
Application Error
events are identified by the class_uid
value 6008
. You can configure a pipeline to filter failed OCSF conversion results out of your data by using the following where
command:| where json_extract(_raw, "class_uid") != "6008"
route
command to send failed OCSF conversion results to a different destination than successfully converted data. For example, the following pipeline sends failed conversion results to $destination2
, and sends successfully converted data to $destination
:import { ocsf, route } from /splunk/ingest/commands
$pipeline = | from $source
| ocsf
| route json_extract(_raw, "class_uid") == 6008, [
| into $destination2
]
| into $destination;
For more information about the route
command, see Process a subset of data using Ingest Processor.
Retaining a copy of the original data
-
When configuring the Convert _raw to OCSF format pipeline action, which represents the
ocsf
command, turn on the Include original raw data option. -
When configuring the
to_ocsf
function, set theinclude_raw
option totrue
.
If these options are turned on, the OCSF-formatted output will include a raw_data
attribute containing a copy of the original data.
thru
command in your pipeline to create a backup copy of the original data and send it to a different data destination than the OCSF-formatted data. For example, the following pipeline sends an unaltered copy of the original data to $destination2
, and then sends the OCSF-formatted data to $destination
:import ocsf from /splunk.ingest.commands
$pipeline = | from $source | thru [
| into $destination2
]
| ocsf
| into $destination;
For more information about the thru
command, see Process a copy of data using Ingest Processor.
Including sibling strings for enum attributes
In OCSF, ID values from the data are stored in enum attributes. Some enum attributes are paired with other attributes, known as sibling strings, that provide descriptive labels for the ID values in the enum attributes.
By default, the Ingest Processor doesn't include sibling strings for enum attributes when converting data to OCSF format. However, you can choose to include sibling strings by turning on the Include sibling strings for enum attributes option in the ocsf
command or the add_enum_siblings
option in the to_ocsf
function.
For example, assume that the converted data includes the key-value pair severity_id: 1
. According to the Authentication
schema in OCSF, severity_id
is an enum attribute that has a corresponding sibling string called severity
. If you configure the Ingest Processor to include sibling strings, then the converted data will include this key-value pair: severity: Informational
.
For information about the Authentication
schema, see “Authentication” in the OCSF schema browser: https://schema.ocsf.io/classes/authentication.
Including observables
A security observable is any piece of information from your event data that is especially relevant for detecting and analyzing potential security threats. Observables can be found in a variety of different fields or attributes in your events. For example, IP addresses are observables, and they might be found in attributes such as device
, src_endpoint
, and dst_endpoint
.
When converting data to OCSF format, you can choose to summarize the attributes that contain observables into an array of objects called observables
. You can then verify the presence of a specific observable by checking this observables
array instead of checking multiple attributes individually.
observables
array from an OCSF-formatted event:{
...
observables: [
{
type_id: 20,
name: "dst_endpoint"
},
{
type_id: 2,
name: "dst_endpoint.ip",
value: "10.160.0.10"
},
{
type_id: 20,
name: "device"
},
{
type_id: 2,
name: "device.ip",
value: "10.160.0.10"
},
{
type_id: 20,
name: "src_endpoint"
},
{
type_id: 2,
name: "src_endpoint.ip",
value: "10.160.39.123"
}
],
...
}
By default, the Ingest Processor doesn't include the observables
array when converting data to OCSF format. However, you can choose to include it by turning on the Include observables option in the ocsf
command or the add_observables
option in the to_ocsf
function.
See also
-
“Understanding the Open Security Schema Framework” on GitHub: https://github.com/ocsf/ocsf-docs/blob/main/Understanding%20OCSF.md
-
The OCSF schema browser: https://schema.ocsf.io