OCSF data conversion process
The Edge Processor parses the incoming data, detects its event type, and then maps the data to an OCSF schema.
- Parse the incoming data according to the data format that is associated with the specified source type. 
- Identify the type of event that the data represents. 
- Map the data to an appropriate OCSF schema based on the source type and event type. 
cisco:asa source type:<166>Jan 05 2024 03:21:14 10.194.183.195 : %ASA-6-611101: User authentication succeeded: Uname: <sasha_patel>cisco:asa data and convert it to OCSF format, the Edge Processor identifies the log as an authentication event based on the 611101 message ID, and then maps the log to the Authentication schema from OCSF. The converted data looks like the following:{
    category_uid: 3,
    metadata: {
        uid: "13003627b465aab8433481239578db50",
        product: {
            name: "ASA",
            vendor_name: "Cisco"
        },
        log_name: "Syslog",
        event_code: "611101",
        profiles: [
            "host"
        ],
        original_time: "Jan 05 2024 03:21:14",
        version: "1.5.0"
    },
    session: {
        is_vpn: true
    },
    message: "User authentication succeeded: Uname: <sasha_patel>",
    unmapped: {
        level: "6",
        facility: 20
    },
    status_id: 1,
    service: {
        name: "ASA"
    },
    activity_id: 1,
    class_uid: 3002,
    dst_endpoint: {
        ip: "10.194.183.195"
    },
    severity_id: 1,
    time: 1704424874000,
    device: {
        type_id: 9,
        ip: "10.194.183.195"
    },
    user: {
        name: "sasha_patel"
    },
    type_uid: 300201
}
For information about the Authentication schema, see “Authentication” in the OCSF schema browser: https://schema.ocsf.io/classes/authentication.
Fallback behavior for failed conversions
- The source type or event type of the data is not supported. For more information, see Supported source types and event types. 
- The data doesn't match the source type specified in the pipeline configuration. 
- The source type is unknown due to a configuration error. 
If the conversion fails, then the Edge Processor maps the data to the generic Application Error schema from OCSF. For information about this schema, see “Application Error” in the OCSF schema browser: https://schema.ocsf.io/classes/application_error.
cisco:asa log:<166>Jan 05 2024 03:21:14 10.194.183.195 : %ASA-6-611101: User authentication succeeded: Uname: <sasha_patel>pan:globalprotect data, the conversion fails and the Edge Processor produces the following result. Notice that the result includes a message attribute that contains an error message, and a raw_data attribute that contains a copy of the original raw data.{
    category_uid: 6,
    metadata: {
        uid: "ecb1da171382f0b33d3608eeab4b902a",
        product: {
            path: "/processor",
            feature: {
                name: "to_ocsf eval function"
            },
            name: "Splunk SPL2 Processor",
            vendor_name: "Splunk"
        },
        version: "1.5.0"
    },
    status_id: 2,
    activity_id: 2,
    class_uid: 6008,
    severity_id: 6,
    time: 1748295979566,
    message: "OCSF translation failed: no matched predicate rules and no default rule; source type "pan:globalprotect": no translation",
    raw_data: "<166>Jan 05 2024 03:21:14 10.194.183.195 : %ASA-6-611101: User authentication succeeded: Uname: <sasha_patel>",
    type_uid: 600802
}
Application Error events are identified by the class_uid value 6008. You can configure a pipeline to filter failed OCSF conversion results out of your data by using the following where command:| where json_extract(_raw, "class_uid") != "6008"route command to send failed OCSF conversion results to a different destination than successfully converted data. For example, the following pipeline sends failed conversion results to $destination2, and sends successfully converted data to $destination:import { ocsf, route } from /splunk/ingest/commands
$pipeline = | from $source 
| ocsf
| route json_extract(_raw, "class_uid") == 6008, [
    | into $destination2
]
| into $destination;
For more information about the route command, see Process a subset of data using an Edge Processor.
Retaining a copy of the original data
- When configuring the Convert _raw to OCSF format pipeline action, which represents the - ocsfcommand, turn on the Include original raw data option.
- When configuring the - to_ocsffunction, set the- include_rawoption to- true.
If these options are turned on, the OCSF-formatted output will include a raw_data attribute containing a copy of the original data.
thru command in your pipeline to create a backup copy of the original data and send it to a different data destination than the OCSF-formatted data. For example, the following pipeline sends an unaltered copy of the original data to $destination2, and then sends the OCSF-formatted data to $destination:import ocsf from /splunk.ingest.commands
$pipeline = | from $source | thru [
    | into $destination2
]
| ocsf
| into $destination;
For more information about the thru command, see Process a copy of data using an Edge Processor.
Including sibling strings for enum attributes
In OCSF, ID values from the data are stored in enum attributes. Some enum attributes are paired with other attributes, known as sibling strings, that provide descriptive labels for the ID values in the enum attributes.
By default, Edge Processors don’t include sibling strings for enum attributes when converting data to OCSF format. However, you can choose to include sibling strings by turning on the Include sibling strings for enum attributes option in the ocsf command or the add_enum_siblings option in the to_ocsf function.
For example, assume that the converted data includes the key-value pair severity_id: 1. According to the Authentication schema in OCSF, severity_id is an enum attribute that has a corresponding sibling string called severity. If you configure the Edge Processor to include sibling strings, then the converted data will include this key-value pair: severity: Informational.
For information about the Authentication schema, see “Authentication” in the OCSF schema browser: https://schema.ocsf.io/classes/authentication.
Including observables
A security observable is any piece of information from your event data that is especially relevant for detecting and analyzing potential security threats. Observables can be found in a variety of different fields or attributes in your events. For example, IP addresses are observables, and they might be found in attributes such as device, src_endpoint, and dst_endpoint.
When converting data to OCSF format, you can choose to summarize the attributes that contain observables into an array of objects called observables. You can then verify the presence of a specific observable by checking this observables array instead of checking multiple attributes individually.
observables array from an OCSF-formatted event:{ 
...
    observables: [
        {
            type_id: 20,
            name: "dst_endpoint"
        },
        {
            type_id: 2,
            name: "dst_endpoint.ip",
            value: "10.160.0.10"
        },
        {
            type_id: 20,
            name: "device"
        },
        {
            type_id: 2,
            name: "device.ip",
            value: "10.160.0.10"
        },
        {
            type_id: 20,
            name: "src_endpoint"
        },
        {
            type_id: 2,
            name: "src_endpoint.ip",
            value: "10.160.39.123"
        }
    ],
... 
}
By default, Edge Processors don’t include the observables array when converting data to OCSF format. However, you can choose to include it by turning on the Include observables option in the ocsf command or the add_observables option in the to_ocsf function. 
See also
- “Understanding the Open Security Schema Framework” on GitHub: https://github.com/ocsf/ocsf-docs/blob/main/overview/understanding-ocsf.md 
- The OCSF schema browser: https://schema.ocsf.io