Convert data in a specified event field to OCSF format

Use the to_ocsf SPL2 function in an Ingest Processor pipeline to convert data from a raw text format to the Open Cybersecurity Schema Framework (OCSF) format.

If the data that you want to convert to OCSF format is not stored in an event field named _raw, or if you want to convert the data from the _raw field but store the results in a different field, then use the to_ocsf SPL2 evaluation function in your pipeline.

Unlike the ocsf command, the to_ocsf function cannot modify other fields in your data such as sourcetype. To make sure that your data is associated with a source type that is supported by the OCSF-CIM Add-on for Splunk and Splunk Enterprise Security, you need to use an eval function to prefix the sourcetype values with ocsf:.

  1. On the Pipelines page, select New pipeline. Follow the on-screen instructions to define a partition, optionally enter sample data, and select a data destination.
    After you complete the on-screen instructions, the pipeline editor displays the SPL2 statement for your pipeline.
  2. Import the to_ocsf function so that it is available for use in the pipeline. In the SPL2 editor, on a new line, enter the following import statement:
    import to_ocsf from /splunk.ingest.commands
  3. Use the to_ocsf function in an eval command to convert the incoming data into OCSF format. In the SPL2 editor, in an appropriate location between the from and into commands in the $pipeline statement, enter the following eval command:
    | eval <destination_field> = to_ocsf(<source_field>, <source_type>, <include_raw>, <add_enum_siblings>, <add_observables>)
    Replace the placeholders in this command as follows:
    PlaceholderReplace with the following

    <destination_field>

    The name of the event field where you want to store the converted data.

    If you want to overwrite the original data with the converted data, then specify the same field name as the <source_field> placeholder.

    <source_field>

    The name of the event field containing the data that you want to convert.

    <source_type>

    The source type that you want the data to be parsed as during conversion.

    You can replace the <source_type> placeholder with any of the following:
    • A source type name enclosed in double quotation marks ( " ).

    • The name of an event field that contains the exact source type.

    • An SPL2 expression that resolves to the source type.

    <include_raw>

    If you want the converted data to include an attribute named raw_data that contains a copy of the original data, then replace the placeholder with true.

    Otherwise, omit the <include_raw> placeholder or replace it with false.

    <add_enum_siblings>

    If you want the converted data to include descriptive labels for ID values, then replace the placeholder with true.

    Otherwise, omit the <add_enum_siblings> placeholder or replace it with false.

    For more information about this configuration option, see Including sibling strings for enum attributes.

    <add_observables>

    If you want the converted data to include an observables array that summarizes the attributes that contain security observables, then replace the placeholder with true.

    Otherwise, omit the <add_observables> placeholder or replace it with false.

    For more information about this configuration option, see Including observables.

    For example:
    | eval ocsf_formatted_data = to_ocsf(log_messages, "cisco:asa")
    This eval command does the following:
    • Parses the data from a field named log_messages as data that is associated with the cisco:asa source type.

    • Converts the data to OCSF format.

    • Stores the results in a field named ocsf_formatted_data.

    The converted data does not include the raw_data attribute, descriptive labels for ID values, or the observables array.

  4. To make sure that your data can be supported by the OCSF-CIM Add-on for Splunk and Splunk Enterprise Security, prefix the source type of the data with ocsf: and then store the updated value in the sourcetype field.
    For example, the following eval command sets the sourcetype field to ocsf:cisco:asa:
    | eval sourcetype = "ocsf:cisco:asa"
    As another example, the following eval command adds ocsf: to the existing values in the sourcetype field:
    | eval sourcetype = "ocsf:" + sourcetype
  5. (Optional) To filter your data for failed OCSF conversions and send those results to a different destination than the successfully converted data, do the following:
    1. Select the plus icon (This image shows an icon of a plus sign.) in the Actions section of the pipeline builder, and then select Route a subset of data.
    2. Configure the options in the Route data dialog box as follows:
      Option nameEnter or select the following

      Field

      Set the drop-down list to Expression and then enter the following:
      json_extract(_raw, "class_uid")

      Action

      Include

      Operator

      = equals

      Value

      6008

      Match case

      This option is not used when matching numbers, so you don't need to do anything with it.

    3. Select Apply.
      The pipeline editor updates the import statement to include the route command, and adds a route command to your pipeline.
    4. In the Actions section of the pipeline builder, select Send data to $destination2. Select the destination that you want to send the failed OCSF conversions to, and then select Apply.

    For information about how the Ingest Processor handles failed OCSF conversions, see Fallback behavior for failed conversions.

  6. Save your pipeline, and then apply it to your Ingest Processor as needed. For more information, see Apply a pipeline.
You now have a pipeline that converts the data in a specified event field from a raw text format into the OCSF format.

Example: Use the to_ocsf function to convert data

For example, assume that the pipeline receives the following event:
log_messages_2sourcetype

<166>Oct 06 2021 12:56:34 10.160.0.10 : %ASA-6-611101: User authentication succeeded: IP address: 10.160.39.123, Uname: admin

cisco:asa

You can process that event using the following pipeline:
import ocsf from /splunk.ingest.commands

$pipeline = | from $source 
| eval ocsf_formatted_data = to_ocsf(log_messages_2, sourcetype, true, true, true)
| eval sourcetype = "ocsf:" + sourcetype
| into $destination;
This pipeline does the following:
  • Parses the data from the log_messages_2 field based on the source type indicated in the sourcetype field.

  • Converts the data into OCSF format.

  • Includes the following additional information in the converted data:
    • A raw_data attribute containing a copy of the original data.

    • The severity, activity_name, type_name, category_name, class_name, and status attributes. These attributes provide descriptive labels for the severity_id, activity_id, type_id, category_uid, class_uid, and status_id enum attributes.

    • The observables array, which summarizes the attributes that contain security observables.

  • Stores the converted data in a field named ocsf_formatted_data.

  • Prefixes the values in the sourcetype field with ocsf:.

The resulting event looks like this:
ocsf_formatted_datalog_messages_2sourcetype
{
    severity: "Informational",
    activity_name: "Logon",
    category_uid: 3,
    metadata: { 
        uid: "fec88ee6aa54566fb109cfa091dbbe63",
        product: {
            name: "ASA",
            vendor_name: "Cisco"
        },
        log_name: "Syslog",
        event_code: "611101",
        profiles: [
            "host"
        ],
        original_time: "Oct 06 2021 12:56:34",
        version: "1.5.0"
    },
    type_name: "Authentication: Logon",
    category_name: "Identity & Access Management",
    session: {
        is_vpn: true
    },
    src_endpoint: {
        ip: "10.160.39.123"
    },
    message: "User authentication succeeded: IP address: 10.160.39.123, Uname: admin",
    unmapped: {
        level: "6",
        facility: 20
    },
    observables: [
        {
            type_id: 20,
            name: "dst_endpoint",
            type: "Endpoint"
        },
        {
            type_id: 2,
            name: "dst_endpoint.ip",
            type: "IP Address",
            value: "10.160.0.10"
        },
        {
            type_id: 20,
            name: "device",
            type: "Endpoint"
        },
        {
            type_id: 2,
            name: "device.ip",
            type: "IP Address",
            value: "10.160.0.10"
        },
        {
            type_id: 20,
            name: "src_endpoint",
            type: "Endpoint"
        },
        { [-] 
            type_id: 2,
            name: "src_endpoint.ip",
            type: "IP Address",
            value: "10.160.39.123"
        },
        {
            type_id: 21,
            name: "user",
            type: "User"
        },
        {
            type_id: 4,
            name: "user.name",
            type: "User Name",
            value: "admin"
        }
    ],
    status_id: 1,
    service: { 
        name: "ASA"
    },
    activity_id: 1,
    class_uid: 3002,
    dst_endpoint: {
        ip: "10.160.0.10"
    },
    severity_id: 1,
    time: 1633524994000,
    class_name: "Authentication",
    device: {
        type_id: 9,
        ip: "10.160.0.10",
        type: "Firewall"
    },
    raw_data: "<166>Oct 06 2021 12:56:34 10.160.0.10 : %ASA-6-611101: User authentication succeeded: IP address: 10.160.39.123, Uname: admin",
    user: {
        name: "admin"
    },
    type_uid: 300201,
    status: "Success"
}

<166>Oct 06 2021 12:56:34 10.160.0.10 : %ASA-6-611101: User authentication succeeded: IP address: 10.160.39.123, Uname: admin

ocsf:cisco:asa

For more information, see the following pages: