Machine Learning Toolkit Macros in Splunk Enterprise Security

Machine Learning Toolkit macros act as shortcuts and wrappers. The macros are found from the Splunk Enterprise menu at Settings > Advanced Search > Search macros.

An example of using a macro to apply data to model=app:failures_by_src_count_1d for qualitative_id=medium, and field=failure:

... | `mltk_apply_upper("app:failures_by_src_count_1d", "medium", "failure")`

Versus doing it without the macro:

... | apply app:failures_by_src_count_1d [| inputlookup append=T qualitative_thresholds_lookup where qualitative_id="medium" | rename threshold as upper_threshold | return upper_threshold | eval search=replace(search,"\"","")] | search "IsOutlier(failure)"=1

Macros used in SPL

You might use the following macros to apply data to your models.

[mltk_apply]

This is approximately equivalent to the xsWhere command, for applying to either upper or lower bounds.

[mltk_apply(3)]
args       = model,qualitative_id,field
definition = apply $model$ [| `get_qualitative_threshold($qualitative_id$)`] | search "IsOutlier($field$)"=1

The macro takes the following arguments:

model

The name of the model for applying data and comparing against standards to find outliers, such as app:failures_by_src_count_1d.

qualitative_id

The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as medium.

field

The name of the field that you're searching or counting to find outliers, such as failure.

[mltk_apply_lower]

This is approximately equivalent to the xsWhere command, for applying to lower bounds.

[mltk_apply_lower(3)]
args       = model,qualitative_id,field
definition = apply $model$ [| `get_qualitative_lower_threshold($qualitative_id$)`] | search "IsOutlier($field$)"=1

The macro takes the following arguments:

model

The name of the model for applying data and comparing against standards to find outliers, such as app:failures_by_src_count_1d.

qualitative_id

The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as medium.

field

The name of the field that you're searching or counting to find outliers, such as failure.

[mltk_apply_upper]

This is approximately equivalent to the xsWhere command, for applying to upper bounds.

[mltk_apply_upper(3)]
args       = model,qualitative_id,field
definition = apply $model$ [| `get_qualitative_upper_threshold($qualitative_id$)`] | search "IsOutlier($field$)"=1

The macro takes the following arguments:

model

The name of the model for applying data and comparing against standards to find outliers, such as app:failures_by_src_count_1d.

qualitative_id

The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as medium.

field

The name of the field that you're searching or counting to find outliers, such as failure.

[mltk_findbest]

This is approximately equivalent to the xsFindBestConcept command. For each value, this macro tells you in which threshold range the value falls on the distribution curve.

[mltk_findbest(1)]
args       = model
definition = apply $model$ as findbest [| `get_findbest_thresholds`] | eval [| `get_findbest_qualitative`] | fields - BoundaryRanges,findbest*

The macro takes the following arguments:

model

The name of the model for applying data and comparing against standards to find outliers, such as app:failures_by_src_count_1d.

Note that the threshold doesn't take a field parameter like the other macros. It performs the findbest operation on the exact field that the Model Gen fit command was performed on. For example:

  • If the Model Gen performed: ... | fit DensityFunction current_count dist=norm into app:total_risk_1d, the mltk_findbest() search will only match on the current_count field.
  • This means that the portion of the search that comes before the mltk_findbest() command must contain the current_count field.

Macros used by other macros

These macros are in use by the macros used in SPL.

[get_qualitative_threshold]

This is a building block for [mltk_apply]. You might not use this one by itself.

[get_qualitative_threshold(1)]
args       = qualitative_id
definition = inputlookup append=T qualitative_thresholds_lookup where qualitative_id="$qualitative_id$" | return threshold | eval search=replace(search,"\"","")

The macro takes the following arguments:

qualitative_id

The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as medium.

[get_qualitative_lower_threshold]

This is a building block for [mltk_apply_upper]. You might not use this one by itself.

[get_qualitative_lower_threshold(1)]
args       = qualitative_id
definition = inputlookup append=T qualitative_thresholds_lookup where qualitative_id="$qualitative_id$" | rename threshold as lower_threshold | return lower_threshold | eval search=replace(search,"\"","")

The macro takes the following arguments:

qualitative_id

The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as medium.

[get_qualitative_upper_threshold]

This is a building block for [mltk_apply_upper]. You might not use this one by itself.

[get_qualitative_upper_threshold(1)]
args       = qualitative_id
definition = inputlookup append=T qualitative_thresholds_lookup where qualitative_id="$qualitative_id$" | rename threshold as upper_threshold | return upper_threshold | eval search=replace(search,"\"","")

The macro takes the following arguments:

qualitative_id

The default IDs that correspond to percentages of deviation, representing where on the distribution curve to start looking for the outliers, such as medium.

[get_findbest_thresholds]

This is a building block for [mltk_findbest]. You might not use this one by itself.

[get_findbest_thresholds]
definition = inputlookup append=T qualitative_thresholds_lookup | stats values(threshold) as search | eval search="threshold=\"".mvjoin(mvsort(search), ",")."\""

[get_findbest_qualitative]

This is a building block for [mltk_findbest]. You might not use this one by itself.

[get_findbest_qualitative]
definition = inputlookup append=T qualitative_thresholds_lookup | eval threshold_id="findbest_th=".threshold | sort threshold | eval subcase="'".threshold_id."'=\"1.0\",\"".qualitative_label."\"" | stats values(subcase) as search | eval search="qualitative=case(".mvjoin(search, ",").")"