Create KPI base searches in ITSI

KPI base searches let you share a search definition across multiple KPIs in IT Service Intelligence (ITSI). Create base searches to consolidate multiple similar KPIs, reduce search load, and improve search performance. For example, if you have similar ad hoc searches whose only difference is an entity or threshold field, you can consolidate these searches into a single base search definition and achieve better search performance.

ITSI module base searches

ITSI includes several pre-configured KPI base searches based on ITSI modules that you can use with your services. The titles of these base searches begin with "DA-ITSI". KPI base searches that come with ITSI modules are read-only and cannot be modified or deleted. To customize a base search that comes from an ITSI module, clone the base search, then perform your edits on the clone.

Service templates and base searches

Service templates use base searches for their KPIs. When a service template is created from a service, all of the KPIs in the service are imported into the template. Any service KPIs that use ad hoc searches, data model searches, or metrics searches are converted into base searches. These base searches are listed on the KPI Base Searches lister page and are available to use for KPIs in any service, just like any other base search. Base searches that are created for service template KPIs use the following naming standard: <service name>:<KPI name>_<last 8 digits of KPI ID>. For more information about service templates, see Overview of service templates in ITSI.

Create a new base search

Build new KPIs from a base search

Delete a KPI base search

Wildcards in KPI base searches

As a best practice, don't use wildcards for entities in a KPI base search. When you use wildcards in an entity alias, the entity filter rule returns multiple entities. As a result, those entities act like pseudo entities with N/A entity_keys. This might also lead to an entity broadcasting situation where pseudo entities contribute to the KPI calculations even though that KPI has a strict entity filter. For more information about pseudo entities and entity broadcast, see Manually create pseudo entities in ITSI in the Entity Integrations manual.

As an alternative to using wildcards, consider using entity pivot to associate pseudo entities to a particular service, even through a shared base search without entity broadcast. For examples, see Entity pivot.

KPI base search performance considerations

The performance of KPI base searches is dependent on the following factors:

  • The number of KPIs that use the base search.
  • The number of services that contain KPIs that use the base search.
  • The number of entities matching service entity rules.

Most of the KPI base searches delivered with ITSI are configured to run every minute. Based on testing on a system with 32 cores and 16 GB of memory, a single KPI base search can support up to 5,000 KPIs with 15 entities matched by service entity rules reasonably well.

In general, a KPI base search can support fewer KPIs with many entities or many KPIs with fewer entities. It's not advised to use a single KPI base search for both a high number of KPIs and a high number of entities. As the number of services or matching entities increases, the search runtime also increases.

You can check the runtime for your KPI base searches on the Activity > Jobs page. The runtime is the actual time it takes to run the search. Check the KPI search schedule, or frequency, of the KPI base search. If a search is scheduled to run every minute, and the runtime of that search is longer than 1 minute, the search is taking too long to run.

To reduce a search's runtime, you need to either reduce the number of KPIs using the base search, reduce the number of services that contain the KPI, or reduce the number of entities for each service accordingly. The easiest solution is to clone the KPI base search and use the cloned base search for some of the KPIs.

Fix truncated or incorrect KPI values

Search results are processed, created, and written to the itsi_summary index via an alert action. The default limit on the number of rows that can be written is 50,000 as specified in the $SPLUNK_HOME/etc/system/default/limits.conf file. You can increase this limit if necessary. Calculate the number of the result rows generated by a shared base search using the following formula:

<number of services> x <number of KPIs in each service> x <number of entities per service entity rule> + <number of services> x 2 (one for the service aggregation result, one for the service maximum result)

For example, for 500 services with 10 KPIs in each service and 15 matching entities, the expected number of result rows is 500 x 10 x 15 + 500 x 2 = 76,000 rows

If the number of result rows expected is more than 50,000, ITSI truncates the results and displays incorrect KPI values. If you think you're running into this limitation perform the following steps:

Prerequisites

  • Only users with file system access, such as system administrators, can increase write search result limits using a configuration file.
  • Review the steps in How to edit a configuration file in the Admin Manual.

CAUTION: Never change or copy the configuration files in the default directory. The files in the default directory must remain intact and in their original location.

Steps

  1. Open or create a local limits.conf file at $SPLUNK_HOME/etc/apps/SA-ITOA/local.
  2. Add the following stanza:
    [scheduler]
    max_action_results = 1000000
    
  3. Set the value for max_action_results to a number higher than 50,000. In the example above it's set to 1,000,000.

Increase the KV store bulk get limit

The KPI base search tries to get all the relevant services from the KV store internally for thresholding related operations. When a KPI base search is attached to a lot of services, the bulk get might reach the KV store bulk get size limit. The default limit is 500MB.

As a guideline, for one service with 20 fully populated KPIs in which all KPIs have custom thresholds with time policies configured, as well as cohesive anomaly detection configured, the size is roughly 0.8 MB in the KV store.

If you have a large number of services containing a lot of KPIs and metadata, it is recommended to increase the KV store bulk get limit in $SPLUNK_HOME/etc/apps/SA-ITOA/local/limits.conf. Increase the max_size_per_result_mb value as necessary.

[kvstore]
# The maximum size, in megabytes (MB), of the result that will be returned for a single query to a collection.
# ITSI requires approximately 50MB per 1,000 KPIs. Override this value if necessary.
# Default: 500 MB
max_size_per_result_mb = 500