Guidelines for using entity discovery sources
Using entity discovery sources for enrichment
Entity discovery in Exposure Analytics primarily relies on active discovery sources. These sources return records that include a time field indicating when the entity was last seen on the network. While active sources are valuable for real-time discovery, they often provide only limited context about the entity. For example, a Windows security event may include an IP address, user_id, and hostname, but not richer asset details such as asset type, operating system, or serial number. Similarly, many sources may identify a user by user_id without including additional attributes such as name, title, or location.
Although Exposure Analytics supports enrichment through rules and other mechanisms, it is often helpful to include entity discovery sources that are intended primarily to add context to discovered assets and users.
For user entity enrichment, LDAP or HR data is a common example, as these sources can provide attributes such as title, full name, location, manager, and email address. The predefined LDAP - General LDAP user search entity discovery source can be used as a template for importing LDAP user information, either from an existing lookup or from indexed data. These types of sources can also be used for linking related user accounts if there is a need to do so. See Linking related user accounts below.
For asset entity enrichment, asset management platforms such as ServiceNow or Active directory are a strong source of additional enrichment. The Splunk Add-on for Exposure Analytics also provides a useful active method for collecting asset enrichment data directly from endpoints. Predefined entity discovery sources exist for these aforementioned sources.
When importing user or asset lists for enrichment, a best practice would be to exclude disabled or decommissioned entities, to prevent them being added to the inventories unnecessarily. Additionally, because this type of data typically changes infrequently, these sources usually only need to be pulled in no more than once per day. After the initial import, the source can be updated to pull only new or modified records to support incremental updates.
Entity discovery sources used solely for enrichment, and that do not indicate actual network activity by including a last seen timestamp, should be marked as Passive. This ensures the source contributes enrichment data without indicating that the asset or user was actively discovered on the network.
Linking related user accounts
In some cases, multiple user accounts may belong to the same individual. For example, a person might have a standard network account (user123), an administrative account (user123_adm), and a cloud account (user123@splunk.com). Although these accounts may appear as separate users, they often share the same contextual attributes, such as first name, last name, title, and location, and may need to be recognized as related identities.
To support this, an entity discovery source can define a single user entry where the user_id field contains multiple linked user account IDs separated by pipes (|). In the example above, the user_id value would be user123|user123_adm|user123@splunk.com. During entity discovery processing, these linked IDs are split into separate user records so each account can be individually tracked, while the full combined value is stored in the user_alternate field for each record. When analyzing any of these users, the related linked accounts are displayed and can be explored for additional context. When populating the Asset and Identity (A&I) identity lookup, the user_alternate field is used as the identity field value in the lookup if it is populated; otherwise, the user_id field is used.
Primary user and host
Entity discovery of users allows for an optional primary_user field that preserves user attribution when activity is performed through a delegated or non-human identity. In these cases, the user_id may represent the active token, service principal, or agentic identity, while primary_user identifies the associated human user who originally received or initiated that access. When this information is present in an entity discovery source, the corresponding source fields can be mapped to these field names such that they are recorded in the discovered user inventory.
The asset inventory includes an optional primary_host field to preserve host attribution when a discovered asset is associated with a broader parent system. This is particularly useful when network or endpoint telemetry identifies an asset by an observed hostname in nt_host, while primary_host identifies the underlying physical host, hypervisor, or other parent system. When available in an entity discovery source, these values can be mapped to the corresponding inventory fields to improve asset context and support more accurate analysis.
Both primary_user and primary_host are surfaced in Entity Analysis to provide additional context during investigations. These fields can be drilled into, to pivot directly to analysis of the associated primary user or host as required.