Configure the Prometheus receiver to collect Milvus metrics

Collect Milvus metrics with the Splunk Distribution of the OpenTelemetry Collector.

You can monitor the performance of Milvus vector databases by configuring the Splunk Distribution of the OpenTelemetry Collector to send Milvus metrics to Splunk Observability Cloud.

This solution uses the Prometheus receiver to collect metrics from Milvus and its subcomponent. Milvus exposes a http://<component-host>:9091/metrics endpoint that publishes Prometheus-compatible metrics.

To configure the Prometheus receiver to collect metrics from Milvus vector databases, you must first deploy Milvus locally or on a cloud server in either standalone or distributed mode. For instructions, see Overview of Milvus Deployment Options in the Milvus documentation.

  1. Deploy the Splunk Distribution of the OpenTelemetry Collector to your host or container platform:
  2. To activate the Prometheus receiver for Milvus manually in the Collector configuration, make the following changes to your values.yaml configuration file.
    1. Add prometheus/milvus to the receivers section. For example, with Milvus deployed in distributed mode:
      prometheus/milvus: 
        config: 
           global: 
             scrape_interval: 10s 
           scrape_configs: 
             - job_name: 'milvus-scraper' 
               metrics_path: /metrics 
               static_configs: 
                 - targets:  
                   - 'milvus-proxy:9091'  
                   - 'milvus-querynode:9091'  
                   - 'milvus-datanode:9091'  
                   - 'milvus-indexnode:9091'  
                   - 'milvus-rootcoord:9091' 
    2. By default, Milvus exposes a large number of metrics. Add the following configuration under the processors section to filter only the metrics required for the built-in dashboard:
      processors:  
        filter/milvus_metrics:  
          metrics:  
            include:  
              match_type: strict  
              metric_names:  
                - "milvus_datacoord_compaction_task_num"  
                - "milvus_datacoord_datanode_num"  
                - "milvus_datacoord_segment_num"  
                - "milvus_datacoord_stored_binlog_size"  
                - "milvus_datacoord_stored_index_files_size"  
                - "milvus_datanode_compaction_delete_count"  
                - "milvus_datanode_compaction_latency"  
                - "milvus_datanode_compaction_missing_delete_count"  
                - "milvus_datanode_flush_buffer_op_count"  
                - "milvus_datanode_flushed_data_rows"  
                - "milvus_datanode_msg_rows_count"  
                - "milvus_num_node"  
                - "milvus_proxy_cache_hit_count"  
                - "milvus_proxy_delete_vectors_count"  
                - "milvus_proxy_insert_vectors_count"  
                - "milvus_proxy_mutation_latency"  
                - "milvus_proxy_req_count"  
                - "milvus_proxy_req_latency"  
                - "milvus_proxy_search_vectors_count"  
                - "milvus_proxy_sq_latency"  
                - "milvus_querycoord_querynode_num"  
                - "milvus_querycoord_task_num"  
                - "milvus_querynode_entity_num"  
                - "milvus_querynode_read_task_concurrency"  
                - "milvus_querynode_segment_num"  
                - "milvus_querynode_sq_queue_latency"  
                - "milvus_querynode_wait_processing_msg_count"  
                - "milvus_rootcoord_ddl_req_count"  
                - "milvus_rootcoord_ddl_req_latency"  
                - "milvus_rootcoord_force_deny_writing_counter"  
                - "milvus_rootcoord_proxy_num"  
                - "milvus_storage_kv_size"  
                - "milvus_storage_op_count" 
    3. Add a new pipeline under the service section for Milvus and export the metrics to the target endpoints. For example:
      service: 
        pipelines: 
          metrics/milvus: 
            receivers:  
              - prometheus/milvus 
            processors:  
              - filter/milvus_metrics  
              - memory_limiter 
              - batch 
              - resourcedetection 
              - resource 
            exporters: 
              - signalfx/histograms 
  3. Restart the Splunk Distribution of the OpenTelemetry Collector.

Configuration settings

Learn about the configuration settings for the Prometheus receiver.

To view the configuration options for the Prometheus receiver, see Settings.

Metrics

The following metrics are available for Milvus databases. These metrics fall under the default metric category. For more information on these metrics, see Milvus Metrics Dashboard in the Milvus documentation.

Milvus metrics

Metric nameMetric typeDescription
milvus_datacoord_compaction_task_numgaugeCurrent number of active compaction tasks in DataCoord.
milvus_datacoord_datanode_numgaugeCurrent number of active DataNodes managed by DataCoord.
milvus_datacoord_segment_numgaugeCurrent number of segments managed by DataCoord.
milvus_datacoord_stored_binlog_sizegaugeTotal binlog size (in bytes) of all healthy segments managed by DataCoord.
milvus_datacoord_stored_index_files_sizegaugeTotal size (in bytes) of index files for all segments managed by DataCoord.
milvus_datanode_compaction_delete_countcounterTotal number of delete entries processed during segment compaction in a DataNode.
milvus_datanode_compaction_latencyhistogramLatency (in milliseconds) of segment compaction operations in a DataNode.
milvus_datanode_compaction_missing_delete_countcounterTotal number of delete entries that were expected but not applied during segment compaction in a DataNode.
milvus_datanode_flush_buffer_op_countcounterTotal number of buffer flush operations performed by the DataNode.
milvus_datanode_flushed_data_rowscounterTotal number of data rows successfully flushed from memory to storage by the DataNode.
milvus_datanode_msg_rows_countcounterTotal number of data rows consumed from the message stream by the DataNode.
milvus_num_nodegaugeCurrent number of active service nodes and coordinators in the Milvus cluster.
milvus_proxy_cache_hit_countcounterTotal number of cache hits recorded by the Milvus Proxy.
milvus_proxy_delete_vectors_countcounterTotal number of vectors successfully deleted through the Milvus Proxy.
milvus_proxy_insert_vectors_countcounterTotal number of vectors successfully inserted through the Milvus Proxy.
milvus_proxy_mutation_latencyhistogramLatency (in milliseconds) of successful insert and delete operations handled by the Milvus Proxy.
milvus_proxy_req_countcounterTotal number of client operations executed through the Milvus Proxy.
milvus_proxy_req_latencyhistogramLatency (in milliseconds) of client requests processed by the Milvus Proxy.
milvus_proxy_search_vectors_countcounterTotal number of vectors successfully searched through the Milvus Proxy.
milvus_proxy_sq_latencyhistogramLatency (in milliseconds) of successful search and query operations processed by the Milvus Proxy.
milvus_querycoord_querynode_numgaugeCurrent number of QueryNodes managed by the QueryCoord component.
milvus_querycoord_task_numgaugeCurrent number of tasks in the QueryCoord scheduler.
milvus_querynode_entity_numgaugeNumber of searchable/queryable entities in the QueryNode, grouped by collection, partition, and state.
milvus_querynode_read_task_concurrencygaugeNumber of read tasks currently executing concurrently in the QueryNode.
milvus_querynode_segment_numgaugeNumber of segments currently loaded in the QueryNode, grouped by collection, partition, state, and number of indexed fields.
milvus_querynode_sq_queue_latencyhistogramLatency (in milliseconds) that search and query requests spend waiting in the QueryNode queue.
milvus_querynode_wait_processing_msg_countgaugeNumber of messages currently waiting to be processed in the QueryNode.
milvus_rootcoord_ddl_req_countcounterTotal number of DDL operations processed by the RootCoord.
milvus_rootcoord_ddl_req_latencyhistogramLatency (in milliseconds) of DDL operations processed by the RootCoord.
milvus_rootcoord_force_deny_writing_countercounterTotal number of times Milvus entered a force-deny-writing state enforced by RootCoord.
milvus_rootcoord_proxy_numgaugeCurrent number of Proxy nodes managed by the RootCoord.
milvus_storage_kv_sizehistogramSize of key-value data stored in Milvus storage (in bytes).
milvus_storage_op_countcounterTotal number of persistent data operations performed in Milvus storage.

Attributes

The following resource attributes are available for Milvus databases.

Milvus attributes

Attribute nameDescriptionValues
node_idThe unique identity of a role.A globally unique ID generated by Milvus.
statusThe status of a processed operation or request.
  • abandon

  • success

  • fail

query_typeThe type of read request.
  • search

  • query

msg_typeThe type of message.
  • insert
  • delete
  • search
  • query
segment_stateThe status of a segment.
  • Sealed
  • Growing
  • Flushed
  • Flushing
  • Dropped
  • Importing
cache_stateThe status of a cached object.
  • hit
  • miss
cache_nameThe name of a cached object. This label is used together with the label cache_state.

Examples of possible values:

  • CollectionID

  • Schema

channel_namePhysical topics in message storage (Pulsar or Kafka).

Examples of possible values:

  • by-dev-rootcoord-dml_0

  • by-dev-rootcoord-dml_255

function_nameThe name of a function that handles certain requests.

Examples of possible values:

  • CreateCollection

  • CreatePartition

  • CreateIndex

user_nameThe username used for authentication.A username of your preference.
index_task_statusThe status of an index task in meta storage.
  • unissued
  • in-progress
  • failed
  • finished
  • recycled