Prometheus Extension for Machine Agent
Overview
The Prometheus Monitoring Extension for AppDynamics Machine Agent enables you to seamlessly integrate Prometheus metrics into your AppDynamics monitoring ecosystem. This native extension automates the collection, transformation, and forwarding of metrics from any Prometheus exporter directly to your AppDynamics Controller, providing centralized visibility and alerting for your infrastructure.
Key Capabilities
- Collect Metrics: - Integrate with any Prometheus exporter, including Node Exporter, cAdvisor, various database exporters, or your custom application metrics endpoints.
- Transform Metrics: - Apply filters, mappings, aggregations, and computed formulas to your Prometheus metrics for tailored monitoring and reporting.
- Send to AppDynamics Controller: - Forward all collected metrics to your AppDynamics Controller, enabling unified monitoring and alerting for the existing data sources.
- Seamless Scalability: - Effortlessly scale from a single Prometheus exporter to hundreds across your environment, to simplify large-scale metric collection.
Workflow
- Enable the extension in
controller-info.xmland start your Prometheus exporter (example, Node Exporter on monitored hosts). - Configure global and exporter-specific settings in the respective YAML files.
- Restart the Machine Agent.
Use the Prometheus Extension
- Update
Controller-info.xmlwith<prometheus-enabled>true</prometheus-enabled> - Run the required Prometheus exporter such as, Node Exporter, cAdvisor. If the exporter is already running, ignore this step.
Configure the YAML files
The extension introduces two new configuration files located under conf/prometheus:
-
prometheus-config.yaml: This YAML contains global settings applied to all exporters unless specifically overridden. Seeprometheus-config-reference.yamlfor configuration options and usage.# ================================================================================== # AppDynamics Prometheus Configuration - Reference Template # ================================================================================== # This is a comprehensive reference file documenting all available configuration # options for the AppDynamics Prometheus integration. # # Use this file as a reference when creating your own prometheus-config.yaml # # Documentation: <add link> # Version: 1.0 # ================================================================================== prometheus: # ================================================================================ # GLOBAL SETTINGS # ================================================================================ # These settings apply to all exporters unless explicitly overridden in the # individual exporter configuration files. # ================================================================================ global: # -------------------------------------------------------------------------- # scrapeInterval (integer, seconds) # -------------------------------------------------------------------------- # How often to scrape metrics from all exporters # # Default: 60 seconds # Range: 10-3600 seconds # Recommendation: # - 30-60s for most use cases # - 10-30s for high-frequency monitoring # - 120-300s for low-frequency infrastructure metrics # # Note: Can be overridden per exporter # -------------------------------------------------------------------------- scrapeInterval: 60 # -------------------------------------------------------------------------- # scrapeTimeout (integer, seconds) # -------------------------------------------------------------------------- # Maximum time to wait for a scrape request to complete # # Default: 10 seconds # Range: 5-60 seconds # Recommendation: Set to < scrapeInterval to avoid overlapping scrapes # # Note: If a scrape takes longer than this, it will be cancelled and an # error will be logged. The next scrape will still occur on schedule. # -------------------------------------------------------------------------- scrapeTimeout: 10 # -------------------------------------------------------------------------- # connectTimeout (integer, seconds) # -------------------------------------------------------------------------- # Maximum time to wait for HTTP connection establishment # # Default: 5 seconds # Range: 1-30 seconds # Recommendation: 5-10s for remote exporters, 2-5s for local exporters # -------------------------------------------------------------------------- connectTimeout: 5 # -------------------------------------------------------------------------- # maxMetricsPerScrape (integer) # -------------------------------------------------------------------------- # Maximum number of metrics to collect in a single scrape operation # # Default: 10000 # Range: 100-100000 # Purpose: Prevents memory issues and runaway metric collection # # Note: If an exporter returns more metrics than this limit, only the # first N metrics will be processed and a warning will be logged. # -------------------------------------------------------------------------- maxMetricsPerScrape: 10000 # -------------------------------------------------------------------------- # defaultMetricPathPrefix (string) # -------------------------------------------------------------------------- # Default prefix for all metrics # # Default: "Custom Metrics|Prometheus" # Examples: # - "Custom Metrics|Prometheus" # - "Custom Metrics|Infrastructure" # - "Hardware Resources" # # Note: Individual exporters can override this with their own prefix # -------------------------------------------------------------------------- defaultMetricPathPrefix: "Custom Metrics|Prometheus" # -------------------------------------------------------------------------- # includeUnmappedMetrics (boolean) # -------------------------------------------------------------------------- # Whether to include metrics that don't have explicit mappings # # Default: false # Options: true | false # # When true: All metrics from exporters will be reported, even if they # don't have a mapping rule defined # When false: Only metrics with explicit mappings will be reported # # Recommendation: Start with false to avoid metric explosion, then enable # for specific exporters as needed # -------------------------------------------------------------------------- includeUnmappedMetrics: false # -------------------------------------------------------------------------- # userAgent (string) # -------------------------------------------------------------------------- # User-Agent header sent with all HTTP requests to exporters # # Default: "AppDynamics-MachineAgent-Prometheus/1.0" # Purpose: Helps identify the scraper in exporter logs and metrics # # Format: Usually "<Product>/<Version>" # -------------------------------------------------------------------------- userAgent: "AppDynamics-MachineAgent-Prometheus/1.0" # -------------------------------------------------------------------------- # THREADING CONFIGURATION # -------------------------------------------------------------------------- # Controls the thread pool used for concurrent scraping operations # -------------------------------------------------------------------------- threading: # ------------------------------------------------------------------------ # poolSize (integer) # ------------------------------------------------------------------------ # Number of worker threads in the scraper thread pool # # Default: max(4, number_of_cpu_cores) # Range: 1-100 # # Recommendation: # - 1-2 threads: For 1-2 exporters or limited resources # - 4-8 threads: For 3-10 exporters (typical use case) # - 10+ threads: For 10+ exporters with sufficient CPU/memory # # Note: Each scrape operation uses one thread. If you have 5 exporters # scraping every 60s, a pool of 5-10 threads is sufficient. # ------------------------------------------------------------------------ poolSize: 4 # ------------------------------------------------------------------------ # threadNamePrefix (string) # ------------------------------------------------------------------------ # Prefix for thread names in the pool (useful for debugging and monitoring) # # Default: "appd-prom-scraper" # # Purpose: Makes it easy to identify Prometheus scraper threads in # thread dumps and profilers # # Thread naming pattern: "{prefix}-{number}" # Example: "appd-prom-scraper-1", "appd-prom-scraper-2", etc. # ------------------------------------------------------------------------ threadNamePrefix: "appd-prom-scraper" # ------------------------------------------------------------------------ # daemonThreads (boolean) # ------------------------------------------------------------------------ # Whether scraper threads should be daemon threads # # Default: true # Options: true | false # # When true: Threads won't prevent JVM shutdown # When false: JVM will wait for threads to complete before shutdown # # Recommendation: Keep as true (default) for normal operation # ------------------------------------------------------------------------ daemonThreads: true # ================================================================================ # EXPORTER REFERENCES # ================================================================================ # List of individual exporter configuration files to load # # Each exporter configuration is defined in a separate YAML file, which allows: # - Modular configuration management # - Easy addition/removal of exporters # - Clean separation of concerns # - Version control friendly # # File Path Resolution: # - Relative paths are resolved relative to this configuration file's directory # - Absolute paths can also be used # - Example: "exporters/my-exporter.yaml" → "conf/prometheus/exporters/my-exporter.yaml" # # Loading Behavior: # - Files are loaded in the order specified # - If a file is missing or invalid, an error is logged and that exporter is skipped # - Other exporters continue to function normally # ================================================================================ exporters: # Example 1: NVIDIA DCGM GPU Exporter (GPU monitoring) - configFile: "exporters/dcgm-exporter.yaml" # Example 2: Node Exporter (System metrics - CPU, memory, disk, network) # Uncomment to enable: # - configFile: "exporters/node-exporter.yaml" # Example 3: Custom application exporter # - configFile: "exporters/my-app-exporter.yaml" # Example 4: Kafka exporter # - configFile: "exporters/kafka-exporter.yaml" # Example 5: MongoDB exporter # - configFile: "exporters/mongodb-exporter.yaml" # You can add as many exporter references as needed # Each exporter runs independently with its own scrape schedule # ================================================================================== # TROUBLESHOOTING # ================================================================================== # # Problem: No metrics appearing in Controller # Solution: # - Check Machine Agent logs for errors # - Verify exporter is enabled: enabled: true # - Ensure generic monitoring is enabled: -Dappdynamics.sim.prometheus.enabled=true # - Verify exporter endpoint is accessible # # Problem: Too many metrics being collected # Solution: # - Add filters to restrict metric collection # - Set includeUnmappedMetrics: false # - Reduce maxMetricsPerScrape if needed # # Problem: Scrapes timing out # Solution: # - Increase scrapeTimeout # - Reduce maxMetricsPerScrape # - Check exporter performance # - Verify network connectivity # # Problem: Metrics delayed or missing # Solution: # - Check scrape interval isn't too long # - Verify thread pool size is sufficient # - Review logs for scrape failures # # ================================================================================== -
exporters/exporter-template.yaml: This YAML contains global settings for configuring exporter-specific settings. Duplicate this file for each Prometheus exporter you wish to monitor and customize as necessary.# ================================================================================== # AppDynamics Prometheus Exporter Configuration - Template # ================================================================================== # This is a comprehensive template documenting all available configuration options # for a single Prometheus exporter. # # Use this file as a reference when creating configuration for a new exporter. # # Copy this file, rename it (e.g., "my-exporter.yaml"), customize the values, # and reference it in prometheus-config.yaml # # Documentation: <add link> # Version: 1.0 # ================================================================================== # ============================================================================== # EXPORTER IDENTIFICATION # ============================================================================== # ------------------------------------------------------------------------------ # name (string, REQUIRED) # ------------------------------------------------------------------------------ # Unique identifier for this exporter # # Requirements: # - Must be unique across all exporters # - Use alphanumeric characters, hyphens, and underscores # - No spaces allowed # # Examples: "node-exporter", "kafka-broker-1", "app-metrics" # # Purpose: Used in logs, metrics paths, and internal tracking # ------------------------------------------------------------------------------ name: "example-exporter" # ------------------------------------------------------------------------------ # type (string, optional) # ------------------------------------------------------------------------------ # Exporter type/category for documentation and organization # # Common types: # - "node" - System metrics (CPU, memory, disk, network) # - "dcgm" - NVIDIA GPU metrics # - "application" - Application-specific metrics # - "database" - Database metrics (MySQL, PostgreSQL, MongoDB) # - "messaging" - Message queue metrics (Kafka, RabbitMQ) # - "container" - Container runtime metrics (cAdvisor) # - "custom" - Custom/proprietary metrics # # Purpose: Helps with categorization and filtering # ------------------------------------------------------------------------------ type: "custom" # ------------------------------------------------------------------------------ # enabled (boolean) # ------------------------------------------------------------------------------ # Whether this exporter should be actively scraped # # Default: true # Options: true | false # # Use Cases: # - Temporarily disable an exporter without removing its configuration # - Environment-specific enablement (dev vs prod) # - A/B testing of different configurations # ------------------------------------------------------------------------------ enabled: true # ============================================================================== # SCRAPE CONFIGURATION # ============================================================================== # ------------------------------------------------------------------------------ # scrapeInterval (integer, seconds, optional) # ------------------------------------------------------------------------------ # How often to scrape metrics from this exporter # # Default: Inherits from global.scrapeInterval # Range: 10-3600 seconds # # Override the global setting when: # - This exporter updates metrics more/less frequently than others # - You want to reduce load on a slow exporter # - You need high-frequency monitoring for specific metrics # # ------------------------------------------------------------------------------ scrapeInterval: 60 # ------------------------------------------------------------------------------ # scrapeTimeout (integer, seconds, optional) # ------------------------------------------------------------------------------ # Maximum time to wait for this exporter to respond # # Default: Inherits from global.scrapeTimeout # Range: 5-60 seconds # # Recommendation: Set < scrapeInterval to avoid overlapping scrapes # # Override when: # - This exporter is slow to respond (large metric payloads) # - Network latency is higher than usual # - Exporter performs computation during scrape # ------------------------------------------------------------------------------ scrapeTimeout: 10 # ============================================================================== # SERVICE DISCOVERY # ============================================================================== # Configuration for locating and connecting to the Prometheus exporter endpoint # # Currently supported discovery type: # - static: Direct connection using host/port/path # ============================================================================== serviceDiscovery: # ---------------------------------------------------------------------------- # type (string) # ---------------------------------------------------------------------------- # Service discovery mechanism # # Supported Values: # - "static" : Direct host/IP connection (bare metal, VMs, or Kubernetes DNS) # - "kubernetes" : Kubernetes service discovery (constructs DNS automatically) # # Default: "static" # ---------------------------------------------------------------------------- type: "static" # ============================================================================ # STATIC DISCOVERY CONFIGURATION # ============================================================================ # Use this for bare metal, VMs, or when you want to specify the full hostname # (including Kubernetes service DNS names manually) # ---------------------------------------------------------------------------- # host (string, REQUIRED for static) # ---------------------------------------------------------------------------- # Hostname or IP address of the exporter # # Examples: # - "localhost" - Local exporters # - "192.168.1.100" - Direct IP address # - "metrics.example.com" - DNS hostname # - "exporter.namespace.svc.cluster.local" - Kubernetes service (manual) # # Environment Variables: Supports ${env:VAR_NAME:default} syntax # Example: "${env:EXPORTER_HOST:localhost}" # ---------------------------------------------------------------------------- host: "localhost" # ---------------------------------------------------------------------------- # port (integer, REQUIRED) # ---------------------------------------------------------------------------- # TCP port number where the exporter is listening # # Range: 1-65535 # # Common Prometheus exporter ports: # - 9100: Node Exporter # - 9400: DCGM Exporter (GPU) # - 9090: Prometheus itself # - 9308: Kafka Exporter # - 9216: MongoDB Exporter # - 9104: MySQL Exporter # # Check your exporter's documentation for the default port # # Can use environment variables: ${env:EXPORTER_PORT:9090} # ---------------------------------------------------------------------------- port: 9090 # ---------------------------------------------------------------------------- # path (string) # ---------------------------------------------------------------------------- # URL path to the metrics endpoint # # Default: "/metrics" (Prometheus standard) # # Examples: # - "/metrics" - Standard Prometheus endpoint # - "/actuator/prometheus" - Spring Boot Actuator # - "/federate" - Prometheus federation endpoint # - "/api/metrics" - Custom application endpoint # # Can use environment variables: ${env:METRICS_PATH:/metrics} # ---------------------------------------------------------------------------- path: "/metrics" # ============================================================================ # KUBERNETES DISCOVERY CONFIGURATION # ============================================================================ # When type is "kubernetes", the Machine Agent will automatically construct # the Kubernetes service DNS name as: <serviceName>.<namespace>.svc.cluster.local # # Example Configuration (uncomment to use): # ---------------------------------------------------------------------------- # type: "kubernetes" # # # ---------------------------------------------------------------------------- # # serviceName (string, REQUIRED for kubernetes) # # ---------------------------------------------------------------------------- # # The Kubernetes service name # # # # Examples: # # - "dcgm-exporter" # # - "node-exporter" # # - "prometheus-server" # # # # Can use environment variables: ${env:K8S_SERVICE_NAME:dcgm-exporter} # # ---------------------------------------------------------------------------- # serviceName: "dcgm-exporter" # # # ---------------------------------------------------------------------------- # # namespace (string, REQUIRED for kubernetes) # # ---------------------------------------------------------------------------- # # The Kubernetes namespace where the service is deployed # # # # Examples: # # - "default" # # - "monitoring" # # - "gpu-operator" # # # # Can use environment variables: ${env:K8S_NAMESPACE:default} # # ---------------------------------------------------------------------------- # namespace: "gpu-operator" # # # Port and path work the same way as static discovery # port: 9400 # path: "/metrics" # ---------------------------------------------------------------------------- # ---------------------------------------------------------------------------- # scheme (string, optional) # ---------------------------------------------------------------------------- # Protocol scheme for the connection # # Supported: "http", "https" # Default: "http" # # Use "https" if your exporter uses TLS/SSL # ---------------------------------------------------------------------------- # scheme: "http" # ============================================================================== # AUTHENTICATION # ============================================================================== # Configure authentication for accessing the exporter endpoint # # Supported types: # - none: No authentication (default) # - basic: HTTP Basic Authentication (username/password) # - bearer: Bearer token authentication (OAuth2, API tokens) # ============================================================================== authentication: # ---------------------------------------------------------------------------- # type (string) # ---------------------------------------------------------------------------- # Authentication mechanism # # Options: "none" | "basic" | "bearer" # Default: "none" # ---------------------------------------------------------------------------- type: "none" # ---------------------------------------------------------------------------- # BASIC AUTHENTICATION # ---------------------------------------------------------------------------- # Used when type: "basic" # # Sends credentials in Authorization header: "Basic base64(username:password)" # # Security Best Practices: # - Use environment variables for credentials: ${env:USERNAME} # ---------------------------------------------------------------------------- # basic: # # Username for authentication # username: "${env:EXPORTER_USERNAME}" # # # Password for authentication # password: "${env:EXPORTER_PASSWORD}" # ---------------------------------------------------------------------------- # BEARER TOKEN AUTHENTICATION # ---------------------------------------------------------------------------- # Used when type: "bearer" # # Sends token in Authorization header: "Bearer <token>" # # Two options for providing the token: # 1. Directly in configuration (less secure, use env vars) # 2. Read from a file (more secure, e.g., Kubernetes secrets) # # ---------------------------------------------------------------------------- # bearer: # # Option 1: Token directly (use env var) # token: "${env:EXPORTER_TOKEN}" # # # Option 2: Read token from file (preferred for Kubernetes) # # tokenFile: "/var/run/secrets/exporter/token" # ============================================================================== # METRIC CONFIGURATION # ============================================================================== # Controls how metrics are collected, filtered, transformed, and mapped # ============================================================================== metricConfig: # ---------------------------------------------------------------------------- # metricPathPrefix (string, optional) # ---------------------------------------------------------------------------- # Prefix for all metrics from this exporter in the AppDynamics metric tree # # Default: Inherits from global.defaultMetricPathPrefix # # Examples: # - "Custom Metrics|Infrastructure|Node" # - "Hardware Resources|GPU" # - "Application|MyApp|Metrics" # # Purpose: Organizes metrics in AppDynamics metric browser # # Note: Individual metric mappings can override this with absolute paths # ---------------------------------------------------------------------------- metricPathPrefix: "Custom Metrics|Prometheus" # ---------------------------------------------------------------------------- # includeUnmappedMetrics (boolean, optional) # ---------------------------------------------------------------------------- # Whether to include metrics without explicit mappings # # Default: Inherits from global.includeUnmappedMetrics (usually false) # # When true: All metrics are reported, mapped and unmapped # Unmapped metrics will appear under: metricPathPrefix|exporterName|metricName # When false: Only metrics with explicit mappings are reported # # Use Cases: # - Set to true temporarily to discover available metrics # - Enable for exporters with stable, well-known metrics # - Disable to prevent metric explosion from dynamic labels # ---------------------------------------------------------------------------- includeUnmappedMetrics: false # ============================================================================ # METRIC FILTERS # ============================================================================ # Include/exclude metrics based on regex patterns # # Filter Order: # 1. Include patterns are evaluated first (if defined) # 2. Exclude patterns are evaluated second (if defined) # 3. A metric must match include AND not match exclude to be collected # # Regex Syntax: # - Uses Java regex syntax # - Patterns are matched against the full metric name # - Use .* for wildcards # - Use ^ and $ for exact matches # ============================================================================ filters: # -------------------------------------------------------------------------- # include (array of strings, optional) # -------------------------------------------------------------------------- # Regex patterns for metrics to INCLUDE # # Behavior: # - If defined, ONLY metrics matching at least one pattern are collected # - If not defined, all metrics pass the include filter # # Examples: # - "http_.*" - All HTTP metrics # - "node_cpu_.*" - All CPU metrics # - "myapp_(requests|errors)_total" - Specific counters # # -------------------------------------------------------------------------- include: - ".*" # Match all metrics (effectively no filtering) # - "http_requests_.*" # - "database_queries_.*" # - "cache_.*" # -------------------------------------------------------------------------- # exclude (array of strings, optional) # -------------------------------------------------------------------------- # Regex patterns for metrics to EXCLUDE # # Behavior: # - Evaluated AFTER include patterns # - Metrics matching any exclude pattern are dropped # # Common Exclusions: # - ".*_bucket" - Histogram buckets (high cardinality) # - ".*_created" - Metric creation timestamps # - ".*_debug.*" - Debug/internal metrics # - ".*_test.*" - Test metrics # # -------------------------------------------------------------------------- exclude: - ".*_bucket" # Exclude histogram buckets - ".*_created" # Exclude creation timestamps # - ".*_debug_.*" # - "go_.*" # Exclude Go runtime metrics # ============================================================================ # PROPERTY MAPPINGS # ============================================================================ # Extract values from metric labels and report them as Machine Agent properties # ============================================================================ propertyMappings: # -------------------------------------------------------------------------- # Property Mapping Entry # -------------------------------------------------------------------------- # Each entry extracts a value from metric labels and creates a property # -------------------------------------------------------------------------- # Example 1: Count unique values (e.g., number of CPUs, GPUs, nodes) - propertyName: "Example|Total Instances" sourceLabelName: "instance" extractionType: "COUNT_UNIQUE_VALUES" description: "Total number of unique instances" # Example 2: Get first value (e.g., version, hostname) # - propertyName: "Example|Version" # sourceLabelName: "version" # extractionType: "FIRST_VALUE" # description: "Application version" # Example 3: Concatenate all values (e.g., list of hosts) # - propertyName: "Example|All Instances" # sourceLabelName: "instance" # extractionType: "CONCATENATE_VALUES" # description: "Comma-separated list of all instances" # Example 4: Pattern transformation (e.g., formatted strings) # - propertyName: "Hardware|GPU Model" # sourceLabelName: "gpu_model" # extractionType: "PATTERN_TRANSFORM" # valuePattern: "GPU Model: {value}" # description: "Formatted GPU model string" # Example 5: Filtered extraction (e.g., only specific label values) # - propertyName: "Application|Production Nodes" # sourceLabelName: "hostname" # extractionType: "CONCATENATE_VALUES" # labelValueFilter: "prod-.*" # description: "List of production hostnames only" # -------------------------------------------------------------------------- # propertyName (string, REQUIRED) # -------------------------------------------------------------------------- # Name of the property in Machine Agent # Use "|" as a separator for hierarchical names # # Examples: # - "Hardware|GPU|Count" # - "Application|Database|Version" # - "System|Hostname" # # Best Practices: # - Use clear, descriptive names # - Organize hierarchically (Category|Subcategory|Property) # - Avoid special characters except "|" # -------------------------------------------------------------------------- # -------------------------------------------------------------------------- # sourceLabelName (string, REQUIRED) # -------------------------------------------------------------------------- # Name of the Prometheus label to extract the value from # # Examples: # - "instance" - Instance/host identifier # - "version" - Application/library version # - "cpu" - CPU core identifier # - "gpu" - GPU device identifier # - "model" - Hardware model name # # Note: Label must exist in at least one metric for extraction to occur # -------------------------------------------------------------------------- # -------------------------------------------------------------------------- # extractionType (string, REQUIRED) # -------------------------------------------------------------------------- # How to extract/aggregate values from the label across all metrics # # Options: # # 1. DIRECT_VALUE # Extract the label value directly without transformation # Use for: Static values, identifiers, names # Example: Label "hostname"="server-01" → Property "server-01" # # 2. FIRST_VALUE # Use the first encountered value for this label # Use for: Properties that should be consistent (version, hostname) # Example: First metric has "version"="1.2.3" → Property "1.2.3" # # 3. COUNT_UNIQUE_VALUES # Count distinct values for this label across all metrics # Use for: Counting resources (CPUs, GPUs, nodes, instances) # Example: Label "gpu" has values "0","1","2" → Property "3" # # 4. COUNT_OCCURRENCES # Count total number of metrics that have this label # Use for: Total metric count, activity indicators # Example: 15 metrics have label "gpu" → Property "15" # # 5. CONCATENATE_VALUES # Join all unique values with commas (sorted alphabetically) # Use for: Lists of resources, inventories # Example: Label "cpu" has values "0","1","2" → Property "0,1,2" # # 6. PATTERN_TRANSFORM # Transform value using a pattern with {value} placeholder # Use for: Formatted strings, prefixed/suffixed values # Example: Pattern "GPU Count: {value}" + value "3" → "GPU Count: 3" # Requires: valuePattern field to be specified # # -------------------------------------------------------------------------- # -------------------------------------------------------------------------- # valuePattern (string, optional) # -------------------------------------------------------------------------- # Pattern for transforming label values (only for PATTERN_TRANSFORM type) # # Use {value} as placeholder for the actual label value # # Examples: # - "GPU Model: {value}" → "GPU Model: Tesla V100" # - "Version {value}" → "Version 1.2.3" # - "Host: {value}" → "Host: server-01" # - "{value} GPUs Available" → "4 GPUs Available" # # Note: Only used when extractionType is PATTERN_TRANSFORM # -------------------------------------------------------------------------- # -------------------------------------------------------------------------- # labelValueFilter (string, optional) # -------------------------------------------------------------------------- # Regex pattern to filter which label values to include # # Use Cases: # - Extract properties only from specific instances/nodes # - Filter by environment (prod, staging, dev) # - Include only specific device types # # Examples: # - "prod-.*" - Only production nodes # - "gpu[0-3]" - Only GPUs 0-3 # - "^worker.*" - Only worker nodes # - ".*-primary$" - Only primary instances # # Behavior: # - If specified, only metrics with label values matching the pattern # will be included in property extraction # - Uses Java regex syntax # - If not specified, all label values are included # # Example Usage: # propertyName: "Infrastructure|Production Nodes" # sourceLabelName: "hostname" # extractionType: "COUNT_UNIQUE_VALUES" # labelValueFilter: "prod-.*" # # Result: Counts only hostnames starting with "prod-" # -------------------------------------------------------------------------- # -------------------------------------------------------------------------- # description (string, optional) # -------------------------------------------------------------------------- # Human-readable description of the property # Used for documentation purposes only # -------------------------------------------------------------------------- # ============================================================================ # METRIC MAPPINGS # ============================================================================ # Transform Prometheus metrics into AppDynamics metrics # # Purpose: # - Map Prometheus metric names to AppDynamics metric paths # - Transform metric values (unit conversion, scaling) # - Transform labels into metric path segments # - Set aggregation type for AppDynamics # - Filter metrics based on label values # ============================================================================ mappings: # -------------------------------------------------------------------------- # Metric Mapping Entry # -------------------------------------------------------------------------- # Each entry defines how one Prometheus metric maps to AppDynamics # -------------------------------------------------------------------------- # Example 1: Simple metric with label transformation - sourceMetricName: "http_requests_total" targetMetricPath: "Application|HTTP|{method}|{status}|Requests" aggregationType: "SUM" labelMappings: method: "{value}" # Keep as-is: GET, POST, etc. status: "status_{value}" # Transform: 200 → status_200 # Example 2: Metric with unit conversion (bytes to MB) # - sourceMetricName: "memory_bytes" # targetMetricPath: "System|Memory|Used (MB)" # aggregationType: "OBSERVATION" # multiplier: 0.000001 # bytes to MB # Example 3: Metric with conditional filtering # - sourceMetricName: "node_cpu_seconds_total" # targetMetricPath: "System|CPU|{cpu}|{mode}|Seconds" # aggregationType: "AVERAGE" # labelMappings: # cpu: "cpu{value}" # mode: "{value}" # -------------------------------------------------------------------------- # sourceMetricName (string, REQUIRED) # -------------------------------------------------------------------------- # Exact name of the Prometheus metric to map # # Examples: # - "http_requests_total" # - "node_memory_MemAvailable_bytes" # - "DCGM_FI_DEV_GPU_UTIL" # # Note: Must match exactly (case-sensitive) # -------------------------------------------------------------------------- # -------------------------------------------------------------------------- # targetMetricPath (string, REQUIRED) # -------------------------------------------------------------------------- # Metric path in AppDynamics metric tree # # Supports placeholders: {labelName} # Placeholders are replaced with label values from the metric # # Examples: # - "Application|HTTP|Requests" - Static path # - "Application|HTTP|{method}|Requests" - Dynamic with method label # - "System|CPU|{cpu}|{mode}|Utilization" - Multiple placeholders # # Best Practices: # - Use clear, hierarchical paths # - Place most specific values at the end # - Include units in path: "Memory (MB)", "Latency (ms)" # -------------------------------------------------------------------------- # -------------------------------------------------------------------------- # aggregationType (string, REQUIRED) # -------------------------------------------------------------------------- # How AppDynamics should aggregate this metric over time # # Options: # - OBSERVATION: Point-in-time measurement (gauges) # Use for: CPU%, memory usage, queue size, temperature # # - AVERAGE: Average over time window # Use for: Response times, latencies, utilization percentages # # - SUM: Sum over time window (counters) # Use for: Request counts, error counts, bytes transferred # # Prometheus Metric Type → Aggregation Type mapping: # - Gauge → OBSERVATION or AVERAGE # - Counter → SUM (use rate() or increase() on Prometheus side first) # - Histogram → AVERAGE for quantiles, SUM for counts/sums # - Summary → AVERAGE for quantiles, SUM for counts/sums # -------------------------------------------------------------------------- # -------------------------------------------------------------------------- # multiplier (number, optional) # -------------------------------------------------------------------------- # Multiply metric value by this factor # # Use Cases: # - Unit conversion: # - Bytes to MB: 0.000001 # - Bytes to GB: 0.000000001 # - Seconds to milliseconds: 1000 # - Decimal to percentage: 100 # - Scaling: # - Convert from basis points: 0.01 # # Examples: # multiplier: 0.000001 # bytes → MB # multiplier: 100 # decimal → percentage # multiplier: 1000 # seconds → milliseconds # -------------------------------------------------------------------------- # -------------------------------------------------------------------------- # unit (string, optional) # -------------------------------------------------------------------------- # Unit of measurement for documentation # # Examples: "bytes", "MB", "ms", "%", "count", "ops/sec" # # Note: This is informational only and doesn't affect metric values # Consider including units in targetMetricPath instead # -------------------------------------------------------------------------- # -------------------------------------------------------------------------- # labelMappings (map, optional) # -------------------------------------------------------------------------- # Transform label values before inserting into metric path # # Format: labelName: "pattern" # # Two transformation types: # 1. Pattern with {value}: Replace {value} with original label value # Example: "cpu{value}" transforms "0" → "cpu0" # # 2. Static string: Replace label value with static string # Example: "primary" transforms any value → "primary" # # Examples: # labelMappings: # cpu: "cpu{value}" # 0 → cpu0, 1 → cpu1 # mode: "{value}" # keep as-is # status: "HTTP_{value}" # 200 → HTTP_200 # -------------------------------------------------------------------------- # ============================================================================ # AGGREGATE METRICS # ============================================================================ # Calculate aggregate statistics across multiple individual metrics # # Purpose: # - Report cluster-wide or system-wide statistics # - Reduce metric cardinality for high-level views # - Provide summary metrics for dashboards # # Timing: Calculated after individual metrics are collected # # Use Cases: # - Average CPU utilization across all cores # - Total memory used across all nodes # - Maximum temperature across all GPUs # - Count of active instances # ============================================================================ aggregates: # -------------------------------------------------------------------------- # Aggregate Metric Entry # -------------------------------------------------------------------------- # Each entry defines one aggregate metric calculation # -------------------------------------------------------------------------- # Example 1: Average across a label (e.g., average CPU usage across cores) - sourceMetricName: "cpu_usage_percent" targetMetricPath: "System|CPU|Average Usage (%)" aggregationFunction: "AVERAGE" aggregateAcrossLabel: "cpu" # Example 2: Maximum value (e.g., hottest temperature) # - sourceMetricName: "temperature_celsius" # targetMetricPath: "Hardware|Maximum Temperature (C)" # aggregationFunction: "MAX" # aggregateAcrossLabel: "sensor" # Example 3: Sum (e.g., total memory across nodes) # - sourceMetricName: "memory_used_bytes" # targetMetricPath: "System|Total Memory Used (MB)" # aggregationFunction: "SUM" # aggregateAcrossLabel: "node" # multiplier: 0.000001 # Convert to MB # Example 4: Count (e.g., number of instances) # - sourceMetricName: "up" # targetMetricPath: "System|Instance Count" # aggregationFunction: "COUNT" # aggregateAcrossLabel: "instance" # Example 5: Filtered aggregate (e.g., average for specific nodes) # - sourceMetricName: "cpu_usage_percent" # targetMetricPath: "System|Worker Nodes|Average CPU (%)" # aggregationFunction: "AVERAGE" # aggregateAcrossLabel: "node" # filterLabels: # role: ["worker"] # Only worker nodes # -------------------------------------------------------------------------- # sourceMetricName (string, REQUIRED) # -------------------------------------------------------------------------- # Prometheus metric name to aggregate # # Note: This should match a sourceMetricName in the mappings section, # or be a metric that was collected # -------------------------------------------------------------------------- # -------------------------------------------------------------------------- # targetMetricPath (string, REQUIRED) # -------------------------------------------------------------------------- # Metric path for the aggregate metric in AppDynamics # # Best Practices: # - Use descriptive names: "Average", "Total", "Maximum" # - Include units: "(MB)", "(%)", "(ms)" # - Place aggregates in logical hierarchies # -------------------------------------------------------------------------- # -------------------------------------------------------------------------- # aggregationFunction (string, REQUIRED) # -------------------------------------------------------------------------- # Statistical function to apply # # Options: # - AVERAGE: Mean value across all instances # - SUM: Total sum across all instances # - MIN: Minimum value across all instances # - MAX: Maximum value across all instances # - COUNT: Number of instances (count of unique label values) # # Use Cases: # - AVERAGE: CPU utilization, memory usage, response time # - SUM: Total requests, total bytes, total errors # - MAX: Peak temperature, max latency, highest queue depth # - MIN: Minimum available memory, lowest utilization # - COUNT: Number of nodes, pods, instances # -------------------------------------------------------------------------- # -------------------------------------------------------------------------- # aggregateAcrossLabel (string, REQUIRED) # -------------------------------------------------------------------------- # Label name to aggregate across # # The aggregation will group all metrics with different values of this # label and apply the aggregation function # # Examples: # - "cpu" - Aggregate across all CPUs (cpu0, cpu1, cpu2, ...) # - "node" - Aggregate across all nodes # - "pod" - Aggregate across all pods # - "instance" - Aggregate across all instances # -------------------------------------------------------------------------- # -------------------------------------------------------------------------- # multiplier (number, optional) # -------------------------------------------------------------------------- # Apply unit conversion to aggregated value # # Same as multiplier in regular mappings # Applied AFTER aggregation # # Example: Sum memory in bytes, then convert to MB with multiplier: 0.000001 # -------------------------------------------------------------------------- # -------------------------------------------------------------------------- # filterLabels (map, optional) # -------------------------------------------------------------------------- # Filter which metrics to include in aggregation based on label values # # Format: labelName: [value1, value2, ...] # # Use Cases: # - Aggregate only specific subsets (e.g., worker nodes only) # - Create multiple aggregates for different groups # # Example: # filterLabels: # role: ["worker", "compute"] # environment: ["production"] # -------------------------------------------------------------------------- # ================================================================================== # USAGE EXAMPLES # ================================================================================== # # Example 1: Simple HTTP metrics exporter # ----------------------------------------- # name: "app-metrics" # type: "application" # enabled: true # serviceDiscovery: # type: "static" # host: "localhost" # port: 8080 # path: "/metrics" # metricConfig: # metricPathPrefix: "Application|MyApp" # filters: # include: # - "http_.*" # mappings: # - sourceMetricName: "http_requests_total" # targetMetricPath: "Application|MyApp|HTTP|{method}|Requests" # aggregationType: "SUM" # # Example 2: Authenticated database exporter # ------------------------------------------- # name: "postgres-metrics" # type: "database" # serviceDiscovery: # type: "static" # host: "postgres.prod.example.com" # port: 9187 # authentication: # type: "basic" # basic: # username: "${env:POSTGRES_EXPORTER_USER}" # password: "${env:POSTGRES_EXPORTER_PASSWORD}" # metricConfig: # metricPathPrefix: "Database|PostgreSQL" # filters: # include: # - "pg_stat_.*" # # ==================================================================================
The following is an example to edit the reference YAML:
prometheus:
global:
scrapeInterval: 60
defaultMetricPathPrefix: "Custom Metrics|Prometheus"
includeUnmappedMetrics: false
# (other global settings…)
exporters:
# --- Enabled Exporters (edit as needed) ---
# GPU Exporter (enabled)
- configFile: "exporters/dcgm-exporter.yaml"
# To enable Node Exporter (system metrics), uncomment below:
# - configFile: "exporters/node-exporter.yaml"
# To add your own exporter, copy this pattern:
# - configFile: "exporters/my-custom-app-exporter.yaml"
In this example:
-
Custom Metrics|Prometheushelps you easily locate and manage all Prometheus-based metrics inside the AppDynamics Controller by grouping them under a clearly named folder in the metric tree. -
scrapeInterval: 60means the agent will query (scrape) each configured Prometheus exporter once every 60 seconds (i.e., every one minute).
The following is an example for exporter:
# Node Exporter Configuration
name: "node-exporter"
type: "node"
enabled: true
scrapeInterval: 30
serviceDiscovery:
type: "static"
host: "localhost"
port: 9100
path: "/metrics"
authentication:
type: "none"
metricConfig:
metricPathPrefix: "Custom Metrics|Infrastructure|Node"
includeUnmappedMetrics: false
filters:
include:
- "node_cpu_.*"
- "node_memory_.*"
- "node_disk_.*"
- "node_network_.*"
- "node_filesystem_.*"
- "node_load.*"
exclude:
- ".*_bucket"
- ".*_created"
propertyMappings:
- propertyName: "Node|Hostname"
sourceLabelName: "instance"
extractionType: "FIRST_VALUE"
description: "Node hostname"
- propertyName: "Node|CPU Count"
sourceLabelName: "cpu"
extractionType: "COUNT_UNIQUE_VALUES"
description: "Number of CPU cores"
mappings:
- sourceMetricName: "node_cpu_seconds_total"
targetMetricPath: "Custom Metrics|Infrastructure|Node|CPU|{cpu}|{mode} (seconds)"
aggregationType: "AVERAGE"
labelMappings:
cpu: "cpu{value}"
mode: "{value}"
# ... more metric mappings ...
In this example:
- name - Unique identifier for the exporter instance.
-
type - Categorizes the exporter. For example, Node.
enabled - Determines if this exporter is active. For Example, Enabled.
scrapeInterval - Determines how often (in seconds) to collect metrics from this exporter.
serviceDiscovery - Determines how the agent locates the exporter endpoint.
authentication - Determines the credentials or method for connecting to protected exporters.
metricConfig - Controls how metrics are collected, filtered, and mapped.
aggregates - Defines how to compute aggregate metrics (average, sum) across labels.