Metrics Monitored for Aviatrix Resources

The system metrics and network metrics that you can access in CoPilot are captured by the Aviatrix Controller. The Controller pulls the data from virtual machines (instances/hosts) that Aviatrix Gateways run on and feeds that data to CoPilot. Some metrics can be used for triggering actions such as alerting and Gateway Scaling. You can also use metrics to monitor the performance of Gateway hosts. You can monitor performance in CoPilot from the Monitor > Performance page. In addition, with Aviatrix Network Insights API, you can use APIs to analyze the performance and health of your Aviatrix-managed resources in external monitoring systems. See External Monitoring with the Metrics and Status APIs for information about using the Metrics and Status APIs.

About Metrics that are Monitored for Aviatrix Resources

Aviatrix Controller captures system metric and network metric information about the virtual machines (instances/hosts) that Aviatrix Gateways run on. Health-type metric information is also captured for Controller and CoPilot virtual machines. See Global Control Plane Health Alert. Metrics that are monitored by Aviatrix Controller and Aviatrix CoPilot include the following:

System Metrics for Triggering Notifications or Other Actions.
Network Metrics for Triggering Notifications or Other Actions.
Health Metrics for Triggering Notifications or Other Actions.

On the CoPilot Monitor > Performance page, you can select metrics to monitor performance on your resource VMs. On the CoPilot Monitor > Notifications > Alert Configurations page, you can configure how to use the pre-existing set of metrics to send notifications about events that occur in your network, such as performance bottlenecks or other problems. To better understand how notifications and alerts work and how to configure them in CoPilot, see Notifications (Alerts) About Network Events. For more information about integrating Aviatrix metric and status APIs with external monitoring tools, see External Monitoring with the Metrics and Status APIs.

About Percentage Metrics

Several network metrics in the tables below are expressed as percentages (metrics with names beginning with per_). All percentage metrics share the same denominator: total attempted packets per second (rate_pkt_attempted), which is the sum of successfully processed packets (rx + tx) and all failure events across both directions. Because the denominator is bidirectional, a directional percentage (such as per_rx_drop) represents that failure type as a fraction of all traffic, not just inbound or outbound traffic.

Health Metrics for Triggering Notifications or Other Actions

The following health metrics are available in CoPilot. They are listed in alphabetical order, by the name used in the CoPilot UI.

Name (Health Metric)	Description	Internal Metric Name
BGP Peering Status	Any BGP peering status change triggers an alert.	BGPpeeringStatus
Connection Status	Any connection status change on the specified gateways/connections triggers an alert.	ConnectionStatus
Gateway Status	Any gateway status change triggers an alert.	GatewayStatus
Underlay Connection Status	Monitors the syslog from any connection that includes the host as the source or destination. When syslog data indicates a potential problem from each direction of the connection between that host and another host within 30 seconds of the other, the alert is triggered. On the same connection, if the syslog data later indicates the problem is resolved from either direction, the alert is automatically resolved.	UnderlayConnectionStatus

System Metrics for Triggering Notifications or Other Actions

For Aviatrix Controller and Aviatrix gateways, you can configure alerts based on the following system metrics. Aviatrix gateways report live Linux system statistics (such as memory, CPU, I/O, processes, and swap) for the instances/virtual machines on which they run. Metrics are listed in alphabetical order, by the name used in the CoPilot UI.

Name (System Metric)	Description	Internal Metric Name	Accessible by API
CPU Idle (%)	Of the total CPU time, the percentage of time the CPU(s) spent idle. Collected as a 3-second average from the gateway.	cpu_idle	✓
CPU Kernel Space (%)	Of the total CPU time, the percentage of time spent running kernel code (system mode).	cpu_ks	✓
CPU Steal (%)	Of the total CPU time, the percentage of time a virtual CPU waited for a real CPU while the hypervisor serviced another virtual processor. Relevant for shared-tenancy instances.	cpu_steal
CPU Used (%)	The percentage of CPU used, calculated as 100% minus CPU Idle.	cpu_used_per
CPU User Space (%)	Of the total CPU time, the percentage of time spent running user-space (non-kernel) code.	cpu_us	✓
CPU Wait (%)	Of the total CPU time, the percentage of time spent waiting for I/O operations to complete.	cpu_wait	✓
Disk Free	The free (unused) storage space on the disk volume, in bytes.	hdisk_free
Disk Free (%)	Of the total storage space on the disk volume, the percentage that is free and unused.	hdisk_free_per
Disk Total	The total storage capacity of the disk volume, in bytes.	hdisk_tot
IO Blocks In	The number of blocks per second received from block devices during the sampling interval.	io_blk_in
IO Blocks Out	The number of blocks per second sent to block devices during the sampling interval.	io_blk_out
Memory Available	The amount of memory (in bytes) available to be allocated to new or existing processes, including free memory and reclaimable caches.	memory_available	✓
Memory Available (%)	Of the total memory, the percentage that is available to be allocated to new or existing processes.	memory_available_per
Memory Buffer	The amount of memory (in bytes) used by kernel buffers.	memory_buf	✓
Memory Cache	The amount of memory (in bytes) used by the page cache.	memory_cached	✓
Memory Free	The amount of memory (in bytes) that is completely unused and available. Unlike Memory Available, this does not include reclaimable caches.	memory_free	✓
Memory Swapped	The amount of memory (in bytes) written to swap space. Reports 0 when swap is not in use.	memory_swpd	✓
Memory Total	The total physical memory (in bytes) on the host.	memory_tot
Memory Used	The amount of memory (in bytes) actively in use by processes.	memory_used
Memory Used (%)	Of the total memory, the percentage actively in use by processes.	memory_used_per
Processes Uninterruptible Sleep	The number of processes in an uninterruptible sleep state, typically blocked waiting for I/O to complete.	nproc_non_int_sleep
Processes Waiting To Be Run	The number of processes that are currently running or are in the run queue waiting for CPU time.	nproc_running
Swaps From Disk	The amount of memory (in kilobytes) swapped in from disk per second.	swap_from_disk
Swaps To Disk	The amount of memory (in kilobytes) swapped out to disk per second.	swap_to_disk
System Context Switches	The number of CPU context switches per second.	system_cs
System Interrupts	The number of hardware interrupts per second, including the clock interrupt.	system_int

Per-vCPU Metrics

Starting in CoPilot 4.32, the following per-vCPU metrics are available through the Metrics API. These metrics provide CPU utilization broken down by individual virtual CPU core for each gateway, enabling identification of single-core bottlenecks that aggregated CPU metrics may mask.

Name (vCPU Metric)	Description	Internal Metric Name	Accessible by API
vCPU Average Usage (%)	The average CPU utilization percentage for an individual vCPU core over the sampling interval.	vcpu_avg_usage	✓
vCPU Minimum Usage (%)	The minimum CPU utilization percentage observed for an individual vCPU core during the sampling interval.	vcpu_min_usage	✓
vCPU Maximum Usage (%)	The maximum CPU utilization percentage observed for an individual vCPU core during the sampling interval.	vcpu_max_usage	✓

Network Metrics for Triggering Notifications or Other Actions

For Aviatrix Controller and Aviatrix gateways, you can configure alerts based on the following network metrics. Metrics are listed in alphabetical order, by the name used in the CoPilot UI.

Cumulative Counters

Cumulative counters represent running totals since the interface was last reset. CoPilot uses the difference between consecutive readings to compute per-second rate and percentage metrics.

Name (Network Metric)	Description	Internal Metric Name	Accessible by API
Bandwidth Egress Limit Exceeded	(AWS Only) The cumulative count of events where the outbound (egress) bandwidth allowance for the instance type was exceeded. Sourced from the Elastic Network Adapter (ENA) driver.	bandwidth_egress_limit_exceeded
Bandwidth Ingress Limit Exceeded	(AWS Only) The cumulative count of events where the inbound (ingress) bandwidth allowance for the instance type was exceeded. Sourced from the ENA driver.	bandwidth_ingress_limit_exceeded	✓
Collisions during Transmission	The cumulative count of collisions detected during packet transmission on the interface.	tx_colls
Compressed Packets Received	The cumulative count of compressed packets received by the interface.	rx_compressed
Compressed Packets Transmitted	The cumulative count of compressed packets transmitted by the interface.	tx_compressed
Conntrack Allowance Available	(AWS Only) The number of tracked connections that can still be established before the instance’s connection-tracking allowance is exhausted. Sourced from the ENA driver.	conntrack_allowance_available
Conntrack Limit Exceeded	(AWS Only) The cumulative count of events where the connection-tracking (conntrack) table limit for the instance type was exceeded. Sourced from the ENA driver.	conntrack_limit_exceeded
Errored Packets Received	The cumulative count of packets received with errors as reported by the network interface (e.g., CRC errors, framing errors).	rx_errs
Errored Packets Transmitted	The cumulative count of packets that encountered errors during transmission.	tx_errs
Linklocal Limit Exceeded	(AWS Only) The cumulative count of events where the link-local packet rate limit for the instance type was exceeded. Sourced from the ENA driver.	linklocal_limit_exceeded
Multicast Packets Received	The cumulative count of multicast packets received by the interface.	rx_multicast
Packets Dropped during Transmission	The cumulative count of outbound packets dropped by the interface, typically due to resource constraints such as transmit queue overflow.	tx_drop	✓
Packets Dropped while Receiving	The cumulative count of inbound packets dropped by the interface, typically due to resource limitations such as receive buffer overflow.	rx_drop	✓
PPS Limit Exceeded	(AWS Only) The cumulative count of events where the packets-per-second allowance for the instance type was exceeded. This is a single aggregate counter covering both inbound and outbound directions. Sourced from the ENA driver.	pps_limit_exceeded	✓
Received Bytes	The cumulative count of bytes received by the interface.	rx_bytes
Received Frames	The cumulative count of frame alignment errors on received packets.	rx_frame
Received Packets	The cumulative count of packets successfully received by the interface.	rx_packets
Receiver FIFO Frames	The cumulative count of FIFO buffer overflow events when receiving packets.	rx_fifo
Transmission FIFO Frames	The cumulative count of FIFO buffer underrun events when transmitting packets.	tx_fifo
Transmitted Bytes	The cumulative count of bytes transmitted by the interface.	tx_bytes
Transmitted Carrier Frames	The cumulative count of carrier sense errors encountered during transmission (e.g., loss of link signal).	tx_carrier
Transmitted Packets	The cumulative count of packets successfully transmitted by the interface.	tx_packets

Per-Second Rates

CoPilot computes per-second rates from cumulative counter deltas between consecutive collection intervals. Throughput rates (rate_sent, rate_received, rate_total) are reported in bits per second.

Name (Network Metric)	Description	Internal Metric Name	Accessible by API
Bandwidth Egress Limit Exceeded Rate	(AWS Only) The per-second rate of egress bandwidth-limit-exceeded events. Sourced from the ENA driver.	rate_bandwidth_egress_limit_exceeded
Bandwidth Ingress Limit Exceeded Rate	(AWS Only) The per-second rate of ingress bandwidth-limit-exceeded events. Sourced from the ENA driver.	rate_bandwidth_ingress_limit_exceeded
Collisions Rate during Transmission	The per-second rate of collisions during packet transmission.	rate_tx_colls
Compressed Packets Received Rate	The per-second rate of compressed packets received.	rate_rx_compressed
Compressed Packets Transmitted Rate	The per-second rate of compressed packets transmitted.	rate_tx_compressed
Conntrack Limit Exceeded Rate	(AWS Only) The per-second rate of connection-tracking limit-exceeded events.	rate_conntrack_limit_exceeded
Conntrack Usage Rate	(AWS Only) The rate at which connection-tracking capacity is being consumed, reported in packets per second. Only available on instances where the Conntrack Allowance Available metric is present.	conntrack_usage_rate
Drop Rate during Transmission	The per-second rate of packets dropped during transmission.	rate_tx_drop	✓
Drop Rate while Receiving	The per-second rate of packets dropped while receiving.	rate_rx_drop	✓
Errored Packets Received Rate	The per-second rate of packets received with errors.	rate_rx_errs
Errored Packets Transmitted Rate	The per-second rate of packet transmission errors.	rate_tx_errs
Limit Exceeded Rate (PPS) - AWS Only	(AWS Only) The per-second rate of packets-per-second limit-exceeded events on the instance.	rate_pps_limit_exceeded
Linklocal Limit Exceeded Rate	(AWS Only) The per-second rate of link-local packet rate limit-exceeded events.	rate_linklocal_limit_exceeded
Multicast Packets Received Rate	The per-second rate of multicast packets received.	rate_rx_multicast
Packet Drop Rate	The per-second rate of all dropped packets across both directions. Computed as the sum of Drop Rate during Transmission and Drop Rate while Receiving.	rate_pkt_drop	✓
Packet Failure Rate	The aggregate per-second rate of all network failure events. This is the sum of 10 individual failure-type rates: bandwidth egress/ingress limit exceeded, conntrack limit exceeded, linklocal limit exceeded, PPS limit exceeded, rx/tx drops, rx/tx errors, and received frame errors.	rate_pkt_fail
Peak Received Rate	The peak inbound throughput in bits per second as reported by the gateway for the collection interval.	rate_peak_received
Peak Total Rate	The peak bidirectional throughput in bits per second as reported by the gateway for the collection interval.	rate_peak_total
Peak Transmitted Rate	The peak outbound throughput in bits per second as reported by the gateway for the collection interval.	rate_peak_sent
Received Frames Rate	The per-second rate of frame alignment errors on received packets.	rate_rx_frame
Received Rate	The inbound throughput in bits per second on the interface. Computed from the byte counter delta.	rate_received	✓
Received Rate (PPS)	The inbound packet throughput in packets per second.	pkt_rx_rate
Receiver FIFO Frames Rate	The per-second rate of receive FIFO buffer overflow events.	rate_rx_fifo
Total Attempted Rate	The total bidirectional packet rate including both successfully processed packets and all failure events, in packets per second. Computed as Total Rate (in packets) + Packet Failure Rate. Used as the denominator for all percentage metrics.	rate_pkt_attempted
Total Rate	The total bidirectional throughput in bits per second on the interface. Computed as the sum of Received Rate and Transmitted Rate.	rate_total	✓
Total Rate (in packets)	The total bidirectional packet throughput in packets per second. Computed as the sum of Received Rate (PPS) and Transmitted Rate (PPS). Instance size determines how many packets per second a gateway can handle.	pkt_rate_total
Transmission FIFO Frames Rate	The per-second rate of transmit FIFO buffer underrun events.	rate_tx_fifo
Transmitted Carrier Frames Rate	The per-second rate of carrier sense errors during transmission.	rate_tx_carrier
Transmitted Rate	The outbound throughput in bits per second on the interface. Computed from the byte counter delta.	rate_sent	✓
Transmitted Rate (PPS)	The outbound packet throughput in packets per second.	pkt_tx_rate

Percentage Metrics

Percentage metrics express a specific failure rate as a fraction of total attempted packets. All percentage metrics use the same bidirectional denominator: rate_pkt_attempted (see About Percentage Metrics).

Name (Network Metric)	Description	Internal Metric Name
Bandwidth Egress Limit Exceeded (%)	(AWS Only) Egress bandwidth-limit-exceeded events as a percentage of total attempted packets.	per_bandwidth_egress_limit
Bandwidth Ingress Limit Exceeded (%)	(AWS Only) Ingress bandwidth-limit-exceeded events as a percentage of total attempted packets.	per_bandwidth_ingress_limit_exceeded
Conntrack Limit Exceeded (%)	(AWS Only) Connection-tracking limit-exceeded events as a percentage of total attempted packets.	per_conntrack_limit_exceeded
Interface Drops during Transmission (%)	Packets dropped during transmission as a percentage of total attempted packets.	per_tx_drop
Interface Drops while Receiving (%)	Packets dropped while receiving as a percentage of total attempted packets.	per_rx_drop
Interface Errors during Transmission (%)	Transmission errors as a percentage of total attempted packets.	per_tx_errs
Interface Errors while Receiving (%)	Receive errors as a percentage of total attempted packets.	per_rx_errs
Linklocal Limit Exceeded (%)	(AWS Only) Link-local rate limit-exceeded events as a percentage of total attempted packets.	per_linklocal_limit_exceeded
Packet Drop (%)	All dropped packets (rx + tx) as a percentage of total attempted packets.	per_pkt_drop
Packet Failure (%)	All failure events (drops, errors, and limit-exceeded events) as a percentage of total attempted packets. This is the broadest failure percentage metric.	per_pkt_fail
PPS Limit Exceeded Drop (%)	(AWS Only) Packets-per-second limit-exceeded events as a percentage of total attempted packets.	per_pps_limit_exceeded

Concepts & Architecture

Guides

Reference

Metrics Monitored for Aviatrix Resources

About Metrics that are Monitored for Aviatrix Resources

About Percentage Metrics

Health Metrics for Triggering Notifications or Other Actions

System Metrics for Triggering Notifications or Other Actions

Per-vCPU Metrics

Network Metrics for Triggering Notifications or Other Actions

Cumulative Counters

Per-Second Rates

Percentage Metrics

Concepts & Architecture

Guides

Reference

​About Metrics that are Monitored for Aviatrix Resources

​About Percentage Metrics

​Health Metrics for Triggering Notifications or Other Actions

​System Metrics for Triggering Notifications or Other Actions

​Per-vCPU Metrics

​Network Metrics for Triggering Notifications or Other Actions

​Cumulative Counters

​Per-Second Rates

​Percentage Metrics

About Metrics that are Monitored for Aviatrix Resources

About Percentage Metrics

Health Metrics for Triggering Notifications or Other Actions

System Metrics for Triggering Notifications or Other Actions

Per-vCPU Metrics

Network Metrics for Triggering Notifications or Other Actions

Cumulative Counters

Per-Second Rates

Percentage Metrics