diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst new file mode 100644 index 000000000000..4cd8e869762b --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst @@ -0,0 +1,1302 @@ +.. SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +.. include:: + +================ +Ethtool counters +================ + +:Copyright: |copy| 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. + +Contents +======== + +- `Overview`_ +- `Groups`_ +- `Types`_ +- `Descriptions`_ + +Overview +======== + +There are several counter groups based on where the counter is being counted. In +addition, each group of counters may have different counter types. + +These counter groups are based on which component in a networking setup, +illustrated below, that they describe:: + + ---------------------------------------- + | | + ---------------------------------------- ---------------------------------------- | + | Hypervisor | | VM | | + | | | | | + | ------------------- --------------- | | ------------------- --------------- | | + | | Ethernet driver | | RDMA driver | | | | Ethernet driver | | RDMA driver | | | + | ------------------- --------------- | | ------------------- --------------- | | + | | | | | | | | | + | ------------------- | | ------------------- | | + | | | | | |-- + ---------------------------------------- ---------------------------------------- + | | + ------------- ----------------------------- + | | + ------ ------ ------ ------ ------ ------ ------ + -----| PF |----------------------| VF |-| VF |-| VF |----- --| PF |--- --| PF |--- --| PF |--- + | ------ ------ ------ ------ | | ------ | | ------ | | ------ | + | | | | | | | | + | | | | | | | | + | | | | | | | | + | eSwitch | | eSwitch | | eSwitch | | eSwitch | + ---------------------------------------------------------- ----------- ----------- ----------- + ------------------------------------------------------------------------------- + | | + | | + | Uplink (no counters) | + ------------------------------------------------------------------------------- + --------------------------------------------------------------- + | | + | | + | MPFS (no counters) | + --------------------------------------------------------------- + | + | + | Port + +Groups +====== + +Ring + Software counters populated by the driver stack. + +Netdev + An aggregation of software ring counters. + +vPort counters + Traffic counters and drops due to steering or no buffers. May indicate issues + with NIC. These counters include Ethernet traffic counters (including Raw + Ethernet) and RDMA/RoCE traffic counters. + +Physical port counters + Counters that collect statistics about the PFs and VFs. May indicate issues + with NIC, link, or network. This measuring point holds information on + standardized counters like IEEE 802.3, RFC2863, RFC 2819, RFC 3635 and + additional counters like flow control, FEC and more. Physical port counters + are not exposed to virtual machines. + +Priority Port Counters + A set of the physical port counters, per priority per port. + +Types +===== + +Counters are divided into three types. + +Traffic Informative Counters + Counters which count traffic. These counters can be used for load estimation + or for general debug. + +Traffic Acceleration Counters + Counters which count traffic that was accelerated by Mellanox driver or by + hardware. The counters are an additional layer to the informative counter set, + and the same traffic is counted in both informative and acceleration counters. + +.. [#accel] Traffic acceleration counter. + +Error Counters + Increment of these counters might indicate a problem. Each of these counters + has an explanation and correction action. + +Statistic can be fetched via the `ip link` or `ethtool` commands. `ethtool` +provides more detailed information.:: + + ip –s link show + ethtool -S + +Descriptions +============ + +XSK, PTP, and QoS counters that are similar to counters defined previously will +not be separately listed. For example, `ptp_tx[i]_packets` will not be +explicitly documented since `tx[i]_packets` describes the behavior of both +counters, except `ptp_tx[i]_packets` is only counted when precision time +protocol is used. + +Ring / Netdev Counter +---------------------------- +The following counters are available per ring or software port. + +These counters provide information on the amount of traffic that was accelerated +by the NIC. The counters are counting the accelerated traffic in addition to the +standard counters which counts it (i.e. accelerated traffic is counted twice). + +The counter names in the table below refers to both ring and port counters. The +notation for ring counters includes the [i] index without the braces. The +notation for port counters doesn't include the [i]. A counter name +`rx[i]_packets` will be printed as `rx0_packets` for ring 0 and `rx_packets` for +the software port. + +.. flat-table:: Ring / Software Port Counter Table + :widths: 2 3 1 + + * - Counter + - Description + - Type + + * - `rx[i]_packets` + - The number of packets received on ring i. + - Informative + + * - `rx[i]_bytes` + - The number of bytes received on ring i. + - Informative + + * - `tx[i]_packets` + - The number of packets transmitted on ring i. + - Informative + + * - `tx[i]_bytes` + - The number of bytes transmitted on ring i. + - Informative + + * - `tx[i]_recover` + - The number of times the SQ was recovered. + - Error + + * - `tx[i]_cqes` + - Number of CQEs events on SQ issued on ring i. + - Informative + + * - `tx[i]_cqe_err` + - The number of error CQEs encountered on the SQ for ring i. + - Error + + * - `tx[i]_tso_packets` + - The number of TSO packets transmitted on ring i [#accel]_. + - Acceleration + + * - `tx[i]_tso_bytes` + - The number of TSO bytes transmitted on ring i [#accel]_. + - Acceleration + + * - `tx[i]_tso_inner_packets` + - The number of TSO packets which are indicated to be carry internal + encapsulation transmitted on ring i [#accel]_. + - Acceleration + + * - `tx[i]_tso_inner_bytes` + - The number of TSO bytes which are indicated to be carry internal + encapsulation transmitted on ring i [#accel]_. + - Acceleration + + * - `rx[i]_gro_packets` + - Number of received packets processed using hardware-accelerated GRO. The + number of hardware GRO offloaded packets received on ring i. + - Acceleration + + * - `rx[i]_gro_bytes` + - Number of received bytes processed using hardware-accelerated GRO. The + number of hardware GRO offloaded bytes received on ring i. + - Acceleration + + * - `rx[i]_gro_skbs` + - The number of receive SKBs constructed while performing + hardware-accelerated GRO. + - Informative + + * - `rx[i]_gro_match_packets` + - Number of received packets processed using hardware-accelerated GRO that + met the flow table match criteria. + - Informative + + * - `rx[i]_gro_large_hds` + - Number of receive packets using hardware-accelerated GRO that have large + headers that require additional memory to be allocated. + - Informative + + * - `rx[i]_lro_packets` + - The number of LRO packets received on ring i [#accel]_. + - Acceleration + + * - `rx[i]_lro_bytes` + - The number of LRO bytes received on ring i [#accel]_. + - Acceleration + + * - `rx[i]_ecn_mark` + - The number of received packets where the ECN mark was turned on. + - Informative + + * - `rx_oversize_pkts_buffer` + - The number of dropped received packets due to length which arrived to RQ + and exceed software buffer size allocated by the device for incoming + traffic. It might imply that the device MTU is larger than the software + buffers size. + - Error + + * - `rx_oversize_pkts_sw_drop` + - Number of received packets dropped in software because the CQE data is + larger than the MTU size. + - Error + + * - `rx[i]_csum_unnecessary` + - Packets received with a `CHECKSUM_UNNECESSARY` on ring i [#accel]_. + - Acceleration + + * - `rx[i]_csum_unnecessary_inner` + - Packets received with inner encapsulation with a `CHECKSUM_UNNECESSARY` + on ring i [#accel]_. + - Acceleration + + * - `rx[i]_csum_none` + - Packets received with a `CHECKSUM_NONE` on ring i [#accel]_. + - Acceleration + + * - `rx[i]_csum_complete` + - Packets received with a `CHECKSUM_COMPLETE` on ring i [#accel]_. + - Acceleration + + * - `rx[i]_csum_complete_tail` + - Number of received packets that had checksum calculation computed, + potentially needed padding, and were able to do so with + `CHECKSUM_PARTIAL`. + - Informative + + * - `rx[i]_csum_complete_tail_slow` + - Number of received packets that need padding larger than eight bytes for + the checksum. + - Informative + + * - `tx[i]_csum_partial` + - Packets transmitted with a `CHECKSUM_PARTIAL` on ring i [#accel]_. + - Acceleration + + * - `tx[i]_csum_partial_inner` + - Packets transmitted with inner encapsulation with a `CHECKSUM_PARTIAL` on + ring i [#accel]_. + - Acceleration + + * - `tx[i]_csum_none` + - Packets transmitted with no hardware checksum acceleration on ring i. + - Informative + + * - `tx[i]_stopped` / `tx_queue_stopped` [#ring_global]_ + - Events where SQ was full on ring i. If this counter is increased, check + the amount of buffers allocated for transmission. + - Informative + + * - `tx[i]_wake` / `tx_queue_wake` [#ring_global]_ + - Events where SQ was full and has become not full on ring i. + - Informative + + * - `tx[i]_dropped` / `tx_queue_dropped` [#ring_global]_ + - Packets transmitted that were dropped due to DMA mapping failure on + ring i. If this counter is increased, check the amount of buffers + allocated for transmission. + - Error + + * - `tx[i]_nop` + - The number of nop WQEs (empty WQEs) inserted to the SQ (related to + ring i) due to the reach of the end of the cyclic buffer. When reaching + near to the end of cyclic buffer the driver may add those empty WQEs to + avoid handling a state the a WQE start in the end of the queue and ends + in the beginning of the queue. This is a normal condition. + - Informative + + * - `tx[i]_added_vlan_packets` + - The number of packets sent where vlan tag insertion was offloaded to the + hardware. + - Acceleration + + * - `rx[i]_removed_vlan_packets` + - The number of packets received where vlan tag stripping was offloaded to + the hardware. + - Acceleration + + * - `rx[i]_wqe_err` + - The number of wrong opcodes received on ring i. + - Error + + * - `rx[i]_mpwqe_frag` + - The number of WQEs that failed to allocate compound page and hence + fragmented MPWQE’s (Multi Packet WQEs) were used on ring i. If this + counter raise, it may suggest that there is no enough memory for large + pages, the driver allocated fragmented pages. This is not abnormal + condition. + - Informative + + * - `rx[i]_mpwqe_filler_cqes` + - The number of filler CQEs events that were issued on ring i. + - Informative + + * - `rx[i]_mpwqe_filler_strides` + - The number of strides consumed by filler CQEs on ring i. + - Informative + + * - `tx[i]_mpwqe_blks` + - The number of send blocks processed from Multi-Packet WQEs (mpwqe). + - Informative + + * - `tx[i]_mpwqe_pkts` + - The number of send packets processed from Multi-Packet WQEs (mpwqe). + - Informative + + * - `rx[i]_cqe_compress_blks` + - The number of receive blocks with CQE compression on ring i [#accel]_. + - Acceleration + + * - `rx[i]_cqe_compress_pkts` + - The number of receive packets with CQE compression on ring i [#accel]_. + - Acceleration + + * - `rx[i]_cache_reuse` + - The number of events of successful reuse of a page from a driver's + internal page cache. + - Acceleration + + * - `rx[i]_cache_full` + - The number of events of full internal page cache where driver can't put a + page back to the cache for recycling (page will be freed). + - Acceleration + + * - `rx[i]_cache_empty` + - The number of events where cache was empty - no page to give. Driver + shall allocate new page. + - Acceleration + + * - `rx[i]_cache_busy` + - The number of events where cache head was busy and cannot be recycled. + Driver allocated new page. + - Acceleration + + * - `rx[i]_cache_waive` + - The number of cache evacuation. This can occur due to page move to + another NUMA node or page was pfmemalloc-ed and should be freed as soon + as possible. + - Acceleration + + * - `rx[i]_arfs_err` + - Number of flow rules that failed to be added to the flow table. + - Error + + * - `rx[i]_recover` + - The number of times the RQ was recovered. + - Error + + * - `tx[i]_xmit_more` + - The number of packets sent with `xmit_more` indication set on the skbuff + (no doorbell). + - Acceleration + + * - `ch[i]_poll` + - The number of invocations of NAPI poll of channel i. + - Informative + + * - `ch[i]_arm` + - The number of times the NAPI poll function completed and armed the + completion queues on channel i. + - Informative + + * - `ch[i]_aff_change` + - The number of times the NAPI poll function explicitly stopped execution + on a CPU due to a change in affinity, on channel i. + - Informative + + * - `ch[i]_events` + - The number of hard interrupt events on the completion queues of channel i. + - Informative + + * - `ch[i]_eq_rearm` + - The number of times the EQ was recovered. + - Error + + * - `ch[i]_force_irq` + - Number of times NAPI is triggered by XSK wakeups by posting a NOP to + ICOSQ. + - Acceleration + + * - `rx[i]_congst_umr` + - The number of times an outstanding UMR request is delayed due to + congestion, on ring i. + - Informative + + * - `rx_pp_alloc_fast` + - Number of successful fast path allocations. + - Informative + + * - `rx_pp_alloc_slow` + - Number of slow path order-0 allocations. + - Informative + + * - `rx_pp_alloc_slow_high_order` + - Number of slow path high order allocations. + - Informative + + * - `rx_pp_alloc_empty` + - Counter is incremented when ptr ring is empty, so a slow path allocation + was forced. + - Informative + + * - `rx_pp_alloc_refill` + - Counter is incremented when an allocation which triggered a refill of the + cache. + - Informative + + * - `rx_pp_alloc_waive` + - Counter is incremented when pages obtained from the ptr ring that cannot + be added to the cache due to a NUMA mismatch. + - Informative + + * - `rx_pp_recycle_cached` + - Counter is incremented when recycling placed page in the page pool cache. + - Informative + + * - `rx_pp_recycle_cache_full` + - Counter is incremented when page pool cache was full. + - Informative + + * - `rx_pp_recycle_ring` + - Counter is incremented when page placed into the ptr ring. + - Informative + + * - `rx_pp_recycle_ring_full` + - Counter is incremented when page released from page pool because the ptr + ring was full. + - Informative + + * - `rx_pp_recycle_released_ref` + - Counter is incremented when page released (and not recycled) because + refcnt > 1. + - Informative + + * - `rx[i]_xsk_buff_alloc_err` + - The number of times allocating an skb or XSK buffer failed in the XSK RQ + context. + - Error + + * - `rx[i]_xsk_arfs_err` + - aRFS (accelerated Receive Flow Steering) does not occur in the XSK RQ + context, so this counter should never increment. + - Error + + * - `rx[i]_xdp_tx_xmit` + - The number of packets forwarded back to the port due to XDP program + `XDP_TX` action (bouncing). these packets are not counted by other + software counters. These packets are counted by physical port and vPort + counters. + - Informative + + * - `rx[i]_xdp_tx_mpwqe` + - Number of multi-packet WQEs transmitted by the netdev and `XDP_TX`-ed by + the netdev during the RQ context. + - Acceleration + + * - `rx[i]_xdp_tx_inlnw` + - Number of WQE data segments transmitted where the data could be inlined + in the WQE and then `XDP_TX`-ed during the RQ context. + - Acceleration + + * - `rx[i]_xdp_tx_nops` + - Number of NOP WQEBBs (WQE building blocks) received posted to the XDP SQ. + - Acceleration + + * - `rx[i]_xdp_tx_full` + - The number of packets that should have been forwarded back to the port + due to `XDP_TX` action but were dropped due to full tx queue. These packets + are not counted by other software counters. These packets are counted by + physical port and vPort counters. You may open more rx queues and spread + traffic rx over all queues and/or increase rx ring size. + - Error + + * - `rx[i]_xdp_tx_err` + - The number of times an `XDP_TX` error such as frame too long and frame + too short occurred on `XDP_TX` ring of RX ring. + - Error + + * - `rx[i]_xdp_tx_cqes` / `rx_xdp_tx_cqe` [#ring_global]_ + - The number of completions received on the CQ of the `XDP_TX` ring. + - Informative + + * - `rx[i]_xdp_drop` + - The number of packets dropped due to XDP program `XDP_DROP` action. these + packets are not counted by other software counters. These packets are + counted by physical port and vPort counters. + - Informative + + * - `rx[i]_xdp_redirect` + - The number of times an XDP redirect action was triggered on ring i. + - Acceleration + + * - `tx[i]_xdp_xmit` + - The number of packets redirected to the interface(due to XDP redirect). + These packets are not counted by other software counters. These packets + are counted by physical port and vPort counters. + - Informative + + * - `tx[i]_xdp_full` + - The number of packets redirected to the interface(due to XDP redirect), + but were dropped due to full tx queue. these packets are not counted by + other software counters. you may enlarge tx queues. + - Informative + + * - `tx[i]_xdp_mpwqe` + - Number of multi-packet WQEs offloaded onto the NIC that were + `XDP_REDIRECT`-ed from other netdevs. + - Acceleration + + * - `tx[i]_xdp_inlnw` + - Number of WQE data segments where the data could be inlined in the WQE + where the data segments were `XDP_REDIRECT`-ed from other netdevs. + - Acceleration + + * - `tx[i]_xdp_nops` + - Number of NOP WQEBBs (WQE building blocks) posted to the SQ that were + `XDP_REDIRECT`-ed from other netdevs. + - Acceleration + + * - `tx[i]_xdp_err` + - The number of packets redirected to the interface(due to XDP redirect) + but were dropped due to error such as frame too long and frame too short. + - Error + + * - `tx[i]_xdp_cqes` + - The number of completions received for packets redirected to the + interface(due to XDP redirect) on the CQ. + - Informative + + * - `tx[i]_xsk_xmit` + - The number of packets transmitted using XSK zerocopy functionality. + - Acceleration + + * - `tx[i]_xsk_mpwqe` + - Number of multi-packet WQEs offloaded onto the NIC that were + `XDP_REDIRECT`-ed from other netdevs. + - Acceleration + + * - `tx[i]_xsk_inlnw` + - Number of WQE data segments where the data could be inlined in the WQE + that are transmitted using XSK zerocopy. + - Acceleration + + * - `tx[i]_xsk_full` + - Number of times doorbell is rung in XSK zerocopy mode when SQ is full. + - Error + + * - `tx[i]_xsk_err` + - Number of errors that occurred in XSK zerocopy mode such as if the data + size is larger than the MTU size. + - Error + + * - `tx[i]_xsk_cqes` + - Number of CQEs processed in XSK zerocopy mode. + - Acceleration + + * - `tx_tls_ctx` + - Number of TLS TX HW offload contexts added to device for encryption. + - Acceleration + + * - `tx_tls_del` + - Number of TLS TX HW offload contexts removed from device (connection + closed). + - Acceleration + + * - `tx_tls_pool_alloc` + - Number of times a unit of work is successfully allocated in the TLS HW + offload pool. + - Acceleration + + * - `tx_tls_pool_free` + - Number of times a unit of work is freed in the TLS HW offload pool. + - Acceleration + + * - `rx_tls_ctx` + - Number of TLS RX HW offload contexts added to device for decryption. + - Acceleration + + * - `rx_tls_del` + - Number of TLS RX HW offload contexts deleted from device (connection has + finished). + - Acceleration + + * - `rx[i]_tls_decrypted_packets` + - Number of successfully decrypted RX packets which were part of a TLS + stream. + - Acceleration + + * - `rx[i]_tls_decrypted_bytes` + - Number of TLS payload bytes in RX packets which were successfully + decrypted. + - Acceleration + + * - `rx[i]_tls_resync_req_pkt` + - Number of received TLS packets with a resync request. + - Acceleration + + * - `rx[i]_tls_resync_req_start` + - Number of times the TLS async resync request was started. + - Acceleration + + * - `rx[i]_tls_resync_req_end` + - Number of times the TLS async resync request properly ended with + providing the HW tracked tcp-seq. + - Acceleration + + * - `rx[i]_tls_resync_req_skip` + - Number of times the TLS async resync request procedure was started but + not properly ended. + - Error + + * - `rx[i]_tls_resync_res_ok` + - Number of times the TLS resync response call to the driver was + successfully handled. + - Acceleration + + * - `rx[i]_tls_resync_res_retry` + - Number of times the TLS resync response call to the driver was + reattempted when ICOSQ is full. + - Error + + * - `rx[i]_tls_resync_res_skip` + - Number of times the TLS resync response call to the driver was terminated + unsuccessfully. + - Error + + * - `rx[i]_tls_err` + - Number of times when CQE TLS offload was problematic. + - Error + + * - `tx[i]_tls_encrypted_packets` + - The number of send packets that are TLS encrypted by the kernel. + - Acceleration + + * - `tx[i]_tls_encrypted_bytes` + - The number of send bytes that are TLS encrypted by the kernel. + - Acceleration + + * - `tx[i]_tls_ooo` + - Number of times out of order TLS SQE fragments were handled on ring i. + - Acceleration + + * - `tx[i]_tls_dump_packets` + - Number of TLS decrypted packets copied over from NIC over DMA. + - Acceleration + + * - `tx[i]_tls_dump_bytes` + - Number of TLS decrypted bytes copied over from NIC over DMA. + - Acceleration + + * - `tx[i]_tls_resync_bytes` + - Number of TLS bytes requested to be resynchronized in order to be + decrypted. + - Acceleration + + * - `tx[i]_tls_skip_no_sync_data` + - Number of TLS send data that can safely be skipped / do not need to be + decrypted. + - Acceleration + + * - `tx[i]_tls_drop_no_sync_data` + - Number of TLS send data that were dropped due to retransmission of TLS + data. + - Acceleration + + * - `ptp_cq[i]_abort` + - Number of times a CQE has to be skipped in precision time protocol due to + a skew between the port timestamp and CQE timestamp being greater than + 128 seconds. + - Error + + * - `ptp_cq[i]_abort_abs_diff_ns` + - Accumulation of time differences between the port timestamp and CQE + timestamp when the difference is greater than 128 seconds in precision + time protocol. + - Error + +.. [#ring_global] The corresponding ring and global counters do not share the + same name (i.e. do not follow the common naming scheme). + +vPort Counters +-------------- +Counters on the NIC port that is connected to a eSwitch. + +.. flat-table:: vPort Counter Table + :widths: 2 3 1 + + * - Counter + - Description + - Type + + * - `rx_vport_unicast_packets` + - Unicast packets received, steered to a port including Raw Ethernet + QP/DPDK traffic, excluding RDMA traffic. + - Informative + + * - `rx_vport_unicast_bytes` + - Unicast bytes received, steered to a port including Raw Ethernet QP/DPDK + traffic, excluding RDMA traffic. + - Informative + + * - `tx_vport_unicast_packets` + - Unicast packets transmitted, steered from a port including Raw Ethernet + QP/DPDK traffic, excluding RDMA traffic. + - Informative + + * - `tx_vport_unicast_bytes` + - Unicast bytes transmitted, steered from a port including Raw Ethernet + QP/DPDK traffic, excluding RDMA traffic. + - Informative + + * - `rx_vport_multicast_packets` + - Multicast packets received, steered to a port including Raw Ethernet + QP/DPDK traffic, excluding RDMA traffic. + - Informative + + * - `rx_vport_multicast_bytes` + - Multicast bytes received, steered to a port including Raw Ethernet + QP/DPDK traffic, excluding RDMA traffic. + - Informative + + * - `tx_vport_multicast_packets` + - Multicast packets transmitted, steered from a port including Raw Ethernet + QP/DPDK traffic, excluding RDMA traffic. + - Informative + + * - `tx_vport_multicast_bytes` + - Multicast bytes transmitted, steered from a port including Raw Ethernet + QP/DPDK traffic, excluding RDMA traffic. + - Informative + + * - `rx_vport_broadcast_packets` + - Broadcast packets received, steered to a port including Raw Ethernet + QP/DPDK traffic, excluding RDMA traffic. + - Informative + + * - `rx_vport_broadcast_bytes` + - Broadcast bytes received, steered to a port including Raw Ethernet + QP/DPDK traffic, excluding RDMA traffic. + - Informative + + * - `tx_vport_broadcast_packets` + - Broadcast packets transmitted, steered from a port including Raw Ethernet + QP/DPDK traffic, excluding RDMA traffic. + - Informative + + * - `tx_vport_broadcast_bytes` + - Broadcast bytes transmitted, steered from a port including Raw Ethernet + QP/DPDK traffic, excluding RDMA traffic. + - Informative + + * - `rx_vport_rdma_unicast_packets` + - RDMA unicast packets received, steered to a port (counters counts + RoCE/UD/RC traffic) [#accel]_. + - Acceleration + + * - `rx_vport_rdma_unicast_bytes` + - RDMA unicast bytes received, steered to a port (counters counts + RoCE/UD/RC traffic) [#accel]_. + - Acceleration + + * - `tx_vport_rdma_unicast_packets` + - RDMA unicast packets transmitted, steered from a port (counters counts + RoCE/UD/RC traffic) [#accel]_. + - Acceleration + + * - `tx_vport_rdma_unicast_bytes` + - RDMA unicast bytes transmitted, steered from a port (counters counts + RoCE/UD/RC traffic) [#accel]_. + - Acceleration + + * - `rx_vport_rdma_multicast_packets` + - RDMA multicast packets received, steered to a port (counters counts + RoCE/UD/RC traffic) [#accel]_. + - Acceleration + + * - `rx_vport_rdma_multicast_bytes` + - RDMA multicast bytes received, steered to a port (counters counts + RoCE/UD/RC traffic) [#accel]_. + - Acceleration + + * - `tx_vport_rdma_multicast_packets` + - RDMA multicast packets transmitted, steered from a port (counters counts + RoCE/UD/RC traffic) [#accel]_. + - Acceleration + + * - `tx_vport_rdma_multicast_bytes` + - RDMA multicast bytes transmitted, steered from a port (counters counts + RoCE/UD/RC traffic) [#accel]_. + - Acceleration + + * - `rx_steer_missed_packets` + - Number of packets that was received by the NIC, however was discarded + because it did not match any flow in the NIC flow table. + - Error + + * - `rx_packets` + - Representor only: packets received, that were handled by the hypervisor. + - Informative + + * - `rx_bytes` + - Representor only: bytes received, that were handled by the hypervisor. + - Informative + + * - `tx_packets` + - Representor only: packets transmitted, that were handled by the + hypervisor. + - Informative + + * - `tx_bytes` + - Representor only: bytes transmitted, that were handled by the hypervisor. + - Informative + + * - `dev_internal_queue_oob` + - The number of dropped packets due to lack of receive WQEs for an internal + device RQ. + - Error + +Physical Port Counters +---------------------- +The physical port counters are the counters on the external port connecting the +adapter to the network. This measuring point holds information on standardized +counters like IEEE 802.3, RFC2863, RFC 2819, RFC 3635 and additional counters +like flow control, FEC and more. + +.. flat-table:: Physical Port Counter Table + :widths: 2 3 1 + + * - Counter + - Description + - Type + + * - `rx_packets_phy` + - The number of packets received on the physical port. This counter doesn’t + include packets that were discarded due to FCS, frame size and similar + errors. + - Informative + + * - `tx_packets_phy` + - The number of packets transmitted on the physical port. + - Informative + + * - `rx_bytes_phy` + - The number of bytes received on the physical port, including Ethernet + header and FCS. + - Informative + + * - `tx_bytes_phy` + - The number of bytes transmitted on the physical port. + - Informative + + * - `rx_multicast_phy` + - The number of multicast packets received on the physical port. + - Informative + + * - `tx_multicast_phy` + - The number of multicast packets transmitted on the physical port. + - Informative + + * - `rx_broadcast_phy` + - The number of broadcast packets received on the physical port. + - Informative + + * - `tx_broadcast_phy` + - The number of broadcast packets transmitted on the physical port. + - Informative + + * - `rx_crc_errors_phy` + - The number of dropped received packets due to FCS (Frame Check Sequence) + error on the physical port. If this counter is increased in high rate, + check the link quality using `rx_symbol_error_phy` and + `rx_corrected_bits_phy` counters below. + - Error + + * - `rx_in_range_len_errors_phy` + - The number of received packets dropped due to length/type errors on a + physical port. + - Error + + * - `rx_out_of_range_len_phy` + - The number of received packets dropped due to length greater than allowed + on a physical port. If this counter is increasing, it implies that the + peer connected to the adapter has a larger MTU configured. Using same MTU + configuration shall resolve this issue. + - Error + + * - `rx_oversize_pkts_phy` + - The number of dropped received packets due to length which exceed MTU + size on a physical port. If this counter is increasing, it implies that + the peer connected to the adapter has a larger MTU configured. Using same + MTU configuration shall resolve this issue. + - Error + + * - `rx_symbol_err_phy` + - The number of received packets dropped due to physical coding errors + (symbol errors) on a physical port. + - Error + + * - `rx_mac_control_phy` + - The number of MAC control packets received on the physical port. + - Informative + + * - `tx_mac_control_phy` + - The number of MAC control packets transmitted on the physical port. + - Informative + + * - `rx_pause_ctrl_phy` + - The number of link layer pause packets received on a physical port. If + this counter is increasing, it implies that the network is congested and + cannot absorb the traffic coming from to the adapter. + - Informative + + * - `tx_pause_ctrl_phy` + - The number of link layer pause packets transmitted on a physical port. If + this counter is increasing, it implies that the NIC is congested and + cannot absorb the traffic coming from the network. + - Informative + + * - `rx_unsupported_op_phy` + - The number of MAC control packets received with unsupported opcode on a + physical port. + - Error + + * - `rx_discards_phy` + - The number of received packets dropped due to lack of buffers on a + physical port. If this counter is increasing, it implies that the adapter + is congested and cannot absorb the traffic coming from the network. + - Error + + * - `tx_discards_phy` + - The number of packets which were discarded on transmission, even no + errors were detected. the drop might occur due to link in down state, + head of line drop, pause from the network, etc. + - Error + + * - `tx_errors_phy` + - The number of transmitted packets dropped due to a length which exceed + MTU size on a physical port. + - Error + + * - `rx_undersize_pkts_phy` + - The number of received packets dropped due to length which is shorter + than 64 bytes on a physical port. If this counter is increasing, it + implies that the peer connected to the adapter has a non-standard MTU + configured or malformed packet had arrived. + - Error + + * - `rx_fragments_phy` + - The number of received packets dropped due to a length which is shorter + than 64 bytes and has FCS error on a physical port. If this counter is + increasing, it implies that the peer connected to the adapter has a + non-standard MTU configured. + - Error + + * - `rx_jabbers_phy` + - The number of received packets d due to a length which is longer than 64 + bytes and had FCS error on a physical port. + - Error + + * - `rx_64_bytes_phy` + - The number of packets received on the physical port with size of 64 bytes. + - Informative + + * - `rx_65_to_127_bytes_phy` + - The number of packets received on the physical port with size of 65 to + 127 bytes. + - Informative + + * - `rx_128_to_255_bytes_phy` + - The number of packets received on the physical port with size of 128 to + 255 bytes. + - Informative + + * - `rx_256_to_511_bytes_phy` + - The number of packets received on the physical port with size of 256 to + 512 bytes. + - Informative + + * - `rx_512_to_1023_bytes_phy` + - The number of packets received on the physical port with size of 512 to + 1023 bytes. + - Informative + + * - `rx_1024_to_1518_bytes_phy` + - The number of packets received on the physical port with size of 1024 to + 1518 bytes. + - Informative + + * - `rx_1519_to_2047_bytes_phy` + - The number of packets received on the physical port with size of 1519 to + 2047 bytes. + - Informative + + * - `rx_2048_to_4095_bytes_phy` + - The number of packets received on the physical port with size of 2048 to + 4095 bytes. + - Informative + + * - `rx_4096_to_8191_bytes_phy` + - The number of packets received on the physical port with size of 4096 to + 8191 bytes. + - Informative + + * - `rx_8192_to_10239_bytes_phy` + - The number of packets received on the physical port with size of 8192 to + 10239 bytes. + - Informative + + * - `link_down_events_phy` + - The number of times where the link operative state changed to down. In + case this counter is increasing it may imply on port flapping. You may + need to replace the cable/transceiver. + - Error + + * - `rx_out_of_buffer` + - Number of times receive queue had no software buffers allocated for the + adapter's incoming traffic. + - Error + + * - `module_bus_stuck` + - The number of times that module's I\ :sup:`2`\C bus (data or clock) + short-wire was detected. You may need to replace the cable/transceiver. + - Error + + * - `module_high_temp` + - The number of times that the module temperature was too high. If this + issue persist, you may need to check the ambient temperature or replace + the cable/transceiver module. + - Error + + * - `module_bad_shorted` + - The number of times that the module cables were shorted. You may need to + replace the cable/transceiver module. + - Error + + * - `module_unplug` + - The number of times that module was ejected. + - Informative + + * - `rx_buffer_passed_thres_phy` + - The number of events where the port receive buffer was over 85% full. + - Informative + + * - `tx_pause_storm_warning_events` + - The number of times the device was sending pauses for a long period of + time. + - Informative + + * - `tx_pause_storm_error_events` + - The number of times the device was sending pauses for a long period of + time, reaching time out and disabling transmission of pause frames. on + the period where pause frames were disabled, drop could have been + occurred. + - Error + + * - `rx[i]_buff_alloc_err` + - Failed to allocate a buffer to received packet (or SKB) on ring i. + - Error + + * - `rx_bits_phy` + - This counter provides information on the total amount of traffic that + could have been received and can be used as a guideline to measure the + ratio of errored traffic in `rx_pcs_symbol_err_phy` and + `rx_corrected_bits_phy`. + - Informative + + * - `rx_pcs_symbol_err_phy` + - This counter counts the number of symbol errors that wasn’t corrected by + FEC correction algorithm or that FEC algorithm was not active on this + interface. If this counter is increasing, it implies that the link + between the NIC and the network is suffering from high BER, and that + traffic is lost. You may need to replace the cable/transceiver. The error + rate is the number of `rx_pcs_symbol_err_phy` divided by the number of + `rx_bits_phy` on a specific time frame. + - Error + + * - `rx_corrected_bits_phy` + - The number of corrected bits on this port according to active FEC + (RS/FC). If this counter is increasing, it implies that the link between + the NIC and the network is suffering from high BER. The corrected bit + rate is the number of `rx_corrected_bits_phy` divided by the number of + `rx_bits_phy` on a specific time frame. + - Error + + * - `rx_err_lane_[l]_phy` + - This counter counts the number of physical raw errors per lane l index. + The counter counts errors before FEC corrections. If this counter is + increasing, it implies that the link between the NIC and the network is + suffering from high BER, and that traffic might be lost. You may need to + replace the cable/transceiver. Please check in accordance with + `rx_corrected_bits_phy`. + - Error + + * - `rx_global_pause` + - The number of pause packets received on the physical port. If this + counter is increasing, it implies that the network is congested and + cannot absorb the traffic coming from the adapter. Note: This counter is + only enabled when global pause mode is enabled. + - Informative + + * - `rx_global_pause_duration` + - The duration of pause received (in microSec) on the physical port. The + counter represents the time the port did not send any traffic. If this + counter is increasing, it implies that the network is congested and + cannot absorb the traffic coming from the adapter. Note: This counter is + only enabled when global pause mode is enabled. + - Informative + + * - `tx_global_pause` + - The number of pause packets transmitted on a physical port. If this + counter is increasing, it implies that the adapter is congested and + cannot absorb the traffic coming from the network. Note: This counter is + only enabled when global pause mode is enabled. + - Informative + + * - `tx_global_pause_duration` + - The duration of pause transmitter (in microSec) on the physical port. + Note: This counter is only enabled when global pause mode is enabled. + - Informative + + * - `rx_global_pause_transition` + - The number of times a transition from Xoff to Xon on the physical port + has occurred. Note: This counter is only enabled when global pause mode + is enabled. + - Informative + + * - `rx_if_down_packets` + - The number of received packets that were dropped due to interface down. + - Informative + +Priority Port Counters +---------------------- +The following counters are physical port counters that are counted per L2 +priority (0-7). + +**Note:** `p` in the counter name represents the priority. + +.. flat-table:: Priority Port Counter Table + :widths: 2 3 1 + + * - Counter + - Description + - Type + + * - `rx_prio[p]_bytes` + - The number of bytes received with priority p on the physical port. + - Informative + + * - `rx_prio[p]_packets` + - The number of packets received with priority p on the physical port. + - Informative + + * - `tx_prio[p]_bytes` + - The number of bytes transmitted on priority p on the physical port. + - Informative + + * - `tx_prio[p]_packets` + - The number of packets transmitted on priority p on the physical port. + - Informative + + * - `rx_prio[p]_pause` + - The number of pause packets received with priority p on a physical port. + If this counter is increasing, it implies that the network is congested + and cannot absorb the traffic coming from the adapter. Note: This counter + is available only if PFC was enabled on priority p. + - Informative + + * - `rx_prio[p]_pause_duration` + - The duration of pause received (in microSec) on priority p on the + physical port. The counter represents the time the port did not send any + traffic on this priority. If this counter is increasing, it implies that + the network is congested and cannot absorb the traffic coming from the + adapter. Note: This counter is available only if PFC was enabled on + priority p. + - Informative + + * - `rx_prio[p]_pause_transition` + - The number of times a transition from Xoff to Xon on priority p on the + physical port has occurred. Note: This counter is available only if PFC + was enabled on priority p. + - Informative + + * - `tx_prio[p]_pause` + - The number of pause packets transmitted on priority p on a physical port. + If this counter is increasing, it implies that the adapter is congested + and cannot absorb the traffic coming from the network. Note: This counter + is available only if PFC was enabled on priority p. + - Informative + + * - `tx_prio[p]_pause_duration` + - The duration of pause transmitter (in microSec) on priority p on the + physical port. Note: This counter is available only if PFC was enabled on + priority p. + - Informative + + * - `rx_prio[p]_buf_discard` + - The number of packets discarded by device due to lack of per host receive + buffers. + - Informative + + * - `rx_prio[p]_cong_discard` + - The number of packets discarded by device due to per host congestion. + - Informative + + * - `rx_prio[p]_marked` + - The number of packets ecn marked by device due to per host congestion. + - Informative + + * - `rx_prio[p]_discards` + - The number of packets discarded by device due to lack of receive buffers. + - Informative + +Device Counters +--------------- +.. flat-table:: Device Counter Table + :widths: 2 3 1 + + * - Counter + - Description + - Type + + * - `rx_pci_signal_integrity` + - Counts physical layer PCIe signal integrity errors, the number of + transitions to recovery due to Framing errors and CRC (dlp and tlp). If + this counter is raising, try moving the adapter card to a different slot + to rule out a bad PCI slot. Validate that you are running with the latest + firmware available and latest server BIOS version. + - Error + + * - `tx_pci_signal_integrity` + - Counts physical layer PCIe signal integrity errors, the number of + transition to recovery initiated by the other side (moving to recovery + due to getting TS/EIEOS). If this counter is raising, try moving the + adapter card to a different slot to rule out a bad PCI slot. Validate + that you are running with the latest firmware available and latest server + BIOS version. + - Error + + * - `outbound_pci_buffer_overflow` + - The number of packets dropped due to pci buffer overflow. If this counter + is raising in high rate, it might indicate that the receive traffic rate + for a host is larger than the PCIe bus and therefore a congestion occurs. + - Informative + + * - `outbound_pci_stalled_rd` + - The percentage (in the range 0...100) of time within the last second that + the NIC had outbound non-posted reads requests but could not perform the + operation due to insufficient posted credits. + - Informative + + * - `outbound_pci_stalled_wr` + - The percentage (in the range 0...100) of time within the last second that + the NIC had outbound posted writes requests but could not perform the + operation due to insufficient posted credits. + - Informative + + * - `outbound_pci_stalled_rd_events` + - The number of seconds where `outbound_pci_stalled_rd` was above 30%. + - Informative + + * - `outbound_pci_stalled_wr_events` + - The number of seconds where `outbound_pci_stalled_wr` was above 30%. + - Informative + + * - `dev_out_of_buffer` + - The number of times the device owned queue had not enough buffers + allocated. + - Error diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/index.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/index.rst index 2346459ae6cc..3fdcd6b61ccf 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/index.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/index.rst @@ -16,6 +16,7 @@ Contents: devlink switchdev tracepoints + counters .. only:: subproject and html