Merge branch 'QorIQ-DPAA-FMan-erratum-A050385-workaround'

Madalin Bucur says:

====================
QorIQ DPAA FMan erratum A050385 workaround

Changes in v2:
 - added CONFIG_DPAA_ERRATUM_A050385
 - removed unnecessary parenthesis
 - changed alignment defines to use only decimal values

The patch set implements the workaround for FMan erratum A050385:

FMAN DMA read or writes under heavy traffic load may cause FMAN
internal resource leak; thus stopping further packet processing.
To reproduce this issue when the workaround is not applied, one
needs to ensure the FMan DMA transaction queue is already full
when a transaction split occurs so the system must be under high
traffic load (i.e. multiple ports at line rate). After the errata
occurs, the traffic stops. The only SoC impacted by this is the
LS1043A, the other ARM DPAA 1 SoC or the PPC DPAA 1 SoCs do not
have this erratum.

The FMAN internal queue can overflow when FMAN splits single
read or write transactions into multiple smaller transactions
such that more than 17 AXI transactions are in flight from FMAN
to interconnect. When the FMAN internal queue overflows, it can
stall further packet processing. The issue can occur with any one
of the following three conditions:

  1. FMAN AXI transaction crosses 4K address boundary (Errata
         A010022)
  2. FMAN DMA address for an AXI transaction is not 16 byte
         aligned, i.e. the last 4 bits of an address are non-zero
  3. Scatter Gather (SG) frames have more than one SG buffer in
         the SG list and any one of the buffers, except the last
         buffer in the SG list has data size that is not a multiple
         of 16 bytes, i.e., other than 16, 32, 48, 64, etc.

With any one of the above three conditions present, there is
likelihood of stalled FMAN packet processing, especially under
stress with multiple ports injecting line-rate traffic.

To avoid situations that stall FMAN packet processing, all of the
above three conditions must be avoided; therefore, configure the
system with the following rules:

  1. Frame buffers must not span a 4KB address boundary, unless
         the frame start address is 256 byte aligned
  2. All FMAN DMA start addresses (for example, BMAN buffer
         address, FD[address] + FD[offset]) are 16B aligned
  3. SG table and buffer addresses are 16B aligned and the size
         of SG buffers are multiple of 16 bytes, except for the last
         SG buffer that can be of any size.

Additional workaround notes:
- Address alignment of 64 bytes is recommended for maximally
efficient system bus transactions (although 16 byte alignment is
sufficient to avoid the stall condition)
- To support frame sizes that are larger than 4K bytes, there are
two options:
  1. Large single buffer frames that span a 4KB page boundary can
         be converted into SG frames to avoid transaction splits at
         the 4KB boundary,
  2. Align the large single buffer to 256B address boundaries,
         ensure that the frame address plus offset is 256B aligned.
- If software generated SG frames have buffers that are unaligned
and with random non-multiple of 16 byte lengths, before
transmitting such frames via FMAN, frames will need to be copied
into a new single buffer or multiple buffer SG frame that is
compliant with the three rules listed above.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
David S. Miller 2020-03-06 21:55:32 -08:00
commit 172fd3eb38
6 changed files with 167 additions and 3 deletions

View File

@ -110,6 +110,13 @@ PROPERTIES
Usage: required
Definition: See soc/fsl/qman.txt and soc/fsl/bman.txt
- fsl,erratum-a050385
Usage: optional
Value type: boolean
Definition: A boolean property. Indicates the presence of the
erratum A050385 which indicates that DMA transactions that are
split can result in a FMan lock.
=============================================================================
FMan MURAM Node

View File

@ -20,6 +20,8 @@
};
&fman0 {
fsl,erratum-a050385;
/* these aliases provide the FMan ports mapping */
enet0: ethernet@e0000 {
};

View File

@ -1,4 +1,5 @@
/* Copyright 2008 - 2016 Freescale Semiconductor Inc.
* Copyright 2020 NXP
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
@ -123,7 +124,22 @@ MODULE_PARM_DESC(tx_timeout, "The Tx timeout in ms");
#define FSL_QMAN_MAX_OAL 127
/* Default alignment for start of data in an Rx FD */
#ifdef CONFIG_DPAA_ERRATUM_A050385
/* aligning data start to 64 avoids DMA transaction splits, unless the buffer
* is crossing a 4k page boundary
*/
#define DPAA_FD_DATA_ALIGNMENT (fman_has_errata_a050385() ? 64 : 16)
/* aligning to 256 avoids DMA transaction splits caused by 4k page boundary
* crossings; also, all SG fragments except the last must have a size multiple
* of 256 to avoid DMA transaction splits
*/
#define DPAA_A050385_ALIGN 256
#define DPAA_FD_RX_DATA_ALIGNMENT (fman_has_errata_a050385() ? \
DPAA_A050385_ALIGN : 16)
#else
#define DPAA_FD_DATA_ALIGNMENT 16
#define DPAA_FD_RX_DATA_ALIGNMENT DPAA_FD_DATA_ALIGNMENT
#endif
/* The DPAA requires 256 bytes reserved and mapped for the SGT */
#define DPAA_SGT_SIZE 256
@ -158,8 +174,13 @@ MODULE_PARM_DESC(tx_timeout, "The Tx timeout in ms");
#define DPAA_PARSE_RESULTS_SIZE sizeof(struct fman_prs_result)
#define DPAA_TIME_STAMP_SIZE 8
#define DPAA_HASH_RESULTS_SIZE 8
#ifdef CONFIG_DPAA_ERRATUM_A050385
#define DPAA_RX_PRIV_DATA_SIZE (DPAA_A050385_ALIGN - (DPAA_PARSE_RESULTS_SIZE\
+ DPAA_TIME_STAMP_SIZE + DPAA_HASH_RESULTS_SIZE))
#else
#define DPAA_RX_PRIV_DATA_SIZE (u16)(DPAA_TX_PRIV_DATA_SIZE + \
dpaa_rx_extra_headroom)
#endif
#define DPAA_ETH_PCD_RXQ_NUM 128
@ -180,7 +201,12 @@ static struct dpaa_bp *dpaa_bp_array[BM_MAX_NUM_OF_POOLS];
#define DPAA_BP_RAW_SIZE 4096
#ifdef CONFIG_DPAA_ERRATUM_A050385
#define dpaa_bp_size(raw_size) (SKB_WITH_OVERHEAD(raw_size) & \
~(DPAA_A050385_ALIGN - 1))
#else
#define dpaa_bp_size(raw_size) SKB_WITH_OVERHEAD(raw_size)
#endif
static int dpaa_max_frm;
@ -1192,7 +1218,7 @@ static int dpaa_eth_init_rx_port(struct fman_port *port, struct dpaa_bp *bp,
buf_prefix_content.pass_prs_result = true;
buf_prefix_content.pass_hash_result = true;
buf_prefix_content.pass_time_stamp = true;
buf_prefix_content.data_align = DPAA_FD_DATA_ALIGNMENT;
buf_prefix_content.data_align = DPAA_FD_RX_DATA_ALIGNMENT;
rx_p = &params.specific_params.rx_params;
rx_p->err_fqid = errq->fqid;
@ -1662,6 +1688,8 @@ static u8 rx_csum_offload(const struct dpaa_priv *priv, const struct qm_fd *fd)
return CHECKSUM_NONE;
}
#define PTR_IS_ALIGNED(x, a) (IS_ALIGNED((unsigned long)(x), (a)))
/* Build a linear skb around the received buffer.
* We are guaranteed there is enough room at the end of the data buffer to
* accommodate the shared info area of the skb.
@ -1733,8 +1761,7 @@ static struct sk_buff *sg_fd_to_skb(const struct dpaa_priv *priv,
sg_addr = qm_sg_addr(&sgt[i]);
sg_vaddr = phys_to_virt(sg_addr);
WARN_ON(!IS_ALIGNED((unsigned long)sg_vaddr,
SMP_CACHE_BYTES));
WARN_ON(!PTR_IS_ALIGNED(sg_vaddr, SMP_CACHE_BYTES));
dma_unmap_page(priv->rx_dma_dev, sg_addr,
DPAA_BP_RAW_SIZE, DMA_FROM_DEVICE);
@ -2022,6 +2049,75 @@ static inline int dpaa_xmit(struct dpaa_priv *priv,
return 0;
}
#ifdef CONFIG_DPAA_ERRATUM_A050385
int dpaa_a050385_wa(struct net_device *net_dev, struct sk_buff **s)
{
struct dpaa_priv *priv = netdev_priv(net_dev);
struct sk_buff *new_skb, *skb = *s;
unsigned char *start, i;
/* check linear buffer alignment */
if (!PTR_IS_ALIGNED(skb->data, DPAA_A050385_ALIGN))
goto workaround;
/* linear buffers just need to have an aligned start */
if (!skb_is_nonlinear(skb))
return 0;
/* linear data size for nonlinear skbs needs to be aligned */
if (!IS_ALIGNED(skb_headlen(skb), DPAA_A050385_ALIGN))
goto workaround;
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
/* all fragments need to have aligned start addresses */
if (!IS_ALIGNED(skb_frag_off(frag), DPAA_A050385_ALIGN))
goto workaround;
/* all but last fragment need to have aligned sizes */
if (!IS_ALIGNED(skb_frag_size(frag), DPAA_A050385_ALIGN) &&
(i < skb_shinfo(skb)->nr_frags - 1))
goto workaround;
}
return 0;
workaround:
/* copy all the skb content into a new linear buffer */
new_skb = netdev_alloc_skb(net_dev, skb->len + DPAA_A050385_ALIGN - 1 +
priv->tx_headroom);
if (!new_skb)
return -ENOMEM;
/* NET_SKB_PAD bytes already reserved, adding up to tx_headroom */
skb_reserve(new_skb, priv->tx_headroom - NET_SKB_PAD);
/* Workaround for DPAA_A050385 requires data start to be aligned */
start = PTR_ALIGN(new_skb->data, DPAA_A050385_ALIGN);
if (start - new_skb->data != 0)
skb_reserve(new_skb, start - new_skb->data);
skb_put(new_skb, skb->len);
skb_copy_bits(skb, 0, new_skb->data, skb->len);
skb_copy_header(new_skb, skb);
new_skb->dev = skb->dev;
/* We move the headroom when we align it so we have to reset the
* network and transport header offsets relative to the new data
* pointer. The checksum offload relies on these offsets.
*/
skb_set_network_header(new_skb, skb_network_offset(skb));
skb_set_transport_header(new_skb, skb_transport_offset(skb));
/* TODO: does timestamping need the result in the old skb? */
dev_kfree_skb(skb);
*s = new_skb;
return 0;
}
#endif
static netdev_tx_t
dpaa_start_xmit(struct sk_buff *skb, struct net_device *net_dev)
{
@ -2068,6 +2164,14 @@ dpaa_start_xmit(struct sk_buff *skb, struct net_device *net_dev)
nonlinear = skb_is_nonlinear(skb);
}
#ifdef CONFIG_DPAA_ERRATUM_A050385
if (unlikely(fman_has_errata_a050385())) {
if (dpaa_a050385_wa(net_dev, &skb))
goto enomem;
nonlinear = skb_is_nonlinear(skb);
}
#endif
if (nonlinear) {
/* Just create a S/G fd based on the skb */
err = skb_to_sg_fd(priv, skb, &fd);

View File

@ -8,3 +8,31 @@ config FSL_FMAN
help
Freescale Data-Path Acceleration Architecture Frame Manager
(FMan) support
config DPAA_ERRATUM_A050385
bool
depends on ARM64 && FSL_DPAA
default y
help
DPAA FMan erratum A050385 software workaround implementation:
align buffers, data start, SG fragment length to avoid FMan DMA
splits.
FMAN DMA read or writes under heavy traffic load may cause FMAN
internal resource leak thus stopping further packet processing.
The FMAN internal queue can overflow when FMAN splits single
read or write transactions into multiple smaller transactions
such that more than 17 AXI transactions are in flight from FMAN
to interconnect. When the FMAN internal queue overflows, it can
stall further packet processing. The issue can occur with any
one of the following three conditions:
1. FMAN AXI transaction crosses 4K address boundary (Errata
A010022)
2. FMAN DMA address for an AXI transaction is not 16 byte
aligned, i.e. the last 4 bits of an address are non-zero
3. Scatter Gather (SG) frames have more than one SG buffer in
the SG list and any one of the buffers, except the last
buffer in the SG list has data size that is not a multiple
of 16 bytes, i.e., other than 16, 32, 48, 64, etc.
With any one of the above three conditions present, there is
likelihood of stalled FMAN packet processing, especially under
stress with multiple ports injecting line-rate traffic.

View File

@ -1,5 +1,6 @@
/*
* Copyright 2008-2015 Freescale Semiconductor Inc.
* Copyright 2020 NXP
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
@ -566,6 +567,10 @@ struct fman_cfg {
u32 qmi_def_tnums_thresh;
};
#ifdef CONFIG_DPAA_ERRATUM_A050385
static bool fman_has_err_a050385;
#endif
static irqreturn_t fman_exceptions(struct fman *fman,
enum fman_exceptions exception)
{
@ -2518,6 +2523,14 @@ struct fman *fman_bind(struct device *fm_dev)
}
EXPORT_SYMBOL(fman_bind);
#ifdef CONFIG_DPAA_ERRATUM_A050385
bool fman_has_errata_a050385(void)
{
return fman_has_err_a050385;
}
EXPORT_SYMBOL(fman_has_errata_a050385);
#endif
static irqreturn_t fman_err_irq(int irq, void *handle)
{
struct fman *fman = (struct fman *)handle;
@ -2845,6 +2858,11 @@ static struct fman *read_dts_node(struct platform_device *of_dev)
goto fman_free;
}
#ifdef CONFIG_DPAA_ERRATUM_A050385
fman_has_err_a050385 =
of_property_read_bool(fm_node, "fsl,erratum-a050385");
#endif
return fman;
fman_node_put:

View File

@ -1,5 +1,6 @@
/*
* Copyright 2008-2015 Freescale Semiconductor Inc.
* Copyright 2020 NXP
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
@ -398,6 +399,10 @@ u16 fman_get_max_frm(void);
int fman_get_rx_extra_headroom(void);
#ifdef CONFIG_DPAA_ERRATUM_A050385
bool fman_has_errata_a050385(void);
#endif
struct fman *fman_bind(struct device *dev);
#endif /* __FM_H */