RDMA/rxe: Get rid of pkt resend on err

Currently the rxe_driver detects packet drops by ip_local_out() which
occur before the packet is sent on the wire and attempts to resend
them. This is redundant with the usual retry mechanism which covers
packets that get dropped in transit to or from the remote node.

The way this is implemented is not robust since it sets need_req_skb and
waits for the number of local skbs outstanding for this qp to drop below a
low water mark. This is racy since the skb may be sent to the destructor
before the requester can set the need_req_skb flag. This will cause a
deadlock in the send path for that qp.

This patch removes this mechanism since the normal retry path will correct
the error and resend the packet and it makes no difference if the packet
is dropped locally or later.

Link: https://lore.kernel.org/r/20240329145513.35381-14-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
This commit is contained in:
Bob Pearson 2024-03-29 09:55:14 -05:00 committed by Jason Gunthorpe
parent 55bec1c440
commit 9cc6290991
2 changed files with 3 additions and 18 deletions

View File

@ -371,12 +371,7 @@ static int rxe_send(struct sk_buff *skb, struct rxe_pkt_info *pkt)
else
err = ip6_local_out(dev_net(skb_dst(skb)->dev), skb->sk, skb);
if (unlikely(net_xmit_eval(err))) {
rxe_dbg_qp(pkt->qp, "error sending packet: %d\n", err);
return -EAGAIN;
}
return 0;
return err;
}
/* fix up a send packet to match the packets

View File

@ -802,18 +802,8 @@ int rxe_requester(struct rxe_qp *qp)
err = rxe_xmit_packet(qp, &pkt, skb);
if (err) {
if (err != -EAGAIN) {
wqe->status = IB_WC_LOC_QP_OP_ERR;
goto err;
}
/* force a delay until the dropped packet is freed and
* the send queue is drained below the low water mark
*/
qp->need_req_skb = 1;
rxe_sched_task(&qp->send_task);
goto exit;
wqe->status = IB_WC_LOC_QP_OP_ERR;
goto err;
}
update_wqe_state(qp, wqe, &pkt);