IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This reverts commit 723ad916134784b317b72f3f6cf0f7ba774e5dae
When client requests channel or ring size larger than what the server
can support the server will cap the request to the supported max. So,
the client would not be able to successfully request resources that
exceed the server limit.
Fixes: 723ad9161347 ("ibmvnic: Add ethtool private flag for driver-defined queue limits")
Signed-off-by: Dany Madden <drt@linux.ibm.com>
Link: https://lore.kernel.org/r/20220427235146.23189-1-drt@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The Time-Specified Departure feature is indeed mutually exclusive with
TX IP checksumming in ENETC, but TX checksumming in itself is broken and
was removed from this driver in commit 82728b91f124 ("enetc: Remove Tx
checksumming offload code").
The blamed commit declared NETIF_F_HW_CSUM in dev->features to comply
with software TSO's expectations, and still did the checksumming in
software by calling skb_checksum_help(). So there isn't any restriction
for the Time-Specified Departure feature.
However, enetc_setup_tc_txtime() doesn't understand that, and blindly
looks for NETIF_F_CSUM_MASK.
Instead of checking for things which can literally never happen in the
current code base, just remove the check and let the driver offload
tc-etf qdiscs.
Fixes: acede3c5dad5 ("net: enetc: declare NETIF_F_HW_CSUM and do it in software")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220427203017.1291634-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The VF driver can forward any IPsec flags and such makes the function
is not extendable and prone to backward/forward incompatibility.
If new software runs on VF, it won't know that PF configured something
completely different as it "knows" only XFRM_OFFLOAD_INBOUND flag.
Fixes: eda0333ac293 ("ixgbe: add VF IPsec management")
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Shannon Nelson <snelson@pensando.io>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220427173152.443102-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Put device node in error path in fec_enet_init_stop_mode().
Fixes: 8a448bf832af ("net: ethernet: fec: move GPR register offset and bit into DT")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220426125231.375688-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-04-26
This series contains updates to ice driver only.
Ivan Vecera removes races related to VF message processing by changing
mutex_trylock() call to mutex_lock() and moving additional operations
to occur under mutex.
Petr Oros increases wait time after firmware flash as current time is
not sufficient.
Jake resolves a use-after-free issue for mailbox snapshot.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 4b5923249b8fa4 ("net: dsa: lantiq_gswip: Configure all remaining
GSWIP_MII_CFG bits") added all known bits in the GSWIP_MII_CFGp
register. It helped bring this register into a well-defined state so the
driver has to rely less on the bootloader to do things right.
Unfortunately it also sets the GSWIP_MII_CFG_RMII_CLK bit without any
possibility to configure it. Upon further testing it turns out that all
boards which are supported by the GSWIP driver in OpenWrt which use an
RMII PHY have a dedicated oscillator on the board which provides the
50MHz RMII reference clock.
Don't set the GSWIP_MII_CFG_RMII_CLK bit (but keep the code which always
clears it) to fix support for the Fritz!Box 7362 SL in OpenWrt. This is
a board with two Atheros AR8030 RMII PHYs. With the "RMII clock" bit set
the MAC also generates the RMII reference clock whose signal then
conflicts with the signal from the oscillator on the board. This results
in a constant cycle of the PHY detecting link up/down (and as a result
of that: the two ports using the AR8030 PHYs are not working).
At the time of writing this patch there's no known board where the MAC
(GSWIP) has to generate the RMII reference clock. If needed this can be
implemented in future by providing a device-tree flag so the
GSWIP_MII_CFG_RMII_CLK bit can be toggled per port.
Fixes: 4b5923249b8fa4 ("net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits")
Tested-by: Jan Hoffmann <jan@3e8.eu>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Link: https://lore.kernel.org/r/20220425152027.2220750-1-martin.blumenstingl@googlemail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We need to wait 5 s for EMP reset after firmware flash. Code was extracted
from OOT driver (ice v1.8.3 downloaded from sourceforge). Without this
wait, fw_activate let card in inconsistent state and recoverable only
by second flash/activate. Flash was tested on these fw's:
From -> To
3.00 -> 3.10/3.20
3.10 -> 3.00/3.20
3.20 -> 3.00/3.10
Reproducer:
[root@host ~]# devlink dev flash pci/0000:ca:00.0 file E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bin
Preparing to flash
[fw.mgmt] Erasing
[fw.mgmt] Erasing done
[fw.mgmt] Flashing 100%
[fw.mgmt] Flashing done 100%
[fw.undi] Erasing
[fw.undi] Erasing done
[fw.undi] Flashing 100%
[fw.undi] Flashing done 100%
[fw.netlist] Erasing
[fw.netlist] Erasing done
[fw.netlist] Flashing 100%
[fw.netlist] Flashing done 100%
Activate new firmware by devlink reload
[root@host ~]# devlink dev reload pci/0000:ca:00.0 action fw_activate
reload_actions_performed:
fw_activate
[root@host ~]# ip link show ens7f0
71: ens7f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff
altname enp202s0f0
dmesg after flash:
[ 55.120788] ice: Copyright (c) 2018, Intel Corporation.
[ 55.274734] ice 0000:ca:00.0: Get PHY capabilities failed status = -5, continuing anyway
[ 55.569797] ice 0000:ca:00.0: The DDP package was successfully loaded: ICE OS Default Package version 1.3.28.0
[ 55.603629] ice 0000:ca:00.0: Get PHY capability failed.
[ 55.608951] ice 0000:ca:00.0: ice_init_nvm_phy_type failed: -5
[ 55.647348] ice 0000:ca:00.0: PTP init successful
[ 55.675536] ice 0000:ca:00.0: DCB is enabled in the hardware, max number of TCs supported on this port are 8
[ 55.685365] ice 0000:ca:00.0: FW LLDP is disabled, DCBx/LLDP in SW mode.
[ 55.692179] ice 0000:ca:00.0: Commit DCB Configuration to the hardware
[ 55.701382] ice 0000:ca:00.0: 126.024 Gb/s available PCIe bandwidth, limited by 16.0 GT/s PCIe x8 link at 0000:c9:02.0 (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link)
Reboot doesn’t help, only second flash/activate with OOT or patched
driver put card back in consistent state.
After patch:
[root@host ~]# devlink dev flash pci/0000:ca:00.0 file E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bin
Preparing to flash
[fw.mgmt] Erasing
[fw.mgmt] Erasing done
[fw.mgmt] Flashing 100%
[fw.mgmt] Flashing done 100%
[fw.undi] Erasing
[fw.undi] Erasing done
[fw.undi] Flashing 100%
[fw.undi] Flashing done 100%
[fw.netlist] Erasing
[fw.netlist] Erasing done
[fw.netlist] Flashing 100%
[fw.netlist] Flashing done 100%
Activate new firmware by devlink reload
[root@host ~]# devlink dev reload pci/0000:ca:00.0 action fw_activate
reload_actions_performed:
fw_activate
[root@host ~]# ip link show ens7f0
19: ens7f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff
altname enp202s0f0
Fixes: 399e27dbbd9e94 ("ice: support immediate firmware activation via devlink reload")
Signed-off-by: Petr Oros <poros@redhat.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Previous patch labelled "ice: Fix incorrect locking in
ice_vc_process_vf_msg()" fixed an issue with ignored messages
sent by VF driver but a small race window still left.
Recently caught trace during 'ip link set ... vf 0 vlan ...' operation:
[ 7332.995625] ice 0000:3b:00.0: Clearing port VLAN on VF 0
[ 7333.001023] iavf 0000:3b:01.0: Reset indication received from the PF
[ 7333.007391] iavf 0000:3b:01.0: Scheduling reset task
[ 7333.059575] iavf 0000:3b:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 3
[ 7333.059626] ice 0000:3b:00.0: Invalid message from VF 0, opcode 3, len 4, error -1
Setting of VLAN for VF causes a reset of the affected VF using
ice_reset_vf() function that runs with cfg_lock taken:
1. ice_notify_vf_reset() informs IAVF driver that reset is needed and
IAVF schedules its own reset procedure
2. Bit ICE_VF_STATE_DIS is set in vf->vf_state
3. Misc initialization steps
4. ice_sriov_post_vsi_rebuild() -> ice_vf_set_initialized() and that
clears ICE_VF_STATE_DIS in vf->vf_state
Step 3 is mentioned race window because IAVF reset procedure runs in
parallel and one of its step is sending of VIRTCHNL_OP_GET_VF_RESOURCES
message (opcode==3). This message is handled in ice_vc_process_vf_msg()
and if it is received during the mentioned race window then it's
marked as invalid and error is returned to VF driver.
Protect vf_state check in ice_vc_process_vf_msg() by cfg_lock to avoid
this race condition.
Fixes: e6ba5273d4ed ("ice: Fix race conditions between virtchnl handling and VF ndo ops")
Tested-by: Fei Liu <feliu@redhat.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Usage of mutex_trylock() in ice_vc_process_vf_msg() is incorrect
because message sent from VF is ignored and never processed.
Use mutex_lock() instead to fix the issue. It is safe because this
mutex is used to prevent races between VF related NDOs and
handlers processing request messages from VF and these handlers
are running in ice_service_task() context. Additionally move this
mutex lock prior ice_vc_is_opcode_allowed() call to avoid potential
races during allowlist access.
Fixes: e6ba5273d4ed ("ice: Fix race conditions between virtchnl handling and VF ndo ops")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
We received a report[1] of kernel crashes when Cilium is used in XDP
mode with virtio_net after updating to newer kernels. After
investigating the reason it turned out that when using mergeable bufs
with an XDP program which adjusts xdp.data or xdp.data_meta page_to_buf()
calculates the build_skb address wrong because the offset can become less
than the headroom so it gets the address of the previous page (-X bytes
depending on how lower offset is):
page_to_skb: page addr ffff9eb2923e2000 buf ffff9eb2923e1ffc offset 252 headroom 256
This is a pr_err() I added in the beginning of page_to_skb which clearly
shows offset that is less than headroom by adding 4 bytes of metadata
via an xdp prog. The calculations done are:
receive_mergeable():
headroom = VIRTIO_XDP_HEADROOM; // VIRTIO_XDP_HEADROOM == 256 bytes
offset = xdp.data - page_address(xdp_page) -
vi->hdr_len - metasize;
page_to_skb():
p = page_address(page) + offset;
...
buf = p - headroom;
Now buf goes -4 bytes from the page's starting address as can be seen
above which is set as skb->head and skb->data by build_skb later. Depending
on what's done with the skb (when it's freed most often) we get all kinds
of corruptions and BUG_ON() triggers in mm[2]. We have to recalculate
the new headroom after the xdp program has run, similar to how offset
and len are recalculated. Headroom is directly related to
data_hard_start, data and data_meta, so we use them to get the new size.
The result is correct (similar pr_err() in page_to_skb, one case of
xdp_page and one case of virtnet buf):
a) Case with 4 bytes of metadata
[ 115.949641] page_to_skb: page addr ffff8b4dcfad2000 offset 252 headroom 252
[ 121.084105] page_to_skb: page addr ffff8b4dcf018000 offset 20732 headroom 252
b) Case of pushing data +32 bytes
[ 153.181401] page_to_skb: page addr ffff8b4dd0c4d000 offset 288 headroom 288
[ 158.480421] page_to_skb: page addr ffff8b4dd00b0000 offset 24864 headroom 288
c) Case of pushing data -33 bytes
[ 835.906830] page_to_skb: page addr ffff8b4dd3270000 offset 223 headroom 223
[ 840.839910] page_to_skb: page addr ffff8b4dcdd68000 offset 12511 headroom 223
Offset and headroom are equal because offset points to the start of
reserved bytes for the virtio_net header which are at buf start +
headroom, while data points at buf start + vnet hdr size + headroom so
when data or data_meta are adjusted by the xdp prog both the headroom size
and the offset change equally. We can use data_hard_start to compute the
new headroom after the xdp prog (linearized / page start case, the
virtnet buf case is similar just with bigger base offset):
xdp.data_hard_start = page_address + vnet_hdr
xdp.data = page_address + vnet_hdr + headroom
new headroom after xdp prog = xdp.data - xdp.data_hard_start - metasize
An example reproducer xdp prog[3] is below.
[1] https://github.com/cilium/cilium/issues/19453
[2] Two of the many traces:
[ 40.437400] BUG: Bad page state in process swapper/0 pfn:14940
[ 40.916726] BUG: Bad page state in process systemd-resolve pfn:053b7
[ 41.300891] kernel BUG at include/linux/mm.h:720!
[ 41.301801] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 41.302784] CPU: 1 PID: 1181 Comm: kubelet Kdump: loaded Tainted: G B W 5.18.0-rc1+ #37
[ 41.304458] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
[ 41.306018] RIP: 0010:page_frag_free+0x79/0xe0
[ 41.306836] Code: 00 00 75 ea 48 8b 07 a9 00 00 01 00 74 e0 48 8b 47 48 48 8d 50 ff a8 01 48 0f 45 fa eb d0 48 c7 c6 18 b8 30 a6 e8 d7 f8 fc ff <0f> 0b 48 8d 78 ff eb bc 48 8b 07 a9 00 00 01 00 74 3a 66 90 0f b6
[ 41.310235] RSP: 0018:ffffac05c2a6bc78 EFLAGS: 00010292
[ 41.311201] RAX: 000000000000003e RBX: 0000000000000000 RCX: 0000000000000000
[ 41.312502] RDX: 0000000000000001 RSI: ffffffffa6423004 RDI: 00000000ffffffff
[ 41.313794] RBP: ffff993c98823600 R08: 0000000000000000 R09: 00000000ffffdfff
[ 41.315089] R10: ffffac05c2a6ba68 R11: ffffffffa698ca28 R12: ffff993c98823600
[ 41.316398] R13: ffff993c86311ebc R14: 0000000000000000 R15: 000000000000005c
[ 41.317700] FS: 00007fe13fc56740(0000) GS:ffff993cdd900000(0000) knlGS:0000000000000000
[ 41.319150] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 41.320152] CR2: 000000c00008a000 CR3: 0000000014908000 CR4: 0000000000350ee0
[ 41.321387] Call Trace:
[ 41.321819] <TASK>
[ 41.322193] skb_release_data+0x13f/0x1c0
[ 41.322902] __kfree_skb+0x20/0x30
[ 41.343870] tcp_recvmsg_locked+0x671/0x880
[ 41.363764] tcp_recvmsg+0x5e/0x1c0
[ 41.384102] inet_recvmsg+0x42/0x100
[ 41.406783] ? sock_recvmsg+0x1d/0x70
[ 41.428201] sock_read_iter+0x84/0xd0
[ 41.445592] ? 0xffffffffa3000000
[ 41.462442] new_sync_read+0x148/0x160
[ 41.479314] ? 0xffffffffa3000000
[ 41.496937] vfs_read+0x138/0x190
[ 41.517198] ksys_read+0x87/0xc0
[ 41.535336] do_syscall_64+0x3b/0x90
[ 41.551637] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 41.568050] RIP: 0033:0x48765b
[ 41.583955] Code: e8 4a 35 fe ff eb 88 cc cc cc cc cc cc cc cc e8 fb 7a fe ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
[ 41.632818] RSP: 002b:000000c000a2f5b8 EFLAGS: 00000212 ORIG_RAX: 0000000000000000
[ 41.664588] RAX: ffffffffffffffda RBX: 000000c000062000 RCX: 000000000048765b
[ 41.681205] RDX: 0000000000005e54 RSI: 000000c000e66000 RDI: 0000000000000016
[ 41.697164] RBP: 000000c000a2f608 R08: 0000000000000001 R09: 00000000000001b4
[ 41.713034] R10: 00000000000000b6 R11: 0000000000000212 R12: 00000000000000e9
[ 41.728755] R13: 0000000000000001 R14: 000000c000a92000 R15: ffffffffffffffff
[ 41.744254] </TASK>
[ 41.758585] Modules linked in: br_netfilter bridge veth netconsole virtio_net
and
[ 33.524802] BUG: Bad page state in process systemd-network pfn:11e60
[ 33.528617] page ffffe05dc0147b00 ffffe05dc04e7a00 ffff8ae9851ec000 (1) len 82 offset 252 metasize 4 hroom 0 hdr_len 12 data ffff8ae9851ec10c data_meta ffff8ae9851ec108 data_end ffff8ae9851ec14e
[ 33.529764] page:000000003792b5ba refcount:0 mapcount:-512 mapping:0000000000000000 index:0x0 pfn:0x11e60
[ 33.532463] flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
[ 33.532468] raw: 000fffffc0000000 0000000000000000 dead000000000122 0000000000000000
[ 33.532470] raw: 0000000000000000 0000000000000000 00000000fffffdff 0000000000000000
[ 33.532471] page dumped because: nonzero mapcount
[ 33.532472] Modules linked in: br_netfilter bridge veth netconsole virtio_net
[ 33.532479] CPU: 0 PID: 791 Comm: systemd-network Kdump: loaded Not tainted 5.18.0-rc1+ #37
[ 33.532482] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
[ 33.532484] Call Trace:
[ 33.532496] <TASK>
[ 33.532500] dump_stack_lvl+0x45/0x5a
[ 33.532506] bad_page.cold+0x63/0x94
[ 33.532510] free_pcp_prepare+0x290/0x420
[ 33.532515] free_unref_page+0x1b/0x100
[ 33.532518] skb_release_data+0x13f/0x1c0
[ 33.532524] kfree_skb_reason+0x3e/0xc0
[ 33.532527] ip6_mc_input+0x23c/0x2b0
[ 33.532531] ip6_sublist_rcv_finish+0x83/0x90
[ 33.532534] ip6_sublist_rcv+0x22b/0x2b0
[3] XDP program to reproduce(xdp_pass.c):
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
SEC("xdp_pass")
int xdp_pkt_pass(struct xdp_md *ctx)
{
bpf_xdp_adjust_head(ctx, -(int)32);
return XDP_PASS;
}
char _license[] SEC("license") = "GPL";
compile: clang -O2 -g -Wall -target bpf -c xdp_pass.c -o xdp_pass.o
load on virtio_net: ip link set enp1s0 xdpdrv obj xdp_pass.o sec xdp_pass
CC: stable@vger.kernel.org
CC: Jason Wang <jasowang@redhat.com>
CC: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
CC: Daniel Borkmann <daniel@iogearbox.net>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: virtualization@lists.linux-foundation.org
Fixes: 8fb7da9e9907 ("virtio_net: get build_skb() buf by data ptr")
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20220425103703.3067292-1-razor@blackwall.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The other port_hidden functions rely on the port_read/port_write
functions to access the hidden control port. These functions apply the
offset for port_base_addr where applicable. Update port_hidden_wait to
use the port_wait_bit so that port_base_addr offsets are accounted for
when waiting for the busy bit to change.
Without the offset the port_hidden_wait function would timeout on
devices that have a non-zero port_base_addr (e.g. MV88E6141), however
devices that have a zero port_base_addr would operate correctly (e.g.
MV88E6390).
Fixes: 609070133aff ("net: dsa: mv88e6xxx: update code operating on hidden registers")
Signed-off-by: Nathan Rossi <nathan@nathanrossi.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220425070454.348584-1-nathan@nathanrossi.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Return back the error value that we get from phy_read_mmd().
Fixes: c84786fa8f91 ("net: phy: marvell10g: read copper results from CSSR1")
Signed-off-by: Baruch Siach <baruch.siach@siklu.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/f47cb031aeae873bb008ba35001607304a171a20.1650868058.git.baruch@tkos.co.il
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The hardware checksum offloading requires use of a transmit
status block inserted before the outgoing frame data, this was
updated in '9a9ba2a4aaaa ("net: bcmgenet: always enable status blocks")'
However, skb_tx_timestamp() assumes that it is passed a raw frame
and PTP parsing chokes on this status block.
Fix this by calling __skb_pull(), which hides the TSB before calling
skb_tx_timestamp(), so an outgoing PTP packet is parsed correctly.
As the data in the skb has already been set up for DMA, and the
dma_unmap_* calls use a separately stored address, there is no
no effective change in the data transmission.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220424165307.591145-1-jonathan.lemon@gmail.com
Fixes: d03825fba459 ("net: bcmgenet: add skb_tx_timestamp call")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
DSA, through dsa_port_bridge_leave(), first notifies the port of the
fact that it left a bridge, then, if that bridge was VLAN-aware, it
notifies the port of the change in VLAN awareness state, towards
VLAN-unaware mode.
So ocelot_port_vlan_filtering() can be called when ocelot_port->bridge
is NULL, and this makes ocelot_add_vlan_unaware_pvid() create a struct
ocelot_bridge_vlan with a vid of 0 and an "untagged" setting of true on
that port.
In a way this structure correctly reflects the reality, but by design,
VID 0 (OCELOT_STANDALONE_PVID) was not meant to be kept in the bridge
VLAN list of the driver, but managed separately.
Having OCELOT_STANDALONE_PVID in ocelot->vlans makes us trip up on
several sanity checks that did not expect to have this VID there.
For example, after we leave a VLAN-aware bridge and we re-join it, we
can no longer program egress-tagged VLANs to hardware:
# ip link add br0 type bridge vlan_filtering 1 && ip link set br0 up
# ip link set swp0 master br0
# ip link set swp0 nomaster
# ip link set swp0 master br0
# bridge vlan add dev swp0 vid 100
Error: mscc_ocelot_switch_lib: Port with more than one egress-untagged VLAN cannot have egress-tagged VLANs.
But this configuration is in fact supported by the hardware, since we
could use OCELOT_PORT_TAG_NATIVE. According to its comment:
/* all VLANs except the native VLAN and VID 0 are egress-tagged */
yet when assessing the eligibility for this mode, we do not check for
VID 0 in ocelot_port_uses_native_vlan(), instead we just ensure that
ocelot_port_num_untagged_vlans() == 1. This is simply because VID 0
doesn't have a bridge VLAN structure.
The way I identify the problem is that ocelot_port_vlan_filtering(false)
only means to call ocelot_add_vlan_unaware_pvid() when we dynamically
turn off VLAN awareness for a bridge we are under, and the PVID changes
from the bridge PVID to a reserved PVID based on the bridge number.
Since OCELOT_STANDALONE_PVID is statically added to the VLAN table
during ocelot_vlan_init() and never removed afterwards, calling
ocelot_add_vlan_unaware_pvid() for it is not intended and does not serve
any purpose.
Fix the issue by avoiding the call to ocelot_add_vlan_unaware_pvid(vid=0)
when we're resetting VLAN awareness after leaving the bridge, to become
a standalone port.
Fixes: 54c319846086 ("net: mscc: ocelot: enforce FDB isolation when VLAN-unaware")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Both the felix DSA driver and ocelot switchdev driver declare
dev->features & NETIF_F_HW_VLAN_CTAG_FILTER under certain circumstances*,
so the 8021q module will add VID 0 to our RX filter when the port goes
up, to ensure 802.1p traffic is not dropped.
We treat VID 0 as a special value (OCELOT_STANDALONE_PVID) which
deliberately does not have a struct ocelot_bridge_vlan associated with
it. Instead, this gets programmed to the VLAN table in ocelot_vlan_init().
If we allow external calls to modify VID 0, we reach the following
situation:
# ip link add br0 type bridge vlan_filtering 1 && ip link set br0 up
# ip link set swp0 master br0
# ip link set swp0 up # this adds VID 0 to ocelot->vlans with untagged=false
bridge vlan
port vlan-id
swp0 1 PVID Egress Untagged # the bridge also adds VID 1
br0 1 PVID Egress Untagged
# bridge vlan add dev swp0 vid 100 untagged
Error: mscc_ocelot_switch_lib: Port with egress-tagged VLANs cannot have more than one egress-untagged (native) VLAN.
This configuration should have been accepted, because
ocelot_port_manage_port_tag() should select OCELOT_PORT_TAG_NATIVE.
Yet it isn't, because we have an entry in ocelot->vlans which says
VID 0 should be egress-tagged, something the hardware can't do.
Fix this by suppressing additions/deletions on VID 0 and managing this
VLAN exclusively using OCELOT_STANDALONE_PVID.
*DSA toggles it when the port becomes VLAN-aware by joining a VLAN-aware
bridge. Ocelot declares it unconditionally for some reason.
Fixes: 54c319846086 ("net: mscc: ocelot: enforce FDB isolation when VLAN-unaware")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The lan966x->ports[] array has lan966x->num_phys_ports elements. These
are assigned in lan966x_probe(). That means the > comparison should be
changed to >=.
The first off by one check is harmless but the second one could lead to
an out of bounds access and a crash.
Fixes: 5ccd66e01cbe ("net: lan966x: add support for interrupts from analyzer")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In one of the error paths of the device_for_each_child_node() loop
in hns_mac_init, add missing call to fwnode_handle_put.
Signed-off-by: Peng Wu <wupeng58@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, there are some querying mailboxes sent from VF to PF,
and VF will wait the PF's handling result. For mailbox
HCLGE_MBX_GET_QID_IN_PF and HCLGE_MBX_GET_RSS_KEY, it may fail
when the input parameter is invalid, but the prototype of their
handler function is void. In this case, PF always return success
to VF, which may cause the VF get incorrect result.
Fixes it by adding return value for these function.
Fixes: 63b1279d9905 ("net: hns3: check queue id range before using")
Fixes: 532cfc0df1e4 ("net: hns3: add a check for index in hclge_get_rss_key()")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add validity check for message data length in function
hclge_send_mbx_msg(), avoid unexpected overflow.
Fixes: dde1a86e93ca ("net: hns3: Add mailbox support to PF driver")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, function hclge_get_ring_chain_from_mbx will return -ENOMEM if
ring_num is bigger than HCLGE_MBX_MAX_RING_CHAIN_PARAM_NUM. It is better to
return -EINVAL for the invalid parameter case.
So this patch fixes it by return -EINVAL in this abnormal branch.
Fixes: 5d02a58dae60 ("net: hns3: fix for buffer overflow smatch warning")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The comments in function hclge_comm_tqps_update_stats is not right,
so fix it.
Fixes: 287db5c40d15 ("net: hns3: create new set of common tqp stats APIs for PF and VF reuse")
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For debugfs node rx/tx_queue_info and rx/tx_bd_info, their output info is
aligned to the right, it's not aligned with output of other debugfs node,
so uniform their output info.
Fixes: 907676b13071 ("net: hns3: use tx bounce buffer for small packets")
Fixes: e44c495d95e0 ("net: hns3: refactor queue info of debugfs")
Fixes: 77e9184869c9 ("net: hns3: refactor dump bd info of debugfs")
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If failed to register netdev, it needs to clear INITED state and stop
client in case of cause problem when concurrency with uninitialized
process of driver.
Fixes: a289a7e5c1d4 ("net: hns3: put off calling register_netdev() until client initialize complete")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I made a mistake with the commit a6aaa0032424 ("net: ethernet: stmmac:
fix altr_tse_pcs function when using a fixed-link"). I should have
tested against both scenario of having a SGMII interface and one
without.
Without the SGMII PCS TSE adpater, the sgmii_adapter_base address is
NULL, thus a write to this address will fail.
Cc: stable@vger.kernel.org
Fixes: a6aaa0032424 ("net: ethernet: stmmac: fix altr_tse_pcs function when using a fixed-link")
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Link: https://lore.kernel.org/r/20220420152345.27415-1-dinguyen@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is no need to add new compatible strings for each new supported
chip version. The compatible string is used only to select the subdriver
(rtl8365mb.c or rtl8366rb.c). Once in the subdriver, it will detect the
chip model by itself, ignoring which compatible string was used.
Link: https://lore.kernel.org/netdev/20220414014055.m4wbmr7tdz6hsa3m@bang-olufsen.dk/
Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20220418233558.13541-2-luizluca@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If major equal 0, register_chrdev() returns error code when it fails.
This function dynamically allocate a major and return its number on
success, so we should use "< 0" to check it instead of "!".
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Acked-By: Jan "Yenya" Kasprzak <kas@fi.muni.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a deadlock in rr_close(), which is shown below:
(Thread 1) | (Thread 2)
| rr_open()
rr_close() | add_timer()
spin_lock_irqsave() //(1) | (wait a time)
... | rr_timer()
del_timer_sync() | spin_lock_irqsave() //(2)
(wait timer to stop) | ...
We hold rrpriv->lock in position (1) of thread 1 and
use del_timer_sync() to wait timer to stop, but timer handler
also need rrpriv->lock in position (2) of thread 2.
As a result, rr_close() will block forever.
This patch extracts del_timer_sync() from the protection of
spin_lock_irqsave(), which could let timer handler to obtain
the needed lock.
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://lore.kernel.org/r/20220417125519.82618-1-duoming@zju.edu.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The init_systime() may be invoked in atomic state. We have observed the
following call trace when running "phc_ctl /dev/ptp0 set" on a Intel
Agilex board.
BUG: sleeping function called from invalid context at drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c:74
in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 381, name: phc_ctl
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
Preemption disabled at:
[<ffff80000892ef78>] stmmac_set_time+0x34/0x8c
CPU: 2 PID: 381 Comm: phc_ctl Not tainted 5.18.0-rc2-next-20220414-yocto-standard+ #567
Hardware name: SoCFPGA Agilex SoCDK (DT)
Call trace:
dump_backtrace.part.0+0xc4/0xd0
show_stack+0x24/0x40
dump_stack_lvl+0x7c/0xa0
dump_stack+0x18/0x34
__might_resched+0x154/0x1c0
__might_sleep+0x58/0x90
init_systime+0x78/0x120
stmmac_set_time+0x64/0x8c
ptp_clock_settime+0x60/0x9c
pc_clock_settime+0x6c/0xc0
__arm64_sys_clock_settime+0x88/0xf0
invoke_syscall+0x5c/0x130
el0_svc_common.constprop.0+0x4c/0x100
do_el0_svc+0x7c/0xa0
el0_svc+0x58/0xcc
el0t_64_sync_handler+0xa4/0x130
el0t_64_sync+0x18c/0x190
So we should use readl_poll_timeout_atomic() here instead of
readl_poll_timeout().
Also adjust the delay time to 10us to fix a "__bad_udelay" build error
reported by "kernel test robot <lkp@intel.com>". I have tested this on
Intel Agilex and NXP S32G boards, there is no delay needed at all.
So the 10us delay should be long enough for most cases.
Fixes: ff8ed737860e ("net: stmmac: use readl_poll_timeout() function in init_systime()")
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the user runs:
bridge link set dev $br_port mcast_flood on
this command should affect not only L2 multicast, but also IPv4 and IPv6
multicast.
In the Ocelot switch, unknown multicast gets flooded according to
different PGIDs according to its type, and PGID_MC only handles L2
multicast. Therefore, by leaving PGID_MCIPV4 and PGID_MCIPV6 at their
default value of 0, unknown IP multicast traffic is never flooded.
Fixes: 421741ea5672 ("net: mscc: ocelot: offload bridge port flags to device")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220415151950.219660-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This will reset deeply on freeze and thaw instead of suspend and
resume and prevent null pointer dereferences of the uninitialized ring
0 buffer while thawing.
The impact is an indefinitely hanging kernel. You can't switch
consoles after this and the only possible user interaction is SysRq.
BUG: kernel NULL pointer dereference
RIP: 0010:aq_ring_rx_fill+0xcf/0x210 [atlantic]
aq_vec_init+0x85/0xe0 [atlantic]
aq_nic_init+0xf7/0x1d0 [atlantic]
atl_resume_common+0x4f/0x100 [atlantic]
pci_pm_thaw+0x42/0xa0
resolves in aq_ring.o to
```
0000000000000ae0 <aq_ring_rx_fill>:
{
/* ... */
baf: 48 8b 43 08 mov 0x8(%rbx),%rax
buff->flags = 0U; /* buff is NULL */
```
The bug has been present since the introduction of the new pm code in
8aaa112a57c1 ("net: atlantic: refactoring pm logic") and was hidden
until 8ce84271697a ("net: atlantic: changes for multi-TC support"),
which refactored the aq_vec_{free,alloc} functions into
aq_vec_{,ring}_{free,alloc}, but is technically not wrong. The
original functions just always reinitialized the buffers on S3/S4. If
the interface is down before freezing, the bug does not occur. It does
not matter, whether the initrd contains and loads the module before
thawing.
So the fix is to invert the boolean parameter deep in all pm function
calls, which was clearly intended to be set like that.
First report was on Github [1], which you have to guess from the
resume logs in the posted dmesg snippet. Recently I posted one on
Bugzilla [2], since I did not have an AQC device so far.
#regzbot introduced: 8ce84271697a
#regzbot from: koo5 <kolman.jindrich@gmail.com>
#regzbot monitor: https://github.com/Aquantia/AQtion/issues/32
Fixes: 8aaa112a57c1 ("net: atlantic: refactoring pm logic")
Link: https://github.com/Aquantia/AQtion/issues/32 [1]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215798 [2]
Cc: stable@vger.kernel.org
Reported-by: koo5 <kolman.jindrich@gmail.com>
Signed-off-by: Manuel Ullmann <labre@posteo.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit b5f862180d70 was introduced to discard lowest hash bit for layer3+4 hashing
but it also removes last bit from non layer3+4 hashing
Below script shows layer2+3 hashing will result in same slave to be used with above commit.
$ cat hash.py
#/usr/bin/python3.6
h_dests=[0xa0, 0xa1]
h_source=0xe3
hproto=0x8
saddr=0x1e7aa8c0
daddr=0x17aa8c0
for h_dest in h_dests:
hash = (h_dest ^ h_source ^ hproto ^ saddr ^ daddr)
hash ^= hash >> 16
hash ^= hash >> 8
print(hash)
print("with last bit removed")
for h_dest in h_dests:
hash = (h_dest ^ h_source ^ hproto ^ saddr ^ daddr)
hash ^= hash >> 16
hash ^= hash >> 8
hash = hash >> 1
print(hash)
Output:
$ python3.6 hash.py
522133332
522133333 <-------------- will result in both slaves being used
with last bit removed
261066666
261066666 <-------------- only single slave used
Signed-off-by: suresh kumar <suresh2514@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the lan966x driver is removed make sure to remove also the ptp_irq
IRQ.
Fixes: e85a96e48e3309 ("net: lan966x: Add support for ptp interrupts")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://lore.kernel.org/r/20220413195716.3796467-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido reported that the commit referenced in the Fixes tag broke
a gre use case with dummy devices. Add a check to ip_tunnel_init_flow
to see if the oif is an l3mdev port and if so set the oif to 0 to
avoid the oif comparison in fib_lookup_good_nhc.
Fixes: 40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices")
Reported-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-04-13
This series contains updates to igc and e1000e drivers.
Sasha removes waiting for hardware semaphore as it could cause an
infinite loop and changes usleep_range() calls done under atomic
context to udelay() for igc. For e1000e, he changes some variables from
u16 to u32 to prevent possible overflow of values.
Vinicius disables PTM when going to suspend as it is causing hang issues
on some platforms for igc.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
e1000e: Fix possible overflow in LTR decoding
igc: Fix suspending when PTM is active
igc: Fix BUG: scheduling while atomic
igc: Fix infinite loop in release_swfw_sync
====================
Link: https://lore.kernel.org/r/20220413170814.2066855-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The displayed list of Ethernet devices in make menuconfig
has gotten out of order. This is mostly due to changes in vendor
names etc, but also because of new Microsoft entry in wrong place.
This restores so that the display is in order even if the names
of the sub directories are not.
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-04-14
This series contains updates to ice driver only.
Maciej adjusts implementation in __ice_alloc_rx_bufs_zc() for when
ice_fill_rx_descs() does not return the entire buffer request and fixes a
return value for !CONFIG_NET_SWITCHDEV configuration which was preventing
VF creation.
Wojciech prevents eswitch transmit when VFs are being removed which was
causing NULL pointer dereference.
Jianglei Nie fixes a memory leak on error path of getting OROM data.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Added the phy_poll_cable_test flag for the lan937x phy driver.
Tested using command - ethtool --cable-test <dev>
Fixes: 680baca546f2 ("net: phy: added the LAN937x phy support")
Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
netfilter.
Current release - regressions:
- smc: fix af_ops of child socket pointing to released memory
- wifi: ath9k: fix usage of driver-private space in tx_info
Previous releases - regressions:
- ipv6: fix panic when forwarding a pkt with no in6 dev
- sctp: use the correct skb for security_sctp_assoc_request
- smc: fix NULL pointer dereference in smc_pnet_find_ib()
- sched: fix initialization order when updating chain 0 head
- phy: don't defer probe forever if PHY IRQ provider is missing
- dsa: revert "net: dsa: setup master before ports"
- dsa: felix: fix tagging protocol changes with multiple CPU ports
- eth: ice:
- fix use-after-free when freeing @rx_cpu_rmap
- revert "iavf: fix deadlock occurrence during resetting VF interface"
- eth: lan966x: stop processing the MAC entry is port is wrong
Previous releases - always broken:
- sched
- flower: fix parsing of ethertype following VLAN header
- taprio: check if socket flags are valid
- nfc: add flush_workqueue to prevent uaf
- veth: ensure eth header is in skb's linear part
- eth: stmmac: fix altr_tse_pcs function when using a fixed-link
- eth: macb: restart tx only if queue pointer is lagging
- eth: macvlan: fix leaking skb in source mode with nodst option
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmJX6CoSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkMRIP/0CKmGetL0i1WQ0LD8regv+E6NizyXDB
+YCchHMMgYJ/aIRVps9GSWqD6ncU2gCSW27aCxHsN+Esw/HytrmsaHfS1SWRgIfb
6hn1CB3/Ojd1eXZOdXgVrlayhJKj//c0KdoHQoY+sQaLKNvULx5VfTq4y1yjyXy2
upfwW4JaQaUlkaTbdhjRhq5TuRY2JCwscc7EmJbwrqmqWG7nSEngrLU0XLHs8tWr
X/gQbI/wDepf/KidE59rLV7yMYlovCkoZBVUrN4oLZRqxUIBtU0a8nZNX5NFWiVD
KwTahrUh8eOizzSMEzjO0HpuGBWQFrmyC0eOb8KwixMJrRfDtqAWJKwHfxZtzHr/
JpUHlgvCN9iQoPYY0LZ8uU2CLFJM+p4hn1sOt/swoo23iAvCiJWWbRoAvsvwLbZ3
4pAsof8w6SRb34J/6Lcu78LECD/y62TJ27Ay82vdjrYbz1g8wb1wGlAMAbJ7K7sE
NQ4iB6wzd6UVF3Qm4qo7kRJT5TTY4TxZMQUKl1gj/OEV9hrJ8zF1MjRhxNWRC2R9
+A8Rw3weL8zYKoezCEEbogaDp+h7IdeNlbZE8r19p2CMiJ/Y+NQd//p4rSJ0Oqcw
1tOtg5x7LVVMiSz38uHk0HbiKJ9UbHLJJZ35N5oi/IDPg+LWFfar23I4BfuII2kB
Wo5Ii/0DMozb
=Ha9M
-----END PGP SIGNATURE-----
Merge tag 'net-5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from wireless and netfilter.
Current release - regressions:
- smc: fix af_ops of child socket pointing to released memory
- wifi: ath9k: fix usage of driver-private space in tx_info
Previous releases - regressions:
- ipv6: fix panic when forwarding a pkt with no in6 dev
- sctp: use the correct skb for security_sctp_assoc_request
- smc: fix NULL pointer dereference in smc_pnet_find_ib()
- sched: fix initialization order when updating chain 0 head
- phy: don't defer probe forever if PHY IRQ provider is missing
- dsa: revert "net: dsa: setup master before ports"
- dsa: felix: fix tagging protocol changes with multiple CPU ports
- eth: ice:
- fix use-after-free when freeing @rx_cpu_rmap
- revert "iavf: fix deadlock occurrence during resetting VF
interface"
- eth: lan966x: stop processing the MAC entry is port is wrong
Previous releases - always broken:
- sched:
- flower: fix parsing of ethertype following VLAN header
- taprio: check if socket flags are valid
- nfc: add flush_workqueue to prevent uaf
- veth: ensure eth header is in skb's linear part
- eth: stmmac: fix altr_tse_pcs function when using a fixed-link
- eth: macb: restart tx only if queue pointer is lagging
- eth: macvlan: fix leaking skb in source mode with nodst option"
* tag 'net-5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits)
net: bcmgenet: Revert "Use stronger register read/writes to assure ordering"
rtnetlink: Fix handling of disabled L3 stats in RTM_GETSTATS replies
net: dsa: felix: fix tagging protocol changes with multiple CPU ports
tun: annotate access to queue->trans_start
nfc: nci: add flush_workqueue to prevent uaf
net: dsa: realtek: don't parse compatible string for RTL8366S
net: dsa: realtek: fix Kconfig to assure consistent driver linkage
net: ftgmac100: access hardware register after clock ready
Revert "net: dsa: setup master before ports"
macvlan: Fix leaking skb in source mode with nodst option
netfilter: nf_tables: nft_parse_register can return a negative value
net: lan966x: Stop processing the MAC entry is port is wrong.
net: lan966x: Fix when a port's upper is changed.
net: lan966x: Fix IGMP snooping when frames have vlan tag
net: lan966x: Update lan966x_ptp_get_nominal_value
sctp: Initialize daddr on peeled off socket
net/smc: Fix af_ops of child socket pointing to released memory
net/smc: Fix NULL pointer dereference in smc_pnet_find_ib()
net/smc: use memcpy instead of snprintf to avoid out of bounds read
net: macb: Restart tx only if queue pointer is lagging
...
A memory chunk was allocated for orom_data in ice_get_orom_civd_data()
by vzmalloc(). But when ice_read_flash_module() fails, the allocated
memory is not freed, which will lead to a memory leak.
We can fix it by freeing the orom_data when ce_read_flash_module() fails.
Fixes: af18d8866c80 ("ice: reduce time to read Option ROM CIVD data")
Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Currently for !CONFIG_NET_SWITCHDEV kernel builds it is not possible to
create VFs properly as call to ice_eswitch_configure() returns
-EOPNOTSUPP for us. This is because CONFIG_ICE_SWITCHDEV depends on
CONFIG_NET_SWITCHDEV.
Change the ice_eswitch_configure() implementation for
!CONFIG_ICE_SWITCHDEV to return 0 instead -EOPNOTSUPP and let
ice_ena_vfs() finish its work properly.
CC: Grzegorz Nitka <grzegorz.nitka@intel.com>
Fixes: 1a1c40df2e80 ("ice: set and release switchdev environment")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
__ice_alloc_rx_bufs_zc() checks if a number of the descriptors to be
allocated would cause the ring wrap. In that case, driver will issue two
calls to xsk_buff_alloc_batch() - one that will fill the ring up to the
end and the second one that will start with filling descriptors from the
beginning of the ring.
ice_fill_rx_descs() is a wrapper for taking care of what
xsk_buff_alloc_batch() gave back to the driver. It works in a best
effort approach, so for example when driver asks for 64 buffers,
ice_fill_rx_descs() could assign only 32. Such case needs to be checked
when ring is being filled up to the end, because in that situation ntu
might not reached the end of the ring.
Fix the ring wrap by checking if nb_buffs_extra has the expected value.
If not, bump ntu and go directly to tail update.
Fixes: 3876ff525de7 ("ice: xsk: Handle SW XDP ring wrap and bump tail more often")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Shwetha Nagaraju <Shwetha.nagaraju@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
When the device tree has 2 CPU ports defined, a single one is active
(has any dp->cpu_dp pointers point to it). Yet the second one is still a
CPU port, and DSA still calls ->change_tag_protocol on it.
On the NXP LS1028A, the CPU ports are ports 4 and 5. Port 4 is the
active CPU port and port 5 is inactive.
After the following commands:
# Initial setting
cat /sys/class/net/eno2/dsa/tagging
ocelot
echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging
echo ocelot > /sys/class/net/eno2/dsa/tagging
traffic is now broken, because the driver has moved the NPI port from
port 4 to port 5, unbeknown to DSA.
The problem can be avoided by detecting that the second CPU port is
unused, and not doing anything for it. Further rework will be needed
when proper support for multiple CPU ports is added.
Treat this as a bug and prepare current kernels to work in single-CPU
mode with multiple-CPU DT blobs.
Fixes: adb3dccf090b ("net: dsa: felix: convert to the new .change_tag_protocol DSA API")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220412172209.2531865-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Commit 5337824f4dc4 ("net: annotate accesses to queue->trans_start")
introduced a new helper, txq_trans_cond_update, to update
queue->trans_start using WRITE_ONCE. One snippet in drivers/net/tun.c
was missed, as it was introduced roughly at the same time.
Fixes: 5337824f4dc4 ("net: annotate accesses to queue->trans_start")
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220412135852.466386-1-atenart@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When we decode the latency and the max_latency, u16 value may not fit
the required size and could lead to the wrong LTR representation.
Scaling is represented as:
scale 0 - 1 (2^(5*0)) = 2^0
scale 1 - 32 (2^(5 *1))= 2^5
scale 2 - 1024 (2^(5 *2)) =2^10
scale 3 - 32768 (2^(5 *3)) =2^15
scale 4 - 1048576 (2^(5 *4)) = 2^20
scale 5 - 33554432 (2^(5 *4)) = 2^25
scale 4 and scale 5 required 20 and 25 bits respectively.
scale 6 reserved.
Replace the u16 type with the u32 type and allow corrected LTR
representation.
Cc: stable@vger.kernel.org
Fixes: 44a13a5d99c7 ("e1000e: Fix the max snoop/no-snoop latency for 10M")
Reported-by: James Hutchinson <jahutchinson99@googlemail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215689
Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Tested-by: James Hutchinson <jahutchinson99@googlemail.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Some mainboard/CPU combinations, in particular, Alder Lake-S with a
W680 mainboard, have shown problems (system hangs usually, no kernel
logs) with suspend/resume when PCIe PTM is enabled and active. In some
cases, it could be reproduced when removing the igc module.
The best we can do is to stop PTM dialogs from the downstream/device
side before the interface is brought down. PCIe PTM will be re-enabled
when the interface is being brought up.
Fixes: a90ec8483732 ("igc: Add support for PTP getcrosststamp()")
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>