1145694 Commits

Author SHA1 Message Date
Michael Riesch
1069470070 media: v4l2-mediabus: add support for dual edge sampling
Some devices support sampling of the parallel data at both edges of the
interface pixel clock in order to reduce the pixel clock by two.
Add a mediabus flag that represents this feature.

Signed-off-by: Michael Riesch <michael.riesch@wolfvision.net>
Reviewed-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2022-12-07 17:58:46 +01:00
Michael Riesch
55f6f743e9 dt-bindings: media: video-interfaces: add support for dual edge sampling
Some devices support sampling of the parallel data at both edges of the
interface pixel clock in order to reduce the pixel clock by two.
Use the pclk-sample property to reflect this feature in the device tree.

Signed-off-by: Michael Riesch <michael.riesch@wolfvision.net>
Reviewed-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2022-12-07 17:58:46 +01:00
Christophe JAILLET
f0ed939b6a media: pt3: Use dma_set_mask_and_coherent() and simplify code
Use dma_set_mask_and_coherent() instead of unrolling it with some
dma_set_mask()+dma_set_coherent_mask().

Moreover, as stated in [1], dma_set_mask() with a 64-bit mask will never
fail if dev->dma_mask is non-NULL.
So, if it fails, the 32 bits case will also fail for the same reason.

Simplify code and remove some dead code accordingly.

[1]: https://lkml.org/lkml/2021/6/7/398

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Akihiro Tsukada <tskd08@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2022-12-07 17:58:46 +01:00
Adam Borowski
6e616668a1 media: ipu3-cio2: make the bridge depend on i2c
drivers/media/pci/intel/ipu3/cio2-bridge.c: In function ‘cio2_bridge_unregister_sensors’:
drivers/media/pci/intel/ipu3/cio2-bridge.c:258:17: error: implicit declaration of function ‘i2c_unregister_device’; did you mean ‘spi_unregister_device’? [-Werror=implicit-function-declaration]
  258 |                 i2c_unregister_device(sensor->vcm_i2c_client);
      |                 ^~~~~~~~~~~~~~~~~~~~~
      |                 spi_unregister_device

Link: https://lore.kernel.org/linux-media/S230142AbiJTWql/20221020224641Z+958@vger.kernel.org
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2022-12-07 17:58:41 +01:00
Hans de Goede
7e837a5c50 media: MAINTAINERS: Add Hans de Goede as staging/atomisp maintainer
Add myself as maintainer for the drivers/staging/media/atomisp code.

Link: https://lore.kernel.org/linux-media/20221123161447.15834-1-hdegoede@redhat.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2022-12-07 17:58:15 +01:00
Thomas Gleixner
6132a490f9 irqchip updates for 6.2
- More APCI fixes and improvements for the LoongArch architecture,
   adding support for the HTVEC irqchip, suspend-resume, and some
   PCI INTx workarounds
 
 - Initial DT support for LoongArch. I'm not even kidding.
 
 - Support for the MTK CIRQv2, a minor deviation from the original version
 
 - Error handling fixes for wpcm450, GIC...
 
 - BE detection for a FSL controller
 
 - Declare the Sifive PLIC as wake-up agnostic
 
 - Simplify fishing out the device data for the ST irqchip
 
 - Mark some data structures as __initconst in the apple-aic driver
 
 - Switch over from strtobool to kstrtobool
 
 - COMPILE_TEST fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmOQsZ8PHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDQMwQAJWBLgkSHnPSJizG+KLpDSljRmkUif7j3PBk
 TVCwfj/xXiCUzP7PBhq2DUZWLvrKIiYl79Z6j/Jz4xlavPAVp6Cs+afxmJF2n/wy
 OwWF8uevHvFhKIFJUFxrSDaODWGI6afmDT7iflE3/Uz2ahVYoLkz5a10ji78N3JU
 peeX81i/kZE19w1aCs0sdMgzJufjK98hj55917GGv/IRBAjj2qi2tOZnPdQuE6d4
 +vDiRtKDjDZYp9toaaH7DGBlBMG/yQzBhPaXtMnxH9wGK5858sYmHZroeM4FFbPR
 sJ/rCiaBEuvC6F1JkShKsRqT2DW1BgXYjSPxh+qhzwCOSG32DDjkj6LqVkygSoKQ
 MogCIkC3C4D0BIiokSdJByN1PWjoynwtYJrUkYPk4fMg7JxHYwee9dCptT81irMM
 +3L1mqx9CAll7YkzPVwrCbFyJF9K4ax9kjEvSH3E8xZxLKGnS9iqjQBfE9NKILCl
 tT6dmCJXTXsoGFjK4zz4o2N8i8aOeUz/B3GX5eYD+x0s0riXhg4kYIWFVXsxwJlX
 VSxBphr5fAQrdgBdOOESmiqNxPv9HuLRyR2mZ4blE4ylDa9NRUd+4UTy8x0bRmxD
 BO7qPof0CnrplBNEQi7kDtc91q4UxQoOV/G8LoFp454d9imMlAuPMbizTAULpWZu
 JxPXyrJl
 =fhi8
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates frim Marc Zyngier:

 - More APCI fixes and improvements for the LoongArch architecture,
   adding support for the HTVEC irqchip, suspend-resume, and some
   PCI INTx workarounds

 - Initial DT support for LoongArch. I'm not even kidding.

 - Support for the MTK CIRQv2, a minor deviation from the original version

 - Error handling fixes for wpcm450, GIC...

 - BE detection for a FSL controller

 - Declare the Sifive PLIC as wake-up agnostic

 - Simplify fishing out the device data for the ST irqchip

 - Mark some data structures as __initconst in the apple-aic driver

 - Switch over from strtobool to kstrtobool

 - COMPILE_TEST fixes
2022-12-07 17:50:44 +01:00
Christoph Hellwig
c34b7ac650 block: remove bio_set_op_attrs
This macro is obsolete, so replace the last few uses with open coded
bi_opf assignments.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Coly Li <colyli@suse.de <mailto:colyli@suse.de>>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20221206144057.720846-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-07 09:43:12 -07:00
Jens Axboe
e18a9c18c3 nvme fixes for Linux 6.1
- initialize core quirks before calling nvme_init_subsystem
    (Pankaj Raghav)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmOQnWwLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYPB9BAAmNjkq4F/OOHOXGyZKIbwq8ylFxtkRBD+608c3Woa
 juVI4+bXwhbG4k6h4TeXD2l2DZdjaKfNyG8ePD1GQ3KDU8unQqEUcFbuauW4vFYS
 9FSeIvj2rIUGkiD0t5rNbxKHX4TZ2l8OB3sN1N3abuwZ34S211peCr09Tq80RnrR
 MOzvAB+RqUSaDyAqNfP+N9tvFLGQZXoruNogPY60b6NY6Z4n4QplvymIaU8wpFms
 7lMAROLi9XQTBrrHkSivIDvxscy/HHtl5aYIzF7m/09svIzrYNu1VnqpnVl21kh8
 5qb1FFbzJCxZvqcVWTC3Ozx19LFN4EZx3MvOGrWQ3wWGtpS1aYzIyol783bZe0w0
 Oo6im77jKBocFHNB63vzwckD5vhfNJ9CPIRqBM1UGcU6NYTwOIl32PWAQn3etrW5
 LElYtXdgRoXKMYho8sZAI8EGL3RPDbW/yba6t+VjvOYup2edp3A6iEjRk0HAcOeN
 vaJp1Rdu4DsXNPdJ00ctPABofD1jL8AwOrAz4ZLmCmnA83mf1UWcq2KpMBL6DBZi
 h61hDe4X88wko/3hWHfqwHVxeFuuIvPhceFjhFlGSygxw/kofs3y7JEHSq/uGTBx
 pmXsnsHIxhU/3G2PfM219iwogVuL1qLu7R4mr9TRK7km6SqAcDiL0nDypVy/gGRt
 zRE=
 =/4ct
 -----END PGP SIGNATURE-----

Merge tag 'nvme-6.1-2022-12-07' of git://git.infradead.org/nvme into block-6.1

Pull NVMe fix from Christoph:

"nvme fixes for Linux 6.1

 - initialize core quirks before calling nvme_init_subsystem
   (Pankaj Raghav)"

* tag 'nvme-6.1-2022-12-07' of git://git.infradead.org/nvme:
  nvme initialize core quirks before calling nvme_init_subsystem
2022-12-07 08:55:27 -07:00
Pavel Begunkov
f66f73421f io_uring: skip spinlocking for ->task_complete
->task_complete was added to serialised CQE posting by doing it from
the task context only (or fallback wq when the task is dead), and now we
can use that to avoid taking ->completion_lock while filling CQ entries.
The patch skips spinlocking only in two spots,
__io_submit_flush_completions() and flushing in io_aux_cqe, it's safer
and covers all cases we care about. Extra care is taken to force taking
the lock while queueing overflow entries.

It fundamentally relies on SINGLE_ISSUER to have only one task posting
events. It also need to take into account overflowed CQEs, flushing of
which happens in the cq wait path, and so this implementation also needs
DEFER_TASKRUN to limit waiters. For the same reason we disable it for
SQPOLL, and for IOPOLL as it won't benefit from it in any case.
DEFER_TASKRUN, SQPOLL and IOPOLL requirement may be relaxed in the
future.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/2a8c91fd82cfcdcc1d2e5bac7051fe2c183bda73.1670384893.git.asml.silence@gmail.com
[axboe: modify to apply]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-07 08:51:08 -07:00
Pavel Begunkov
6d043ee116 io_uring: do msg_ring in target task via tw
While executing in a context of one io_uring instance msg_ring
manipulates another ring. We're trying to keep CQEs posting contained in
the context of the ring-owner task, use task_work to send the request to
the target ring's task when we're modifying its CQ or trying to install
a file. Note, we can't safely use io_uring task_work infra and have to
use task_work directly.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4d76c7b28ed5d71b520de4482fbb7f660f21cd80.1670384893.git.asml.silence@gmail.com
[axboe: use TWA_SIGNAL_NO_IPI]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-07 08:50:57 -07:00
Vladimir Oltean
87a39882b5 net: dsa: mv88e6xxx: accept phy-mode = "internal" for internal PHY ports
The ethernet-controller dt-schema, mostly pushed forward by Linux, has
the "internal" PHY mode for denoting MAC connections to an internal PHY.

U-Boot may provide device tree blobs where this phy-mode is specified,
so make the Linux driver accept them.

It appears that the current behavior with phy-mode = "internal" was
introduced when mv88e6xxx started reporting supported_interfaces to
phylink. Prior to that, I don't think it would have any issues accepting
this phy-mode.

Fixes: d4ebf12bcec4 ("net: dsa: mv88e6xxx: populate supported_interfaces and mac_capabilities")
Link: https://lore.kernel.org/linux-arm-kernel/20221205172709.kglithpbhdbsakvd@skbuf/T/
Reported-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Tim Harvey <tharvey@gateworks.com> # imx6q-gw904.dts
Link: https://lore.kernel.org/r/20221205194845.2131161-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 16:08:06 +01:00
Juergen Gross
7dfa764e02 xen/netback: fix build warning
Commit ad7f402ae4f4 ("xen/netback: Ensure protocol headers don't fall in
the non-linear area") introduced a (valid) build warning. There have
even been reports of this problem breaking networking of Xen guests.

Fixes: ad7f402ae4f4 ("xen/netback: Ensure protocol headers don't fall in the non-linear area")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Tested-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-12-07 16:03:21 +01:00
Yang Yingliang
9e62465185 xen/netback: don't call kfree_skb() under spin_lock_irqsave()
It is not allowed to call kfree_skb() from hardware interrupt
context or with interrupts being disabled. So replace kfree_skb()
with dev_kfree_skb_irq() under spin_lock_irqsave().

Fixes: be81992f9086 ("xen/netback: don't queue unlimited number of packages")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Link: https://lore.kernel.org/r/20221205141333.3974565-1-yangyingliang@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 16:00:33 +01:00
Mario Limonciello
e4678483f9 platform/x86/amd: pmc: Add a workaround for an s0i3 issue on Cezanne
Cezanne platforms under the right circumstances have a synchronization
problem where attempting to enter s2idle may fail if the x86 cores are
put into HLT before hardware resume from the previous attempt has
completed.

To avoid this issue add a 10-20ms delay before entering s2idle another
time. This workaround will only be applied on interrupts that wake the
hardware but don't break the s2idle loop.

Cc: stable@vger.kernel.org # 6.1
Cc: "Mahapatra, Rajib" <Rajib.Mahapatra@amd.com>
Cc: "Raul Rangel" <rrangel@chromium.org>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20221116154341.13382-1-mario.limonciello@amd.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-12-07 15:27:49 +01:00
bayi cheng
8330e9e826
spi: spi-mtk-nor: Add recovery mechanism for dma read timeout
The state machine of MTK spi nor controller may be disturbed by some
glitch signals from the relevant BUS during dma read, Although the
possibility of causing the dma read to fail is next to nothing,
However, if error-handling is not implemented, which makes the feature
somewhat risky.

Add an error-handling mechanism here, reset the state machine and
re-read the data when an error occurs.

Signed-off-by: bayi cheng <bayi.cheng@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20221207055435.30557-1-bayi.cheng@mediatek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-12-07 14:19:22 +00:00
Han Xu
bc9ab1b7a6
spi: spi-fsl-lpspi: add num-cs binding for lpspi
Add num-cs property to support multiple cs for lpspi. This property is
optional.

Signed-off-by: Han Xu <han.xu@nxp.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221206225410.604482-2-han.xu@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-12-07 14:19:21 +00:00
Han Xu
5f947746f0
spi: spi-fsl-lpspi: support multiple cs for lpspi
support to get chip select number from DT file.

Signed-off-by: Han Xu <han.xu@nxp.com>
Link: https://lore.kernel.org/r/20221206225410.604482-1-han.xu@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-12-07 14:19:20 +00:00
Yuan Can
cf34ac6aa2
regulator: qcom-labibb: Fix missing of_node_put() in qcom_labibb_regulator_probe()
The reg_node needs to be released through of_node_put() in the error
handling path when of_irq_get_byname() failed.

Fixes: 390af53e0411 ("regulator: qcom-labibb: Implement short-circuit and over-current IRQs")
Signed-off-by: Yuan Can <yuancan@huawei.com>
Link: https://lore.kernel.org/r/20221203062109.115043-1-yuancan@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-12-07 14:19:18 +00:00
Wang Kefeng
73a0b6ee5d ARM: 9278/1: kfence: only handle translation faults
This is a similar fixup like arm64 does, only handle translation faults
in case of unexpected kfence report when alignment faults on ARM, see
more from commit 0bb1fbffc631 ("arm64: mm: kfence: only handle translation
faults").

Fixes: 75969686ec0d ("ARM: 9166/1: Support KFENCE for ARM")
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-12-07 14:08:09 +00:00
Sagi Grimberg
19b00e0069 nvmet: don't open-code NVME_NS_ATTR_RO enumeration
It is already there, just go ahead and use it.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-12-07 15:03:09 +01:00
Christoph Hellwig
0da7feaa59 nvme-pci: use the tagset alloc/free helpers
Use the common helpers to allocate and free the tagsets.  To make this
work the generic nvme_ctrl now needs to be stored in the hctx private
data instead of the nvme_dev.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2022-12-07 15:02:31 +01:00
Christoph Hellwig
93b24f579c nvme: add the Apple shared tag workaround to nvme_alloc_io_tag_set
Add the apple shared tag workaround to nvme_alloc_io_tag_set to prepare
for using that helper in the PCIe driver.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2022-12-07 15:02:27 +01:00
Christoph Hellwig
b794d1c2ad nvme: only set reserved_tags in nvme_alloc_io_tag_set for fabrics controllers
The reserved_tags are only needed for fabrics controllers.  Right now only
fabrics drivers call this helper, so this is harmless, but we'll use it
in the PCIe driver soon.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2022-12-07 15:02:24 +01:00
Christoph Hellwig
db45e1a5dd nvme: consolidate setting the tagset flags
All nvme transports should be using the same flags for their tagsets,
with the exception for the blocking flag that should only be set for
transports that can block in ->queue_rq.

Add a NVME_F_BLOCKING flag to nvme_ctrl_ops to control the blocking
behavior and lift setting the flags into nvme_alloc_{admin,io}_tag_set.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2022-12-07 15:02:20 +01:00
Christoph Hellwig
dcef77274a nvme: pass nr_maps explicitly to nvme_alloc_io_tag_set
Don't look at ctrl->ops as only RDMA and TCP actually support multiple
maps.

Fixes: 6dfba1c09c10 ("nvme-fc: use the tagset alloc/free helpers")
Fixes: ceee1953f923 ("nvme-loop: use the tagset alloc/free helpers")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2022-12-07 15:02:15 +01:00
Pavel Begunkov
1721131016 io_uring: extract a io_msg_install_complete helper
Extract a helper called io_msg_install_complete() from io_msg_send_fd(),
will be used later.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1500ca1054cc4286a3ee1c60aacead57fcdfa02a.1670384893.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-07 06:47:13 -07:00
Pavel Begunkov
11373026f2 io_uring: get rid of double locking
We don't need to take both uring_locks at once, msg_ring can be split in
two parts, first getting a file from the filetable of the first ring and
then installing it into the second one.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a80ecc2bc99c3b3f2cf20015d618b7c51419a797.1670384893.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-07 06:47:13 -07:00
Pavel Begunkov
77e443ab29 io_uring: never run tw and fallback in parallel
Once we fallback a tw we want all requests to that task to be given to
the fallback wq so we dont run it in parallel with the last, i.e. post
PF_EXITING, tw run of the task.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/96f4987265c4312f376f206511c6af3e77aaf5ac.1670384893.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-07 06:47:13 -07:00
Pavel Begunkov
d34b1b0b67 io_uring: use tw for putting rsrc
Use task_work for completing rsrc removals, it'll be needed later for
spinlock optimisations.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/cbba5d53a11ee6fc2194dacea262c1d733c8b529.1670384893.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-07 06:47:13 -07:00
Pavel Begunkov
17add5cea2 io_uring: force multishot CQEs into task context
Multishot are posting CQEs outside of the normal request completion
path, which is usually done from within a task work handler. However, it
might be not the case when it's yet to be polled but has been punted to
io-wq. Make it abide ->task_complete and push it to the polling path
when executed by io-wq.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d7714aaff583096769a0f26e8e747759e556feb1.1670384893.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-07 06:47:13 -07:00
Pavel Begunkov
e6aeb2721d io_uring: complete all requests in task context
This patch adds ctx->task_complete flag. If set, we'll complete all
requests in the context of the original task. Note, this extends to
completion CQE posting only but not io_kiocb cleanup / free, e.g. io-wq
may free the requests in the free calllback. This flag will be used
later for optimisations purposes.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/21ece72953f76bb2e77659a72a14326227ab6460.1670384893.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-07 06:47:13 -07:00
Pavel Begunkov
1b346e4aa8 io_uring: don't check overflow flush failures
The only way to fail overflowed CQEs flush is for CQ to be fully packed.
There is one place checking for flush failures, i.e. io_cqring_wait(),
but we limit the number to be waited for by the CQ size, so getting a
failure automatically means that we're done with waiting.

Don't check for failures, rarely but they might spuriously fail CQ
waiting with -EBUSY.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/6b720a45c03345655517f8202cbd0bece2848fb2.1670384893.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-07 06:47:13 -07:00
Pavel Begunkov
a85381d832 io_uring: skip overflow CQE posting for dying ring
After io_ring_ctx_wait_and_kill() is called there should be no users
poking into rings and so there is no need to post CQEs. So, instead of
trying to post overflowed CQEs into the CQ, drop them. Also, do it
in io_ring_exit_work() in a loop to reduce the number of contexts it
can be executed from and even when it struggles to quiesce the ring we
won't be leaving memory allocated for longer than needed.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/26d13751155a735a3029e24f8d9ca992f810419d.1670384893.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-07 06:47:13 -07:00
Pavel Begunkov
4c979eaefa io_uring: improve io_double_lock_ctx fail handling
msg_ring will fail the request if it can't lock rings, instead punt it
to io-wq as was originally intended.

Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4697f05afcc37df5c8f89e2fe6d9c7c19f0241f9.1670384893.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-07 06:47:13 -07:00
Pavel Begunkov
ef0ec1ad03 io_uring: dont remove file from msg_ring reqs
We should not be messing with req->file outside of core paths. Clearing
it makes msg_ring non reentrant, i.e. luckily io_msg_send_fd() fails the
request on failed io_double_lock_ctx() but clearly was originally
intended to do retries instead.

Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e5ac9edadb574fe33f6d727cb8f14ce68262a684.1670384893.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-07 06:47:13 -07:00
Harshit Mogalapalli
998b30c394 io_uring: Fix a null-ptr-deref in io_tctx_exit_cb()
Syzkaller reports a NULL deref bug as follows:

 BUG: KASAN: null-ptr-deref in io_tctx_exit_cb+0x53/0xd3
 Read of size 4 at addr 0000000000000138 by task file1/1955

 CPU: 1 PID: 1955 Comm: file1 Not tainted 6.1.0-rc7-00103-gef4d3ea40565 #75
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
 Call Trace:
  <TASK>
  dump_stack_lvl+0xcd/0x134
  ? io_tctx_exit_cb+0x53/0xd3
  kasan_report+0xbb/0x1f0
  ? io_tctx_exit_cb+0x53/0xd3
  kasan_check_range+0x140/0x190
  io_tctx_exit_cb+0x53/0xd3
  task_work_run+0x164/0x250
  ? task_work_cancel+0x30/0x30
  get_signal+0x1c3/0x2440
  ? lock_downgrade+0x6e0/0x6e0
  ? lock_downgrade+0x6e0/0x6e0
  ? exit_signals+0x8b0/0x8b0
  ? do_raw_read_unlock+0x3b/0x70
  ? do_raw_spin_unlock+0x50/0x230
  arch_do_signal_or_restart+0x82/0x2470
  ? kmem_cache_free+0x260/0x4b0
  ? putname+0xfe/0x140
  ? get_sigframe_size+0x10/0x10
  ? do_execveat_common.isra.0+0x226/0x710
  ? lockdep_hardirqs_on+0x79/0x100
  ? putname+0xfe/0x140
  ? do_execveat_common.isra.0+0x238/0x710
  exit_to_user_mode_prepare+0x15f/0x250
  syscall_exit_to_user_mode+0x19/0x50
  do_syscall_64+0x42/0xb0
  entry_SYSCALL_64_after_hwframe+0x63/0xcd
 RIP: 0023:0x0
 Code: Unable to access opcode bytes at 0xffffffffffffffd6.
 RSP: 002b:00000000fffb7790 EFLAGS: 00000200 ORIG_RAX: 000000000000000b
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
  </TASK>
 Kernel panic - not syncing: panic_on_warn set ...

This happens because the adding of task_work from io_ring_exit_work()
isn't synchronized with canceling all work items from eg exec. The
execution of the two are ordered in that they are both run by the task
itself, but if io_tctx_exit_cb() is queued while we're canceling all
work items off exec AND gets executed when the task exits to userspace
rather than in the main loop in io_uring_cancel_generic(), then we can
find current->io_uring == NULL and hit the above crash.

It's safe to add this NULL check here, because the execution of the two
paths are done by the task itself.

Cc: stable@vger.kernel.org
Fixes: d56d938b4bef ("io_uring: do ctx initiated file note removal")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Link: https://lore.kernel.org/r/20221206093833.3812138-1-harshit.m.mogalapalli@oracle.com
[axboe: add code comment and also put an explanation in the commit msg]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-07 06:45:20 -07:00
Paolo Abeni
92439a8590 Merge tag 'ieee802154-for-net-2022-12-05' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan
Stefan Schmidt says:

====================
pull-request: ieee802154 for net 2022-12-05

An update from ieee802154 for your *net* tree:

Three small fixes this time around.

Ziyang Xuan fixed an error code for a timeout during initialization of the
cc2520 driver.
Hauke Mehrtens fixed a crash in the ca8210 driver SPI communication due
uninitialized SPI structures.
Wei Yongjun added INIT_LIST_HEAD ieee802154_if_add() to avoid a potential
null pointer dereference.
====================

Link: https://lore.kernel.org/r/20221205122515.1720539-1-stefan@datenfreihafen.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 13:50:16 +01:00
Yuan Can
4fad22a128 dpaa2-switch: Fix memory leak in dpaa2_switch_acl_entry_add() and dpaa2_switch_acl_entry_remove()
The cmd_buff needs to be freed when error happened in
dpaa2_switch_acl_entry_add() and dpaa2_switch_acl_entry_remove().

Fixes: 1110318d83e8 ("dpaa2-switch: add tc flower hardware offload on ingress traffic")
Signed-off-by: Yuan Can <yuancan@huawei.com>
Link: https://lore.kernel.org/r/20221205061515.115012-1-yuancan@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 12:14:12 +01:00
Zhang Changzhong
063a932b64 ethernet: aeroflex: fix potential skb leak in greth_init_rings()
The greth_init_rings() function won't free the newly allocated skb when
dma_mapping_error() returns error, so add dev_kfree_skb() to fix it.

Compile tested only.

Fixes: d4c41139df6e ("net: Add Aeroflex Gaisler 10/100/1G Ethernet MAC driver")
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/1670134149-29516-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 11:45:52 +01:00
Xin Long
88956177db tipc: call tipc_lxc_xmit without holding node_read_lock
When sending packets between nodes in netns, it calls tipc_lxc_xmit() for
peer node to receive the packets where tipc_sk_mcast_rcv()/tipc_sk_rcv()
might be called, and it's pretty much like in tipc_rcv().

Currently the local 'node rw lock' is held during calling tipc_lxc_xmit()
to protect the peer_net not being freed by another thread. However, when
receiving these packets, tipc_node_add_conn() might be called where the
peer 'node rw lock' is acquired. Then a dead lock warning is triggered by
lockdep detector, although it is not a real dead lock:

    WARNING: possible recursive locking detected
    --------------------------------------------
    conn_server/1086 is trying to acquire lock:
    ffff8880065cb020 (&n->lock#2){++--}-{2:2}, \
                     at: tipc_node_add_conn.cold.76+0xaa/0x211 [tipc]

    but task is already holding lock:
    ffff8880065cd020 (&n->lock#2){++--}-{2:2}, \
                     at: tipc_node_xmit+0x285/0xb30 [tipc]

    other info that might help us debug this:
     Possible unsafe locking scenario:

           CPU0
           ----
      lock(&n->lock#2);
      lock(&n->lock#2);

     *** DEADLOCK ***

     May be due to missing lock nesting notation

    4 locks held by conn_server/1086:
     #0: ffff8880036d1e40 (sk_lock-AF_TIPC){+.+.}-{0:0}, \
                          at: tipc_accept+0x9c0/0x10b0 [tipc]
     #1: ffff8880036d5f80 (sk_lock-AF_TIPC/1){+.+.}-{0:0}, \
                          at: tipc_accept+0x363/0x10b0 [tipc]
     #2: ffff8880065cd020 (&n->lock#2){++--}-{2:2}, \
                          at: tipc_node_xmit+0x285/0xb30 [tipc]
     #3: ffff888012e13370 (slock-AF_TIPC){+...}-{2:2}, \
                          at: tipc_sk_rcv+0x2da/0x1b40 [tipc]

    Call Trace:
     <TASK>
     dump_stack_lvl+0x44/0x5b
     __lock_acquire.cold.77+0x1f2/0x3d7
     lock_acquire+0x1d2/0x610
     _raw_write_lock_bh+0x38/0x80
     tipc_node_add_conn.cold.76+0xaa/0x211 [tipc]
     tipc_sk_finish_conn+0x21e/0x640 [tipc]
     tipc_sk_filter_rcv+0x147b/0x3030 [tipc]
     tipc_sk_rcv+0xbb4/0x1b40 [tipc]
     tipc_lxc_xmit+0x225/0x26b [tipc]
     tipc_node_xmit.cold.82+0x4a/0x102 [tipc]
     __tipc_sendstream+0x879/0xff0 [tipc]
     tipc_accept+0x966/0x10b0 [tipc]
     do_accept+0x37d/0x590

This patch avoids this warning by not holding the 'node rw lock' before
calling tipc_lxc_xmit(). As to protect the 'peer_net', rcu_read_lock()
should be enough, as in cleanup_net() when freeing the netns, it calls
synchronize_rcu() before the free is continued.

Also since tipc_lxc_xmit() is like the RX path in tipc_rcv(), it makes
sense to call it under rcu_read_lock(). Note that the right lock order
must be:

   rcu_read_lock();
   tipc_node_read_lock(n);
   tipc_node_read_unlock(n);
   tipc_lxc_xmit();
   rcu_read_unlock();

instead of:

   tipc_node_read_lock(n);
   rcu_read_lock();
   tipc_node_read_unlock(n);
   tipc_lxc_xmit();
   rcu_read_unlock();

and we have to call tipc_node_read_lock/unlock() twice in
tipc_node_xmit().

Fixes: f73b12812a3d ("tipc: improve throughput between nodes in netns")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/5bdd1f8fee9db695cfff4528a48c9b9d0523fb00.1670110641.git.lucien.xin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 11:32:04 +01:00
Frank Jungclaus
918ee4911f can: esd_usb: Allow REC and TEC to return to zero
We don't get any further EVENT from an esd CAN USB device for changes
on REC or TEC while those counters converge to 0 (with ecc == 0). So
when handling the "Back to Error Active"-event force txerr = rxerr =
0, otherwise the berr-counters might stay on values like 95 forever.

Also, to make life easier during the ongoing development a
netdev_dbg() has been introduced to allow dumping error events send by
an esd CAN USB device.

Fixes: 96d8e90382dc ("can: Add driver for esd CAN-USB/2 device")
Signed-off-by: Frank Jungclaus <frank.jungclaus@esd.eu>
Link: https://lore.kernel.org/all/20221130202242.3998219-2-frank.jungclaus@esd.eu
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-07 10:32:48 +01:00
Max Staudt
f4a4d121eb can: can327: flush TX_work on ldisc .close()
Additionally, remove it from .ndo_stop().

This ensures that the worker is not called after being freed, and that
the UART TX queue remains active to send final commands when the
netdev is stopped.

Thanks to Jiri Slaby for finding this in slcan:

  https://lore.kernel.org/linux-can/20221201073426.17328-1-jirislaby@kernel.org/

A variant of this patch for slcan, with the flush in .ndo_stop() still
present, has been tested successfully on physical hardware:

  https://bugzilla.suse.com/show_bug.cgi?id=1205597

Fixes: 43da2f07622f ("can: can327: CAN/ldisc driver for ELM327 based OBD-II adapters")
Cc: "Jiri Slaby (SUSE)" <jirislaby@kernel.org>
Cc: Max Staudt <max@enpas.org>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: linux-can@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Max Staudt <max@enpas.org>
Link: https://lore.kernel.org/all/20221202160148.282564-1-max@enpas.org
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-07 10:32:36 +01:00
Jiri Slaby (SUSE)
fb855e9f3b can: slcan: fix freed work crash
The LTP test pty03 is causing a crash in slcan:
  BUG: kernel NULL pointer dereference, address: 0000000000000008
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 0 PID: 348 Comm: kworker/0:3 Not tainted 6.0.8-1-default #1 openSUSE Tumbleweed 9d20364b934f5aab0a9bdf84e8f45cfdfae39dab
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014
  Workqueue:  0x0 (events)
  RIP: 0010:process_one_work (/home/rich/kernel/linux/kernel/workqueue.c:706 /home/rich/kernel/linux/kernel/workqueue.c:2185)
  Code: 49 89 ff 41 56 41 55 41 54 55 53 48 89 f3 48 83 ec 10 48 8b 06 48 8b 6f 48 49 89 c4 45 30 e4 a8 04 b8 00 00 00 00 4c 0f 44 e0 <49> 8b 44 24 08 44 8b a8 00 01 00 00 41 83 e5 20 f6 45 10 04 75 0e
  RSP: 0018:ffffaf7b40f47e98 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffff9d644e1b8b48 RCX: ffff9d649e439968
  RDX: 00000000ffff8455 RSI: ffff9d644e1b8b48 RDI: ffff9d64764aa6c0
  RBP: ffff9d649e4335c0 R08: 0000000000000c00 R09: ffff9d64764aa734
  R10: 0000000000000007 R11: 0000000000000001 R12: 0000000000000000
  R13: ffff9d649e4335e8 R14: ffff9d64490da780 R15: ffff9d64764aa6c0
  FS:  0000000000000000(0000) GS:ffff9d649e400000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000008 CR3: 0000000036424000 CR4: 00000000000006f0
  Call Trace:
   <TASK>
  worker_thread (/home/rich/kernel/linux/kernel/workqueue.c:2436)
  kthread (/home/rich/kernel/linux/kernel/kthread.c:376)
  ret_from_fork (/home/rich/kernel/linux/arch/x86/entry/entry_64.S:312)

Apparently, the slcan's tx_work is freed while being scheduled. While
slcan_netdev_close() (netdev side) calls flush_work(&sl->tx_work),
slcan_close() (tty side) does not. So when the netdev is never set UP,
but the tty is stuffed with bytes and forced to wakeup write, the work
is scheduled, but never flushed.

So add an additional flush_work() to slcan_close() to be sure the work
is flushed under all circumstances.

The Fixes commit below moved flush_work() from slcan_close() to
slcan_netdev_close(). What was the rationale behind it? Maybe we can
drop the one in slcan_netdev_close()?

I see the same pattern in can327. So it perhaps needs the very same fix.

Fixes: cfcb4465e992 ("can: slcan: remove legacy infrastructure")
Link: https://bugzilla.suse.com/show_bug.cgi?id=1205597
Reported-by: Richard Palethorpe <richard.palethorpe@suse.com>
Tested-by: Petr Vorel <petr.vorel@suse.com>
Cc: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: linux-can@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org
Cc: Max Staudt <max@enpas.org>
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Reviewed-by: Max Staudt <max@enpas.org>
Link: https://lore.kernel.org/all/20221201073426.17328-1-jirislaby@kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-07 10:32:24 +01:00
Oliver Hartkopp
0acc442309 can: af_can: fix NULL pointer dereference in can_rcv_filter
Analogue to commit 8aa59e355949 ("can: af_can: fix NULL pointer
dereference in can_rx_register()") we need to check for a missing
initialization of ml_priv in the receive path of CAN frames.

Since commit 4e096a18867a ("net: introduce CAN specific pointer in the
struct net_device") the check for dev->type to be ARPHRD_CAN is not
sufficient anymore since bonding or tun netdevices claim to be CAN
devices but do not initialize ml_priv accordingly.

Fixes: 4e096a18867a ("net: introduce CAN specific pointer in the struct net_device")
Reported-by: syzbot+2d7f58292cb5b29eb5ad@syzkaller.appspotmail.com
Reported-by: Wei Chen <harperchen1110@gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://lore.kernel.org/all/20221206201259.3028-1-socketcan@hartkopp.net
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-07 10:30:47 +01:00
Jakub Kicinski
1799c1b85e Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-12-05 (i40e)

Michal clears XPS init flag on reset to allow for updated values to be
written.

Sylwester adds sleep to VF reset to resolve issue of VFs not getting
resources.

Przemyslaw rejects filters for raw IPv4 or IPv6 l4_4_bytes filters as they
 are not supported.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  i40e: Disallow ip4 and ip6 l4_4_bytes
  i40e: Fix for VF MAC address 0
  i40e: Fix not setting default xps_cpus after reset
====================

Link: https://lore.kernel.org/r/20221205212523.3197565-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:46:33 -08:00
Zhengchao Shao
78a9ea43fc net: dsa: sja1105: fix memory leak in sja1105_setup_devlink_regions()
When dsa_devlink_region_create failed in sja1105_setup_devlink_regions(),
priv->regions is not released.

Fixes: bf425b82059e ("net: dsa: sja1105: expose static config as devlink region")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221205012132.2110979-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:36:38 -08:00
Jakub Kicinski
e40febfb9c Merge branch 'ipv4-two-bug-fixes'
Ido Schimmel says:

====================
ipv4: Two small fixes for bugs in IPv4 routing code.

A variation of the second bug was reported by an FRR 5.0 (released
06/18) user as this version was setting a table ID of 0 for the
default VRF, unlike iproute2 and newer FRR versions.

The first bug was discovered while fixing the second.

Both bugs are not regressions (never worked) and are not critical
in my opinion, so the fixes can be applied to net-next, if desired.

No regressions in other tests:

 # ./fib_tests.sh
 ...
 Tests passed: 191
 Tests failed:   0
====================

Link: https://lore.kernel.org/r/20221204075045.3780097-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:34:46 -08:00
Ido Schimmel
c0d999348e ipv4: Fix incorrect route flushing when table ID 0 is used
Cited commit added the table ID to the FIB info structure, but did not
properly initialize it when table ID 0 is used. This can lead to a route
in the default VRF with a preferred source address not being flushed
when the address is deleted.

Consider the following example:

 # ip address add dev dummy1 192.0.2.1/28
 # ip address add dev dummy1 192.0.2.17/28
 # ip route add 198.51.100.0/24 via 192.0.2.2 src 192.0.2.17 metric 100
 # ip route add table 0 198.51.100.0/24 via 192.0.2.2 src 192.0.2.17 metric 200
 # ip route show 198.51.100.0/24
 198.51.100.0/24 via 192.0.2.2 dev dummy1 src 192.0.2.17 metric 100
 198.51.100.0/24 via 192.0.2.2 dev dummy1 src 192.0.2.17 metric 200

Both routes are installed in the default VRF, but they are using two
different FIB info structures. One with a metric of 100 and table ID of
254 (main) and one with a metric of 200 and table ID of 0. Therefore,
when the preferred source address is deleted from the default VRF,
the second route is not flushed:

 # ip address del dev dummy1 192.0.2.17/28
 # ip route show 198.51.100.0/24
 198.51.100.0/24 via 192.0.2.2 dev dummy1 src 192.0.2.17 metric 200

Fix by storing a table ID of 254 instead of 0 in the route configuration
structure.

Add a test case that fails before the fix:

 # ./fib_tests.sh -t ipv4_del_addr

 IPv4 delete address route tests
     Regular FIB info
     TEST: Route removed from VRF when source address deleted            [ OK ]
     TEST: Route in default VRF not removed                              [ OK ]
     TEST: Route removed in default VRF when source address deleted      [ OK ]
     TEST: Route in VRF is not removed by address delete                 [ OK ]
     Identical FIB info with different table ID
     TEST: Route removed from VRF when source address deleted            [ OK ]
     TEST: Route in default VRF not removed                              [ OK ]
     TEST: Route removed in default VRF when source address deleted      [ OK ]
     TEST: Route in VRF is not removed by address delete                 [ OK ]
     Table ID 0
     TEST: Route removed in default VRF when source address deleted      [FAIL]

 Tests passed:   8
 Tests failed:   1

And passes after:

 # ./fib_tests.sh -t ipv4_del_addr

 IPv4 delete address route tests
     Regular FIB info
     TEST: Route removed from VRF when source address deleted            [ OK ]
     TEST: Route in default VRF not removed                              [ OK ]
     TEST: Route removed in default VRF when source address deleted      [ OK ]
     TEST: Route in VRF is not removed by address delete                 [ OK ]
     Identical FIB info with different table ID
     TEST: Route removed from VRF when source address deleted            [ OK ]
     TEST: Route in default VRF not removed                              [ OK ]
     TEST: Route removed in default VRF when source address deleted      [ OK ]
     TEST: Route in VRF is not removed by address delete                 [ OK ]
     Table ID 0
     TEST: Route removed in default VRF when source address deleted      [ OK ]

 Tests passed:   9
 Tests failed:   0

Fixes: 5a56a0b3a45d ("net: Don't delete routes in different VRFs")
Reported-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:34:43 -08:00
Ido Schimmel
f96a3d7455 ipv4: Fix incorrect route flushing when source address is deleted
Cited commit added the table ID to the FIB info structure, but did not
prevent structures with different table IDs from being consolidated.
This can lead to routes being flushed from a VRF when an address is
deleted from a different VRF.

Fix by taking the table ID into account when looking for a matching FIB
info. This is already done for FIB info structures backed by a nexthop
object in fib_find_info_nh().

Add test cases that fail before the fix:

 # ./fib_tests.sh -t ipv4_del_addr

 IPv4 delete address route tests
     Regular FIB info
     TEST: Route removed from VRF when source address deleted            [ OK ]
     TEST: Route in default VRF not removed                              [ OK ]
     TEST: Route removed in default VRF when source address deleted      [ OK ]
     TEST: Route in VRF is not removed by address delete                 [ OK ]
     Identical FIB info with different table ID
     TEST: Route removed from VRF when source address deleted            [FAIL]
     TEST: Route in default VRF not removed                              [ OK ]
 RTNETLINK answers: File exists
     TEST: Route removed in default VRF when source address deleted      [ OK ]
     TEST: Route in VRF is not removed by address delete                 [FAIL]

 Tests passed:   6
 Tests failed:   2

And pass after:

 # ./fib_tests.sh -t ipv4_del_addr

 IPv4 delete address route tests
     Regular FIB info
     TEST: Route removed from VRF when source address deleted            [ OK ]
     TEST: Route in default VRF not removed                              [ OK ]
     TEST: Route removed in default VRF when source address deleted      [ OK ]
     TEST: Route in VRF is not removed by address delete                 [ OK ]
     Identical FIB info with different table ID
     TEST: Route removed from VRF when source address deleted            [ OK ]
     TEST: Route in default VRF not removed                              [ OK ]
     TEST: Route removed in default VRF when source address deleted      [ OK ]
     TEST: Route in VRF is not removed by address delete                 [ OK ]

 Tests passed:   8
 Tests failed:   0

Fixes: 5a56a0b3a45d ("net: Don't delete routes in different VRFs")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:34:43 -08:00
Rasmus Villemoes
7e6303567c net: fec: properly guard irq coalesce setup
Prior to the Fixes: commit, the initialization code went through the
same fec_enet_set_coalesce() function as used by ethtool, and that
function correctly checks whether the current variant has support for
irq coalescing.

Now that the initialization code instead calls fec_enet_itr_coal_set()
directly, that call needs to be guarded by a check for the
FEC_QUIRK_HAS_COALESCE bit.

Fixes: df727d4547de (net: fec: don't reset irq coalesce settings to defaults on "ip link up")
Reported-by: Greg Ungerer <gregungerer@westnet.com.au>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221205204604.869853-1-linux@rasmusvillemoes.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:22:34 -08:00