Commit Graph

1123427 Commits

Author SHA1 Message Date
Janis Schoetterl-Glausch
b3cefd6bf1 KVM: s390: Pass initialized arg even if unused
This silences smatch warnings reported by kbuild bot:
arch/s390/kvm/gaccess.c:859 guest_range_to_gpas() error: uninitialized symbol 'prot'.
arch/s390/kvm/gaccess.c:1064 access_guest_with_key() error: uninitialized symbol 'prot'.

This is because it cannot tell that the value is not used in this case.
The trans_exc* only examine prot if code is PGM_PROTECTION.
Pass a dummy value for other codes.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20220825192540.1560559-1-scgl@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-09-21 16:18:35 +02:00
Matthew Rosato
e8c924a4fb KVM: s390: pci: fix plain integer as NULL pointer warnings
Fix some sparse warnings that a plain integer 0 is being used instead of
NULL.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220915175514.167899-1-mjrosato@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-09-21 16:18:30 +02:00
Jakub Kicinski
375a683321 linux-can-fixes-for-6.0-20220921
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEBsvAIBsPu6mG7thcrX5LkNig010FAmMqy5UTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRCtfkuQ2KDTXSEiB/93YAPIO2GlO1tGmvCvTiiNrzGR21au
 BNCHLFrFTKWrOnceA8c6Jm0YjeAXDFM0OMN0LR3JF9aVmSuKSGeqjDVqdKvB1ykq
 93V2vQEnIVaOI9zkyo9zK+xVrJx8sso3+RypNeDp75NnBM6IiXJvSvCYQFt8zSvn
 qqV7RPo+R+e5g+o3iQnfb38Ej9BLDHCtOODMBTD2VmIiNSmWBI9COTscBGSD1vNO
 nxJbBBSKc0UFoAYOc7D9DejG8Dd0rlBDHvwpXA2Y28zYnTAfttCQOvdlO2r2waF2
 y3A37MpmCQg9iMAUh7LJQbHzL28UNCIlUNB9Zl9FxX7cZ+49oheJAKuV
 =5hoh
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-6.0-20220921' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2022-09-21

The 1st patch is by me, targets the flexcan driver and fixes a
potential system hang on single core systems under high CAN packet
rate.

The next 2 patches are also by me and target the gs_usb driver. A
potential race condition during the ndo_open callback as well as the
return value if the ethtool identify feature is not supported are
fixed.

* tag 'linux-can-fixes-for-6.0-20220921' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: gs_usb: gs_usb_set_phys_id(): return with error if identify is not supported
  can: gs_usb: gs_can_open(): fix race dev->can.state condition
  can: flexcan: flexcan_mailbox_read() fix return value for drop = true
====================

Link: https://lore.kernel.org/r/20220921083609.419768-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-21 06:52:32 -07:00
Lieven Hey
babd04386b perf jit: Include program header in ELF files
The missing header makes it hard for programs like elfutils to open
these files.

Fixes: 2d86612aac ("perf symbol: Correct address for bss symbols")
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Lieven Hey <lieven.hey@kdab.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20220915092910.711036-1-lieven.hey@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-09-21 10:30:55 -03:00
Namhyung Kim
7901086014 perf test: Add a new test for perf stat cgroup BPF counter
$ sudo ./perf test -v each-cgroup
   96: perf stat --bpf-counters --for-each-cgroup test                 :
  --- start ---
  test child forked, pid 79600
  test child finished with 0
  ---- end ----
  perf stat --bpf-counters --for-each-cgroup test: Ok

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220916184132.1161506-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-09-21 10:30:55 -03:00
Namhyung Kim
8a92605daa perf stat: Use evsel->core.cpus to iterate cpus in BPF cgroup counters
If it mixes core and uncore events, each evsel would have different cpu map.
But it assumed they are same with evlist's all_cpus and accessed by the same
index.  This resulted in a crash like below.

  $ perf stat -a --bpf-counters --for-each_cgroup ^. -e cycles,imc/cas_count_read/ sleep 1
  Segmentation fault

While it's not recommended to use uncore events for cgroup aggregation, it
should not crash.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220916184132.1161506-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-09-21 10:30:55 -03:00
Namhyung Kim
3da35231d9 perf stat: Fix cpu map index in bperf cgroup code
The previous cpu map introduced a bug in the bperf cgroup counter.  This
results in a failure when user gives a partial cpu map starting from
non-zero.

  $ sudo ./perf stat -C 1-2 --bpf-counters --for-each-cgroup ^. sleep 1
  libbpf: prog 'on_cgrp_switch': failed to create BPF link for perf_event FD 0:
                                 -9 (Bad file descriptor)
  Failed to attach cgroup program

To get the FD of an evsel, it should use a map index not the CPU number.

Fixes: 0255571a16 ("perf cpumap: Switch to using perf_cpu_map API")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: bpf@vger.kernel.org
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/20220916184132.1161506-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-09-21 10:30:55 -03:00
Namhyung Kim
0d77326c33 perf stat: Fix BPF program section name
It seems the recent libbpf got more strict about the section name.
I'm seeing a failure like this:

  $ sudo ./perf stat -a --bpf-counters --for-each-cgroup ^. sleep 1
  libbpf: prog 'on_cgrp_switch': missing BPF prog type, check ELF section name 'perf_events'
  libbpf: prog 'on_cgrp_switch': failed to load: -22
  libbpf: failed to load object 'bperf_cgroup_bpf'
  libbpf: failed to load BPF skeleton 'bperf_cgroup_bpf': -22
  Failed to load cgroup skeleton

The section name should be 'perf_event' (without the trailing 's').
Although it's related to the libbpf change, it'd be better fix the
section name in the first place.

Fixes: 944138f048 ("perf stat: Enable BPF counter with --for-each-cgroup")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: bpf@vger.kernel.org
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/20220916184132.1161506-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-09-21 10:30:55 -03:00
Jianglei Nie
65e5d27df6 net: atlantic: fix potential memory leak in aq_ndev_close()
If aq_nic_stop() fails, aq_ndev_close() returns err without calling
aq_nic_deinit() to release the relevant memory and resource, which
will lead to a memory leak.

We can fix it by deleting the if condition judgment and goto statement to
call aq_nic_deinit() directly after aq_nic_stop() to fix the memory leak.

Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-21 12:50:57 +01:00
Yi Liu
1548978070 iommu/vt-d: Check correct capability for sagaw determination
Check 5-level paging capability for 57 bits address width instead of
checking 1GB large page capability.

Fixes: 53fc7ad6ed ("iommu/vt-d: Correctly calculate sagaw value of IOMMU")
Cc: stable@vger.kernel.org
Reported-by: Raghunathan Srinivasan <raghunathan.srinivasan@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Raghunathan Srinivasan <raghunathan.srinivasan@intel.com>
Link: https://lore.kernel.org/r/20220916071212.2223869-2-yi.l.liu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-21 10:22:54 +02:00
Lu Baolu
7ebb5f8e00 Revert "iommu/vt-d: Fix possible recursive locking in intel_iommu_init()"
This reverts commit 9cd4f14344.

Some issues were reported on the original commit. Some thunderbolt devices
don't work anymore due to the following DMA fault.

DMAR: DRHD: handling fault status reg 2
DMAR: [INTR-REMAP] Request device [09:00.0] fault index 0x8080
      [fault reason 0x25]
      Blocked a compatibility format interrupt request

Bring it back for now to avoid functional regression.

Fixes: 9cd4f14344 ("iommu/vt-d: Fix possible recursive locking in intel_iommu_init()")
Link: https://lore.kernel.org/linux-iommu/485A6EA5-6D58-42EA-B298-8571E97422DE@getmailspring.com/
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216497
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: <stable@vger.kernel.org> # 5.19.x
Reported-and-tested-by: George Hilliard <thirtythreeforty@gmail.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220920081701.3453504-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-21 10:22:54 +02:00
David S. Miller
79a392a3b1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Florian Westphal says:

====================
netfilter: bugfixes for net

The following set contains netfilter fixes for the *net* tree.

Regressions (rc only):
recent ebtables crash fix was incomplete, it added a memory leak.

The patch to fix possible buffer overrun for BIG TCP in ftp conntrack
tried to be too clever, we cannot re-use ct->lock: NAT engine might
grab it again -> deadlock.  Revert back to a global spinlock.
Both from myself.

Remove the documentation for the recently removed
'nf_conntrack_helper' sysctl as well, from Pablo Neira.

The static_branch_inc() that guards the 'chain stats enabled' path
needs to be deferred further, until the entire transaction was created.
From Tetsuo Handa.

Older bugs:
Since 5.3:
nf_tables_addchain may leak pcpu memory in error path when
offloading fails. Also from Tetsuo Handa.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-21 09:07:53 +01:00
Marc Kleine-Budde
0f2211f1cf can: gs_usb: gs_usb_set_phys_id(): return with error if identify is not supported
Until commit 409c188c57 ("can: tree-wide: advertise software
timestamping capabilities") the ethtool_ops was only assigned for
devices which support the GS_CAN_FEATURE_IDENTIFY feature. That commit
assigns ethtool_ops unconditionally.

This results on controllers without GS_CAN_FEATURE_IDENTIFY support
for the following ethtool error:

| $ ethtool -p can0 1
| Cannot identify NIC: Broken pipe

Restore the correct error value by checking for
GS_CAN_FEATURE_IDENTIFY in the gs_usb_set_phys_id() function.

| $ ethtool -p can0 1
| Cannot identify NIC: Operation not supported

While there use the variable "netdev" for the "struct net_device"
pointer and "dev" for the "struct gs_can" pointer as in the rest of
the driver.

Fixes: 409c188c57 ("can: tree-wide: advertise software timestamping capabilities")
Link: http://lore.kernel.org/all/20220818143853.2671854-1-mkl@pengutronix.de
Cc: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-09-21 09:48:52 +02:00
Marc Kleine-Budde
5440428b3d can: gs_usb: gs_can_open(): fix race dev->can.state condition
The dev->can.state is set to CAN_STATE_ERROR_ACTIVE, after the device
has been started. On busy networks the CAN controller might receive
CAN frame between and go into an error state before the dev->can.state
is assigned.

Assign dev->can.state before starting the controller to close the race
window.

Fixes: d08e973a77 ("can: gs_usb: Added support for the GS_USB CAN devices")
Link: https://lore.kernel.org/all/20220920195216.232481-1-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-09-21 09:48:51 +02:00
Marc Kleine-Budde
a09721dd47 can: flexcan: flexcan_mailbox_read() fix return value for drop = true
The following happened on an i.MX25 using flexcan with many packets on
the bus:

The rx-offload queue reached a length more than skb_queue_len_max. In
can_rx_offload_offload_one() the drop variable was set to true which
made the call to .mailbox_read() (here: flexcan_mailbox_read()) to
_always_ return ERR_PTR(-ENOBUFS) and drop the rx'ed CAN frame. So
can_rx_offload_offload_one() returned ERR_PTR(-ENOBUFS), too.

can_rx_offload_irq_offload_fifo() looks as follows:

| 	while (1) {
| 		skb = can_rx_offload_offload_one(offload, 0);
| 		if (IS_ERR(skb))
| 			continue;
| 		if (!skb)
| 			break;
| 		...
| 	}

The flexcan driver wrongly always returns ERR_PTR(-ENOBUFS) if drop is
requested, even if there is no CAN frame pending. As the i.MX25 is a
single core CPU, while the rx-offload processing is active, there is
no thread to process packets from the offload queue. So the queue
doesn't get any shorter and this results is a tight loop.

Instead of always returning ERR_PTR(-ENOBUFS) if drop is requested,
return NULL if no CAN frame is pending.

Changes since v1: https://lore.kernel.org/all/20220810144536.389237-1-u.kleine-koenig@pengutronix.de
- don't break in can_rx_offload_irq_offload_fifo() in case of an error,
  return NULL in flexcan_mailbox_read() in case of no pending CAN frame
  instead

Fixes: 4e9c9484b0 ("can: rx-offload: Prepare for CAN FD support")
Link: https://lore.kernel.org/all/20220811094254.1864367-1-mkl@pengutronix.de
Cc: stable@vger.kernel.org # v5.5
Suggested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: Thorsten Scherer <t.scherer@eckelmann.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-09-21 09:48:51 +02:00
Meng Li
69bef19d6b gpiolib: cdev: Set lineevent_state::irq after IRQ register successfully
When running gpio test on nxp-ls1028 platform with below command
gpiomon --num-events=3 --rising-edge gpiochip1 25
There will be a warning trace as below:
Call trace:
free_irq+0x204/0x360
lineevent_free+0x64/0x70
gpio_ioctl+0x598/0x6a0
__arm64_sys_ioctl+0xb4/0x100
invoke_syscall+0x5c/0x130
......
el0t_64_sync+0x1a0/0x1a4
The reason of this issue is that calling request_threaded_irq()
function failed, and then lineevent_free() is invoked to release
the resource. Since the lineevent_state::irq was already set, so
the subsequent invocation of free_irq() would trigger the above
warning call trace. To fix this issue, set the lineevent_state::irq
after the IRQ register successfully.

Fixes: 4682427241 ("gpiolib: cdev: refactor lineevent cleanup into lineevent_free")
Cc: stable@vger.kernel.org
Signed-off-by: Meng Li <Meng.Li@windriver.com>
Reviewed-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-09-21 09:32:11 +02:00
Dongliang Mu
21a9acc162 gpio: tqmx86: fix uninitialized variable girq
The commit 924610607f ("gpio: tpmx86: Move PM device over to
irq domain") adds a dereference of girq that may be uninitialized.

Fix this by moving irq_domain_set_pm_device into if true branch
as suggested by Marc Zyngier.

Fixes: 924610607f ("gpio: tpmx86: Move PM device over to irq domain")
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-09-21 09:31:22 +02:00
David Gow
bd71558d58 arch: um: Mark the stack non-executable to fix a binutils warning
Since binutils 2.39, ld will print a warning if any stack section is
executable, which is the default for stack sections on files without a
.note.GNU-stack section.

This was fixed for x86 in commit ffcf9c5700 ("x86: link vdso and boot with -z noexecstack --no-warn-rwx-segments"),
but remained broken for UML, resulting in several warnings:

/usr/bin/ld: warning: arch/x86/um/vdso/vdso.o: missing .note.GNU-stack section implies executable stack
/usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
/usr/bin/ld: warning: .tmp_vmlinux.kallsyms1 has a LOAD segment with RWX permissions
/usr/bin/ld: warning: .tmp_vmlinux.kallsyms1.o: missing .note.GNU-stack section implies executable stack
/usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
/usr/bin/ld: warning: .tmp_vmlinux.kallsyms2 has a LOAD segment with RWX permissions
/usr/bin/ld: warning: .tmp_vmlinux.kallsyms2.o: missing .note.GNU-stack section implies executable stack
/usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
/usr/bin/ld: warning: vmlinux has a LOAD segment with RWX permissions

Link both the VDSO and vmlinux with -z noexecstack, fixing the warnings
about .note.GNU-stack sections. In addition, pass --no-warn-rwx-segments
to dodge the remaining warnings about LOAD segments with RWX permissions
in the kallsyms objects. (Note that this flag is apparently not
available on lld, so hide it behind a test for BFD, which is what the
x86 patch does.)

Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ffcf9c5700e49c0aee42dcba9a12ba21338e8136
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ba951afb99912da01a6e8434126b8fac7aa75107
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Tested-by: Lukas Straub <lukasstraub2@web.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: Richard Weinberger <richard@nod.at>
2022-09-21 09:11:42 +02:00
Geert Uytterhoeven
6a1dbfefda net: sh_eth: Fix PHY state warning splat during system resume
Since commit 744d23c71a ("net: phy: Warn about incorrect
mdio_bus_phy_resume() state"), a warning splat is printed during system
resume with Wake-on-LAN disabled:

	WARNING: CPU: 0 PID: 626 at drivers/net/phy/phy_device.c:323 mdio_bus_phy_resume+0xbc/0xe4

As the Renesas SuperH Ethernet driver already calls phy_{stop,start}()
in its suspend/resume callbacks, it is sufficient to just mark the MAC
responsible for managing the power state of the PHY.

Fixes: fba863b816 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/c6e1331b9bef61225fa4c09db3ba3e2e7214ba2d.1663598886.git.geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 17:05:50 -07:00
Geert Uytterhoeven
4924c0cdce net: ravb: Fix PHY state warning splat during system resume
Since commit 744d23c71a ("net: phy: Warn about incorrect
mdio_bus_phy_resume() state"), a warning splat is printed during system
resume with Wake-on-LAN disabled:

        WARNING: CPU: 0 PID: 1197 at drivers/net/phy/phy_device.c:323 mdio_bus_phy_resume+0xbc/0xc8

As the Renesas Ethernet AVB driver already calls phy_{stop,start}() in
its suspend/resume callbacks, it is sufficient to just mark the MAC
responsible for managing the power state of the PHY.

Fixes: fba863b816 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/8ec796f47620980fdd0403e21bd8b7200b4fa1d4.1663598796.git.geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 17:05:46 -07:00
Florian Westphal
d250889322 netfilter: nf_ct_ftp: fix deadlock when nat rewrite is needed
We can't use ct->lock, this is already used by the seqadj internals.
When using ftp helper + nat, seqadj will attempt to acquire ct->lock
again.

Revert back to a global lock for now.

Fixes: c783a29c7e ("netfilter: nf_ct_ftp: prefer skb_linearize")
Reported-by: Bruno de Paula Larini <bruno.larini@riosoft.com.br>
Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-20 23:50:03 +02:00
Florian Westphal
62ce44c4ff netfilter: ebtables: fix memory leak when blob is malformed
The bug fix was incomplete, it "replaced" crash with a memory leak.
The old code had an assignment to "ret" embedded into the conditional,
restore this.

Fixes: 7997eff828 ("netfilter: ebtables: reject blobs that don't provide all entry points")
Reported-and-tested-by: syzbot+a24c5252f3e3ab733464@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-20 23:50:03 +02:00
Tetsuo Handa
9a4d6dd554 netfilter: nf_tables: fix percpu memory leak at nf_tables_addchain()
It seems to me that percpu memory for chain stats started leaking since
commit 3bc158f8d0 ("netfilter: nf_tables: map basechain priority to
hardware priority") when nft_chain_offload_priority() returned an error.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 3bc158f8d0 ("netfilter: nf_tables: map basechain priority to hardware priority")
Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-20 23:50:03 +02:00
Tetsuo Handa
921ebde3c0 netfilter: nf_tables: fix nft_counters_enabled underflow at nf_tables_addchain()
syzbot is reporting underflow of nft_counters_enabled counter at
nf_tables_addchain() [1], for commit 43eb8949cf ("netfilter:
nf_tables: do not leave chain stats enabled on error") missed that
nf_tables_chain_destroy() after nft_basechain_init() in the error path of
nf_tables_addchain() decrements the counter because nft_basechain_init()
makes nft_is_base_chain() return true by setting NFT_CHAIN_BASE flag.

Increment the counter immediately after returning from
nft_basechain_init().

Link:  https://syzkaller.appspot.com/bug?extid=b5d82a651b71cd8a75ab [1]
Reported-by: syzbot <syzbot+b5d82a651b71cd8a75ab@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: syzbot <syzbot+b5d82a651b71cd8a75ab@syzkaller.appspotmail.com>
Fixes: 43eb8949cf ("netfilter: nf_tables: do not leave chain stats enabled on error")
Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-20 23:50:03 +02:00
Pablo Neira Ayuso
76b907ee00 netfilter: conntrack: remove nf_conntrack_helper documentation
This toggle has been already remove by b118509076 ("netfilter: remove
nf_conntrack_helper sysctl and modparam toggles").

Remove the documentation entry for this toggle too.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-20 23:50:03 +02:00
Bhupesh Sharma
603ccb3aca MAINTAINERS: Add myself as a reviewer for Qualcomm ETHQOS Ethernet driver
As suggested by Vinod, adding myself as the reviewer
for the Qualcomm ETHQOS Ethernet driver.

Recently I have enabled this driver on a few Qualcomm
SoCs / boards and hence trying to keep a close eye on
it.

Signed-off-by: Bhupesh Sharma <bhupesh.sharma@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
Link: https://lore.kernel.org/r/20220915112804.3950680-1-bhupesh.sharma@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 13:42:55 -07:00
Mateusz Palczewski
8ac7132704 ice: Fix interface being down after reset with link-down-on-close flag on
When performing a reset on ice driver with link-down-on-close flag on
interface would always stay down. Fix this by moving a check of this
flag to ice_stop() that is called only when user wants to bring
interface down.

Fixes: ab4ab73fc1 ("ice: Add ethtool private flag to make forcing link down optional")
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Petr Oros <poros@redhat.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-20 13:30:51 -07:00
Michal Swiatkowski
122045ca77 ice: config netdev tc before setting queues number
After lowering number of tx queues the warning appears:
"Number of in use tx queues changed invalidating tc mappings. Priority
traffic classification disabled!"
Example command to reproduce:
ethtool -L enp24s0f0 tx 36 rx 36

Fix this by setting correct tc mapping before setting real number of
queues on netdev.

Fixes: 0754d65bd4 ("ice: Add infrastructure for mqprio support via ndo_setup_tc")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-20 13:30:51 -07:00
Jakub Kicinski
da847246ab Merge branch 'fixes-for-tc-taprio-software-mode'
Vladimir Oltean says:

====================
Fixes for tc-taprio software mode

While working on some new features for tc-taprio, I found some strange
behavior which looked like bugs. I was able to eventually trigger a NULL
pointer dereference. This patch set fixes 2 issues I saw. Detailed
explanation in patches.
====================

Link: https://lore.kernel.org/r/20220915100802.2308279-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:41:18 -07:00
Vladimir Oltean
1461d212ab net/sched: taprio: make qdisc_leaf() see the per-netdev-queue pfifo child qdiscs
taprio can only operate as root qdisc, and to that end, there exists the
following check in taprio_init(), just as in mqprio:

	if (sch->parent != TC_H_ROOT)
		return -EOPNOTSUPP;

And indeed, when we try to attach taprio to an mqprio child, it fails as
expected:

$ tc qdisc add dev swp0 root handle 1: mqprio num_tc 8 \
	map 0 1 2 3 4 5 6 7 \
	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
$ tc qdisc replace dev swp0 parent 1:2 taprio num_tc 8 \
	map 0 1 2 3 4 5 6 7 \
	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
	base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
	flags 0x0 clockid CLOCK_TAI
Error: sch_taprio: Can only be attached as root qdisc.

(extack message added by me)

But when we try to attach a taprio child to a taprio root qdisc,
surprisingly it doesn't fail:

$ tc qdisc replace dev swp0 root handle 1: taprio num_tc 8 \
	map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
	base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
	flags 0x0 clockid CLOCK_TAI
$ tc qdisc replace dev swp0 parent 1:2 taprio num_tc 8 \
	map 0 1 2 3 4 5 6 7 \
	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
	base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
	flags 0x0 clockid CLOCK_TAI

This is because tc_modify_qdisc() behaves differently when mqprio is
root, vs when taprio is root.

In the mqprio case, it finds the parent qdisc through
p = qdisc_lookup(dev, TC_H_MAJ(clid)), and then the child qdisc through
q = qdisc_leaf(p, clid). This leaf qdisc q has handle 0, so it is
ignored according to the comment right below ("It may be default qdisc,
ignore it"). As a result, tc_modify_qdisc() goes through the
qdisc_create() code path, and this gives taprio_init() a chance to check
for sch_parent != TC_H_ROOT and error out.

Whereas in the taprio case, the returned q = qdisc_leaf(p, clid) is
different. It is not the default qdisc created for each netdev queue
(both taprio and mqprio call qdisc_create_dflt() and keep them in
a private q->qdiscs[], or priv->qdiscs[], respectively). Instead, taprio
makes qdisc_leaf() return the _root_ qdisc, aka itself.

When taprio does that, tc_modify_qdisc() goes through the qdisc_change()
code path, because the qdisc layer never finds out about the child qdisc
of the root. And through the ->change() ops, taprio has no reason to
check whether its parent is root or not, just through ->init(), which is
not called.

The problem is the taprio_leaf() implementation. Even though code wise,
it does the exact same thing as mqprio_leaf() which it is copied from,
it works with different input data. This is because mqprio does not
attach itself (the root) to each device TX queue, but one of the default
qdiscs from its private array.

In fact, since commit 13511704f8 ("net: taprio offload: enforce qdisc
to netdev queue mapping"), taprio does this too, but just for the full
offload case. So if we tried to attach a taprio child to a fully
offloaded taprio root qdisc, it would properly fail too; just not to a
software root taprio.

To fix the problem, stop looking at the Qdisc that's attached to the TX
queue, and instead, always return the default qdiscs that we've
allocated (and to which we privately enqueue and dequeue, in software
scheduling mode).

Since Qdisc_class_ops :: leaf  is only called from tc_modify_qdisc(),
the risk of unforeseen side effects introduced by this change is
minimal.

Fixes: 5a781ccbd1 ("tc: Add support for configuring the taprio scheduler")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:41:14 -07:00
Vladimir Oltean
db46e3a88a net/sched: taprio: avoid disabling offload when it was never enabled
In an incredibly strange API design decision, qdisc->destroy() gets
called even if qdisc->init() never succeeded, not exclusively since
commit 87b60cfacf ("net_sched: fix error recovery at qdisc creation"),
but apparently also earlier (in the case of qdisc_create_dflt()).

The taprio qdisc does not fully acknowledge this when it attempts full
offload, because it starts off with q->flags = TAPRIO_FLAGS_INVALID in
taprio_init(), then it replaces q->flags with TCA_TAPRIO_ATTR_FLAGS
parsed from netlink (in taprio_change(), tail called from taprio_init()).

But in taprio_destroy(), we call taprio_disable_offload(), and this
determines what to do based on FULL_OFFLOAD_IS_ENABLED(q->flags).

But looking at the implementation of FULL_OFFLOAD_IS_ENABLED()
(a bitwise check of bit 1 in q->flags), it is invalid to call this macro
on q->flags when it contains TAPRIO_FLAGS_INVALID, because that is set
to U32_MAX, and therefore FULL_OFFLOAD_IS_ENABLED() will return true on
an invalid set of flags.

As a result, it is possible to crash the kernel if user space forces an
error between setting q->flags = TAPRIO_FLAGS_INVALID, and the calling
of taprio_enable_offload(). This is because drivers do not expect the
offload to be disabled when it was never enabled.

The error that we force here is to attach taprio as a non-root qdisc,
but instead as child of an mqprio root qdisc:

$ tc qdisc add dev swp0 root handle 1: \
	mqprio num_tc 8 map 0 1 2 3 4 5 6 7 \
	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
$ tc qdisc replace dev swp0 parent 1:1 \
	taprio num_tc 8 map 0 1 2 3 4 5 6 7 \
	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \
	sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
	flags 0x0 clockid CLOCK_TAI
Unable to handle kernel paging request at virtual address fffffffffffffff8
[fffffffffffffff8] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 96000004 [#1] PREEMPT SMP
Call trace:
 taprio_dump+0x27c/0x310
 vsc9959_port_setup_tc+0x1f4/0x460
 felix_port_setup_tc+0x24/0x3c
 dsa_slave_setup_tc+0x54/0x27c
 taprio_disable_offload.isra.0+0x58/0xe0
 taprio_destroy+0x80/0x104
 qdisc_create+0x240/0x470
 tc_modify_qdisc+0x1fc/0x6b0
 rtnetlink_rcv_msg+0x12c/0x390
 netlink_rcv_skb+0x5c/0x130
 rtnetlink_rcv+0x1c/0x2c

Fix this by keeping track of the operations we made, and undo the
offload only if we actually did it.

I've added "bool offloaded" inside a 4 byte hole between "int clockid"
and "atomic64_t picos_per_byte". Now the first cache line looks like
below:

$ pahole -C taprio_sched net/sched/sch_taprio.o
struct taprio_sched {
        struct Qdisc * *           qdiscs;               /*     0     8 */
        struct Qdisc *             root;                 /*     8     8 */
        u32                        flags;                /*    16     4 */
        enum tk_offsets            tk_offset;            /*    20     4 */
        int                        clockid;              /*    24     4 */
        bool                       offloaded;            /*    28     1 */

        /* XXX 3 bytes hole, try to pack */

        atomic64_t                 picos_per_byte;       /*    32     0 */

        /* XXX 8 bytes hole, try to pack */

        spinlock_t                 current_entry_lock;   /*    40     0 */

        /* XXX 8 bytes hole, try to pack */

        struct sched_entry *       current_entry;        /*    48     8 */
        struct sched_gate_list *   oper_sched;           /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */

Fixes: 9c66d15646 ("taprio: Add support for hardware offloading")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:41:14 -07:00
Ido Schimmel
76dd072813 ipv6: Fix crash when IPv6 is administratively disabled
The global 'raw_v6_hashinfo' variable can be accessed even when IPv6 is
administratively disabled via the 'ipv6.disable=1' kernel command line
option, leading to a crash [1].

Fix by restoring the original behavior and always initializing the
variable, regardless of IPv6 support being administratively disabled or
not.

[1]
 BUG: unable to handle page fault for address: ffffffffffffffc8
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 173e18067 P4D 173e18067 PUD 173e1a067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP KASAN
 CPU: 3 PID: 271 Comm: ss Not tainted 6.0.0-rc4-custom-00136-g0727a9a5fbc1 #1396
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
 RIP: 0010:raw_diag_dump+0x310/0x7f0
 [...]
 Call Trace:
  <TASK>
  __inet_diag_dump+0x10f/0x2e0
  netlink_dump+0x575/0xfd0
  __netlink_dump_start+0x67b/0x940
  inet_diag_handler_cmd+0x273/0x2d0
  sock_diag_rcv_msg+0x317/0x440
  netlink_rcv_skb+0x15e/0x430
  sock_diag_rcv+0x2b/0x40
  netlink_unicast+0x53b/0x800
  netlink_sendmsg+0x945/0xe60
  ____sys_sendmsg+0x747/0x960
  ___sys_sendmsg+0x13a/0x1e0
  __sys_sendmsg+0x118/0x1e0
  do_syscall_64+0x34/0x80
  entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Fixes: 0daf07e527 ("raw: convert raw sockets to RCU")
Reported-by: Roberto Ricci <rroberto2r@gmail.com>
Tested-by: Roberto Ricci <rroberto2r@gmail.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220916084821.229287-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:27:32 -07:00
Vladimir Oltean
5641c751fe net: enetc: deny offload of tc-based TSN features on VF interfaces
TSN features on the ENETC (taprio, cbs, gate, police) are configured
through a mix of command BD ring messages and port registers:
enetc_port_rd(), enetc_port_wr().

Port registers are a region of the ENETC memory map which are only
accessible from the PCIe Physical Function. They are not accessible from
the Virtual Functions.

Moreover, attempting to access these registers crashes the kernel:

$ echo 1 > /sys/bus/pci/devices/0000\:00\:00.0/sriov_numvfs
pci 0000:00:01.0: [1957:ef00] type 00 class 0x020001
fsl_enetc_vf 0000:00:01.0: Adding to iommu group 15
fsl_enetc_vf 0000:00:01.0: enabling device (0000 -> 0002)
fsl_enetc_vf 0000:00:01.0 eno0vf0: renamed from eth0
$ tc qdisc replace dev eno0vf0 root taprio num_tc 8 map 0 1 2 3 4 5 6 7 \
	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \
	sched-entry S 0x7f 900000 sched-entry S 0x80 100000 flags 0x2
Unable to handle kernel paging request at virtual address ffff800009551a08
Internal error: Oops: 96000007 [#1] PREEMPT SMP
pc : enetc_setup_tc_taprio+0x170/0x47c
lr : enetc_setup_tc_taprio+0x16c/0x47c
Call trace:
 enetc_setup_tc_taprio+0x170/0x47c
 enetc_setup_tc+0x38/0x2dc
 taprio_change+0x43c/0x970
 taprio_init+0x188/0x1e0
 qdisc_create+0x114/0x470
 tc_modify_qdisc+0x1fc/0x6c0
 rtnetlink_rcv_msg+0x12c/0x390

Split enetc_setup_tc() into separate functions for the PF and for the
VF drivers. Also remove enetc_qos.o from being included into
enetc-vf.ko, since it serves absolutely no purpose there.

Fixes: 34c6adf197 ("enetc: Configure the Time-Aware Scheduler via tc-taprio offload")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220916133209.3351399-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:27:10 -07:00
Vladimir Oltean
fed38e64d9 net: enetc: move enetc_set_psfp() out of the common enetc_set_features()
The VF netdev driver shouldn't respond to changes in the NETIF_F_HW_TC
flag; only PFs should. Moreover, TSN-specific code should go to
enetc_qos.c, which should not be included in the VF driver.

Fixes: 79e499829f ("net: enetc: add hw tc hw offload features for PSPF capability")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220916133209.3351399-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:27:09 -07:00
Jakub Kicinski
0507246d9e Merge branch 'wireguard-patches-for-6-0-rc6'
Jason A. Donenfeld says:

====================
wireguard patches for 6.0-rc6

1) The ratelimiter timing test doesn't help outside of development, yet
   it is currently preventing the module from being inserted on some
   kernels when it flakes at insertion time. So we disable it.

2) A fix for a build error on UML, caused by a recent change in a
   different tree.

3) A WARN_ON() is triggered by Kees' new fortified memcpy() patch, due
   to memcpy()ing over a sockaddr pointer with the size of a
   sockaddr_in[6]. The type safe fix is pretty simple. Given how classic
   of a thing sockaddr punning is, I suspect this may be the first in a
   few patches like this throughout the net tree, once Kees' fortify
   series is more widely deployed (current it's just in next).
====================

Link: https://lore.kernel.org/r/20220916143740.831881-1-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:26:19 -07:00
Jason A. Donenfeld
26c013108c wireguard: netlink: avoid variable-sized memcpy on sockaddr
Doing a variable-sized memcpy is slower, and the compiler isn't smart
enough to turn this into a constant-size assignment.

Further, Kees' latest fortified memcpy will actually bark, because the
destination pointer is type sockaddr, not explicitly sockaddr_in or
sockaddr_in6, so it thinks there's an overflow:

    memcpy: detected field-spanning write (size 28) of single field
    "&endpoint.addr" at drivers/net/wireguard/netlink.c:446 (size 16)

Fix this by just assigning by using explicit casts for each checked
case.

Fixes: e7096c131e ("net: WireGuard secure network tunnel")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reported-by: syzbot+a448cda4dba2dac50de5@syzkaller.appspotmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:26:14 -07:00
Jason A. Donenfeld
8e25c02b8c wireguard: selftests: do not install headers on UML
Since 1b620d539c ("kbuild: disable header exports for UML in a
straightforward way"), installing headers fails on UML, so just disable
installing them, since they're not needed anyway on the architecture.

Fixes: b438b3b8d6 ("wireguard: selftests: support UML")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:26:14 -07:00
Jason A. Donenfeld
684dec3cf4 wireguard: ratelimiter: disable timings test by default
A previous commit tried to make the ratelimiter timings test more
reliable but in the process made it less reliable on other
configurations. This is an impossible problem to solve without
increasingly ridiculous heuristics. And it's not even a problem that
actually needs to be solved in any comprehensive way, since this is only
ever used during development. So just cordon this off with a DEBUG_
ifdef, just like we do for the trie's randomized tests, so it can be
enabled while hacking on the code, and otherwise disabled in CI. In the
process we also revert 151c8e499f.

Fixes: 151c8e499f ("wireguard: ratelimiter: use hrtimer in selftest")
Fixes: e7096c131e ("net: WireGuard secure network tunnel")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:26:13 -07:00
Íñigo Huguet
589c6eded1 sfc/siena: fix null pointer dereference in efx_hard_start_xmit
Like in previous patch for sfc, prevent potential (but unlikely) NULL
pointer dereference.

Fixes: 12804793b1 ("sfc: decouple TXQ type from label")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Link: https://lore.kernel.org/r/20220915141958.16458-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:21:28 -07:00
Íñigo Huguet
974bb793ad sfc/siena: fix TX channel offset when using legacy interrupts
As in previous commit for sfc, fix TX channels offset when
efx_siena_separate_tx_channels is false (the default)

Fixes: 25bde571b4 ("sfc/siena: fix wrong tx channel offset with efx_separate_tx_channels")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Link: https://lore.kernel.org/r/20220915141653.15504-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:21:13 -07:00
Linus Torvalds
60891ec99e for-6.0-rc6-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmMpskIACgkQxWXV+ddt
 WDtxGA//Z4Z9e0p9CTwBGla9eqflpfPQLya93ANEBqhV/S1wxgvQtj+Q2XpGIqhj
 AVR4ZqEmnFPmAOay5s/mGQ+wZ3dyR+n/XLZ8XsViXY5yBLnRpZJi8p5ozqYuSm59
 1A4FF0ZciD73jql8hPodsd1VFkKqtOTmPFyCxHk2lt/Z36FFYKCUm4P8ALdMxlct
 6uEp67PI9Pb6PANq4mj8lpNTnsD2wTKDHqQ3WkHBwuHkEOCVkPbRsBlUkUqpYi0h
 Lc0XhjcnPX0alfiLFwwNdPZ8vrLE4egktzWA6PqEg1YzBPQQNnuQTHmO25KOqrm1
 bW20PGOIF7WFg85w1P20G4I8UdT2CWBEloPSjYTDlD2KTdqBOp95oo7MUQlrDFNm
 lxns3npylswlvia8nH39iOlwUPL75cDe4U8LkOV+rSHmTmt7B6XK/MfI6sYgmveH
 V4DUI7BnbfEALbJMsJesHAR/3tnsAPqnLtv+lEF9hM70YXdN2o5iN/D0G/vms3Sr
 RGVpEFJyJPnzvAg6y3PNTdMEpDtouQHQhHBtPKnfOzRJsgtzk5CTpEBkWPSRLiqm
 DQj25JdcT8j8Xa8nWppEvogC0hfctqs1ROuZux7KajkxUHEDfXs2l0RR1dEpMvs7
 v+Bhw3zLPS0e/b+9HqBSwCo0JAkIWzm6TE00LlKCYsnzNwLZT9k=
 =4Hu8
 -----END PGP SIGNATURE-----

Merge tag 'for-6.0-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - two fixes for hangs in the umount sequence where threads depend on
   each other and the work must be finished in the right order

 - in zoned mode, wait for flushing all block group metadata IO before
   finishing the zone

* tag 'for-6.0-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: zoned: wait for extent buffer IOs before finishing a zone
  btrfs: fix hang during unmount when stopping a space reclaim worker
  btrfs: fix hang during unmount when stopping block group reclaim worker
2022-09-20 10:23:24 -07:00
Linus Torvalds
84a3193883 fs.fixes.v6.0-rc7
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYymPoQAKCRCRxhvAZXjc
 ounZAQDGLmHjqby6KFLbNIHkgIMzODUk3OCLo3jNRsSw+SsJFQD/cW1eBM5P+ctO
 bePiCHMZv4Gh+G1dR2cchd3Etwks4A0=
 =7kI/
 -----END PGP SIGNATURE-----

Merge tag 'fs.fixes.v6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping

Pull vfs fix from Christian Brauner:
 "Beginning of the merge window we introduced the vfs{g,u}id_t types in
  b27c82e129 ("attr: port attribute changes to new types") and changed
  various codepaths over including chown_common().

  When userspace passes -1 for an ownership change the ownership fields
  in struct iattr stay uninitialized. Usually this is fine because any
  code making use of any fields in struct iattr must check the
  ->ia_valid field whether the value of interest has been initialized.
  That's true for all struct iattr passing code.

  However, over the course of the last year with more heavy use of KMSAN
  we found quite a few places that got this wrong. A recent one I fixed
  was 3cb6ee9914 ("9p: only copy valid iattrs in 9P2000.L setattr
  implementation").

  But we also have LSM hooks. Actually we have two. The first one is
  security_inode_setattr() in notify_change() which does the right thing
  and passes the full struct iattr down to LSMs and thus LSMs can check
  whether it is initialized.

  But then we also have security_path_chown() which passes down a path
  argument and the target ownership as the filesystem would see it. For
  the latter we now generate the target values based on struct iattr and
  pass it down. However, when userspace passes -1 then struct iattr
  isn't initialized.

  This patch simply initializes ->ia_vfs{g,u}id with INVALID_VFS{G,U}ID
  so the hook continue to see invalid ownership when -1 is passed from
  userspace. The only LSM that cares about the actual values is Tomoyo.

  The vfs codepaths don't look at these fields without ->ia_valid being
  set so there's no harm in initializing ->ia_vfs{g,u}id. Arguably this
  is also safer since we can't end up copying valid ownership values
  when invalid ownership values should be passed.

  This only affects mainline. No kernel has been released with this and
  thus no backport is needed. The commit is thus marked with a Fixes:
  tag but annotated with "# mainline only" (I didn't quite remember what
  Greg said about how to tell stable autoselect to not bother with fixes
  for mainline only)"

* tag 'fs.fixes.v6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
  open: always initialize ownership fields
2022-09-20 10:08:37 -07:00
Guilherme G. Piccoli
7da5b13dcc efi: efibc: Guard against allocation failure
There is a single kmalloc in this driver, and it's not currently
guarded against allocation failure. Do it here by just bailing-out
the reboot handler, in case this tentative allocation fails.

Fixes: 416581e486 ("efi: efibc: avoid efivar API for setting variables")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-09-20 18:42:55 +02:00
Linus Torvalds
f489921dba execve reverts for v6.0-rc7
- Remove the recent "unshare time namespace on vfork+exec" feature (Andrei Vagin)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmMoxpIWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJpd/D/9V7iLUZoquMvXFonv//sRH21P+
 u7vH03q0X4lSov73jdjizq8znZl9RVO14IYi+6lQE8VHyOjzjBoTALRPnirNCyGa
 Ia8P+LPaOHDTDQmGqt+9xmPKp3z0qwrpWWyTrFHLo7GRzWtI0QjQsSlgUTIz7jCw
 dSwLRWN6n7d3hzNzFWt9VUOOlzpip8NTcnAbC9YA5dPFLO85+wZ4ZpMYYfFJMcQj
 N/Zm63lrqAU0wy7EhonkKJQDjgRP/zYUs6VJMejHqYl951SrZJ+DgXEGaAwR14Sz
 IZAUhSM5Fl8alhkrcmlkiy9A5P014iVRR6AaSyeT2616fac97wY1EWHxvBMqzNsB
 AJJqjPHoN+mc8cqt9lMyIhbmS8WkTuyTHziEcFyyTVsNYGYN6x9hVVZalqPrl8o3
 Y3zC6MfRK33JNVB2GZVUzsf5EZC3mjz9VJKKmLwYmG4X7/JOvIVCiW123b060T7z
 b49PzI+0rTG8SHTk1I/T8NpWuvLRTCglzZK06q971uyT80xPoGD/HmSpmm+86dHs
 k3WV2qBoz31Eaoewa3NJqn6pBxQLy9WAZP6rJb3aQSFwDRCuvKO4CUpHAXILt5U+
 SoarR5445zVzY3NYHaf/3BRsEnCQS06U67ma0lAmMWk4J3ehFOY0DrRqtLJ02iwd
 sKJD/KnKC+IEcLjrAA==
 =yFGx
 -----END PGP SIGNATURE-----

Merge tag 'execve-v6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull execve reverts from Kees Cook:
 "The recent work to support time namespace unsharing turns out to have
  some undesirable corner cases, so rather than allowing the API to stay
  exposed for another release, it'd be best to remove it ASAP, with the
  replacement getting another cycle of testing. Nothing is known to use
  this yet, so no userspace breakage is expected.

  For more details, see:

    https://lore.kernel.org/lkml/ed418e43ad28b8688cfea2b7c90fce1c@ispras.ru

  Summary:

   - Remove the recent 'unshare time namespace on vfork+exec' feature
     (Andrei Vagin)"

* tag 'execve-v6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  Revert "fs/exec: allow to unshare a time namespace on vfork+exec"
  Revert "selftests/timens: add a test for vfork+exit"
2022-09-20 08:38:55 -07:00
Tetsuo Handa
d547c1b717 net: clear msg_get_inq in __get_compat_msghdr()
syzbot is still complaining uninit-value in tcp_recvmsg(), for
commit 1228b34c8d ("net: clear msg_get_inq in __sys_recvfrom() and
__copy_msghdr_from_user()") missed that __get_compat_msghdr() is called
instead of copy_msghdr_from_user() when MSG_CMSG_COMPAT is specified.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 1228b34c8d ("net: clear msg_get_inq in __sys_recvfrom() and __copy_msghdr_from_user()")
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/d06d0f7f-696c-83b4-b2d5-70b5f2730a37@I-love.SAKURA.ne.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 08:23:20 -07:00
Jakub Kicinski
68fe503c2b Merge branch 'ipmr-always-call-ip-6-_mr_forward-from-rcu-read-side-critical-section'
Ido Schimmel says:

====================
ipmr: Always call ip{,6}_mr_forward() from RCU read-side critical section

Patch #1 fixes a bug in ipmr code.

Patch #2 adds corresponding test cases.
====================

Link: https://lore.kernel.org/r/20220914075339.4074096-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 08:22:20 -07:00
Ido Schimmel
2b5a8c8f59 selftests: forwarding: Add test cases for unresolved multicast routes
Add IPv4 and IPv6 test cases for unresolved multicast routes, testing
that queued packets are forwarded after installing a matching (S, G)
route.

The test cases can be used to reproduce the bugs fixed in "ipmr: Always
call ip{,6}_mr_forward() from RCU read-side critical section".

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 08:22:15 -07:00
Ido Schimmel
b07a9b26e2 ipmr: Always call ip{,6}_mr_forward() from RCU read-side critical section
These functions expect to be called from RCU read-side critical section,
but this only happens when invoked from the data path via
ip{,6}_mr_input(). They can also be invoked from process context in
response to user space adding a multicast route which resolves a cache
entry with queued packets [1][2].

Fix by adding missing rcu_read_lock() / rcu_read_unlock() in these call
paths.

[1]
WARNING: suspicious RCU usage
6.0.0-rc3-custom-15969-g049d233c8bcc-dirty #1387 Not tainted
-----------------------------
net/ipv4/ipmr.c:84 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
1 lock held by smcrouted/246:
 #0: ffffffff862389b0 (rtnl_mutex){+.+.}-{3:3}, at: ip_mroute_setsockopt+0x11c/0x1420

stack backtrace:
CPU: 0 PID: 246 Comm: smcrouted Not tainted 6.0.0-rc3-custom-15969-g049d233c8bcc-dirty #1387
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x91/0xb9
 vif_dev_read+0xbf/0xd0
 ipmr_queue_xmit+0x135/0x1ab0
 ip_mr_forward+0xe7b/0x13d0
 ipmr_mfc_add+0x1a06/0x2ad0
 ip_mroute_setsockopt+0x5c1/0x1420
 do_ip_setsockopt+0x23d/0x37f0
 ip_setsockopt+0x56/0x80
 raw_setsockopt+0x219/0x290
 __sys_setsockopt+0x236/0x4d0
 __x64_sys_setsockopt+0xbe/0x160
 do_syscall_64+0x34/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

[2]
WARNING: suspicious RCU usage
6.0.0-rc3-custom-15969-g049d233c8bcc-dirty #1387 Not tainted
-----------------------------
net/ipv6/ip6mr.c:69 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
1 lock held by smcrouted/246:
 #0: ffffffff862389b0 (rtnl_mutex){+.+.}-{3:3}, at: ip6_mroute_setsockopt+0x6b9/0x2630

stack backtrace:
CPU: 1 PID: 246 Comm: smcrouted Not tainted 6.0.0-rc3-custom-15969-g049d233c8bcc-dirty #1387
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x91/0xb9
 vif_dev_read+0xbf/0xd0
 ip6mr_forward2.isra.0+0xc9/0x1160
 ip6_mr_forward+0xef0/0x13f0
 ip6mr_mfc_add+0x1ff2/0x31f0
 ip6_mroute_setsockopt+0x1825/0x2630
 do_ipv6_setsockopt+0x462/0x4440
 ipv6_setsockopt+0x105/0x140
 rawv6_setsockopt+0xd8/0x690
 __sys_setsockopt+0x236/0x4d0
 __x64_sys_setsockopt+0xbe/0x160
 do_syscall_64+0x34/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: ebc3197963 ("ipmr: add rcu protection over (struct vif_device)->dev")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 08:22:15 -07:00
Alex Elder
cf412ec333 net: ipa: properly limit modem routing table use
IPA can route packets between IPA-connected entities.  The AP and
modem are currently the only such entities supported, and no routing
is required to transfer packets between them.

The number of entries in each routing table is fixed, and defined at
initialization time.  Some of these entries are designated for use
by the modem, and the rest are available for the AP to use.  The AP
sends a QMI message to the modem which describes (among other
things) information about routing table memory available for the
modem to use.

Currently the QMI initialization packet gives wrong information in
its description of routing tables.  What *should* be supplied is the
maximum index that the modem can use for the routing table memory
located at a given location.  The current code instead supplies the
total *number* of routing table entries.  Furthermore, the modem is
granted the entire table, not just the subset it's supposed to use.

This patch fixes this.  First, the ipa_mem_bounds structure is
generalized so its "end" field can be interpreted either as a final
byte offset, or a final array index.  Second, the IPv4 and IPv6
(non-hashed and hashed) table information fields in the QMI
ipa_init_modem_driver_req structure are changed to be ipa_mem_bounds
rather than ipa_mem_array structures.  Third, we set the "end" value
for each routing table to be the last index, rather than setting the
"count" to be the number of indices.  Finally, instead of allowing
the modem to use all of a routing table's memory, it is limited to
just the portion meant to be used by the modem.  In all versions of
IPA currently supported, that is IPA_ROUTE_MODEM_COUNT (8) entries.

Update a few comments for clarity.

Fixes: 530f9216a9 ("soc: qcom: ipa: AP/modem communications")
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20220913204602.1803004-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 08:11:13 -07:00
Liang He
1c48709e6d of: mdio: Add of_node_put() when breaking out of for_each_xx
In of_mdiobus_register(), we should call of_node_put() for 'child'
escaped out of for_each_available_child_of_node().

Fixes: 66bdede495 ("of_mdio: Fix broken PHY IRQ in case of probe deferral")
Co-developed-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Liang He <windhl@126.com>
Link: https://lore.kernel.org/r/20220913125659.3331969-1-windhl@126.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 07:32:09 -07:00