1044823 Commits

Author SHA1 Message Date
Eric W. Biederman
5ebcbe342b ucounts: Move get_ucounts from cred_alloc_blank to key_change_session_keyring
Setting cred->ucounts in cred_alloc_blank does not make sense.  The
uid and user_ns are deliberately not set in cred_alloc_blank but
instead the setting is delayed until key_change_session_keyring.

So move dealing with ucounts into key_change_session_keyring as well.

Unfortunately that movement of get_ucounts adds a new failure mode to
key_change_session_keyring.  I do not see anything stopping the parent
process from calling setuid and changing the relevant part of it's
cred while keyctl_session_to_parent is running making it fundamentally
necessary to call get_ucounts in key_change_session_keyring.  Which
means that the new failure mode cannot be avoided.

A failure of key_change_session_keyring results in a single threaded
parent keeping it's existing credentials.  Which results in the parent
process not being able to access the session keyring and whichever
keys are in the new keyring.

Further get_ucounts is only expected to fail if the number of bits in
the refernece count for the structure is too few.

Since the code has no other way to report the failure of get_ucounts
and because such failures are not expected to be common add a WARN_ONCE
to report this problem to userspace.

Between the WARN_ONCE and the parent process not having access to
the keys in the new session keyring I expect any failure of get_ucounts
will be noticed and reported and we can find another way to handle this
condition.  (Possibly by just making ucounts->count an atomic_long_t).

Cc: stable@vger.kernel.org
Fixes: 905ae01c4ae2 ("Add a reference to ucounts for each cred")
Link: https://lkml.kernel.org/r/7k0ias0uf.fsf_-_@disp2133
Tested-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: Alexey Gladkov <legion@kernel.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-20 10:34:20 -05:00
Arnd Bergmann
36b6dcbc12 Reset controller fixes for v5.15
Fix the status bit polarity in the brcmstb-rescal driver, re-enable
 pistachio driver selection after MACH_PISTACHIO removal, add transfer
 error handling in the tegra-bpmp driver, and fix probing of the
 reset-socfpga driver after recent device link changes.
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYIADUWIQRRO6F6WdpH1R0vGibVhaclGDdiwAUCYW7f4RcccC56YWJlbEBw
 ZW5ndXRyb25peC5kZQAKCRDVhaclGDdiwHbBAQC60RYdccw5jg82QaySxbFkkBT9
 3ULxXp/Tr/s36VV6NQEA6kdiyfvHJi8YrkJS3rBd5sWvCjaqLCb0xl7wjCibhg0=
 =kZov
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmFwKs0ACgkQmmx57+YA
 GNkdYg//YJsCoovbeGeL7E4V7eKTHACUZWWzenQqsH2A3RA9h79EbUWfcEV2GVq7
 DbEdEVxC36DBM+MgjQMQIOTmUfv+YS3wXycwpC1INuXPB6uKsc1v/jJzQ2h2eGR8
 MxiH68Uw6wopS6iFCToebXCLU+AXQW1Vd9WbYfz3su+1uQEY/SROzeaf9saPElKR
 TJUI4sSy0cU6VWJ0r44g2jGNpBdFnV9q7Fi3UW58FXd43/ms1PelND5ECijiMDPo
 ytbnxBEmBSFkI0oSsmRNty9zU5qd+4PDXXgtoLJzakYvIDMF47SuKqrKJyR4fceO
 j3u7sRPVFeE56f3Gz4qIo1w2vxIPilQwa6Zj0NEcAxOwKQLHFMchU3yhlFd1cs7C
 vm29WawT0A4xRx3y62nfz4YjO2N584c5Th4fbZXCbZpJaKygOBth+5nKd8IQXVSg
 Dzn+KRigrw3qbCiiTmHwowwgXTOxR/DzV3yYY/bCA8K2ddv0r3MgNOp8eviEPidh
 9NlYriSPUbRNrZCcMd0YRxDygCVy9vkndirX3MrBmvxqJVH6h8Q7htPKlcCoYcIE
 aRJE/m4rNOazGkH6ePWRQ46zd1qlUYCPefIWfsFFAskMUOxXRiW1sd+NKwMkqJBN
 LCyvOLB/yd5OO9EHoyOin+sSQNL8R+ddSiKIpN7MLAieAzk5i7M=
 =GNV4
 -----END PGP SIGNATURE-----

Merge tag 'reset-fixes-for-v5.15' of git://git.pengutronix.de/pza/linux into arm/fixes

Reset controller fixes for v5.15

Fix the status bit polarity in the brcmstb-rescal driver, re-enable
pistachio driver selection after MACH_PISTACHIO removal, add transfer
error handling in the tegra-bpmp driver, and fix probing of the
reset-socfpga driver after recent device link changes.

* tag 'reset-fixes-for-v5.15' of git://git.pengutronix.de/pza/linux:
  reset: socfpga: add empty driver allowing consumers to probe
  reset: tegra-bpmp: Handle errors in BPMP response
  reset: pistachio: Re-enable driver selection
  reset: brcmstb-rescal: fix incorrect polarity of status bit

Link: https://lore.kernel.org/r/b378f2aae54538db6a13c98561b4cbcacbef937c.camel@pengutronix.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-10-20 16:42:21 +02:00
Arnd Bergmann
e23c7487f5 Two patches to fix the GMAC PHY mode on some boards.
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYW0pwAAKCRDj7w1vZxhR
 xcbPAQDtUoZcjLPx7X5Pn28UiqRaNr7bVR/A7lzXvyCRUGrqpgD/UyXP/KlWSXAY
 YXRzrksiJvPh7cLDPqLQCutLFMQKJAY=
 =VLd/
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmFwHusACgkQmmx57+YA
 GNm2exAAmvpGmTiUXd359nciMXudXbgVmYa9EqnTcYPKftNqCZmC4tyrogVcyJXI
 La06ucXq7+YAFZwBA6Dz/dNFvLSXyhl7Q18ouDlPPiSk+8VL1yGIaa97xvEiJFqQ
 LQtoxkzbipRLef9zQKWcYSSvAHBpFqeFRnG8b3CEFWz64pQO2AJiqcE4Q8O61kSh
 A6PTSlFq5y9qOX2mvLPukzUGc+ZlvaJ80hgEimonddkPSEOAwifuJW6v5jndVOiU
 xql4cqczwk4DlZsCREPdX8NbvCEYAzh+Iqhovyk53AQaGk3OQIHILVv/Ix2G1bpT
 crbBnm92IMMRVbnRmkfOUFq6bawJFiUTPVh9jdS4nBiIm2VA+f/ES0pR74aint/i
 gpKzx+71G0g2Km8OaF0ZN+sZaitMzKTrF6y5b1VbOdNqFIPXGcp/LUjGJn6Gj5Py
 az5zg4lqdjFdIbLj8Ek2CaAvRDIWQzwH/Ut/Y7btSZvf2MxV+FAyqF+SC7BGFUjb
 DNE1lOoH8OT7+Dw52mnSPa7dbMTVuYpwq1Bm5oOre5rCmOzJJlQfTKEAhR/PgR44
 e/HRprmYXmlk3bGf9RP1puhKWHEbSrBtBThZhmjmY7sCX4A5feGMWnWecoUUfyqR
 sc3TlqR+UhJJwNTT6u2iK1Qier4YpiaiN8XYRwzC6Ca+i7camgs=
 =Eb+K
 -----END PGP SIGNATURE-----

Merge tag 'sunxi-fixes-for-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/fixes

Two patches to fix the GMAC PHY mode on some boards.

* tag 'sunxi-fixes-for-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
  ARM: dts: sun7i: A20-olinuxino-lime2: Fix ethernet phy-mode
  arm64: dts: allwinner: h5: NanoPI Neo 2: Fix ethernet node

Link: https://lore.kernel.org/r/d4c41c71-f1ff-4464-a26f-1bfd4b52fd78.lettre@localhost
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-10-20 15:51:38 +02:00
Arnd Bergmann
72cd4e3bde i.MX fixes for 5.15, round 4:
- A series from Frieder Schrempf to fix i.MX8MM Kontron N801x board on
   various aspects, voltage settings, GPIO polarity, SPI clock frequency
   and Ethernet PHY type.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCgAyFiEEFmJXigPl4LoGSz08UFdYWoewfM4FAmFsuuoUHHNoYXduZ3Vv
 QGtlcm5lbC5vcmcACgkQUFdYWoewfM525QgAlFAdFWkUddaPS4QNc5ZsCgVkl8ER
 lwWWnFpTFMQuiif8LtQeWTpVOM+O9czkYk0MlOfuDowNa5J3/5lRFlScwNFPWnnt
 uccY9UI2LG1olLSMJTgK6XfHVf/6C13l6Hg1zS+ukBwozTQCrbdMIVG89iC5Gxsp
 NgN9jvmw6Zs8dqYYiDe9thcPplcdreLwySnVmVg48LWQKGEGM1pTA0HMM7L6zGHz
 f1irBfX7U5HE6nJLKtD/4+C+1nPdWxm7ESM1lWiPCnDYOnvSKjY2iuCEugnYfAFC
 yzTTrlckcTpy/oohT0Ni6J/ZLqKEPdfuVg87tSmNd+tU9MYovzLxvweSLA==
 =YjPJ
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmFwHnIACgkQmmx57+YA
 GNl4YQ/+Or42vIZvBaECWVAc8cd0RNAOt8isbUYowcXdG9TevFMHK2EfeYKe/HCg
 BDTtIpapy01nYoHxYvEVPEuU52GVNZMqfdcC9hvmQXwJ+0vAlyEpQ5CNvoouRlbu
 3YyckHIOZ58AlJGiF9hV6f4V20bQaIWdE8wEKN0nZNewcq3Y0vjODDP6KoQcflVW
 lHBlt2N5LKf+B0+F7sHO/bEzZUFy2z3Vs0tpNo+zE6DHkhUBVVGZvx232nkhbcPY
 UANLHq/a+3dEQKvm9N2ppK4c5yMYLWrjo4NnEv4TyPW+e3lTEyyZK8ZMY/CPM7dx
 sLgSNILjUvml5fvCAVVYpi6bZlvPerFbfDs2l262/iTL4vhTZghDsquQgMybCzm4
 bwW41nGGOrj+2dJXXvHlN02Y7Se90O2wGRhLnWAfMTNbAZrkjfMhVYNtxSgB1X1W
 o9IcavZQ1MELdJdxU7x8toxJXCAMMhugssWPAfT/vCdfDCPeEGasFqwGijmjhdNT
 /gK37CRDWynXpQk1aYjG48lYTVkM6M3imgnzZG39S9emCUlb8P+RI2dtjLgOm+Xy
 qwPZAr05UnqobinApYWbq5ZJHc4O4Iuux0VRfF/4C6GrLyL43eMJQkhdKo9KGkrA
 Q2+5mIvbLCiZY9KUd9XokOD9AeMSNvsuDzPVa/j/LeSOVeYiBSc=
 =/+Bt
 -----END PGP SIGNATURE-----

Merge tag 'imx-fixes-5.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes

i.MX fixes for 5.15, round 4:

- A series from Frieder Schrempf to fix i.MX8MM Kontron N801x board on
  various aspects, voltage settings, GPIO polarity, SPI clock frequency
  and Ethernet PHY type.

* tag 'imx-fixes-5.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  arm64: dts: imx8mm-kontron: Fix connection type for VSC8531 RGMII PHY
  arm64: dts: imx8mm-kontron: Fix CAN SPI clock frequency
  arm64: dts: imx8mm-kontron: Fix polarity of reg_rst_eth2
  arm64: dts: imx8mm-kontron: Set lower limit of VDD_SNVS to 800 mV
  arm64: dts: imx8mm-kontron: Make sure SOC and DRAM supply voltages are correct

Link: https://lore.kernel.org/r/20211018000958.GC25810@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-10-20 15:49:38 +02:00
Yang Yingliang
4225fea1cb ptp: Fix possible memory leak in ptp_clock_register()
I got memory leak as follows when doing fault injection test:

unreferenced object 0xffff88800906c618 (size 8):
  comm "i2c-idt82p33931", pid 4421, jiffies 4294948083 (age 13.188s)
  hex dump (first 8 bytes):
    70 74 70 30 00 00 00 00                          ptp0....
  backtrace:
    [<00000000312ed458>] __kmalloc_track_caller+0x19f/0x3a0
    [<0000000079f6e2ff>] kvasprintf+0xb5/0x150
    [<0000000026aae54f>] kvasprintf_const+0x60/0x190
    [<00000000f323a5f7>] kobject_set_name_vargs+0x56/0x150
    [<000000004e35abdd>] dev_set_name+0xc0/0x100
    [<00000000f20cfe25>] ptp_clock_register+0x9f4/0xd30 [ptp]
    [<000000008bb9f0de>] idt82p33_probe.cold+0x8b6/0x1561 [ptp_idt82p33]

When posix_clock_register() returns an error, the name allocated
in dev_set_name() will be leaked, the put_device() should be used
to give up the device reference, then the name will be freed in
kobject_cleanup() and other memory will be freed in ptp_clock_release().

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: a33121e5487b ("ptp: fix the race between the release of ptp_clock and cdev")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 14:44:33 +01:00
Kurt Kanzenbach
3cb958027c net: stmmac: Fix E2E delay mechanism
When utilizing End to End delay mechanism, the following error messages show up:

|root@ehl1:~# ptp4l --tx_timestamp_timeout=50 -H -i eno2 -E -m
|ptp4l[950.573]: selected /dev/ptp3 as PTP clock
|ptp4l[950.586]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE
|ptp4l[950.586]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE
|ptp4l[952.879]: port 1: new foreign master 001395.fffe.4897b4-1
|ptp4l[956.879]: selected best master clock 001395.fffe.4897b4
|ptp4l[956.879]: port 1: assuming the grand master role
|ptp4l[956.879]: port 1: LISTENING to GRAND_MASTER on RS_GRAND_MASTER
|ptp4l[962.017]: port 1: received DELAY_REQ without timestamp
|ptp4l[962.273]: port 1: received DELAY_REQ without timestamp
|ptp4l[963.090]: port 1: received DELAY_REQ without timestamp

Commit f2fb6b6275eb ("net: stmmac: enable timestamp snapshot for required PTP
packets in dwmac v5.10a") already addresses this problem for the dwmac
v5.10. However, same holds true for all dwmacs above version v4.10. Correct the
check accordingly. Afterwards everything works as expected.

Tested on Intel Atom(R) x6414RE Processor.

Fixes: 14f347334bf2 ("net: stmmac: Correctly take timestamp for PTPv2")
Fixes: f2fb6b6275eb ("net: stmmac: enable timestamp snapshot for required PTP packets in dwmac v5.10a")
Suggested-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 14:43:53 +01:00
Uwe Kleine-König
641e3fd1a0 nfc: st95hf: Make spi remove() callback return zero
If something goes wrong in the remove callback, returning an error code
just results in an error message. The device still disappears.

So don't skip disabling the regulator in st95hf_remove() if resetting
the controller via spi fails. Also don't return an error code which just
results in two error messages.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 14:41:20 +01:00
David S. Miller
323e9a957d Merge branch 'hns3-fixes'
Guangbin Huang says:

====================
net: hns3: add some fixes for -net

This series adds some fixes for the HNS3 ethernet driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Peng Li
0dd8a25f35 net: hns3: disable sriov before unload hclge layer
HNS3 driver includes hns3.ko, hnae3.ko and hclge.ko.
hns3.ko includes network stack and pci_driver, hclge.ko includes
HW device action, algo_ops and timer task, hnae3.ko includes some
register function.

When SRIOV is enable and hclge.ko is removed, HW device is unloaded
but VF still exists, PF will not reply VF mbx messages, and cause
errors.

This patch fix it by disable SRIOV before remove hclge.ko.

Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Yufeng Mo
1385cc81ba net: hns3: fix vf reset workqueue cannot exit
The task of VF reset is performed through the workqueue. It checks the
value of hdev->reset_pending to determine whether to exit the loop.
However, the value of hdev->reset_pending may also be assigned by
the interrupt function hclgevf_misc_irq_handle(), which may cause the
loop fail to exit and keep occupying the workqueue. This loop is not
necessary, so remove it and the workqueue will be rescheduled if the
reset needs to be retried or a new reset occurs.

Fixes: 1cc9bc6e5867 ("net: hns3: split hclgevf_reset() into preparing and rebuilding part")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Yunsheng Lin
68752b24f5 net: hns3: schedule the polling again when allocation fails
Currently when there is a rx page allocation failure, it is
possible that polling may be stopped if there is no more packet
to be reveiced, which may cause queue stall problem under memory
pressure.

This patch makes sure polling is scheduled again when there is
any rx page allocation failure, and polling will try to allocate
receive buffers until it succeeds.

Now the allocation retry is added, it is unnecessary to do the rx
page allocation at the end of rx cleaning, so remove it. And reset
the unused_count to zero after calling hns3_nic_alloc_rx_buffers()
to avoid calling hns3_nic_alloc_rx_buffers() repeatedly under
memory pressure.

Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Yunsheng Lin
9f9f0f1999 net: hns3: fix for miscalculation of rx unused desc
rx unused desc is the desc that need attatching new buffer
before refilling to hw to receive new packet, the number of
desc need attatching new buffer is calculated using next_to_use
and next_to_clean. when next_to_use == next_to_clean, currently
hns3 driver assumes that all the desc has the buffer attatched,
but 'next_to_use == next_to_clean' also means all the desc need
attatching new buffer if hw has comsumed all the desc and the
driver has not attatched any buffer to the desc yet.

This patch adds 'refill' in desc_cb to indicate whether a new
buffer has been refilled to a desc.

Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Yunsheng Lin
adfb7b4966 net: hns3: fix the max tx size according to user manual
Currently the max tx size supported by the hw is calculated by
using the max BD num supported by the hw. According to the hw
user manual, the max tx size is fixed value for both non-TSO and
TSO skb.

This patch updates the max tx size according to the manual.

Fixes: 8ae10cfb5089("net: hns3: support tx-scatter-gather-fraglist feature")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Guangbin Huang
731797fdff net: hns3: add limit ets dwrr bandwidth cannot be 0
If ets dwrr bandwidth of tc is set to 0, the hardware will switch to SP
mode. In this case, this tc may occupy all the tx bandwidth if it has
huge traffic, so it violates the purpose of the user setting.

To fix this problem, limit the ets dwrr bandwidth must greater than 0.

Fixes: cacde272dd00 ("net: hns3: Add hclge_dcb module for the support of DCB feature")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Guangbin Huang
b63fcaab95 net: hns3: reset DWRR of unused tc to zero
Currently, DWRR of tc will be initialized to a fixed value when this tc
is enabled, but it is not been reset to 0 when this tc is disabled. It
cause a problem that the DWRR of unused tc is not 0 after using tc tool
to add and delete multi-tc parameters.

For examples, after enabling 4 TCs and restoring to 1 TC by follow
tc commands:

$ tc qdisc add dev eth0 root mqprio num_tc 4 map 0 1 2 3 0 1 2 3 queues \
  8@0 8@8 8@16 8@24 hw 1 mode channel
$ tc qdisc del dev eth0 root

Now there is just one TC is enabled for eth0, but the tc info querying by
debugfs is shown as follow:

$ cat /mnt/hns3/0000:7d:00.0/tm/tc_sch_info
enabled tc number: 1
weight_offset: 14
TC    MODE  WEIGHT
0     dwrr    100
1     dwrr    100
2     dwrr    100
3     dwrr    100
4     dwrr      0
5     dwrr      0
6     dwrr      0
7     dwrr      0

This patch fixes it by resetting DWRR of tc to 0 when tc is disabled.

Fixes: 848440544b41 ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Jiaran Zhang
60484103d5 net: hns3: Add configuration of TM QCN error event
Add configuration of interrupt type and fifo interrupt enable of TM QCN
error event if enabled, otherwise this event will not be reported when
there is error.

Fixes: d914971df022 ("net: hns3: remove redundant query in hclge_config_tm_hw_err_int()")
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:38:11 +01:00
Nathan Lynch
787252a10d powerpc/smp: do not decrement idle task preempt count in CPU offline
With PREEMPT_COUNT=y, when a CPU is offlined and then onlined again, we
get:

BUG: scheduling while atomic: swapper/1/0/0x00000000
no locks held by swapper/1/0.
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.15.0-rc2+ #100
Call Trace:
 dump_stack_lvl+0xac/0x108
 __schedule_bug+0xac/0xe0
 __schedule+0xcf8/0x10d0
 schedule_idle+0x3c/0x70
 do_idle+0x2d8/0x4a0
 cpu_startup_entry+0x38/0x40
 start_secondary+0x2ec/0x3a0
 start_secondary_prolog+0x10/0x14

This is because powerpc's arch_cpu_idle_dead() decrements the idle task's
preempt count, for reasons explained in commit a7c2bb8279d2 ("powerpc:
Re-enable preemption before cpu_die()"), specifically "start_secondary()
expects a preempt_count() of 0."

However, since commit 2c669ef6979c ("powerpc/preempt: Don't touch the idle
task's preempt_count during hotplug") and commit f1a0a376ca0c ("sched/core:
Initialize the idle task with preemption disabled"), that justification no
longer holds.

The idle task isn't supposed to re-enable preemption, so remove the
vestigial preempt_enable() from the CPU offline path.

Tested with pseries and powernv in qemu, and pseries on PowerVM.

Fixes: 2c669ef6979c ("powerpc/preempt: Don't touch the idle task's preempt_count during hotplug")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211015173902.2278118-1-nathanl@linux.ibm.com
2021-10-20 21:38:01 +11:00
Michael Ellerman
496c5fe25c powerpc/idle: Don't corrupt back chain when going idle
In isa206_idle_insn_mayloss() we store various registers into the stack
red zone, which is allowed.

However inside the IDLE_STATE_ENTER_SEQ_NORET macro we save r2 again,
to 0(r1), which corrupts the stack back chain.

We used to do the same in isa206_idle_insn_mayloss() itself, but we
fixed that in 73287caa9210 ("powerpc64/idle: Fix SP offsets when saving
GPRs"), however we missed that the macro also corrupts the back chain.

Corrupting the back chain is bad for debuggability but doesn't
necessarily cause a bug.

However we recently changed the stack handling in some KVM code, and it
now relies on the stack back chain being valid when it returns. The
corruption causes that code to return with r1 pointing somewhere in
kernel data, at some point LR is restored from the stack and we branch
to NULL or somewhere else invalid.

Only affects Power8 hosts running KVM guests, with dynamic_mt_modes
enabled (which it is by default).

The fixes tag below points to the commit that changed the KVM stack
handling, exposing this bug. The actual corruption of the back chain has
always existed since 948cf67c4726 ("powerpc: Add NAP mode support on
Power7 in HV mode").

Fixes: 9b4416c5095c ("KVM: PPC: Book3S HV: Fix stack handling in idle_kvm_start_guest()")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211020094826.3222052-1-mpe@ellerman.id.au
2021-10-20 21:37:58 +11:00
Eugene Crosser
55161e67d4 vrf: Revert "Reset skb conntrack connection..."
This reverts commit 09e856d54bda5f288ef8437a90ab2b9b3eab83d1.

When an interface is enslaved in a VRF, prerouting conntrack hook is
called twice: once in the context of the original input interface, and
once in the context of the VRF interface. If no special precausions are
taken, this leads to creation of two conntrack entries instead of one,
and breaks SNAT.

Commit above was intended to avoid creation of extra conntrack entries
when input interface is enslaved in a VRF. It did so by resetting
conntrack related data associated with the skb when it enters VRF context.

However it breaks netfilter operation. Imagine a use case when conntrack
zone must be assigned based on the original input interface, rather than
VRF interface (that would make original interfaces indistinguishable). One
could create netfilter rules similar to these:

        chain rawprerouting {
                type filter hook prerouting priority raw;
                iif realiface1 ct zone set 1 return
                iif realiface2 ct zone set 2 return
        }

This works before the mentioned commit, but not after: zone assignment
is "forgotten", and any subsequent NAT or filtering that is dependent
on the conntrack zone does not work.

Here is a reproducer script that demonstrates the difference in behaviour.

==========
#!/bin/sh

# This script demonstrates unexpected change of nftables behaviour
# caused by commit 09e856d54bda5f28 ""vrf: Reset skb conntrack
# connection on VRF rcv"
# https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=09e856d54bda5f288ef8437a90ab2b9b3eab83d1
#
# Before the commit, it was possible to assign conntrack zone to a
# packet (or mark it for `notracking`) in the prerouting chanin, raw
# priority, based on the `iif` (interface from which the packet
# arrived).
# After the change, # if the interface is enslaved in a VRF, such
# assignment is lost. Instead, assignment based on the `iif` matching
# the VRF master interface is honored. Thus it is impossible to
# distinguish packets based on the original interface.
#
# This script demonstrates this change of behaviour: conntrack zone 1
# or 2 is assigned depending on the match with the original interface
# or the vrf master interface. It can be observed that conntrack entry
# appears in different zone in the kernel versions before and after
# the commit.

IPIN=172.30.30.1
IPOUT=172.30.30.2
PFXL=30

ip li sh vein >/dev/null 2>&1 && ip li del vein
ip li sh tvrf >/dev/null 2>&1 && ip li del tvrf
nft list table testct >/dev/null 2>&1 && nft delete table testct

ip li add vein type veth peer veout
ip li add tvrf type vrf table 9876
ip li set veout master tvrf
ip li set vein up
ip li set veout up
ip li set tvrf up
/sbin/sysctl -w net.ipv4.conf.veout.accept_local=1
/sbin/sysctl -w net.ipv4.conf.veout.rp_filter=0
ip addr add $IPIN/$PFXL dev vein
ip addr add $IPOUT/$PFXL dev veout

nft -f - <<__END__
table testct {
	chain rawpre {
		type filter hook prerouting priority raw;
		iif { veout, tvrf } meta nftrace set 1
		iif veout ct zone set 1 return
		iif tvrf ct zone set 2 return
		notrack
	}
	chain rawout {
		type filter hook output priority raw;
		notrack
	}
}
__END__

uname -rv
conntrack -F
ping -W 1 -c 1 -I vein $IPOUT
conntrack -L

Signed-off-by: Eugene Crosser <crosser@average.org>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20 11:27:19 +01:00
Marios Makassikis
0d994cd482 ksmbd: add buffer validation in session setup
Make sure the security buffer's length/offset are valid with regards to
the packet length.

Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-20 00:07:10 -05:00
Namjae Jeon
621be84a9d ksmbd: throttle session setup failures to avoid dictionary attacks
To avoid dictionary attacks (repeated session setups rapidly sent) to
connect to server, ksmbd make a delay of a 5 seconds on session setup
failure to make it harder to send enough random connection requests
to break into a server if a user insert the wrong password 10 times
in a row.

Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-20 00:07:10 -05:00
Hyunchul Lee
34061d6b76 ksmbd: validate OutputBufferLength of QUERY_DIR, QUERY_INFO, IOCTL requests
Validate OutputBufferLength of QUERY_DIR, QUERY_INFO, IOCTL requests and
check the free size of response buffer for these requests.

Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-20 00:07:10 -05:00
Russ Weight
f09f6dfef8
spi: altera: Change to dynamic allocation of spi id
The spi-altera driver has two flavors: platform and dfl. I'm seeing
a case where I have both device types in the same machine, and they
are conflicting on the SPI ID:

... kernel: couldn't get idr
... kernel: WARNING: CPU: 28 PID: 912 at drivers/spi/spi.c:2920 spi_register_controller.cold+0x84/0xc0a

Both the platform and dfl drivers use the parent's driver ID as the SPI
ID. In the error case, the parent devices are dfl_dev.4 and
subdev_spi_altera.4.auto. When the second spi-master is created, the
failure occurs because the SPI ID of 4 has already been allocated.

Change the ID allocation to dynamic (by initializing bus_num to -1) to
avoid duplicate SPI IDs.

Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20211019002401.24041-1-russell.h.weight@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-20 01:53:15 +01:00
Mustafa Ismail
2dace185ca RDMA/irdma: Do not hold qos mutex twice on QP resume
When irdma_ws_add fails, irdma_ws_remove is used to cleanup the leaf node.
This lead to holding the qos mutex twice in the QP resume path. Fix this
by avoiding the call to irdma_ws_remove and unwinding the error in
irdma_ws_add. This skips the call to irdma_tc_in_use function which is not
needed in the error unwind cases.

Fixes: 3ae331c75128 ("RDMA/irdma: Add QoS definitions")
Link: https://lore.kernel.org/r/20211019151654.1943-2-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-19 20:22:01 -03:00
Mustafa Ismail
cc07b73ef1 RDMA/irdma: Set VLAN in UD work completion correctly
Currently VLAN is reported in UD work completion when VLAN id is zero,
i.e. no VLAN case.

Report VLAN in UD work completion only when VLAN id is non-zero.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20211019151654.1943-1-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-19 20:22:01 -03:00
Aharon Landau
5508546631 RDMA/mlx5: Initialize the ODP xarray when creating an ODP MR
Normally the zero fill would hide the missing initialization, but an
errant set to desc_size in reg_create() causes a crash:

  BUG: unable to handle page fault for address: 0000000800000000
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 5 PID: 890 Comm: ib_write_bw Not tainted 5.15.0-rc4+ #47
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  RIP: 0010:mlx5_ib_dereg_mr+0x14/0x3b0 [mlx5_ib]
  Code: 48 63 cd 4c 89 f7 48 89 0c 24 e8 37 30 03 e1 48 8b 0c 24 eb a0 90 0f 1f 44 00 00 41 56 41 55 41 54 55 53 48 89 fb 48 83 ec 30 <48> 8b 2f 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 8b 87 c8
  RSP: 0018:ffff88811afa3a60 EFLAGS: 00010286
  RAX: 000000000000001c RBX: 0000000800000000 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000800000000
  RBP: 0000000800000000 R08: 0000000000000000 R09: c0000000fffff7ff
  R10: ffff88811afa38f8 R11: ffff88811afa38f0 R12: ffffffffa02c7ac0
  R13: 0000000000000000 R14: ffff88811afa3cd8 R15: ffff88810772fa00
  FS:  00007f47b9080740(0000) GS:ffff88852cd40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000800000000 CR3: 000000010761e003 CR4: 0000000000370ea0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   mlx5_ib_free_odp_mr+0x95/0xc0 [mlx5_ib]
   mlx5_ib_dereg_mr+0x128/0x3b0 [mlx5_ib]
   ib_dereg_mr_user+0x45/0xb0 [ib_core]
   ? xas_load+0x8/0x80
   destroy_hw_idr_uobject+0x1a/0x50 [ib_uverbs]
   uverbs_destroy_uobject+0x2f/0x150 [ib_uverbs]
   uobj_destroy+0x3c/0x70 [ib_uverbs]
   ib_uverbs_cmd_verbs+0x467/0xb00 [ib_uverbs]
   ? uverbs_finalize_object+0x60/0x60 [ib_uverbs]
   ? ttwu_queue_wakelist+0xa9/0xe0
   ? pty_write+0x85/0x90
   ? file_tty_write.isra.33+0x214/0x330
   ? process_echoes+0x60/0x60
   ib_uverbs_ioctl+0xa7/0x110 [ib_uverbs]
   __x64_sys_ioctl+0x10d/0x8e0
   ? vfs_write+0x17f/0x260
   do_syscall_64+0x3c/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Add the missing xarray initialization and remove the desc_size set.

Fixes: a639e66703ee ("RDMA/mlx5: Zero out ODP related items in the mlx5_ib_mr")
Link: https://lore.kernel.org/r/a4846a11c9de834663e521770da895007f9f0d30.1634642730.git.leonro@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-19 20:19:59 -03:00
Prabhakar Kushwaha
60fab10766 rdma/qedr: Fix crash due to redundant release of device's qp memory
Device's QP memory should only be allocated and released by IB layer.
This patch removes the redundant release of the device's qp memory and
uses completion APIs to make sure that .destroy_qp() only return, when qp
reference becomes 0.

Fixes: 514aee660df4 ("RDMA: Globally allocate and release QP memory")
Link: https://lore.kernel.org/r/20211019082212.7052-1-pkushwaha@marvell.com
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-19 20:17:44 -03:00
Pavel Begunkov
bc369921d6 io-wq: max_worker fixes
First, fix nr_workers checks against max_workers, with max_worker
registration, it may pretty easily happen that nr_workers > max_workers.

Also, synchronise writing to acct->max_worker with wqe->lock. It's not
an actual problem, but as we don't care about io_wqe_create_worker(),
it's better than WRITE_ONCE()/READ_ONCE().

Fixes: 2e480058ddc2 ("io-wq: provide a way to limit max number of workers")
Reported-by: Beld Zhang <beldzhang@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/11f90e6b49410b7d1a88f5d04fb8d95bb86b8cf3.1634671835.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 17:09:34 -06:00
Christophe JAILLET
ba69fd9101 net: dsa: Fix an error handling path in 'dsa_switch_parse_ports_of()'
If we return before the end of the 'for_each_child_of_node()' iterator, the
reference taken on 'port' must be released.

Add the missing 'of_node_put()' calls.

Fixes: 83c0afaec7b7 ("net: dsa: Add new binding implementation")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/15d5310d1d55ad51c1af80775865306d92432e03.1634587046.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-19 15:41:16 -07:00
Rafael J. Wysocki
bc28368596 ACPI: PM: Do not turn off power resources in unknown state
Commit 6381195ad7d0 ("ACPI: power: Rework turning off unused power
resources") caused power resources in unknown state with reference
counters equal to zero to be turned off too, but that caused issues
to appear in the field, so modify the code to only turn off power
resources that are known to be "on".

Link: https://lore.kernel.org/linux-acpi/6faf4b92-78d5-47a4-63df-cc2bab7769d0@molgen.mpg.de/
Fixes: 6381195ad7d0 ("ACPI: power: Rework turning off unused power resources")
Reported-by: Andreas K. Huettel <andreas.huettel@ur.de>
Tested-by: Andreas K. Huettel <andreas.huettel@ur.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: 5.14+ <stable@vger.kernel.org> # 5.14+
2021-10-19 19:28:56 +02:00
Eric W. Biederman
34dc2fd6e6 ucounts: Proper error handling in set_cred_ucounts
Instead of leaking the ucounts in new if alloc_ucounts fails, store
the result of alloc_ucounts into a temporary variable, which is later
assigned to new->ucounts.

Cc: stable@vger.kernel.org
Fixes: 905ae01c4ae2 ("Add a reference to ucounts for each cred")
Link: https://lkml.kernel.org/r/87pms2s0v8.fsf_-_@disp2133
Tested-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: Alexey Gladkov <legion@kernel.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-19 11:04:25 -05:00
Eric W. Biederman
629715adc6 ucounts: Pair inc_rlimit_ucounts with dec_rlimit_ucoutns in commit_creds
The purpose of inc_rlimit_ucounts and dec_rlimit_ucounts in commit_creds
is to change which rlimit counter is used to track a process when the
credentials changes.

Use the same test for both to guarantee the tracking is correct.

Cc: stable@vger.kernel.org
Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts")
Link: https://lkml.kernel.org/r/87v91us0w4.fsf_-_@disp2133
Tested-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: Alexey Gladkov <legion@kernel.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-19 11:01:52 -05:00
Woody Lin
63acd42c0d sched/scs: Reset the shadow stack when idle_task_exit
Commit f1a0a376ca0c ("sched/core: Initialize the idle task with
preemption disabled") removed the init_idle() call from
idle_thread_get(). This was the sole call-path on hotplug that resets
the Shadow Call Stack (scs) Stack Pointer (sp).

Not resetting the scs-sp leads to scs overflow after enough hotplug
cycles. Therefore add an explicit scs_task_reset() to the hotplug code
to make sure the scs-sp does get reset on hotplug.

Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled")
Signed-off-by: Woody Lin <woodylin@google.com>
[peterz: Changelog]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lore.kernel.org/r/20211012083521.973587-1-woodylin@google.com
2021-10-19 17:46:11 +02:00
Linus Torvalds
d9abdee5fd Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "19 patches.

  Subsystems affected by this patch series: mm (userfaultfd, migration,
  memblock, mempolicy, slub, secretmem, and thp), ocfs2, binfmt, vfs,
  and misc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mailmap: add Andrej Shadura
  mm/thp: decrease nr_thps in file's mapping on THP split
  mm/secretmem: fix NULL page->mapping dereference in page_is_secretmem()
  vfs: check fd has read access in kernel_read_file_from_fd()
  elfcore: correct reference to CONFIG_UML
  mm, slub: fix incorrect memcg slab count for bulk free
  mm, slub: fix potential use-after-free in slab_debugfs_fops
  mm, slub: fix potential memoryleak in kmem_cache_open()
  mm, slub: fix mismatch between reconstructed freelist depth and cnt
  mm, slub: fix two bugs in slab_debug_trace_open()
  mm/mempolicy: do not allow illegal MPOL_F_NUMA_BALANCING | MPOL_LOCAL in mbind()
  memblock: check memory total_size
  ocfs2: mount fails with buffer overflow in strlen
  ocfs2: fix data corruption after conversion from inline format
  mm/migrate: fix CPUHP state to update node demotion order
  mm/migrate: add CPU hotplug to demotion #ifdef
  mm/migrate: optimize hotplug-time demotion order updates
  userfaultfd: fix a race between writeprotect and exit_mmap()
  mm/userfaultfd: selftests: fix memory corruption with thp enabled
2021-10-19 05:41:36 -10:00
David S. Miller
04ee2752a5 linux-can-fixes-for-5.15-20211019
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEK3kIWJt9yTYMP3ehqclaivrt76kFAmFub7gTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRCpyVqK+u3vqf6yCACAiYQuk1sRh7/b6G4Qk2QUTWUxhG/T
 NWAOVuYMJkvGAMRAxjA7ecwRqA19/WIS2ys00t8UW62mvFGrGfcSBXWj3baPnx7w
 rfhyd+gk1LqOjWcaQriDmLO2ERY27roNA7FMtcMCoRlbYkrnYcFyKUMdhQXSW3/R
 /5KIEWBFzmr6ISUBUpkOeDs3irjy3ScIE0fEBt1+Uu3XydFWt7qnrMqBGlwoc9Xs
 du5VN5nWTZzdB9ju7p0PyrYdwoHsa1L1XbnIJd1hTLCua38028nuL9v7OGy3QvwN
 1Tz10PTnTk+2KMCkR4AwuzEBcHBVIcNmTrpTQAKiecB0QbXQlR9es0J5
 =9TSz
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-5.15-20211019' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2021-10-19

this is a pull request of a single patch for net/master.

The patch is by me and fixes the error handling in case of a FC
timeout in the TX path of the ISOTOP CAN protocol.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-19 13:14:36 +01:00
Zheyu Ma
c69b2f4687 cavium: Fix return values of the probe function
During the process of driver probing, the probe function should return < 0
for failure, otherwise, the kernel will treat value > 0 as success.

Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-19 13:09:57 +01:00
Zheyu Ma
e211210098 mISDN: Fix return values of the probe function
During the process of driver probing, the probe function should return < 0
for failure, otherwise, the kernel will treat value > 0 as success.

Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-19 13:09:28 +01:00
Arnd Bergmann
48ccc8edf5 ARM: 9141/1: only warn about XIP address when not compile testing
In randconfig builds, we sometimes come across this warning:

arm-linux-gnueabi-ld: XIP start address may cause MPU programming issues

While this is helpful for actual systems to figure out why it
fails, the warning does not provide any benefit for build testing,
so guard it in a check for CONFIG_COMPILE_TEST, which is usually
set on randconfig builds.

Fixes: 216218308cfb ("ARM: 8713/1: NOMMU: Support MPU in XIP configuration")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-19 10:39:50 +01:00
Arnd Bergmann
1f323127ca ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype
With extra warnings enabled, gcc complains about this function
definition:

arch/arm/probes/kprobes/core.c: In function 'arch_init_kprobes':
arch/arm/probes/kprobes/core.c:465:12: warning: old-style function definition [-Wold-style-definition]
  465 | int __init arch_init_kprobes()

Link: https://lore.kernel.org/all/20201027093057.c685a14b386acacb3c449e3d@kernel.org/

Fixes: 24ba613c9d6c ("ARM kprobes: core code")
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-19 10:39:50 +01:00
Arnd Bergmann
44cc6412e6 ARM: 9138/1: fix link warning with XIP + frame-pointer
When frame pointers are used instead of the ARM unwinder,
and the kernel is built using clang with an external assembler
and CONFIG_XIP_KERNEL, every file produces two warnings
like:

arm-linux-gnueabi-ld: warning: orphan section `.ARM.extab' from `net/mac802154/util.o' being placed in section `.ARM.extab'
arm-linux-gnueabi-ld: warning: orphan section `.ARM.exidx' from `net/mac802154/util.o' being placed in section `.ARM.exidx'

The same fix was already merged for the normal (non-XIP)

linker script, with a longer description.

Fixes: c39866f268f8 ("arm/build: Always handle .ARM.exidx and .ARM.extab sections")
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-19 10:38:22 +01:00
Arnd Bergmann
eaf6cc7165 ARM: 9134/1: remove duplicate memcpy() definition
Both the decompressor code and the kasan logic try to override
the memcpy() and memmove()  definitions, which leading to a clash
in a KASAN-enabled kernel with XZ decompression:

arch/arm/boot/compressed/decompress.c:50:9: error: 'memmove' macro redefined [-Werror,-Wmacro-redefined]
 #define memmove memmove
        ^
arch/arm/include/asm/string.h:59:9: note: previous definition is here
 #define memmove(dst, src, len) __memmove(dst, src, len)
        ^
arch/arm/boot/compressed/decompress.c:51:9: error: 'memcpy' macro redefined [-Werror,-Wmacro-redefined]
 #define memcpy memcpy
        ^
arch/arm/include/asm/string.h:58:9: note: previous definition is here
 #define memcpy(dst, src, len) __memcpy(dst, src, len)
        ^

Here we want the set of functions from the decompressor, so undefine
the other macros before the override.

Link: https://lore.kernel.org/linux-arm-kernel/CACRpkdZYJogU_SN3H9oeVq=zJkRgRT1gDz3xp59gdqWXxw-B=w@mail.gmail.com/
Link: https://lore.kernel.org/lkml/202105091112.F5rmd4By-lkp@intel.com/

Fixes: d6d51a96c7d6 ("ARM: 9014/2: Replace string mem* functions for KASan")
Fixes: a7f464f3db93 ("ARM: 7001/2: Wire up support for the XZ decompressor")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-19 10:37:36 +01:00
Nick Desaulniers
e6a0c958bd ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned
A kernel built with CONFIG_THUMB2_KERNEL=y and using clang as the
assembler could generate non-naturally-aligned v7wbi_tlb_fns which
results in a boot failure. The original commit adding the macro missed
the .align directive on this data.

Link: https://github.com/ClangBuiltLinux/linux/issues/1447
Link: https://lore.kernel.org/all/0699da7b-354f-aecc-a62f-e25693209af4@linaro.org/
Debugged-by: Ard Biesheuvel <ardb@kernel.org>
Debugged-by: Nathan Chancellor <nathan@kernel.org>
Debugged-by: Richard Henderson <richard.henderson@linaro.org>

Fixes: 66a625a88174 ("ARM: mm: proc-macros: Add generic proc/cache/tlb struct definition macros")
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-19 10:37:35 +01:00
Lexi Shao
df909df077 ARM: 9132/1: Fix __get_user_check failure with ARM KASAN images
ARM: kasan: Fix __get_user_check failure with kasan

In macro __get_user_check defined in arch/arm/include/asm/uaccess.h,
error code is store in register int __e(r0). When kasan is
enabled, assigning value to kernel address might trigger kasan check,
which unexpectedly overwrites r0 and causes undefined behavior on arm
kasan images.

One example is failure in do_futex and results in process soft lockup.
Log:
watchdog: BUG: soft lockup - CPU#0 stuck for 62946ms! [rs:main
Q:Reg:1151]
...
(__asan_store4) from (futex_wait_setup+0xf8/0x2b4)
(futex_wait_setup) from (futex_wait+0x138/0x394)
(futex_wait) from (do_futex+0x164/0xe40)
(do_futex) from (sys_futex_time32+0x178/0x230)
(sys_futex_time32) from (ret_fast_syscall+0x0/0x50)

The soft lockup happens in function futex_wait_setup. The reason is
function get_futex_value_locked always return EINVAL, thus pc jump
back to retry label and causes looping.

This line in function get_futex_value_locked
	ret = __get_user(*dest, from);
is expanded to
	*dest = (typeof(*(p))) __r2; ,
in macro __get_user_check. Writing to pointer dest triggers kasan check
and overwrites the return value of __get_user_x function.
The assembly code of get_futex_value_locked in kernel/futex.c:
...
c01f6dc8:       eb0b020e        bl      c04b7608 <__get_user_4>
// "x = (typeof(*(p))) __r2;" triggers kasan check and r0 is overwritten
c01f6dCc:       e1a00007        mov     r0, r7
c01f6dd0:       e1a05002        mov     r5, r2
c01f6dd4:       eb04f1e6        bl      c0333574 <__asan_store4>
c01f6dd8:       e5875000        str     r5, [r7]
// save ret value of __get_user(*dest, from), which is dest address now
c01f6ddc:       e1a05000        mov     r5, r0
...
// checking return value of __get_user failed
c01f6e00:       e3550000        cmp     r5, #0
...
c01f6e0c:       01a00005        moveq   r0, r5
// assign return value to EINVAL
c01f6e10:       13e0000d        mvnne   r0, #13

Return value is the destination address of get_user thus certainly
non-zero, so get_futex_value_locked always return EINVAL.

Fix it by using a tmp vairable to store the error code before the
assignment. This fix has no effects to non-kasan images thanks to compiler
optimization. It only affects cases that overwrite r0 due to kasan check.

This should fix bug discussed in Link:
[1] https://lore.kernel.org/linux-arm-kernel/0ef7c2a5-5d8b-c5e0-63fa-31693fd4495c@gmail.com/

Fixes: 421015713b30 ("ARM: 9017/2: Enable KASan for ARM")
Signed-off-by: Lexi Shao <shaolexi@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-19 10:37:35 +01:00
Ard Biesheuvel
00d43d13da ARM: 9125/1: fix incorrect use of get_kernel_nofault()
Commit 344179fc7ef4 ("ARM: 9106/1: traps: use get_kernel_nofault instead
of set_fs()") replaced an occurrence of __get_user() with
get_kernel_nofault(), but inverted the sense of the conditional in the
process, resulting in no values to be printed at all.

I.e., every exception stack now looks like this:

Exception stack(0xc18d1fb0 to 0xc18d1ff8)
1fa0:                                     ???????? ???????? ???????? ????????
1fc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
1fe0: ???????? ???????? ???????? ???????? ???????? ????????

which is rather unhelpful.

Fixes: 344179fc7ef4 ("ARM: 9106/1: traps: use get_kernel_nofault instead of set_fs()")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-19 10:37:34 +01:00
Nick Desaulniers
9d417cbe36 ARM: 9122/1: select HAVE_FUTEX_CMPXCHG
tglx notes:
  This function [futex_detect_cmpxchg] is only needed when an
  architecture has to runtime discover whether the CPU supports it or
  not.  ARM has unconditional support for this, so the obvious thing to
  do is the below.

Fixes linkage failure from Clang randconfigs:
kernel/futex.o:(.text.fixup+0x5c): relocation truncated to fit: R_ARM_JUMP24 against `.init.text'
and boot failures for CONFIG_THUMB2_KERNEL.

Link: https://github.com/ClangBuiltLinux/linux/issues/325

Comments from Nick Desaulniers:

 See-also: 03b8c7b623c8 ("futex: Allow architectures to skip
 futex_atomic_cmpxchg_inatomic() test")

Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: stable@vger.kernel.org # v3.14+
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-19 10:37:34 +01:00
Jeff Layton
1bd85aa65d ceph: fix handling of "meta" errors
Currently, we check the wb_err too early for directories, before all of
the unsafe child requests have been waited on. In order to fix that we
need to check the mapping->wb_err later nearer to the end of ceph_fsync.

We also have an overly-complex method for tracking errors after
blocklisting. The errors recorded in cleanup_session_requests go to a
completely separate field in the inode, but we end up reporting them the
same way we would for any other error (in fsync).

There's no real benefit to tracking these errors in two different
places, since the only reporting mechanism for them is in fsync, and
we'd need to advance them both every time.

Given that, we can just remove i_meta_err, and convert the places that
used it to instead just use mapping->wb_err instead. That also fixes
the original problem by ensuring that we do a check_and_advance of the
wb_err at the end of the fsync op.

Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/52864
Reported-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-10-19 09:36:06 +02:00
Jeff Layton
98d0a6fb73 ceph: skip existing superblocks that are blocklisted or shut down when mounting
Currently when mounting, we may end up finding an existing superblock
that corresponds to a blocklisted MDS client. This means that the new
mount ends up being unusable.

If we've found an existing superblock with a client that is already
blocklisted, and the client is not configured to recover on its own,
fail the match. Ditto if the superblock has been forcibly unmounted.

While we're in here, also rename "other" to the more conventional "fsc".

Cc: stable@vger.kernel.org
URL: https://bugzilla.redhat.com/show_bug.cgi?id=1901499
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-10-19 09:36:06 +02:00
Marc Kleine-Budde
d674a8f123 can: isotp: isotp_sendmsg(): fix return error on FC timeout on TX path
When the a large chunk of data send and the receiver does not send a
Flow Control frame back in time, the sendmsg() does not return a error
code, but the number of bytes sent corresponding to the size of the
packet.

If a timeout occurs the isotp_tx_timer_handler() is fired, sets
sk->sk_err and calls the sk->sk_error_report() function. It was
wrongly expected that the error would be propagated to user space in
every case. For isotp_sendmsg() blocking on wait_event_interruptible()
this is not the case.

This patch fixes the problem by checking if sk->sk_err is set and
returning the error to user space.

Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol")
Link: https://github.com/hartkopp/can-isotp/issues/42
Link: https://github.com/hartkopp/can-isotp/pull/43
Link: https://lore.kernel.org/all/20210507091839.1366379-1-mkl@pengutronix.de
Cc: stable@vger.kernel.org
Reported-by: Sottas Guillaume (LMB) <Guillaume.Sottas@liebherr.com>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-10-19 09:10:30 +02:00
Andrej Shadura
362d5dfc48 mailmap: add Andrej Shadura
Add a mapping for my old work email for BelDisplayTech to the personal
email, and make sure the Collabora email has the correct spelling of the
first name.

Link: https://lkml.kernel.org/r/20210917091016.30232-1-andrew.shadura@collabora.co.uk
Signed-off-by: Andrej Shadura <andrew.shadura@collabora.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00
Marek Szyprowski
1ca7554d05 mm/thp: decrease nr_thps in file's mapping on THP split
Decrease nr_thps counter in file's mapping to ensure that the page cache
won't be dropped excessively on file write access if page has been
already split.

I've tried a test scenario running a big binary, kernel remaps it with
THPs, then force a THP split with /sys/kernel/debug/split_huge_pages.
During any further open of that binary with O_RDWR or O_WRITEONLY kernel
drops page cache for it, because of non-zero thps counter.

Link: https://lkml.kernel.org/r/20211012120237.2600-1-m.szyprowski@samsung.com
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: 09d91cda0e82 ("mm,thp: avoid writes to file with THP in pagecache")
Fixes: 06d3eff62d9d ("mm/thp: fix node page state in split_huge_page_to_list()")
Acked-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: <sfoon.kim@samsung.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 20:22:03 -10:00