IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
It turns out access to j1939_can_rx_register() needs to be serialized,
otherwise j1939_priv can be corrupted when parallel threads call
j1939_netdev_start() and j1939_can_rx_register() fails. This issue is
thoroughly covered in other commit which serializes access to
j1939_can_rx_register().
Change j1939_netdev_lock type to mutex so that we do not need to remove
GFP_KERNEL from can_rx_register().
j1939_netdev_lock seems to be used in normal contexts where mutex usage
is not prohibited.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: 9d71dd0c70 ("can: add support of SAE J1939 protocol")
Suggested-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Tested-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20230526171910.227615-2-pchelkin@ispras.ru
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch addresses an issue within the j1939_sk_send_loop_abort()
function in the j1939/socket.c file, specifically in the context of
Transport Protocol (TP) sessions.
Without this patch, when a TP session is initiated and a Clear To Send
(CTS) frame is received from the remote side requesting one data packet,
the kernel dispatches the first Data Transport (DT) frame and then waits
for the next CTS. If the remote side doesn't respond with another CTS,
the kernel aborts due to a timeout. This leads to the user-space
receiving an EPOLLERR on the socket, and the socket becomes active.
However, when trying to read the error queue from the socket with
sock.recvmsg(, , socket.MSG_ERRQUEUE), it returns -EAGAIN,
given that the socket is non-blocking. This situation results in an
infinite loop: the user-space repeatedly calls epoll(), epoll() returns
the socket file descriptor with EPOLLERR, but the socket then blocks on
the recv() of ERRQUEUE.
This patch introduces an additional check for the J1939_SOCK_ERRQUEUE
flag within the j1939_sk_send_loop_abort() function. If the flag is set,
it indicates that the application has subscribed to receive error queue
messages. In such cases, the kernel can communicate the current transfer
state via the error queue. This allows for the function to return early,
preventing the unnecessary setting of the socket into an error state,
and breaking the infinite loop. It is crucial to note that a socket
error is only needed if the application isn't using the error queue, as,
without it, the application wouldn't be aware of transfer issues.
Fixes: 9d71dd0c70 ("can: add support of SAE J1939 protocol")
Reported-by: David Jander <david@protonic.nl>
Tested-by: David Jander <david@protonic.nl>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20230526081946.715190-1-o.rempel@pengutronix.de
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Unlinked list recovery requires errors removing the inode the from
the unlinked list get fed back to the main recovery loop. Now that
we offload the unlinking to the inodegc work, we don't get errors
being fed back when we trip over a corruption that prevents the
inode from being removed from the unlinked list.
This means we never clear the corrupt unlinked list bucket,
resulting in runtime operations eventually tripping over it and
shutting down.
Fix this by collecting inodegc worker errors and feed them
back to the flush caller. This is largely best effort - the only
context that really cares is log recovery, and it only flushes a
single inode at a time so we don't need complex synchronised
handling. Essentially the inodegc workers will capture the first
error that occurs and the next flush will gather them and clear
them. The flush itself will only report the first gathered error.
In the cases where callers can return errors, propagate the
collected inodegc flush error up the error handling chain.
In the case of inode unlinked list recovery, there are several
superfluous calls to flush queued unlinked inodes -
xlog_recover_iunlink_bucket() guarantees that it has flushed the
inodegc and collected errors before it returns. Hence nothing in the
calling path needs to run a flush, even when an error is returned.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Bad things happen in defered extent freeing operations if it is
passed a bad block number in the xefi. This can come from a bogus
agno/agbno pair from deferred agfl freeing, or just a bad fsbno
being passed to __xfs_free_extent_later(). Either way, it's very
difficult to diagnose where a null perag oops in EFI creation
is coming from when the operation that queued the xefi has already
been completed and there's no longer any trace of it around....
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>
If the agfl or the indexing in the AGF has been corrupted, getting a
block form the AGFL could return an invalid block number. If this
happens, bad things happen. Check the agbno we pull off the AGFL
and return -EFSCORRUPTED if we find somethign bad.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>
When a v4 filesystem has fl_last - fl_first != fl_count, we do not
not detect the corruption and allow the AGF to be used as it if was
fully valid. On V5 filesystems, we reset the AGFL to empty in these
cases and avoid the corruption at a small cost of leaked blocks.
If we don't catch the corruption on V4 filesystems, bad things
happen later when an allocation attempts to trim the free list
and either double-frees stale entries in the AGFl or tries to free
NULLAGBNO entries.
Either way, this is bad. Prevent this from happening by using the
AGFL_NEED_RESET logic for v4 filesysetms, too.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>
xfs_bmap_longest_free_extent() can return an error when accessing
the AGF fails. In this case, the behaviour of
xfs_filestream_pick_ag() is conditional on the error. We may
continue the loop, or break out of it. The error handling after the
loop cleans up the perag reference held when the break occurs. If we
continue, the next loop iteration handles cleaning up the perag
reference.
EIther way, we don't need to release the active perag reference when
xfs_bmap_longest_free_extent() fails. Doing so means we do a double
decrement on the active reference count, and this causes tha active
reference count to fall to zero. At this point, new active
references will fail.
This leads to unmount hanging because it tries to grab active
references to that perag, only for it to fail. This happens inside a
loop that retries until a inode tree radix tree tag is cleared,
which cannot happen because we can't get an active reference to the
perag.
The unmount livelocks in this path:
xfs_reclaim_inodes+0x80/0xc0
xfs_unmount_flush_inodes+0x5b/0x70
xfs_unmountfs+0x5b/0x1a0
xfs_fs_put_super+0x49/0x110
generic_shutdown_super+0x7c/0x1a0
kill_block_super+0x27/0x50
deactivate_locked_super+0x30/0x90
deactivate_super+0x3c/0x50
cleanup_mnt+0xc2/0x160
__cleanup_mnt+0x12/0x20
task_work_run+0x5e/0xa0
exit_to_user_mode_prepare+0x1bc/0x1c0
syscall_exit_to_user_mode+0x16/0x40
do_syscall_64+0x40/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Reported-by: Pengfei Xu <pengfei.xu@intel.com>
Fixes: eb70aa2d8e ("xfs: use for_each_perag_wrap in xfs_filestream_pick_ag")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Commit 6bc6c99a944c was a well-intentioned effort to initiate
consolidation of adjacent bmbt mapping records by setting the PREEN
flag. Consolidation can only happen if the length of the combined
record doesn't overflow the 21-bit blockcount field of the bmbt
recordset. Unfortunately, the length test is inverted, leading to it
triggering on data forks like these:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..16777207]: 76110848..92888055 0 (76110848..92888055) 16777208
1: [16777208..20639743]: 92888056..96750591 0 (92888056..96750591) 3862536
Note that record 0 has a length of 16777208 512b blocks. This
corresponds to 2097151 4k fsblocks, which is the maximum. Hence the two
records cannot be merged.
However, the logic is still wrong even if we change the in-loop
comparison, because the scope of our examination isn't broad enough
inside the loop to detect mappings like this:
0: [0..9]: 76110838..76110847 0 (76110838..76110847) 10
1: [10..16777217]: 76110848..92888055 0 (76110848..92888055) 16777208
2: [16777218..20639753]: 92888056..96750591 0 (92888056..96750591) 3862536
These three records could be merged into two, but one cannot determine
this purely from looking at records 0-1 or 1-2 in isolation.
Hoist the mergability detection outside the loop, and base its decision
making on whether or not a merged mapping could be expressed in fewer
bmbt records. While we're at it, fix the incorrect return type of the
iter function.
Fixes: 336642f792 ("xfs: alert the user about data/attr fork mappings that could be merged")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
With gcc-5:
In file included from ./include/trace/define_trace.h:102:0,
from ./fs/xfs/scrub/trace.h:988,
from fs/xfs/scrub/trace.c:40:
./fs/xfs/./scrub/trace.h: In function ‘trace_raw_output_xchk_fsgate_class’:
./fs/xfs/scrub/scrub.h:111:28: error: initializer element is not constant
#define XREP_ALREADY_FIXED (1 << 31) /* checking our repair work */
^
Shifting the (signed) value 1 into the sign bit is undefined behavior.
Fix this for all definitions in the file by shifting "1U" instead of
"1".
This was exposed by the first user added in commit 466c525d6d
("xfs: minimize overhead of drain wakeups by using jump labels").
Fixes: 160b5a7845 ("xfs: hoist the already_fixed variable to the scrub context")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Lock order in XFS is AGI -> AGF, hence for operations involving
inode unlinked list operations we always lock the AGI first. Inode
unlinked list operations operate on the inode cluster buffer,
so the lock order there is AGI -> inode cluster buffer.
For O_TMPFILE operations, this now means the lock order set down in
xfs_rename and xfs_link is AGI -> inode cluster buffer -> AGF as the
unlinked ops are done before the directory modifications that may
allocate space and lock the AGF.
Unfortunately, we also now lock the inode cluster buffer when
logging an inode so that we can attach the inode to the cluster
buffer and pin it in memory. This creates a lock order of AGF ->
inode cluster buffer in directory operations as we have to log the
inode after we've allocated new space for it.
This creates a lock inversion between the AGF and the inode cluster
buffer. Because the inode cluster buffer is shared across multiple
inodes, the inversion is not specific to individual inodes but can
occur when inodes in the same cluster buffer are accessed in
different orders.
To fix this we need move all the inode log item cluster buffer
interactions to the end of the current transaction. Unfortunately,
xfs_trans_log_inode() calls are littered throughout the transactions
with no thought to ordering against other items or locking. This
makes it difficult to do anything that involves changing the call
sites of xfs_trans_log_inode() to change locking orders.
However, we do now have a mechanism that allows is to postpone dirty
item processing to just before we commit the transaction: the
->iop_precommit method. This will be called after all the
modifications are done and high level objects like AGI and AGF
buffers have been locked and modified, thereby providing a mechanism
that guarantees we don't lock the inode cluster buffer before those
high level objects are locked.
This change is largely moving the guts of xfs_trans_log_inode() to
xfs_inode_item_precommit() and providing an extra flag context in
the inode log item to track the dirty state of the inode in the
current transaction. This also means we do a lot less repeated work
in xfs_trans_log_inode() by only doing it once per transaction when
all the work is done.
Fixes: 298f7bec50 ("xfs: pin inode backing buffer to the inode log item")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>
To fix a AGI-AGF-inode cluster buffer deadlock, we need to move
inode cluster buffer operations to the ->iop_precommit() method.
However, this means that deferred operations can require precommits
to be run on the final transaction that the deferred ops pass back
to xfs_trans_commit() context. This will be exposed by attribute
handling, in that the last changes to the inode in the attr set
state machine "disappear" because the precommit operation is not run.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
It was accidentally dropped when refactoring the allocation code,
resulting in the AG iteration always doing blocking AG iteration.
This results in a small performance regression for a specific fsmark
test that runs more user data writer threads than there are AGs.
Reported-by: kernel test robot <oliver.sang@intel.com>
Fixes: 2edf06a50f ("xfs: factor xfs_alloc_vextent_this_ag() for _iterate_ags()")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
When a buffer is unpinned by xfs_buf_item_unpin(), we need to access
the buffer after we've dropped the buffer log item reference count.
This opens a window where we can have two racing unpins for the
buffer item (e.g. shutdown checkpoint context callback processing
racing with journal IO iclog completion processing) and both attempt
to access the buffer after dropping the BLI reference count. If we
are unlucky, the "BLI freed" context wins the race and frees the
buffer before the "BLI still active" case checks the buffer pin
count.
This results in a use after free that can only be triggered
in active filesystem shutdown situations.
To fix this, we need to ensure that buffer existence extends beyond
the BLI reference count checks and until the unpin processing is
complete. This implies that a buffer pin operation must also take a
buffer reference to ensure that the buffer cannot be freed until the
buffer unpin processing is complete.
Reported-by: yangerkun <yangerkun@huawei.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
wrongly
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmR8qTcACgkQEsHwGGHe
VUqEsQ//UH2BSu+aSCTZWTOMLHnrsCToVOURKbUV6GD/+in8VrOxE1tXqdNCG5u8
wnHGi+hyZGxK1fWHrAObfj1OT0O3cLbdxSpNIufESihny+eoganlVngweCFZVGcK
JymC1bAlDYEUIdBsJlRdBt/uSWYBw9CFCdHkTsdku6CUf2BLr8NkMHCcYg11nvaK
iAKP96pFIxUmeI0R+l1Wc6qIE1plqeiKYYR+pGO+ubyQAC0OEGR/robCZkTeCHth
9FhmtMSsVghqmXfUfjbRY+fiDsSdnwn7Iw8qSLl4zmsD+CIsy/CRTag0gDljyZ+m
WeBx1ripUU3hVGv3CGpfq4MIdkILYqFORAcuhYVgiEAmOHa4V28S+eEBWAlct7Bc
NIs3UVP8BItbsdHiqOCAyXJs6dAT3Ja+PDr2a7WelrpLOXsb1u8ffjhr4UTWYU0a
VlAnohKTBJifl9bAD5dvgOAfnDCmHAibVqdB6ylNQyDPB4l8JzIf7q0ZuO2gBsoa
UNVKaYNVbeIl/DEEW2Qct6prXj8ekfUXdiWzjOe3JzP2QuDOZDVTxO1fyxWfyyGE
kpLWhqUcqOHSk8BYPH7BzaEkKY3oojwJUlTnWqJvf0cGYNwOmA3xfL1iphyl/F5p
Tpj9CkwnSD1sNdw6Rh6NNZWkmk0Tz7qAh3ywFsni3vx6p+7aEpw=
=Twrd
-----END PGP SIGNATURE-----
Merge tag 'irq_urgent_for_v6.4_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Borislav Petkov:
- Fix open firmware quirks validation so that they don't get applied
wrongly
* tag 'irq_urgent_for_v6.4_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic: Correctly validate OF quirk descriptors
This patch fixes the following sparse warning:
net/sched/sch_api.c:2305:1: sparse: warning: symbol 'tc_skip_wrapper' was not declared. Should it be static?
No functional change intended.
Signed-off-by: Min-Hua Chen <minhuadotchen@gmail.com>
Acked-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Fang says:
====================
net: enetc: correct the statistics of rx bytes
The purpose of this patch set is to fix the issue of rx bytes
statistics. The first patch corrects the rx bytes statistics
of normal kernel protocol stack path, and the second patch is
used to correct the rx bytes statistics of XDP.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The rx_bytes statistics of XDP are always zero, because rx_byte_cnt
is not updated after it is initialized to 0. So fix it.
Fixes: d1b15102dd ("net: enetc: add support for XDP_DROP and XDP_PASS")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rx_bytes of struct net_device_stats should count the length of
ethernet frames excluding the FCS. However, there are two problems
with the rx_bytes statistics of the current enetc driver. one is
that the length of VLAN header is not counted if the VLAN extraction
feature is enabled. The other is that the length of L2 header is not
counted, because eth_type_trans() is invoked before updating rx_bytes
which will subtract the length of L2 header from skb->len.
BTW, the rx_bytes statistics of XDP path also have similar problem,
I will fix it in another patch.
Fixes: a800abd3ec ("net: enetc: move skb creation into enetc_build_skb")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmR8hVgACgkQCF8+vY7k
4RXDRg//SxDkN3JSKKUuNzBBYI7KJkEXE/G7rsA0DZmgBS90ri8lop6psw1dq7pY
PgffWc01YKJqWK9Tz6V8ejn9jFo7YAAiTwYqitwOxEfsF5r+2yLV0lfSGU7OhYAP
m/CwSbsL78RU8YAcAXJm1K8UJu/NDHKcJQiroCDAJanw5W1dvKx42flE0kw6g9YY
CqanR3XuiqUxq4XDzUoN86VHIUk97AhRDeCi9E4hpYJgMHuxQPoRd71/vuA15KYD
H2d/Xh/jJ+qNtPnO/Ivgy43Ueb6qVvbjr5uNevFtPghJ8ATsP+a/dwBeprqMuuX5
k5jWfNTNiT0VHWVG0ruOsGMpq6NCUXXVt5IHAaLWiuGh8RQrnn1JfEkMvYqiu5ar
/4Z55Fl6qU2760N/PVLUwskcDnGNOKSTAKSPBZg3hj4jn5eCwQIkAysEt8BULiLs
SdyOODiqH8r+g2j6JXFqRWl9sV7jH6cV+ZaNW6mbfCyRIJdJ25W1C3yKIDK/G3dG
qBj1dm0uLd7ufvdSwgNW2LwLFH4a8sHXELfij603K3ysO/NZfdlzgY+6rDuv3w2P
OiHNtMTig0O4TImIELjJlxOsb3bSsaM3tPBSdCl0KC0kkMT8U+7rqXZFfCvYqKUV
uwvUnfmu6dx2CfVIPEep90Vnsr1rtIL+ZPqut1x5LfFrVYuTYrE=
=hLQn
-----END PGP SIGNATURE-----
Merge tag 'media/v6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
"Some driver fixes:
- a regression fix for the verisilicon driver
- uvcvideo: don't expose unsupported video formats to userspace
- camss-video: don't zero subdev format after init
- mediatek: some fixes for 4K decoder formats
- fix a Sphinx build warning (missing doc for client_caps)
- some fixes for imx and atomisp staging drivers
And two CEC core fixes:
- don't set last_initiator if TX in progress
- disable adapter in cec_devnode_unregister"
* tag 'media/v6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: uvcvideo: Don't expose unsupported formats to userspace
media: v4l2-subdev: Fix missing kerneldoc for client_caps
media: staging: media: imx: initialize hs_settle to avoid warning
media: v4l2-mc: Drop subdev check in v4l2_create_fwnode_links_to_pad()
media: staging: media: atomisp: init high & low vars
media: cec: core: don't set last_initiator if tx in progress
media: cec: core: disable adapter in cec_devnode_unregister
media: mediatek: vcodec: Only apply 4K frame sizes on decoder formats
media: camss: camss-video: Don't zero subdev format again after initialization
media: verisilicon: Additional fix for the crash when opening the driver
Here are a bunch of tiny char/misc/other driver fixes for 6.4-rc5 that
resolve a number of reported issues. Included in here are:
- iio driver fixes
- fpga driver fixes
- test_firmware bugfixes
- fastrpc driver tiny bugfixes
- MAINTAINERS file updates for some subsystems
All of these have been in linux-next this past week with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZHxDNg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yl1ywCg0uz+E/GYKx5cP9chPFmbbaFwxH4AnRpn/kIH
xz6nbAqSf7CBbtxmED11
=4J1c
-----END PGP SIGNATURE-----
Merge tag 'char-misc-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are a bunch of tiny char/misc/other driver fixes for 6.4-rc5 that
resolve a number of reported issues. Included in here are:
- iio driver fixes
- fpga driver fixes
- test_firmware bugfixes
- fastrpc driver tiny bugfixes
- MAINTAINERS file updates for some subsystems
All of these have been in linux-next this past week with no reported
issues"
* tag 'char-misc-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (34 commits)
test_firmware: fix the memory leak of the allocated firmware buffer
test_firmware: fix a memory leak with reqs buffer
test_firmware: prevent race conditions by a correct implementation of locking
firmware_loader: Fix a NULL vs IS_ERR() check
MAINTAINERS: Vaibhav Gupta is the new ipack maintainer
dt-bindings: fpga: replace Ivan Bornyakov maintainership
MAINTAINERS: update Microchip MPF FPGA reviewers
misc: fastrpc: reject new invocations during device removal
misc: fastrpc: return -EPIPE to invocations on device removal
misc: fastrpc: Reassign memory ownership only for remote heap
misc: fastrpc: Pass proper scm arguments for secure map request
iio: imu: inv_icm42600: fix timestamp reset
iio: adc: ad_sigma_delta: Fix IRQ issue by setting IRQ_DISABLE_UNLAZY flag
dt-bindings: iio: adc: renesas,rcar-gyroadc: Fix adi,ad7476 compatible value
iio: dac: mcp4725: Fix i2c_master_send() return value handling
iio: accel: kx022a fix irq getting
iio: bu27034: Ensure reset is written
iio: dac: build ad5758 driver when AD5758 is selected
iio: addac: ad74413: fix resistance input processing
iio: light: vcnl4035: fixed chip ID check
...
The final production baseboard had a different chip select than
earlier prototype boards. When the newer board was released,
the SPI stopped working because the wrong pin was used in the device
tree and conflicted with the UART RTS. Fix the pinmux for
production boards.
Fixes: 36ca3c8ccb ("arm64: dts: imx: Add Beacon i.MX8M Nano development kit")
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Here are 2 small driver core cacheinfo fixes for 6.4-rc5 that resolve a
number of reported issues with that file. These changes have been in
linux-next this past week with no reported problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZHxChg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykrLACeJBLCDThdooct8G/7MzfpJhFcjSYAn1/EhJDA
GxgOmZrsB1HcO3Bo587a
=Cucq
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are two small driver core cacheinfo fixes for 6.4-rc5 that
resolve a number of reported issues with that file. These changes have
been in linux-next this past week with no reported problems"
* tag 'driver-core-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
drivers: base: cacheinfo: Update cpu_map_populated during CPU Hotplug
drivers: base: cacheinfo: Fix shared_cpu_map changes in event of CPU hotplug
Here are some small tty/serial driver fixes for 6.4-rc5 that have all
been in linux-next this past week with no reported problems. Included
in here are:
- 8250_tegra driver bugfix
- fsl uart driver bugfixes
- Kconfig fix for dependancy issue
- dt-bindings fix for the 8250_omap driver
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZHxD4w8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykcTQCdGohhrEfOmNVDGnYHTTCZ7NXgjX4AoJkqRjsT
pp6mxqTNLHy/NQqjboUR
=O/xg
-----END PGP SIGNATURE-----
Merge tag 'tty-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are some small tty/serial driver fixes for 6.4-rc5 that have all
been in linux-next this past week with no reported problems. Included
in here are:
- 8250_tegra driver bugfix
- fsl uart driver bugfixes
- Kconfig fix for dependancy issue
- dt-bindings fix for the 8250_omap driver"
* tag 'tty-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
dt-bindings: serial: 8250_omap: add rs485-rts-active-high
serial: cpm_uart: Fix a COMPILE_TEST dependency
soc: fsl: cpm1: Fix TSA and QMC dependencies in case of COMPILE_TEST
tty: serial: fsl_lpuart: use UARTCTRL_TXINV to send break instead of UARTCTRL_SBK
serial: 8250_tegra: Fix an error handling path in tegra_uart_probe()
Here are some USB driver and core fixes for 6.4-rc5. Most of these are
tiny driver fixes, including:
- udc driver bugfix
- f_fs gadget driver bugfix
- cdns3 driver bugfix
- typec bugfixes
But the "big" thing in here is a fix yet-again for how the USB buffers
are handled from userspace when dealing with DMA issues. The changes
were discussed a lot, and tested a lot, on the list, and acked by the
relevant mm maintainers and have been in linux-next all this past week
with no reported problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZHxFGA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykd0wCgwHMYaXa8jJCGgG+e4o/rFvBucK8AoJdmHc8M
hoeLOGdBuxJItXNOnMac
=uUEx
-----END PGP SIGNATURE-----
Merge tag 'usb-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some USB driver and core fixes for 6.4-rc5. Most of these are
tiny driver fixes, including:
- udc driver bugfix
- f_fs gadget driver bugfix
- cdns3 driver bugfix
- typec bugfixes
But the "big" thing in here is a fix yet-again for how the USB buffers
are handled from userspace when dealing with DMA issues. The changes
were discussed a lot, and tested a lot, on the list, and acked by the
relevant mm maintainers and have been in linux-next all this past week
with no reported problems"
* tag 'usb-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: typec: tps6598x: Fix broken polling mode after system suspend/resume
mm: page_table_check: Ensure user pages are not slab pages
mm: page_table_check: Make it dependent on EXCLUSIVE_SYSTEM_RAM
usb: usbfs: Use consistent mmap functions
usb: usbfs: Enforce page requirements for mmap
dt-bindings: usb: snps,dwc3: Fix "snps,hsphy_interface" type
usb: gadget: udc: fix NULL dereference in remove()
usb: gadget: f_fs: Add unbind event before functionfs_unbind
usb: cdns3: fix NCM gadget RX speed 20x slow than expection at iMX8QM
* Address some fallout of the locking rework, this time affecting
the way the vgic is configured
* Fix an issue where the page table walker frees a subtree and
then proceeds with walking what it has just freed...
* Check that a given PA donated to the guest is actually memory
(only affecting pKVM)
* Correctly handle MTE CMOs by Set/Way
* Fix the reported address of a watchpoint forwarded to userspace
* Fix the freeing of the root of stage-2 page tables
* Stop creating spurious PMU events to perform detection of the
default PMU and use the existing PMU list instead.
x86:
* Fix a memslot lookup bug in the NX recovery thread that could
theoretically let userspace bypass the NX hugepage mitigation
* Fix a s/BLOCKING/PENDING bug in SVM's vNMI support
* Account exit stats for fastpath VM-Exits that never leave the super
tight run-loop
* Fix an out-of-bounds bug in the optimized APIC map code, and add a
regression test for the race.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmR7k1QUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroNblwf/faUVOBMv7mQBGsGa7FNcmaNhYeIT
U1k4pFNlo7dNNuNJrGdpo+sOGP5A8CRLNSVvlyjgCHF1Qc9gVtXNvZ9PnA6nAYmB
qqvUz/TDw9/NLTlJEkbSs05B4am4yfd5pV6R/32jrPIbXOW++6ae2LpILS/NPBrB
y0tGiVUJrO3zVXdBKa4PFmlO8jsXPmMEiicEJa5v2Boeo5SFyFfErw9zDNwSMsQc
27bzbs3O2daXTNMFnwVCCpWUxt1EqWYUXGvBjsChAUI0K10F2/GW9f6YeFsGXqKI
d8g1QuCukSt/CvN0pT+g/540mR6i0Azpek1myQfuCu2IhQ1jCJaSWOjoEw==
=8VrO
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Address some fallout of the locking rework, this time affecting the
way the vgic is configured
- Fix an issue where the page table walker frees a subtree and then
proceeds with walking what it has just freed...
- Check that a given PA donated to the guest is actually memory (only
affecting pKVM)
- Correctly handle MTE CMOs by Set/Way
- Fix the reported address of a watchpoint forwarded to userspace
- Fix the freeing of the root of stage-2 page tables
- Stop creating spurious PMU events to perform detection of the
default PMU and use the existing PMU list instead
x86:
- Fix a memslot lookup bug in the NX recovery thread that could
theoretically let userspace bypass the NX hugepage mitigation
- Fix a s/BLOCKING/PENDING bug in SVM's vNMI support
- Account exit stats for fastpath VM-Exits that never leave the super
tight run-loop
- Fix an out-of-bounds bug in the optimized APIC map code, and add a
regression test for the race"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: selftests: Add test for race in kvm_recalculate_apic_map()
KVM: x86: Bail from kvm_recalculate_phys_map() if x2APIC ID is out-of-bounds
KVM: x86: Account fastpath-only VM-Exits in vCPU stats
KVM: SVM: vNMI pending bit is V_NMI_PENDING_MASK not V_NMI_BLOCKING_MASK
KVM: x86/mmu: Grab memslot for correct address space in NX recovery worker
KVM: arm64: Document default vPMU behavior on heterogeneous systems
KVM: arm64: Iterate arm_pmus list to probe for default PMU
KVM: arm64: Drop last page ref in kvm_pgtable_stage2_free_removed()
KVM: arm64: Populate fault info for watchpoint
KVM: arm64: Reload PTE after invoking walker callback on preorder traversal
KVM: arm64: Handle trap of tagged Set/Way CMOs
arm64: Add missing Set/Way CMO encodings
KVM: arm64: Prevent unconditional donation of unmapped regions from the host
KVM: arm64: vgic: Fix a comment
KVM: arm64: vgic: Fix locking comment
KVM: arm64: vgic: Wrap vgic_its_create() with config_lock
KVM: arm64: vgic: Fix a circular locking issue
- Fix link errors in new aes-gcm-p10 code when built-in with other drivers.
- Limit number of TCEs passed to H_STUFF_TCE hcall as per spec.
- Use KSYM_NAME_LEN in xmon array size to avoid possible OOB write.
Thanks to: Gaurav Batra, Maninder Singh Vishal Chourasia.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmR70UsTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgCdcEACJQ7GOV3MuV7oSAivumF81AmOG/86Y
eN1wqI25nPyhH0sUOYjM97A8e2vPvEVJPFCNnDHe1fcICaRR+X6rW0cnfrE3NCI6
JU+Qu1zEDC/JVd+AXh2vjHPUyi91rCNAXuao0Y+IHu+ViTjmKLd1bEa5hhFS0vxj
WkEWWatWxtWnJV9mfS29v+leGmFgX2wX04IuIFzA4OafMU2eaBDYDXMvvqXkIpLj
CGmA5mRGYsSyPZIG2CITFcSOrQ5hSd8w2M5zenDth6lwIMXJLsi1f2cfn0GEueQF
lp2e8cF96D20M+oOuP7+35hZ/Iq9haQkLUR3m55ai+RK1MhpyXSJPLUkMg6/M5BN
n8P7x+BuCJh148YS+qdb7FEyMLK7Zjjr+j4yR0LVmL+HBQL8/BklX5HhkpMA4UCh
l9MBDIvqzMVGpKwoR/vdTuMH+g4Y6tDWV9yR2Oz4zOXrYb5nR4KHvhCcax5SfC11
bVC3tP2hMgMalfTlm7J+iSdukwkLUZT3aubJoAi7r4iyjIwFJkCySJWO+GI7IGAd
OIyg2RQtObIy8evOL+0RIomuek9UBASQymA3N0EP8QmxZocoqJU7FAOiMIFhP86/
yr6fmcW8Mov+aV0fZtOzOwtnCXn96j4xgVPdznKz/vWZkNzerRmOOHOqLHcfN5Cf
tFEceNQLrjpGdg==
=EvgV
-----END PGP SIGNATURE-----
Merge tag 'powerpc-6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix link errors in new aes-gcm-p10 code when built-in with other
drivers
- Limit number of TCEs passed to H_STUFF_TCE hcall as per spec
- Use KSYM_NAME_LEN in xmon array size to avoid possible OOB write
Thanks to Gaurav Batra and Maninder Singh Vishal Chourasia.
* tag 'powerpc-6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/xmon: Use KSYM_NAME_LEN in array size
powerpc/iommu: Limit number of TCEs to 512 for H_STUFF_TCE hcall
powerpc/crypto: Fix aes-gcm-p10 link errors
The nr_active counter continues to increase over time which causes the
blk_mq_get_tag to hang until the thread is rescheduled to a different
core despite there are still tags available.
kernel-stack
INFO: task inboundIOReacto:3014879 blocked for more than 2 seconds
Not tainted 6.1.15-amd64 #1 Debian 6.1.15~debian11
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:inboundIOReacto state:D stack:0 pid:3014879 ppid:4557 flags:0x00000000
Call Trace:
<TASK>
__schedule+0x351/0xa20
scheduler+0x5d/0xe0
io_schedule+0x42/0x70
blk_mq_get_tag+0x11a/0x2a0
? dequeue_task_stop+0x70/0x70
__blk_mq_alloc_requests+0x191/0x2e0
kprobe output showing RQF_MQ_INFLIGHT bit is not cleared before
__blk_mq_free_request being called.
320 320 kworker/29:1H __blk_mq_free_request rq_flags 0x220c0 in-flight 1
b'__blk_mq_free_request+0x1 [kernel]'
b'bt_iter+0x50 [kernel]'
b'blk_mq_queue_tag_busy_iter+0x318 [kernel]'
b'blk_mq_timeout_work+0x7c [kernel]'
b'process_one_work+0x1c4 [kernel]'
b'worker_thread+0x4d [kernel]'
b'kthread+0xe6 [kernel]'
b'ret_from_fork+0x1f [kernel]'
Signed-off-by: Tian Lan <tian.lan@twosigma.com>
Fixes: 2e315dc07d ("blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter")
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230513221227.497327-1-tilan7663@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
SMCRv1 has a similar issue to SMCRv2 (see link below) that may access
invalid MRs of RMBs when construct LLC ADD LINK CONT messages.
BUG: kernel NULL pointer dereference, address: 0000000000000014
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 5 PID: 48 Comm: kworker/5:0 Kdump: loaded Tainted: G W E 6.4.0-rc3+ #49
Workqueue: events smc_llc_add_link_work [smc]
RIP: 0010:smc_llc_add_link_cont+0x160/0x270 [smc]
RSP: 0018:ffffa737801d3d50 EFLAGS: 00010286
RAX: ffff964f82144000 RBX: ffffa737801d3dd8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff964f81370c30
RBP: ffffa737801d3dd4 R08: ffff964f81370000 R09: ffffa737801d3db0
R10: 0000000000000001 R11: 0000000000000060 R12: ffff964f82e70000
R13: ffff964f81370c38 R14: ffffa737801d3dd3 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff9652bfd40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000014 CR3: 000000008fa20004 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
smc_llc_srv_rkey_exchange+0xa7/0x190 [smc]
smc_llc_srv_add_link+0x3ae/0x5a0 [smc]
smc_llc_add_link_work+0xb8/0x140 [smc]
process_one_work+0x1e5/0x3f0
worker_thread+0x4d/0x2f0
? __pfx_worker_thread+0x10/0x10
kthread+0xe5/0x120
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
</TASK>
When an alernate RNIC is available in system, SMC will try to add a new
link based on the RNIC for resilience. All the RMBs in use will be mapped
to the new link. Then the RMBs' MRs corresponding to the new link will
be filled into LLC messages. For SMCRv1, they are ADD LINK CONT messages.
However smc_llc_add_link_cont() may mistakenly access to unused RMBs which
haven't been mapped to the new link and have no valid MRs, thus causing a
crash. So this patch fixes it.
Fixes: 87f88cda21 ("net/smc: rkey processing for a new link as SMC client")
Link: https://lore.kernel.org/r/1685101741-74826-3-git-send-email-guwen@linux.alibaba.com
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This resolves the issue that generated binary is showing up as an untracked git file after every build on the kernel.
Signed-off-by: Weihao Gao <weihaogao@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Fix a memslot lookup bug in the NX recovery thread that could
theoretically let userspace bypass the NX hugepage mitigation
- Fix a s/BLOCKING/PENDING bug in SVM's vNMI support
- Account exit stats for fastpath VM-Exits that never leave the super
tight run-loop
- Fix an out-of-bounds bug in the optimized APIC map code, and add a
regression test for the race.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmR6id4SHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL5i8UP/0QqgBeVHluyG+0v+x1pXNsB1lEKjkTX
7W0OLkEPB2p4JVa0TG4zcBrs+JTCRGPnC3aL86Xd5LLQq51aZuEsFYRHJ6Ngs5jR
Sy/EOHGwgL0Ol83KBPpRsAwX9TXspRq/fWDKqlNCFa90tJFqTioTaaG5T4YNqdR9
NcxmtvOi3GnqSUU+9MV0/DL4vOthhstQdj4JPptgP5hS/FH1F7ZdFSFC46AS0C60
bXJglcmv4l8tEGnIo2VtJAAK8MxCJp0m/4Ow+AXqAqIFDxP3F66KG60l4Jrty0GE
URuXXFUtwYMhJaeZlnHM1po6RKX/E1etYEA4E0zHD/fntXg8ciE9j9XwQadXT44Q
8WIGmSt52OH3zRqbuMcOpdkuW9LepkrA2A8POpJuN9j9ELoNBHzMC+L1pe9FMEd+
VT2lfoAfDyQzIuwzOwyqc0/axUx0YeyMJy/3mVuk10hI4L32YRDvc97IVXf6MUQA
DaAu+mgEjU6J+MVsWcpHHEq31VtpZXAaaluub2JlBOaPANMIM9Qfur2XtwFIuL3u
sOZAiyBWQyUUzMEqdOAVpB1HB1QZ2Tb1NmhUXGLh5BrpkWxqEf89vtM6xN+sXmip
lQDyxf1NDFxiAVfcnG0I0q0285mDr2mbGHTawu1vM5o46MWyqmXiSQMdAEbnyIUm
uEF1Rppg4Hkc
=ztue
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-fixes-6.4' of https://github.com/kvm-x86/linux into HEAD
KVM x86 fixes for 6.4
- Fix a memslot lookup bug in the NX recovery thread that could
theoretically let userspace bypass the NX hugepage mitigation
- Fix a s/BLOCKING/PENDING bug in SVM's vNMI support
- Account exit stats for fastpath VM-Exits that never leave the super
tight run-loop
- Fix an out-of-bounds bug in the optimized APIC map code, and add a
regression test for the race.
- Fix the reported address of a watchpoint forwarded to userspace
- Fix the freeing of the root of stage-2 page tables
- Stop creating spurious PMU events to perform detection of the
default PMU and use the existing PMU list instead.
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmR3F3EPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDPdQP/3dnsiOsA6uHMuRvSa6/+IMAjAoe6CNSQ+iD
CE8AvVGuHD4/99JFxrBZJrFXJiB3VIhJS2NCxzcXu26ZKwbGx9ZhEuBcWa5O2yo5
Aa9EfPuFzTFdB207kkzk/TMeLbnT2jy8d9ZYeop/h1x+xsVQxgqllAA6NCooBCYA
+BcYiT+HKGklqzKpBrbPqjA1lHrsSmjccf64+3xGYMaTnlCsjav3K3fc1a9jGipy
k2LnVlh/Bv1aw1qj7Aqf3kZyhD/1F8U7QuuaSpCYXxHJPwlrQRmmRC5iL31BVlvy
tVfgVEloi1sas/rqZYI9UKhi6S/Z9Hx3AVIW4ehgJkjj35hrre7ZvQjqjEj8XUBz
9P9XVz1TKt5GjTIycaYvopQiXnnE4J7GAC80vTLnGtTFXGCpXT3DVDCpEIjOpKj4
8CECdBaY7XQGQGupNspUOaWvlDuBUSIARvccBH6Z2y3hcdF9eXf05T9j3Pw6Eexf
M0AsQAAgj5C00dTzfV/R9A1uNdL2x3M1eiZgXgttbLEyzd+slvvuUfpJA5k+c9+/
9g3x+u0qFmIRQMFtYYnDNtMhYtjtgVJQunspUwXmij6dzAz/dXU7v+QoB2/KZLY+
fpfgpGuDZH9OBC/TmzKzjT+onR+hYSPdrzx3WTJwzsKGr/4WtIBiiykjKVF6c5yY
GyxT00p3
=63kl
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-fixes-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.4, take #3
- Fix the reported address of a watchpoint forwarded to userspace
- Fix the freeing of the root of stage-2 page tables
- Stop creating spurious PMU events to perform detection of the
default PMU and use the existing PMU list instead.
- Address some fallout of the locking rework, this time affecting
the way the vgic is configured
- Fix an issue where the page table walker frees a subtree and
then proceeds with walking what it has just freed...
- Check that a given PA donated to the gues is actually memory
(only affecting pKVM)
- Correctly handle MTE CMOs by Set/Way
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmRuCOgPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpD08wP/1g00EJei1G0ZHDhoWI/F7axx2ibmoEU1dkn
BYVgwC9y51O4h+sfMZw1nLLtOw65EPCMspAxgSxMXMBW09A2wRvj9HbKnlrtLi1W
Y3pkTuJwN4o8YX/R5pdMtt0gNI2odlq7YokXpYB5+CbmTZa1P12eZz21t3e+CQh9
dkSW/zPP4ihnSYtB/47OrRGYpyv6HKwPjU9VtRbKcbomO0SYPAiizyEy2IE3UABK
nBvt7mMXnlPpkLmhWwKYj4QuOoB3H48xXf059hbZLNcGWhqtQloE7TvvqILQ7PJW
n68r6cnmZpW/+so9Wn8+/iznFf5C/YyXUEx4NiZtm395VyJRe1LCCP07soBvlwHh
x+wPm2xDrHasFJeEg2f8wXpqM4LByHfyIV4KMdPg2dUMW8jkglETDyFWd4i0QzbH
Sovfkt2AnSJ/soUXlkZA+f5bf023Q5zXDBBlNZyxXf24fynglVdaKiRXbzutqt+s
pKAniy2UxHf9Yz0M9hR+LeJ8Cvchnq+7+ToEo5B1em9g17ngGnKzrWt22F57b8Ro
U3ZSLQedob5G8ykt4jEJuBlKwf6bDijwRlpwo8aCfMzs2iqthOJx/wjcj8HMONRF
A0stENROnCF2sz5ThJXfXNtLZk2gLrTk0p9EPhU7qDquKzTURauzI/uyFBsgeRxD
1RcxKUMd
=Uwqe
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-fixes-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.4, take #2
- Address some fallout of the locking rework, this time affecting
the way the vgic is configured
- Fix an issue where the page table walker frees a subtree and
then proceeds with walking what it has just freed...
- Check that a given PA donated to the gues is actually memory
(only affecting pKVM)
- Correctly handle MTE CMOs by Set/Way
Five fixes, all in drivers. The most extensive is the target change to
fix the hang in the login code, which involves changing timers from
per login to per connection.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZHt7YCYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishSh+AP9iwKp2
MJJMB5GeijKd0TxlOp5gigcTAaqYgkco5xl/wAD+ItgItZ/Fbcn4t5ScMzbOQddb
Z4QNYUVhplUr+cBel1Y=
=eHo1
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Five fixes, all in drivers.
The most extensive is the target change to fix the hang in the login
code, which involves changing timers from per login to per connection"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: stex: Fix gcc 13 warnings
scsi: qla2xxx: Fix NULL pointer dereference in target mode
scsi: target: iscsi: Prevent login threads from racing between each other
scsi: target: iscsi: Remove unused transport_timer
scsi: target: iscsi: Fix hang in the iSCSI login code
Here's a fix for a regression in 6.4-rc1 which broke the backlight on
machines such as the Lenovo ThinkPad X13s.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQQHbPq+cpGvN/peuzMLxc3C7H1lCAUCZHtacAAKCRALxc3C7H1l
CCV+APsExsBg7Pk+mSkSAqVCcWAj6kj2y6VYveevXrmZnhm2bgD9HQJFSzJreTEG
BmdLH9uAZ8pEHlb45wQM+Br9Iw0wHAY=
=1kor
-----END PGP SIGNATURE-----
Merge tag 'leds-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/linux
Pull LED fix from Johan Hovold:
"Here's a fix for a regression in 6.4-rc1 which broke the backlight on
machines such as the Lenovo ThinkPad X13s"
Acked-by: Lee Jones <lee@kernel.org>
Link: https://lore.kernel.org/lkml/20230602091928.GR449117@google.com/
* tag 'leds-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/linux:
leds: qcom-lpg: Fix PWM period limits
The introduction of high resolution PWM support changed the order of the
operations in the calculation of min and max period. The result in both
divisions is in most cases a truncation to 0, which limits the period to
the range of [0, 0].
Both numerators (and denominators) are within 64 bits, so the whole
expression can be put directly into the div64_u64, instead of doing it
partially.
Fixes: b00d2ed376 ("leds: rgb: leds-qcom-lpg: Add support for high resolution PWM")
Reviewed-by: Caleb Connolly <caleb.connolly@linaro.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Acked-by: Lee Jones <lee@kernel.org>
Tested-by: Johan Hovold <johan+linaro@kernel.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD
Link: https://lore.kernel.org/r/20230515162604.649203-1-quic_bjorande@quicinc.com
Signed-off-by: Johan Hovold <johan@kernel.org>
- Return NULL if the trace_probe list on trace_probe_event is empty.
- selftests/ftrace: Choose testing symbol name for filtering feature
from sample data instead of fixed symbol.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmR640AACgkQ2/sHvwUr
PxugGgf/YwwocmUqiEtTukTB7fzoAjYyQXr0YaJM+DjeZXMqAJ4dl9tV1/AmAL4j
iWtZd53aolTym/3P2VADfSc4xiyWjFdkYv7zRPjpqfMg3XsELJgshwz+12dmmMdx
0uco1l2/Ge3JNPK6BuWaO3V44QjoPSgiRsmxxKLh5K7M9V5swL7fadoLtins1B0r
TVVqdyEHQkZLTByexg7wHYd/ro+4lexv1yhvyP4rEmYRPDoR56eOF2zwcQMHPvaY
qstdP2ce6m5rG0gp4TsY7oRkezb64y903hNQuumoU6VR9nI3IK4PZjuX5/xns2By
G9mRaOqb02+UmP+HhX4QGmr92G9Vyw==
=o07w
-----END PGP SIGNATURE-----
Merge tag 'probes-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
- Return NULL if the trace_probe list on trace_probe_event is empty
- selftests/ftrace: Choose testing symbol name for filtering feature
from sample data instead of fixed symbol
* tag 'probes-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
selftests/ftrace: Choose target function for filter test from samples
tracing/probe: trace_probe_primary_from_call(): checked list_first_entry
Raju Lakkaraju reported that the below commit caused a regression
with Lan743x drivers and a 2.5G SFP. Sadly, this is because the commit
was utterly wrong. Let's fix this properly by not moving the
linkmode_and(), but instead copying the link ksettings and then
modifying the advertising mask before passing the modified link
ksettings to phylib.
Fixes: df0acdc59b ("net: phylink: fix ksettings_set() ethtool call")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1q4eLm-00Ayxk-GZ@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Save a bit a space, and could help future sysctls to
use the same pattern.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
skip_notify_on_dev_down ctl table expects this field
to be an int (4 bytes), not a bool (1 byte).
Because proc_dou8vec_minmax() was added in 5.13,
this patch converts skip_notify_on_dev_down to an int.
Following patch then converts the field to u8 and use proc_dou8vec_minmax().
Fixes: 7c6bb7d2fa ("net/ipv6: Add knob to skip DELROUTE message on device down")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Keep switching between LAPIC_MODE_X2APIC and LAPIC_MODE_DISABLED during
APIC map construction to hunt for TOCTOU bugs in KVM. KVM's optimized map
recalc makes multiple passes over the list of vCPUs, and the calculations
ignore vCPU's whose APIC is hardware-disabled, i.e. there's a window where
toggling LAPIC_MODE_DISABLED is quite interesting.
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20230602233250.1014316-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Bail from kvm_recalculate_phys_map() and disable the optimized map if the
target vCPU's x2APIC ID is out-of-bounds, i.e. if the vCPU was added
and/or enabled its local APIC after the map was allocated. This fixes an
out-of-bounds access bug in the !x2apic_format path where KVM would write
beyond the end of phys_map.
Check the x2APIC ID regardless of whether or not x2APIC is enabled,
as KVM's hardcodes x2APIC ID to be the vCPU ID, i.e. it can't change, and
the map allocation in kvm_recalculate_apic_map() doesn't check for x2APIC
being enabled, i.e. the check won't get false postivies.
Note, this also affects the x2apic_format path, which previously just
ignored the "x2apic_id > new->max_apic_id" case. That too is arguably a
bug fix, as ignoring the vCPU meant that KVM would not send interrupts to
the vCPU until the next map recalculation. In practice, that "bug" is
likely benign as a newly present vCPU/APIC would immediately trigger a
recalc. But, there's no functional downside to disabling the map, and
a future patch will gracefully handle the -E2BIG case by retrying instead
of simply disabling the optimized map.
Opportunistically add a sanity check on the xAPIC ID size, along with a
comment explaining why the xAPIC ID is guaranteed to be "good".
Reported-by: Michal Luczaj <mhal@rbox.co>
Fixes: 5b84b02917 ("KVM: x86: Honor architectural behavior for aliased 8-bit APIC IDs")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230602233250.1014316-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Rhys Rustad-Elliott says:
====================
Commit d937bc3449 ("bpf: make uniform use of array->elem_size
everywhere in arraymap.c") changed array_map_gen_lookup to use
array->elem_size instead of round_up(map->value_size, 8) as the element
size when generating code to access a value in an array map.
array->elem_size, however, is not set by bpf_map_meta_alloc when
initializing an BPF_MAP_TYPE_ARRAY_OF_MAPS or BPF_MAP_TYPE_HASH_OF_MAPS.
This results in array_map_gen_lookup incorrectly outputting code that
always accesses index 0 in the array (as the index will be calculated
via a multiplication with the element size, which is incorrectly set to
0).
This patchset sets elem_size on the bpf_array object when allocating an
array or hash of maps to fix this and adds a selftest that accesses an
array map nested within a hash of maps at a nonzero index to prevent
regressions.
v1: https://lore.kernel.org/bpf/95b5da7c-ee52-3ecb-0a4e-f6a7a114f269@linux.dev/
Changelog:
v1 -> v2:
Address comments by Martin KaFai Lau:
- Directly use inner_array->elem_size instead of using round_up
- Move selftests to a new patch
- Use ASSERT_* macros instead of CHECK and remove duration
- Remove unnecessary usleep
- Shorten selftest name
====================
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Add a selftest that accesses a BPF_MAP_TYPE_ARRAY (at a nonzero index)
nested within a BPF_MAP_TYPE_HASH_OF_MAPS to flex a previously buggy
case.
Signed-off-by: Rhys Rustad-Elliott <me@rhysre.net>
Link: https://lore.kernel.org/r/20230602190110.47068-3-me@rhysre.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
The call to startup_64_setup_env() will install a new GDT but does not
actually switch to using the KERNEL_CS entry until returning from the
function call.
Commit bcce829083 ("x86/sev: Detect/setup SEV/SME features earlier in
boot") moved the call to sme_enable() earlier in the boot process and in
between the call to startup_64_setup_env() and the switch to KERNEL_CS.
An SEV-ES or an SEV-SNP guest will trigger #VC exceptions during the call
to sme_enable() and if the CS pushed on the stack as part of the exception
and used by IRETQ is not mapped by the new GDT, then problems occur.
Today, the current CS when entering startup_64 is the kernel CS value
because it was set up by the decompressor code, so no issue is seen.
However, a recent patchset that looked to avoid using the legacy
decompressor during an EFI boot exposed this bug. At entry to startup_64,
the CS value is that of EFI and is not mapped in the new kernel GDT. So
when a #VC exception occurs, the CS value used by IRETQ is not valid and
the guest boot crashes.
Fix this issue by moving the block that switches to the KERNEL_CS value to
be done immediately after returning from startup_64_setup_env().
Fixes: bcce829083 ("x86/sev: Detect/setup SEV/SME features earlier in boot")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/all/6ff1f28af2829cc9aea357ebee285825f90a431f.1684340801.git.thomas.lendacky%40amd.com
Increment vcpu->stat.exits when handling a fastpath VM-Exit without
going through any part of the "slow" path. Not bumping the exits stat
can result in wildly misleading exit counts, e.g. if the primary reason
the guest is exiting is to program the TSC deadline timer.
Fixes: 404d5d7bff ("KVM: X86: Introduce more exit_fastpath_completion enum values")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230602011920.787844-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
While testing Hyper-V enabled Windows Server 2019 guests on Zen4 hardware
I noticed that with vCPU count large enough (> 16) they sometimes froze at
boot.
With vCPU count of 64 they never booted successfully - suggesting some kind
of a race condition.
Since adding "vnmi=0" module parameter made these guests boot successfully
it was clear that the problem is most likely (v)NMI-related.
Running kvm-unit-tests quickly showed failing NMI-related tests cases, like
"multiple nmi" and "pending nmi" from apic-split, x2apic and xapic tests
and the NMI parts of eventinj test.
The issue was that once one NMI was being serviced no other NMI was allowed
to be set pending (NMI limit = 0), which was traced to
svm_is_vnmi_pending() wrongly testing for the "NMI blocked" flag rather
than for the "NMI pending" flag.
Fix this by testing for the right flag in svm_is_vnmi_pending().
Once this is done, the NMI-related kvm-unit-tests pass successfully and
the Windows guest no longer freezes at boot.
Fixes: fa4c027a79 ("KVM: x86: Add support for SVM's Virtual NMI")
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/be4ca192eb0c1e69a210db3009ca984e6a54ae69.1684495380.git.maciej.szmigiero@oracle.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Factor in the address space (non-SMM vs. SMM) of the target shadow page
when recovering potential NX huge pages, otherwise KVM will retrieve the
wrong memslot when zapping shadow pages that were created for SMM. The
bug most visibly manifests as a WARN on the memslot being non-NULL, but
the worst case scenario is that KVM could unaccount the shadow page
without ensuring KVM won't install a huge page, i.e. if the non-SMM slot
is being dirty logged, but the SMM slot is not.
------------[ cut here ]------------
WARNING: CPU: 1 PID: 3911 at arch/x86/kvm/mmu/mmu.c:7015
kvm_nx_huge_page_recovery_worker+0x38c/0x3d0 [kvm]
CPU: 1 PID: 3911 Comm: kvm-nx-lpage-re
RIP: 0010:kvm_nx_huge_page_recovery_worker+0x38c/0x3d0 [kvm]
RSP: 0018:ffff99b284f0be68 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff99b284edd000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff9271397024e0 R08: 0000000000000000 R09: ffff927139702450
R10: 0000000000000000 R11: 0000000000000001 R12: ffff99b284f0be98
R13: 0000000000000000 R14: ffff9270991fcd80 R15: 0000000000000003
FS: 0000000000000000(0000) GS:ffff927f9f640000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0aacad3ae0 CR3: 000000088fc2c005 CR4: 00000000003726e0
Call Trace:
<TASK>
__pfx_kvm_nx_huge_page_recovery_worker+0x10/0x10 [kvm]
kvm_vm_worker_thread+0x106/0x1c0 [kvm]
kthread+0xd9/0x100
ret_from_fork+0x2c/0x50
</TASK>
---[ end trace 0000000000000000 ]---
This bug was exposed by commit edbdb43fc9 ("KVM: x86: Preserve TDP MMU
roots until they are explicitly invalidated"), which allowed KVM to retain
SMM TDP MMU roots effectively indefinitely. Before commit edbdb43fc9,
KVM would zap all SMM TDP MMU roots and thus all SMM TDP MMU shadow pages
once all vCPUs exited SMM, which made the window where this bug (recovering
an SMM NX huge page) could be encountered quite tiny. To hit the bug, the
NX recovery thread would have to run while at least one vCPU was in SMM.
Most VMs typically only use SMM during boot, and so the problematic shadow
pages were gone by the time the NX recovery thread ran.
Now that KVM preserves TDP MMU roots until they are explicitly invalidated
(e.g. by a memslot deletion), the window to trigger the bug is effectively
never closed because most VMMs don't delete memslots after boot (except
for a handful of special scenarios).
Fixes: eb29860570 ("KVM: x86/mmu: Do not recover dirty-tracked NX Huge Pages")
Reported-by: Fabio Coatti <fabio.coatti@gmail.com>
Closes: https://lore.kernel.org/all/CADpTngX9LESCdHVu_2mQkNGena_Ng2CphWNwsRGSMxzDsTjU2A@mail.gmail.com
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230602010137.784664-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>