IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This is a stable-specific patch.
I botched the stable-specific rewrite of
commit b67fbebd4cf98 ("mmu_gather: Force tlb-flush VM_PFNMAP vmas"):
As Hugh pointed out, unmap_region() actually operates on a list of VMAs,
and the variable "vma" merely points to the first VMA in that list.
So if we want to check whether any of the VMAs we're operating on is
PFNMAP or MIXEDMAP, we have to iterate through the list and check each VMA.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c61feaee68b9735be06f162bc046c7f1959efb0c ]
USB external storage device(0x0b05:1932), use gnome-disk-utility tools
to test usb write < 30MB/s.
if does not to load module of uas for this device, can increase the
write speed from 20MB/s to >40MB/s.
Suggested-by: Matthias Kaehlcke <mka@chromium.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Hu Xiaoying <huxiaoying@kylinos.cn>
Link: https://lore.kernel.org/r/20220901045737.3438046-1-huxiaoying@kylinos.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c3b82d26bc85f5fc2fef5ec8cce17c89633a55a8 ]
2 keymap fixes for the Acer Aspire One AOD270 and the same hardware
rebranded as Packard Bell Dot SC:
1. The F2 key is marked with a big '?' symbol on the Packard Bell Dot SC,
this sends WMID_HOTKEY_EVENTs with a scancode of 0x27 add a mapping
for this.
2. Scancode 0x61 is KEY_SWITCHVIDEOMODE. Usually this is a duplicate
input event with the "Video Bus" input device events. But on these devices
the "Video Bus" does not send events for this key. Map 0x61 to KEY_UNKNOWN
instead of using KE_IGNORE so that udev/hwdb can override it on these devs.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220829163544.5288-1-hdegoede@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 478814a5584197fa1fb18377653626e3416e7cd6 ]
TCP_FIN_WAIT2 and TCP_LAST_ACK were not handled, the connection is closing
so we can ignore them and avoid printing the "unhandled state"
warning message.
[ 1298.852386] nvmet_tcp: queue 2 unhandled state 5
[ 1298.879112] nvmet_tcp: queue 7 unhandled state 5
[ 1298.884253] nvmet_tcp: queue 8 unhandled state 5
[ 1298.889475] nvmet_tcp: queue 9 unhandled state 5
v2: Do not call nvmet_tcp_schedule_release_queue(), just ignore
the fin_wait2 and last_ack states.
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ffd7bdddaab193c38416fd5dd416d065517d266e ]
The rc code is 0 at the error path "status & CC2520_STATUS_TX_UNDERFLOW".
Assign rc code with '-EINVAL' at this error path to fix it.
Signed-off-by: Li Qiong <liqiong@nfschina.com>
Link: https://lore.kernel.org/r/20220829071259.18330-1-liqiong@nfschina.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e1fa076706209cc447d7a2abd0843a18277e5ef7 ]
There is a timing issue captured during ishtp client sending stress tests.
It was observed during stress tests that ISH firmware is getting out of
ordered messages. This is a rare scenario as the current set of ISH client
drivers don't send much data to firmware. But this may not be the case
going forward.
When message size is bigger than IPC MTU, ishtp splits the message into
fragments and uses serialized async method to send message fragments.
The call stack:
ishtp_cl_send_msg_ipc->ipc_tx_callback(first fregment)->
ishtp_send_msg(with callback)->write_ipc_to_queue->
write_ipc_from_queue->callback->ipc_tx_callback(next fregment)......
When an ipc write complete interrupt is received, driver also calls
write_ipc_from_queue->ipc_tx_callback in ISR to start sending of next fragment.
Through ipc_tx_callback uses spin_lock to protect message splitting, as the
serialized sending method will call back to ipc_tx_callback again, so it doesn't
put sending under spin_lock, it causes driver cannot guarantee all fragments
be sent in order.
Considering this scenario:
ipc_tx_callback just finished a fragment splitting, and not call ishtp_send_msg
yet, there is a write complete interrupt happens, then ISR->write_ipc_from_queue
->ipc_tx_callback->ishtp_send_msg->write_ipc_to_queue......
Because ISR has higher exec priority than normal thread, this causes the new
fragment be sent out before previous fragment. This disordered message causes
invalid message to firmware.
The solution is, to send fragments synchronously:
Use ishtp_write_message writing fragments into tx queue directly one by one,
instead of ishtp_send_msg only writing one fragment with completion callback.
As no completion callback be used, so change ipc_tx_callback to ipc_tx_send.
Signed-off-by: Even Xu <even.xu@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 94553f8a218540d676efbf3f7827ed493d1057cf ]
The double `like' is duplicated in the comment, remove one.
Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 174974d8463b77c2b4065e98513adb204e64de7d ]
If the previous thing cat'ing $debugfs/rd left the FIFO full, then
subsequent open could deadlock in rd_write() (because open is blocked,
not giving a chance for read() to consume any data in the FIFO). Also
it is generally a good idea to clear out old data from the FIFO.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/496706/
Link: https://lore.kernel.org/r/20220807160901.2353471-2-robdclark@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 35508d2424097f9b6a1a17aac94f702767035616 ]
The RTCCTRL reg of LS1C is obselete.
Writing this reg will cause system hang.
Fixes: 60219c563c9b6 ("MIPS: Add RTC support for Loongson1C board")
Signed-off-by: Yang Ling <gnaygnil@gmail.com>
Tested-by: Keguang Zhang <keguang.zhang@gmail.com>
Acked-by: Keguang Zhang <keguang.zhang@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 332924973725e8cdcc783c175f68cf7e162cb9e5 upstream.
Turns out that i386 doesn't unconditionally have LFENCE, as such the
loop in __FILL_RETURN_BUFFER isn't actually speculation safe on such
chips.
Fixes: ba6e31af2be9 ("x86/speculation: Add LFENCE to RSB fill sequence")
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Yv9tj9vbQ9nNlXoY@worktop.programming.kicks-ass.net
[bwh: Backported to 4.19/5.4:
- __FILL_RETURN_BUFFER takes an sp parameter
- Open-code __FILL_RETURN_SLOT]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2f09707d0c972120bf794cfe0f0c67e2c2ddb252 ]
Cong Wang noticed that the previous fix for sch_sfb accessing the queued
skb after enqueueing it to a child qdisc was incomplete: the SFB enqueue
function was also calling qdisc_qstats_backlog_inc() after enqueue, which
reads the pkt len from the skb cb field. Fix this by also storing the skb
len, and using the stored value to increment the backlog after enqueueing.
Fixes: 9efd23297cca ("sch_sfb: Don't assume the skb is still around after enqueueing to child")
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Acked-by: Cong Wang <cong.wang@bytedance.com>
Link: https://lore.kernel.org/r/20220905192137.965549-1-toke@toke.dk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 686dc2db2a0fdc1d34b424ec2c0a735becd8d62b ]
Fix a bug reported and analyzed by Nagaraj Arankal, where the handling
of a spurious non-SACK RTO could cause a connection to fail to clear
retrans_stamp, causing a later RTO to very prematurely time out the
connection with ETIMEDOUT.
Here is the buggy scenario, expanding upon Nagaraj Arankal's excellent
report:
(*1) Send one data packet on a non-SACK connection
(*2) Because no ACK packet is received, the packet is retransmitted
and we enter CA_Loss; but this retransmission is spurious.
(*3) The ACK for the original data is received. The transmitted packet
is acknowledged. The TCP timestamp is before the retrans_stamp,
so tcp_may_undo() returns true, and tcp_try_undo_loss() returns
true without changing state to Open (because tcp_is_sack() is
false), and tcp_process_loss() returns without calling
tcp_try_undo_recovery(). Normally after undoing a CA_Loss
episode, tcp_fastretrans_alert() would see that the connection
has returned to CA_Open and fall through and call
tcp_try_to_open(), which would set retrans_stamp to 0. However,
for non-SACK connections we hold the connection in CA_Loss, so do
not fall through to call tcp_try_to_open() and do not set
retrans_stamp to 0. So retrans_stamp is (erroneously) still
non-zero.
At this point the first "retransmission event" has passed and
been recovered from. Any future retransmission is a completely
new "event". However, retrans_stamp is erroneously still
set. (And we are still in CA_Loss, which is correct.)
(*4) After 16 minutes (to correspond with tcp_retries2=15), a new data
packet is sent. Note: No data is transmitted between (*3) and
(*4) and we disabled keep alives.
The socket's timeout SHOULD be calculated from this point in
time, but instead it's calculated from the prior "event" 16
minutes ago (step (*2)).
(*5) Because no ACK packet is received, the packet is retransmitted.
(*6) At the time of the 2nd retransmission, the socket returns
ETIMEDOUT, prematurely, because retrans_stamp is (erroneously)
too far in the past (set at the time of (*2)).
This commit fixes this bug by ensuring that we reuse in
tcp_try_undo_loss() the same careful logic for non-SACK connections
that we have in tcp_try_undo_recovery(). To avoid duplicating logic,
we factor out that logic into a new
tcp_is_non_sack_preventing_reopen() helper and call that helper from
both undo functions.
Fixes: da34ac7626b5 ("tcp: only undo on partial ACKs in CA_Loss")
Reported-by: Nagaraj Arankal <nagaraj.p.arankal@hpe.com>
Link: https://lore.kernel.org/all/SJ0PR84MB1847BE6C24D274C46A1B9B0EB27A9@SJ0PR84MB1847.NAMPRD84.PROD.OUTLOOK.COM/
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220903121023.866900-1-ncardwell.kernel@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 160f3549a907a50e51a8518678ba2dcf2541abea ]
We should also bail from the io_work loop when we set rd_enabled to true,
so we don't attempt to read data from the socket when the TCP stream is
already out-of-sync or corrupted.
Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
Reported-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 74b30b3ad5cec95d2647e796d10137438a098bc1 ]
When accessing Ports Performance Counters Register (PPCNT),
local port must be one if it is Function-Per-Port HCA that
HCA_CAP.num_ports is 1.
The offending patch can change the local port to other values
when accessing PPCNT after enabling switchdev mode. The following
syndrome will be printed:
# cat /sys/class/infiniband/rdmap4s0f0/ports/2/counters/*
# dmesg
mlx5_core 0000:04:00.0: mlx5_cmd_check:756:(pid 12450): ACCESS_REG(0x805) op_mod(0x1) failed, status bad parameter(0x3), syndrome (0x1e5585)
Fix it by setting local port to one for Function-Per-Port HCA.
Fixes: 210b1f78076f ("IB/mlx5: When not in dual port RoCE mode, use provided port as native")
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Chris Mi <cmi@nvidia.com>
Link: https://lore.kernel.org/r/6c5086c295c76211169e58dbd610fb0402360bab.1661763459.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 84a53580c5d2138c7361c7c3eea5b31827e63b35 ]
The SRv6 layer allows defining HMAC data that can later be used to sign IPv6
Segment Routing Headers. This configuration is realised via netlink through
four attributes: SEG6_ATTR_HMACKEYID, SEG6_ATTR_SECRET, SEG6_ATTR_SECRETLEN and
SEG6_ATTR_ALGID. Because the SECRETLEN attribute is decoupled from the actual
length of the SECRET attribute, it is possible to provide invalid combinations
(e.g., secret = "", secretlen = 64). This case is not checked in the code and
with an appropriately crafted netlink message, an out-of-bounds read of up
to 64 bytes (max secret length) can occur past the skb end pointer and into
skb_shared_info:
Breakpoint 1, seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208
208 memcpy(hinfo->secret, secret, slen);
(gdb) bt
#0 seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208
#1 0xffffffff81e012e9 in genl_family_rcv_msg_doit (skb=skb@entry=0xffff88800b1f9f00, nlh=nlh@entry=0xffff88800b1b7600,
extack=extack@entry=0xffffc90000ba7af0, ops=ops@entry=0xffffc90000ba7a80, hdrlen=4, net=0xffffffff84237580 <init_net>, family=<optimized out>,
family=<optimized out>) at net/netlink/genetlink.c:731
#2 0xffffffff81e01435 in genl_family_rcv_msg (extack=0xffffc90000ba7af0, nlh=0xffff88800b1b7600, skb=0xffff88800b1f9f00,
family=0xffffffff82fef6c0 <seg6_genl_family>) at net/netlink/genetlink.c:775
#3 genl_rcv_msg (skb=0xffff88800b1f9f00, nlh=0xffff88800b1b7600, extack=0xffffc90000ba7af0) at net/netlink/genetlink.c:792
#4 0xffffffff81dfffc3 in netlink_rcv_skb (skb=skb@entry=0xffff88800b1f9f00, cb=cb@entry=0xffffffff81e01350 <genl_rcv_msg>)
at net/netlink/af_netlink.c:2501
#5 0xffffffff81e00919 in genl_rcv (skb=0xffff88800b1f9f00) at net/netlink/genetlink.c:803
#6 0xffffffff81dff6ae in netlink_unicast_kernel (ssk=0xffff888010eec800, skb=0xffff88800b1f9f00, sk=0xffff888004aed000)
at net/netlink/af_netlink.c:1319
#7 netlink_unicast (ssk=ssk@entry=0xffff888010eec800, skb=skb@entry=0xffff88800b1f9f00, portid=portid@entry=0, nonblock=<optimized out>)
at net/netlink/af_netlink.c:1345
#8 0xffffffff81dff9a4 in netlink_sendmsg (sock=<optimized out>, msg=0xffffc90000ba7e48, len=<optimized out>) at net/netlink/af_netlink.c:1921
...
(gdb) p/x ((struct sk_buff *)0xffff88800b1f9f00)->head + ((struct sk_buff *)0xffff88800b1f9f00)->end
$1 = 0xffff88800b1b76c0
(gdb) p/x secret
$2 = 0xffff88800b1b76c0
(gdb) p slen
$3 = 64 '@'
The OOB data can then be read back from userspace by dumping HMAC state. This
commit fixes this by ensuring SECRETLEN cannot exceed the actual length of
SECRET.
Reported-by: Lucas Leong <wmliang.tw@gmail.com>
Tested: verified that EINVAL is correctly returned when secretlen > len(secret)
Fixes: 4f4853dc1c9c1 ("ipv6: sr: implement API to control SR HMAC structure")
Signed-off-by: David Lebrun <dlebrun@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0d1b756acf60da5004c1e20ca4462f0c257bf6e1 ]
Functions that work on a pointer to virtual memory such as
virt_to_pfn() and users of that function such as
virt_to_page() are supposed to pass a pointer to virtual
memory, ideally a (void *) or other pointer. However since
many architectures implement virt_to_pfn() as a macro,
this function becomes polymorphic and accepts both a
(unsigned long) and a (void *).
If we instead implement a proper virt_to_pfn(void *addr)
function the following happens (occurred on arch/arm):
drivers/infiniband/sw/siw/siw_qp_tx.c:32:23: warning: incompatible
integer to pointer conversion passing 'dma_addr_t' (aka 'unsigned int')
to parameter of type 'const void *' [-Wint-conversion]
drivers/infiniband/sw/siw/siw_qp_tx.c:32:37: warning: passing argument
1 of 'virt_to_pfn' makes pointer from integer without a cast
[-Wint-conversion]
drivers/infiniband/sw/siw/siw_qp_tx.c:538:36: warning: incompatible
integer to pointer conversion passing 'unsigned long long'
to parameter of type 'const void *' [-Wint-conversion]
Fix this with an explicit cast. In one case where the SIW
SGE uses an unaligned u64 we need a double cast modifying the
virtual address (va) to a platform-specific uintptr_t before
casting to a (void *).
Fixes: b9be6f18cf9e ("rdma/siw: transmit path")
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220902215918.603761-1-linus.walleij@linaro.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e2b224abd9bf45dcb55750479fc35970725a430b ]
There is a shift wrapping bug in this code so anything thing above
31 will return false.
Fixes: 35c55c9877f8 ("tipc: add neighbor monitoring framework")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9efd23297cca530bb35e1848665805d3fcdd7889 ]
The sch_sfb enqueue() routine assumes the skb is still alive after it has
been enqueued into a child qdisc, using the data in the skb cb field in the
increment_qlen() routine after enqueue. However, the skb may in fact have
been freed, causing a use-after-free in this case. In particular, this
happens if sch_cake is used as a child of sfb, and the GSO splitting mode
of CAKE is enabled (in which case the skb will be split into segments and
the original skb freed).
Fix this by copying the sfb cb data to the stack before enqueueing the skb,
and using this stack copy in increment_qlen() instead of the skb pointer
itself.
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-18231
Fixes: e13e02a3c68d ("net_sched: SFB flow scheduler")
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7903192c4b4a82d792cb0dc5e2779a2efe60d45b ]
rxrpc and kafs between them try to use the receive timestamp on the first
data packet (ie. the one with sequence number 1) as a base from which to
calculate the time at which callback promise and lock expiration occurs.
However, we don't know how long it took for the server to send us the reply
from it having completed the basic part of the operation - it might then,
for instance, have to send a bunch of a callback breaks, depending on the
particular operation.
Fix this by using the time at which the operation is issued on the client
as a base instead. That should never be longer than the server's idea of
the expiry time.
Fixes: 781070551c26 ("afs: Fix calculation of callback expiry time")
Fixes: 2070a3e44962 ("rxrpc: Allow the reply time to be obtained on a client call")
Suggested-by: Jeffrey E Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0d40f728e28393a8817d1fcae923dfa3409e488c ]
rxkad_verify_packet_2() has a small stack-allocated sglist of 4 elements,
but if that isn't sufficient for the number of fragments in the socket
buffer, we try to allocate an sglist large enough to hold all the
fragments.
However, for large packets with a lot of fragments, this isn't sufficient
and we need at least one additional fragment.
The problem manifests as skb_to_sgvec() returning -EMSGSIZE and this then
getting returned by userspace. Most of the time, this isn't a problem as
rxrpc sets a limit of 5692, big enough for 4 jumbo subpackets to be glued
together; occasionally, however, the server will ignore the reported limit
and give a packet that's a lot bigger - say 19852 bytes with ->nr_frags
being 7. skb_to_sgvec() then tries to return a "zeroth" fragment that
seems to occur before the fragments counted by ->nr_frags and we hit the
end of the sglist too early.
Note that __skb_to_sgvec() also has an skb_walk_frags() loop that is
recursive up to 24 deep. I'm not sure if I need to take account of that
too - or if there's an easy way of counting those frags too.
Fix this by counting an extra frag and allocating a larger sglist based on
that.
Fixes: d0d5c0cd1e71 ("rxrpc: Use skb_unshare() rather than skb_cow_data()")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-afs@lists.infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0efe125cfb99e6773a7434f3463f7c2fa28f3a43 ]
Ensure the match happens in the right direction, previously the
destination used was the server, not the NAT host, as the comment
shows the code intended.
Additionally nf_nat_irc uses port 0 as a signal and there's no valid way
it can appear in a DCC message, so consider port 0 also forged.
Fixes: 869f37d8e48f ("[NETFILTER]: nf_conntrack/nf_nat: add IRC helper port")
Signed-off-by: David Leadbeater <dgl@dgl.cx>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d047283a7034140ea5da759a494fd2274affdd46 ]
The IPv6 path already drops dst in the daddr changed case, but the IPv4
path does not. This change makes the two code paths consistent.
Further, it is possible that there is already a metadata_dst allocated from
ingress that might already be attached to skbuff->dst while following
the bridge path. If it is not released before setting a new
metadata_dst, it will be leaked. This is similar to what is done in
bpf_set_tunnel_key() or ip6_route_input().
It is important to note that the memory being leaked is not the dst
being set in the bridge code, but rather memory allocated from some
other code path that is not being freed correctly before the skb dst is
overwritten.
An example of the leakage fixed by this commit found using kmemleak:
unreferenced object 0xffff888010112b00 (size 256):
comm "softirq", pid 0, jiffies 4294762496 (age 32.012s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 80 16 f1 83 ff ff ff ff ................
e1 4e f6 82 ff ff ff ff 00 00 00 00 00 00 00 00 .N..............
backtrace:
[<00000000d79567ea>] metadata_dst_alloc+0x1b/0xe0
[<00000000be113e13>] udp_tun_rx_dst+0x174/0x1f0
[<00000000a36848f4>] geneve_udp_encap_recv+0x350/0x7b0
[<00000000d4afb476>] udp_queue_rcv_one_skb+0x380/0x560
[<00000000ac064aea>] udp_unicast_rcv_skb+0x75/0x90
[<000000009a8ee8c5>] ip_protocol_deliver_rcu+0xd8/0x230
[<00000000ef4980bb>] ip_local_deliver_finish+0x7a/0xa0
[<00000000d7533c8c>] __netif_receive_skb_one_core+0x89/0xa0
[<00000000a879497d>] process_backlog+0x93/0x190
[<00000000e41ade9f>] __napi_poll+0x28/0x170
[<00000000b4c0906b>] net_rx_action+0x14f/0x2a0
[<00000000b20dd5d4>] __do_softirq+0xf4/0x305
[<000000003a7d7e15>] __irq_exit_rcu+0xc3/0x140
[<00000000968d39a2>] sysvec_apic_timer_interrupt+0x9e/0xc0
[<000000009e920794>] asm_sysvec_apic_timer_interrupt+0x16/0x20
[<000000008942add0>] native_safe_halt+0x13/0x20
Florian Westphal says: "Original code was likely fine because nothing
ever did set a skb->dst entry earlier than bridge in those days."
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Harsh Modi <harshmodi@google.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1085f5080647f0c9f357c270a537869191f7f2a1 ]
In brcmstb_pm_probe(), there are two kinds of leak bugs:
(1) we need to add of_node_put() when for_each__matching_node() breaks
(2) we need to add iounmap() for each iomap in fail path
Fixes: 0b741b8234c8 ("soc: bcm: brcmstb: Add support for S2/S3/S5 suspend states (ARM)")
Signed-off-by: Liang He <windhl@126.com>
Link: https://lore.kernel.org/r/20220707015620.306468-1-windhl@126.com
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 27cfde795a96aef1e859a5480489944b95421e46 ]
Fix the order of source and destination addresses when resolving the
route between server and client to validate use of correct net device.
The reverse order we had so far didn't actually validate the net device
as the server would try to resolve the route to itself, thus always
getting the server's net device.
The issue was discovered when running cm applications on a single host
between 2 interfaces with same subnet and source based routing rules.
When resolving the reverse route the source based route rules were
ignored.
Fixes: f887f2ac87c2 ("IB/cma: Validate routing of incoming requests")
Link: https://lore.kernel.org/r/1c1ec2277a131d277ebcceec987fd338d35b775f.1661251872.git.leonro@nvidia.com
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c32f1ebfd26bece77141257864ed7b4720da1557 ]
If regulator_enable() fails, enable_count is incremented still.
A consumer, assuming no matching regulator_disable() is necessary on
failure, will then get this error message upon regulator_put()
since enable_count is non-zero:
[ 1.277418] WARNING: CPU: 3 PID: 1 at drivers/regulator/core.c:2304 _regulator_put.part.0+0x168/0x170
The consumer could try to fix this in their driver by cleaning up on
error from regulator_enable() (i.e. call regulator_disable()), but that
results in the following since regulator_enable() failed and didn't
increment user_count:
[ 1.258112] unbalanced disables for vreg_l17c
[ 1.262606] WARNING: CPU: 4 PID: 1 at drivers/regulator/core.c:2899 _regulator_disable+0xd4/0x190
Fix this by decrementing enable_count upon failure to enable.
With this in place, just the reason for failure to enable is printed
as expected and developers can focus on the root cause of their issue
instead of thinking their usage of the regulator consumer api is
incorrect. For example, in my case:
[ 1.240426] vreg_l17c: invalid input voltage found
Fixes: 5451781dadf8 ("regulator: core: Only count load for enabled consumers")
Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20220819194336.382740-1-ahalaney@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ba0803050d610d5072666be727bca5e03e55b242 ]
smb3 fallocate punch hole was not grabbing the inode or filemap_invalidate
locks so could have race with pagemap reinstantiating the page.
Cc: stable@vger.kernel.org
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4f7e7236435ca0abe005c674ebd6892c6e83aeb3 ]
Bringing up a CPU may involve creating and destroying tasks which requires
read-locking threadgroup_rwsem, so threadgroup_rwsem nests inside
cpus_read_lock(). However, cpuset's ->attach(), which may be called with
thredagroup_rwsem write-locked, also wants to disable CPU hotplug and
acquires cpus_read_lock(), leading to a deadlock.
Fix it by guaranteeing that ->attach() is always called with CPU hotplug
disabled and removing cpus_read_lock() call from cpuset_attach().
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-and-tested-by: Imran Khan <imran.f.khan@oracle.com>
Reported-and-tested-by: Xuewen Yan <xuewen.yan@unisoc.com>
Fixes: 05c7b7a92cc8 ("cgroup/cpuset: Fix a race between cpuset_attach() and cpu hotplug")
Cc: stable@vger.kernel.org # v5.17+
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 671c11f0619e5ccb380bcf0f062f69ba95fc974a ]
cgroup_update_dfl_csses() write-lock the threadgroup_rwsem as updating the
csses can trigger process migrations. However, if the subtree doesn't
contain any tasks, there aren't gonna be any cgroup migrations. This
condition can be trivially detected by testing whether
mgctx.preloaded_src_csets is empty. Elide write-locking threadgroup_rwsem if
the subtree is empty.
After this optimization, the usage pattern of creating a cgroup, enabling
the necessary controllers, and then seeding it with CLONE_INTO_CGROUP and
then removing the cgroup after it becomes empty doesn't need to write-lock
threadgroup_rwsem at all.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9a3284fad42f66bb43629c6716709ff791aaa457 ]
There are reports of users who use thread migrations between cgroups and
they report performance drop after d59cfc09c32a ("sched, cgroup: replace
signal_struct->group_rwsem with a global percpu_rwsem"). The effect is
pronounced on machines with more CPUs.
The migration is affected by forking noise happening in the background,
after the mentioned commit a migrating thread must wait for all
(forking) processes on the system, not only of its threadgroup.
There are several places that need to synchronize with migration:
a) do_exit,
b) de_thread,
c) copy_process,
d) cgroup_update_dfl_csses,
e) parallel migration (cgroup_{proc,thread}s_write).
In the case of self-migrating thread, we relax the synchronization on
cgroup_threadgroup_rwsem to avoid the cost of waiting. d) and e) are
excluded with cgroup_mutex, c) does not matter in case of single thread
migration and the executing thread cannot exec(2) or exit(2) while it is
writing into cgroup.threads. In case of do_exit because of signal
delivery, we either exit before the migration or finish the migration
(of not yet PF_EXITING thread) and die afterwards.
This patch handles only the case of self-migration by writing "0" into
cgroup.threads. For simplicity, we always take cgroup_threadgroup_rwsem
with numeric PIDs.
This change improves migration dependent workload performance similar
to per-signal_struct state.
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit da6d507f5ff328f346b3c50e19e19993027b8ffd upstream.
Add the missing destroy_workqueue() before return from
lpfc_sli4_driver_resource_setup() in the error path.
Link: https://lore.kernel.org/r/20220823044237.285643-1-yangyingliang@huawei.com
Fixes: 3cee98db2610 ("scsi: lpfc: Fix crash on driver unload in wq free")
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 991df3dd5144f2e6b1c38b8d20ed3d4d21e20b34 upstream.
Fix the following use-after-free warning which is observed during
controller reset:
refcount_t: underflow; use-after-free.
WARNING: CPU: 23 PID: 5399 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0
Link: https://lore.kernel.org/r/20220906134908.1039-2-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dec9b2f1e0455a151a7293c367da22ab973f713e upstream.
There is a very common pattern of using
debugfs_remove(debufs_lookup(..)) which results in a dentry leak of the
dentry that was looked up. Instead of having to open-code the correct
pattern of calling dput() on the dentry, create
debugfs_lookup_and_remove() to handle this pattern automatically and
properly without any memory leaks.
Cc: stable <stable@kernel.org>
Reported-by: Kuyo Chang <kuyo.chang@mediatek.com>
Tested-by: Kuyo Chang <kuyo.chang@mediatek.com>
Link: https://lore.kernel.org/r/YxIaQ8cSinDR881k@kroah.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e53f47f6c1a56d2af728909f1cb894da6b43d9bf upstream.
There may be a bad USB audio device with a USB ID of (0x04fa, 0x4201) and
the number of it's interfaces less than 4, an out-of-bounds read bug occurs
when parsing the interface descriptor for this device.
Fix this by checking the number of interfaces.
Signed-off-by: Dongxiang Ke <kdx.glider@gmail.com>
Link: https://lore.kernel.org/r/20220906024928.10951-1-kdx.glider@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3e48940abee88b8dbbeeaf8a07e7b2b6be1271b3 upstream.
In loopback_jiffies_timer_pos_update(), we are getting jiffies twice.
First time for playback, second time for capture. Jiffies can be updated
between these two calls and if the capture jiffies is larger, extra zeros
will be filled in the capture buffer.
Change to get jiffies once and use it for both playback and capture.
Signed-off-by: Pattara Teerapong <pteerapong@chromium.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220901144036.4049060-1-pteerapong@chromium.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d29f59051d3a07b81281b2df2b8c9dfe4716067f upstream.
The voice allocator sometimes begins allocating from near the end of the
array and then wraps around, however snd_emu10k1_pcm_channel_alloc()
accesses the newly allocated voices as if it never wrapped around.
This results in out of bounds access if the first voice has a high enough
index so that first_voice + requested_voice_count > NUM_G (64).
The more voices are requested, the more likely it is for this to occur.
This was initially discovered using PipeWire, however it can be reproduced
by calling aplay multiple times with 16 channels:
aplay -r 48000 -D plughw:CARD=Live,DEV=3 -c 16 /dev/zero
UBSAN: array-index-out-of-bounds in sound/pci/emu10k1/emupcm.c:127:40
index 65 is out of range for type 'snd_emu10k1_voice [64]'
CPU: 1 PID: 31977 Comm: aplay Tainted: G W IOE 6.0.0-rc2-emu10k1+ #7
Hardware name: ASUSTEK COMPUTER INC P5W DH Deluxe/P5W DH Deluxe, BIOS 3002 07/22/2010
Call Trace:
<TASK>
dump_stack_lvl+0x49/0x63
dump_stack+0x10/0x16
ubsan_epilogue+0x9/0x3f
__ubsan_handle_out_of_bounds.cold+0x44/0x49
snd_emu10k1_playback_hw_params+0x3bc/0x420 [snd_emu10k1]
snd_pcm_hw_params+0x29f/0x600 [snd_pcm]
snd_pcm_common_ioctl+0x188/0x1410 [snd_pcm]
? exit_to_user_mode_prepare+0x35/0x170
? do_syscall_64+0x69/0x90
? syscall_exit_to_user_mode+0x26/0x50
? do_syscall_64+0x69/0x90
? exit_to_user_mode_prepare+0x35/0x170
snd_pcm_ioctl+0x27/0x40 [snd_pcm]
__x64_sys_ioctl+0x95/0xd0
do_syscall_64+0x5c/0x90
? do_syscall_64+0x69/0x90
? do_syscall_64+0x69/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Signed-off-by: Tasos Sahanidis <tasos@tasossah.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/3707dcab-320a-62ff-63c0-73fc201ef756@tasossah.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b8983d42524f10ac6bf35bbce6a7cc8e45f61e04 ]
The mmVM_L2_CNTL3 register is not assigned an initial value
Signed-off-by: Qu Huang <jinsdb@126.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e75d18cecbb3805895d8ed64da4f78575ec96043 ]
Though acpi_find_last_cache_level() always returned signed value and the
document states it will return any errors caused by lack of a PPTT table,
it never returned negative values before.
Commit 0c80f9e165f8 ("ACPI: PPTT: Leave the table mapped for the runtime usage")
however changed it by returning -ENOENT if no PPTT was found. The value
returned from acpi_find_last_cache_level() is then assigned to unsigned
fw_level.
It will result in the number of cache leaves calculated incorrectly as
a huge value which will then cause the following warning from __alloc_pages
as the order would be great than MAX_ORDER because of incorrect and huge
cache leaves value.
| WARNING: CPU: 0 PID: 1 at mm/page_alloc.c:5407 __alloc_pages+0x74/0x314
| Modules linked in:
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-10393-g7c2a8d3ac4c0 #73
| pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : __alloc_pages+0x74/0x314
| lr : alloc_pages+0xe8/0x318
| Call trace:
| __alloc_pages+0x74/0x314
| alloc_pages+0xe8/0x318
| kmalloc_order_trace+0x68/0x1dc
| __kmalloc+0x240/0x338
| detect_cache_attributes+0xe0/0x56c
| update_siblings_masks+0x38/0x284
| store_cpu_topology+0x78/0x84
| smp_prepare_cpus+0x48/0x134
| kernel_init_freeable+0xc4/0x14c
| kernel_init+0x2c/0x1b4
| ret_from_fork+0x10/0x20
Fix the same by changing fw_level to be signed integer and return the
error from init_cache_level() early in case of error.
Reported-and-Tested-by: Bruno Goncalves <bgoncalv@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20220808084640.3165368-1-sudeep.holla@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 591d2108f3abc4db9f9073cae37cf3591fd250d6 ]
If a 32-bit kernel was compiled for PA2.0 CPUs, it won't be able to run
on machines with PA1.x CPUs. Add a check and bail out early if a PA1.x
machine is detected.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d46c742f827fa2326ab1f4faa1cccadb56912341 ]
As the possible failure of the kmalloc(), it should be better
to fix this error path, check and return '-ENOMEM' error code.
Signed-off-by: Li Qiong <liqiong@nfschina.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f461950fdc374a3ada5a63c669d997de4600dffe ]
Although radeon card fence and wait for gpu to finish processing current batch rings,
there is still a corner case that radeon lockup work queue may not be fully flushed,
and meanwhile the radeon_suspend_kms() function has called pci_set_power_state() to
put device in D3hot state.
Per PCI spec rev 4.0 on 5.3.1.4.1 D3hot State.
> Configuration and Message requests are the only TLPs accepted by a Function in
> the D3hot state. All other received Requests must be handled as Unsupported Requests,
> and all received Completions may optionally be handled as Unexpected Completions.
This issue will happen in following logs:
Unable to handle kernel paging request at virtual address 00008800e0008010
CPU 0 kworker/0:3(131): Oops 0
pc = [<ffffffff811bea5c>] ra = [<ffffffff81240844>] ps = 0000 Tainted: G W
pc is at si_gpu_check_soft_reset+0x3c/0x240
ra is at si_dma_is_lockup+0x34/0xd0
v0 = 0000000000000000 t0 = fff08800e0008010 t1 = 0000000000010000
t2 = 0000000000008010 t3 = fff00007e3c00000 t4 = fff00007e3c00258
t5 = 000000000000ffff t6 = 0000000000000001 t7 = fff00007ef078000
s0 = fff00007e3c016e8 s1 = fff00007e3c00000 s2 = fff00007e3c00018
s3 = fff00007e3c00000 s4 = fff00007fff59d80 s5 = 0000000000000000
s6 = fff00007ef07bd98
a0 = fff00007e3c00000 a1 = fff00007e3c016e8 a2 = 0000000000000008
a3 = 0000000000000001 a4 = 8f5c28f5c28f5c29 a5 = ffffffff810f4338
t8 = 0000000000000275 t9 = ffffffff809b66f8 t10 = ff6769c5d964b800
t11= 000000000000b886 pv = ffffffff811bea20 at = 0000000000000000
gp = ffffffff81d89690 sp = 00000000aa814126
Disabling lock debugging due to kernel taint
Trace:
[<ffffffff81240844>] si_dma_is_lockup+0x34/0xd0
[<ffffffff81119610>] radeon_fence_check_lockup+0xd0/0x290
[<ffffffff80977010>] process_one_work+0x280/0x550
[<ffffffff80977350>] worker_thread+0x70/0x7c0
[<ffffffff80977410>] worker_thread+0x130/0x7c0
[<ffffffff80982040>] kthread+0x200/0x210
[<ffffffff809772e0>] worker_thread+0x0/0x7c0
[<ffffffff80981f8c>] kthread+0x14c/0x210
[<ffffffff80911658>] ret_from_kernel_thread+0x18/0x20
[<ffffffff80981e40>] kthread+0x0/0x210
Code: ad3e0008 43f0074a ad7e0018 ad9e0020 8c3001e8 40230101
<88210000> 4821ed21
So force lockup work queue flush to fix this problem.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zhenneng Li <lizhenneng@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>