IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
[ Upstream commit 2e60546368 ]
Avoid the situation where an IPV6 only flag is applied to an IPv4 address:
# ip addr add 192.0.2.1/24 dev dummy0 nodad home mngtmpaddr noprefixroute
# ip -4 addr show dev dummy0
2: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
inet 192.0.2.1/24 scope global noprefixroute dummy0
valid_lft forever preferred_lft forever
Or worse, by sending a malicious netlink command:
# ip -4 addr show dev dummy0
2: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
inet 192.0.2.1/24 scope global nodad optimistic dadfailed home tentative mngtmpaddr noprefixroute stable-privacy dummy0
valid_lft forever preferred_lft forever
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d87b88ba2 upstream.
Microsoft Surface Precision Mouse provides bogus identity address when
pairing. It connects with Static Random address but provides Public
Address in SMP Identity Address Information PDU. Address has same
value but type is different. Workaround this by dropping IRK if ID
address discrepancy is detected.
> HCI Event: LE Meta Event (0x3e) plen 19
LE Connection Complete (0x01)
Status: Success (0x00)
Handle: 75
Role: Master (0x00)
Peer address type: Random (0x01)
Peer address: E0:52:33:93:3B:21 (Static)
Connection interval: 50.00 msec (0x0028)
Connection latency: 0 (0x0000)
Supervision timeout: 420 msec (0x002a)
Master clock accuracy: 0x00
....
> ACL Data RX: Handle 75 flags 0x02 dlen 12
SMP: Identity Address Information (0x09) len 7
Address type: Public (0x00)
Address: E0:52:33:93:3B:21
Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Tested-by: Maarten Fonville <maarten.fonville@gmail.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199461
Cc: stable@vger.kernel.org
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c49a8682fc ]
Problem: The Linux Bluetooth stack yields complete control over the BLE
connection interval to the remote device.
The Linux Bluetooth stack provides access to the BLE connection interval
min and max values through /sys/kernel/debug/bluetooth/hci0/
conn_min_interval and /sys/kernel/debug/bluetooth/hci0/conn_max_interval.
These values are used for initial BLE connections, but the remote device
has the ability to request a connection parameter update. In the event
that the remote side requests to change the connection interval, the Linux
kernel currently only validates that the desired value is within the
acceptable range in the Bluetooth specification (6 - 3200, corresponding to
7.5ms - 4000ms). There is currently no validation that the desired value
requested by the remote device is within the min/max limits specified in
the conn_min_interval/conn_max_interval configurations. This essentially
leads to Linux yielding complete control over the connection interval to
the remote device.
The proposed patch adds a verification step to the connection parameter
update mechanism, ensuring that the desired value is within the min/max
bounds of the current connection. If the desired value is outside of the
current connection min/max values, then the connection parameter update
request is rejected and the negative response is returned to the remote
device. Recall that the initial connection is established using the local
conn_min_interval/conn_max_interval values, so this allows the Linux
administrator to retain control over the BLE connection interval.
The one downside that I see is that the current default Linux values for
conn_min_interval and conn_max_interval typically correspond to 30ms and
50ms respectively. If this change were accepted, then it is feasible that
some devices would no longer be able to negotiate to their desired
connection interval values. This might be remedied by setting the default
Linux conn_min_interval and conn_max_interval values to the widest
supported range (6 - 3200 / 7.5ms - 4000ms). This could lead to the same
behavior as the current implementation, where the remote device could
request to change the connection interval value to any value that is
permitted by the Bluetooth specification, and Linux would accept the
desired value.
Signed-off-by: Carey Sonsino <csonsino@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b188b03270 ]
Handle overlooked case where the target address is assigned to a peer
and neither route nor gateway exist.
For one peer, no checks are performed to see if it is meant to receive
packets for a given address.
As soon as there is a second peer however, checks are performed
to deal with routes and gateways for handling complex setups with
multiple hops to a target address.
This logic assumed that no route and no gateway imply that the
destination address can not be reached, which is false in case of a
direct peer.
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Tested-by: Michael Scott <mike@foundries.io>
Signed-off-by: Josua Mayer <josua.mayer@jm0.eu>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b8d6d00797 ]
After commit b38ff4075a, the following command does not work anymore:
$ ip xfrm state add src 10.125.0.2 dst 10.125.0.1 proto esp spi 34 reqid 1 \
mode tunnel enc 'cbc(aes)' 0xb0abdba8b782ad9d364ec81e3a7d82a1 auth-trunc \
'hmac(sha1)' 0xe26609ebd00acb6a4d51fca13e49ea78a72c73e6 96 flag align4
In fact, the selector is not mandatory, allow the user to provide an empty
selector.
Fixes: b38ff4075a ("xfrm: Fix xfrm sel prefix length validation")
CC: Anirudh Gupta <anirudh.gupta@sophos.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b38ff4075a ]
Family of src/dst can be different from family of selector src/dst.
Use xfrm selector family to validate address prefix length,
while verifying new sa from userspace.
Validated patch with this command:
ip xfrm state add src 1.1.6.1 dst 1.1.6.2 proto esp spi 4260196 \
reqid 20004 mode tunnel aead "rfc4106(gcm(aes))" \
0x1111016400000000000000000000000044440001 128 \
sel src 1011:1:4::2/128 sel dst 1021:1:4::2/128 dev Port5
Fixes: 07bf790895 ("xfrm: Validate address prefix lengths in the xfrm selector.")
Signed-off-by: Anirudh Gupta <anirudh.gupta@sophos.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 38c73529de ]
In commit 19e4e76806 ("ipv4: Fix raw socket lookup for local
traffic"), the dif argument to __raw_v4_lookup() is coming from the
returned value of inet_iif() but the change was done only for the first
lookup. Subsequent lookups in the while loop still use skb->dev->ifIndex.
Fixes: 19e4e76806 ("ipv4: Fix raw socket lookup for local traffic")
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4f07b80c97 ]
This patch is to fix an uninit-value issue, reported by syzbot:
BUG: KMSAN: uninit-value in memchr+0xce/0x110 lib/string.c:981
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x191/0x1f0 lib/dump_stack.c:113
kmsan_report+0x130/0x2a0 mm/kmsan/kmsan.c:622
__msan_warning+0x75/0xe0 mm/kmsan/kmsan_instr.c:310
memchr+0xce/0x110 lib/string.c:981
string_is_valid net/tipc/netlink_compat.c:176 [inline]
tipc_nl_compat_bearer_disable+0x2a1/0x480 net/tipc/netlink_compat.c:449
__tipc_nl_compat_doit net/tipc/netlink_compat.c:327 [inline]
tipc_nl_compat_doit+0x3ac/0xb00 net/tipc/netlink_compat.c:360
tipc_nl_compat_handle net/tipc/netlink_compat.c:1178 [inline]
tipc_nl_compat_recv+0x1b1b/0x27b0 net/tipc/netlink_compat.c:1281
TLV_GET_DATA_LEN() may return a negtive int value, which will be
used as size_t (becoming a big unsigned long) passed into memchr,
cause this issue.
Similar to what it does in tipc_nl_compat_bearer_enable(), this
fix is to return -EINVAL when TLV_GET_DATA_LEN() is negtive in
tipc_nl_compat_bearer_disable(), as well as in
tipc_nl_compat_link_stat_dump() and tipc_nl_compat_link_reset_stats().
v1->v2:
- add the missing Fixes tags per Eric's request.
Fixes: 0762216c0a ("tipc: fix uninit-value in tipc_nl_compat_bearer_enable")
Fixes: 8b66fee7f8 ("tipc: fix uninit-value in tipc_nl_compat_link_reset_stats")
Reported-by: syzbot+30eaa8bf392f7fafffaf@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c492d4c74d ]
This patch is to fix a dst defcnt leak, which can be reproduced by doing:
# ip net a c; ip net a s; modprobe tipc
# ip net e s ip l a n eth1 type veth peer n eth1 netns c
# ip net e c ip l s lo up; ip net e c ip l s eth1 up
# ip net e s ip l s lo up; ip net e s ip l s eth1 up
# ip net e c ip a a 1.1.1.2/8 dev eth1
# ip net e s ip a a 1.1.1.1/8 dev eth1
# ip net e c tipc b e m udp n u1 localip 1.1.1.2
# ip net e s tipc b e m udp n u1 localip 1.1.1.1
# ip net d c; ip net d s; rmmod tipc
and it will get stuck and keep logging the error:
unregister_netdevice: waiting for lo to become free. Usage count = 1
The cause is that a dst is held by the udp sock's sk_rx_dst set on udp rx
path with udp_early_demux == 1, and this dst (eventually holding lo dev)
can't be released as bearer's removal in tipc pernet .exit happens after
lo dev's removal, default_device pernet .exit.
"There are two distinct types of pernet_operations recognized: subsys and
device. At creation all subsys init functions are called before device
init functions, and at destruction all device exit functions are called
before subsys exit function."
So by calling register_pernet_device instead to register tipc_net_ops, the
pernet .exit() will be invoked earlier than loopback dev's removal when a
netns is being destroyed, as fou/gue does.
Note that vxlan and geneve udp tunnels don't have this issue, as the udp
sock is released in their device ndo_stop().
This fix is also necessary for tipc dst_cache, which will hold dsts on tx
path and I will introduce in my next patch.
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ef5305f1f7 ]
strcpy to dirent->d_name could overflow the buffer, use strscpy to check
the provided string length and error out if the size was too big.
While we are here, make the function return an error when the pdu
parsing failed, instead of returning the pdu offset as if it had been a
success...
Link: http://lkml.kernel.org/r/1536339057-21974-4-git-send-email-asmadeus@codewreck.org
Addresses-Coverity-ID: 139133 ("Copy into fixed size buffer")
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 588f7d39b3 upstream.
When receiving a robust management frame, drop it if we don't have
rx->sta since then we don't have a security association and thus
couldn't possibly validate the frame.
Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 693cd8ce3f upstream.
When trying to align the minimum encryption key size requirement for
Bluetooth connections, it turns out doing this in a central location in
the HCI connection handling code is not possible.
Original Bluetooth version up to 2.0 used a security model where the
L2CAP service would enforce authentication and encryption. Starting
with Bluetooth 2.1 and Secure Simple Pairing that model has changed into
that the connection initiator is responsible for providing an encrypted
ACL link before any L2CAP communication can happen.
Now connecting Bluetooth 2.1 or later devices with Bluetooth 2.0 and
before devices are causing a regression. The encryption key size check
needs to be moved out of the HCI connection handling into the L2CAP
channel setup.
To achieve this, the current check inside hci_conn_security() has been
moved into l2cap_check_enc_key_size() helper function and then called
from four decisions point inside L2CAP to cover all combinations of
Secure Simple Pairing enabled devices and device using legacy pairing
and legacy service security model.
Fixes: d5bb334a8e ("Bluetooth: Align minimum encryption key size for LE and BR/EDR connections")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203643
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b6653b3629 upstream.
tcp_fragment() might be called for skbs in the write queue.
Memory limits might have been exceeded because tcp_sendmsg() only
checks limits at full skb (64KB) boundaries.
Therefore, we need to make sure tcp_fragment() wont punish applications
that might have setup very low SO_SNDBUF values.
Fixes: f070ef2ac6 ("tcp: tcp_fragment() should apply sane memory limits")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Christoph Paasch <cpaasch@apple.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6be8e297f9 ]
lapb_register calls lapb_create_cb, which initializes the control-
block's ref-count to one, and __lapb_insert_cb, which increments it when
adding the new block to the list of blocks.
lapb_unregister calls __lapb_remove_cb, which decrements the ref-count
when removing control-block from the list of blocks, and calls lapb_put
itself to decrement the ref-count before returning.
However, lapb_unregister also calls __lapb_devtostruct to look up the
right control-block for the given net_device, and __lapb_devtostruct
also bumps the ref-count, which means that when lapb_unregister returns
the ref-count is still 1 and the control-block is leaked.
Call lapb_put after __lapb_devtostruct to fix leak.
Reported-by: syzbot+afb980676c836b4a0afa@syzkaller.appspotmail.com
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 65a3c497c0 ]
Before taking a refcount, make sure the object is not already
scheduled for deletion.
Same fix is needed in ipv6_flowlabel_opt()
Fixes: 18367681a1 ("ipv6 flowlabel: Convert np->ipv6_fl_list to RCU.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit d016dc1bd2 which is
commit d5bb334a8e upstream.
Lots of people have reported issues with this patch, and as there does
not seem to be a fix going into Linus's kernel tree any time soon,
revert the commit in the stable trees so as to get people's machines
working properly again.
Reported-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reported-by: Hans de Goede <hdegoede@redhat.com>
Cc: Jeremy Cline <jeremy@jcline.org>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f3e2bf008 upstream.
Some TCP peers announce a very small MSS option in their SYN and/or
SYN/ACK messages.
This forces the stack to send packets with a very high network/cpu
overhead.
Linux has enforced a minimal value of 48. Since this value includes
the size of TCP options, and that the options can consume up to 40
bytes, this means that each segment can include only 8 bytes of payload.
In some cases, it can be useful to increase the minimal value
to a saner value.
We still let the default to 48 (TCP_MIN_SND_MSS), for compatibility
reasons.
Note that TCP_MAXSEG socket option enforces a minimal value
of (TCP_MIN_MSS). David Miller increased this minimal value
in commit c39508d6f1 ("tcp: Make TCP_MAXSEG minimum more correct.")
from 64 to 88.
We might in the future merge TCP_MIN_SND_MSS and TCP_MIN_MSS.
CVE-2019-11479 -- tcp mss hardcoded to 48
Signed-off-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Jonathan Looney <jtl@netflix.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Bruce Curtis <brucec@netflix.com>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f070ef2ac6 upstream.
Jonathan Looney reported that a malicious peer can force a sender
to fragment its retransmit queue into tiny skbs, inflating memory
usage and/or overflow 32bit counters.
TCP allows an application to queue up to sk_sndbuf bytes,
so we need to give some allowance for non malicious splitting
of retransmit queue.
A new SNMP counter is added to monitor how many times TCP
did not allow to split an skb if the allowance was exceeded.
Note that this counter might increase in the case applications
use SO_SNDBUF socket option to lower sk_sndbuf.
CVE-2019-11478 : tcp_fragment, prevent fragmenting a packet when the
socket is already using more than half the allowed space
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jonathan Looney <jtl@netflix.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Cc: Bruce Curtis <brucec@netflix.com>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3b4929f65b upstream.
Jonathan Looney reported that TCP can trigger the following crash
in tcp_shifted_skb() :
BUG_ON(tcp_skb_pcount(skb) < pcount);
This can happen if the remote peer has advertized the smallest
MSS that linux TCP accepts : 48
An skb can hold 17 fragments, and each fragment can hold 32KB
on x86, or 64KB on PowerPC.
This means that the 16bit witdh of TCP_SKB_CB(skb)->tcp_gso_segs
can overflow.
Note that tcp_sendmsg() builds skbs with less than 64KB
of payload, so this problem needs SACK to be enabled.
SACK blocks allow TCP to coalesce multiple skbs in the retransmit
queue, thus filling the 17 fragments to maximal capacity.
CVE-2019-11477 -- u16 overflow of TCP_SKB_CB(skb)->tcp_gso_segs
Backport notes, provided by Joao Martins <joao.m.martins@oracle.com>
v4.15 or since commit 737ff31456 ("tcp: use sequence distance to
detect reordering") had switched from the packet-based FACK tracking and
switched to sequence-based.
v4.14 and older still have the old logic and hence on
tcp_skb_shift_data() needs to retain its original logic and have
@fack_count in sync. In other words, we keep the increment of pcount with
tcp_skb_pcount(skb) to later used that to update fack_count. To make it
more explicit we track the new skb that gets incremented to pcount in
@next_pcount, and we get to avoid the constant invocation of
tcp_skb_pcount(skb) all together.
Fixes: 832d11c5cd ("tcp: Try to restore large SKBs while SACK processing")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jonathan Looney <jtl@netflix.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Bruce Curtis <brucec@netflix.com>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f9fc54d313 upstream.
The return type for get_regs_len in struct ethtool_ops is int,
the hns3 driver may return error when failing to get the regs
len by sending cmd to firmware.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 720f1de402 ]
Currently, the process issuing a "start" command on the pktgen procfs
interface, acquires the pktgen thread lock and never release it, until
all pktgen threads are completed. The above can blocks indefinitely any
other pktgen command and any (even unrelated) netdevice removal - as
the pktgen netdev notifier acquires the same lock.
The issue is demonstrated by the following script, reported by Matteo:
ip -b - <<'EOF'
link add type dummy
link add type veth
link set dummy0 up
EOF
modprobe pktgen
echo reset >/proc/net/pktgen/pgctrl
{
echo rem_device_all
echo add_device dummy0
} >/proc/net/pktgen/kpktgend_0
echo count 0 >/proc/net/pktgen/dummy0
echo start >/proc/net/pktgen/pgctrl &
sleep 1
rmmod veth
Fix the above releasing the thread lock around the sleep call.
Additionally we must prevent racing with forcefull rmmod - as the
thread lock no more protects from them. Instead, acquire a self-reference
before waiting for any thread. As a side effect, running
rmmod pktgen
while some thread is running now fails with "module in use" error,
before this patch such command hanged indefinitely.
Note: the issue predates the commit reported in the fixes tag, but
this fix can't be applied before the mentioned commit.
v1 -> v2:
- no need to check for thread existence after flipping the lock,
pktgen threads are freed only at net exit time
-
Fixes: 6146e6a43b ("[PKTGEN]: Removes thread_{un,}lock() macros.")
Reported-and-tested-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 85cb928787 ]
When the following tests last for several hours, the problem will occur.
Server:
rds-stress -r 1.1.1.16 -D 1M
Client:
rds-stress -r 1.1.1.14 -s 1.1.1.16 -D 1M -T 30
The following will occur.
"
Starting up....
tsks tx/s rx/s tx+rx K/s mbi K/s mbo K/s tx us/c rtt us cpu
%
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
"
>From vmcore, we can find that clean_list is NULL.
>From the source code, rds_mr_flushd calls rds_ib_mr_pool_flush_worker.
Then rds_ib_mr_pool_flush_worker calls
"
rds_ib_flush_mr_pool(pool, 0, NULL);
"
Then in function
"
int rds_ib_flush_mr_pool(struct rds_ib_mr_pool *pool,
int free_all, struct rds_ib_mr **ibmr_ret)
"
ibmr_ret is NULL.
In the source code,
"
...
list_to_llist_nodes(pool, &unmap_list, &clean_nodes, &clean_tail);
if (ibmr_ret)
*ibmr_ret = llist_entry(clean_nodes, struct rds_ib_mr, llnode);
/* more than one entry in llist nodes */
if (clean_nodes->next)
llist_add_batch(clean_nodes->next, clean_tail, &pool->clean_list);
...
"
When ibmr_ret is NULL, llist_entry is not executed. clean_nodes->next
instead of clean_nodes is added in clean_list.
So clean_nodes is discarded. It can not be used again.
The workqueue is executed periodically. So more and more clean_nodes are
discarded. Finally the clean_list is NULL.
Then this problem will occur.
Fixes: 1bc144b625 ("net, rds, Replace xlist in net/rds/xlist.h with llist")
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4b2a2bfeb3 ]
Commit cd9ff4de01 changed the key for IFF_POINTOPOINT devices to
INADDR_ANY but neigh_xmit which is used for MPLS encapsulations was not
updated to use the altered key. The result is that every packet Tx does
a lookup on the gateway address which does not find an entry, a new one
is created only to find the existing one in the table right before the
insert since arp_constructor was updated to reset the primary key. This
is seen in the allocs and destroys counters:
ip -s -4 ntable show | head -10 | grep alloc
which increase for each packet showing the unnecessary overhread.
Fix by having neigh_xmit use __ipv4_neigh_lookup_noref for NEIGH_ARP_TABLE.
Fixes: cd9ff4de01 ("ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY")
Reported-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0ee4e76937 ]
ethtool_get_regs() allocates a buffer of size ops->get_regs_len(),
and pass it to the kernel driver via ops->get_regs() for filling.
There is no restriction about what the kernel drivers can or cannot do
with the open ethtool_regs structure. They usually set regs->version
and ignore regs->len or set it to the same size as ops->get_regs_len().
But if userspace allocates a smaller buffer for the registers dump,
we would cause a userspace buffer overflow in the final copy_to_user()
call, which uses the regs.len value potentially reset by the driver.
To fix this, make this case obvious and store regs.len before calling
ops->get_regs(), to only copy as much data as requested by userspace,
up to the value returned by ops->get_regs_len().
While at it, remove the redundant check for non-null regbuf.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2b16f04872 upstream.
If you take a GSO skb, and split it into packets, will the MAC
length (L2 + L3 + L4 headers + payload) of those packets be small
enough to fit within a given length?
Move skb_gso_mac_seglen() to skbuff.h with other related functions
like skb_gso_network_seglen() so we can use it, and then create
skb_gso_validate_mac_len to do the full calculation.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 4.4: There is no GSO_BY_FRAGS case to handle, so
skb_gso_validate_mac_len() becomes a trivial comparison. Put it inline in
<linux/skbuff.h>.]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 526f5b851a upstream.
Error message printed:
modprobe: ERROR: could not insert 'tipc': Address family not
supported by protocol.
when modprobe tipc after the following patch: switch order of
device registration, commit 7e27e8d613
("tipc: switch order of device registration to fix a crash")
Because sock_create_kern(net, AF_TIPC, ...) called by
tipc_topsrv_create_listener() in the initialization process
of tipc_init_net(), so tipc_socket_init() must be execute before that.
Meanwhile, tipc_net_id need to be initialized when sock_create()
called, and tipc_socket_init() is no need to be called for each namespace.
I add a variable tipc_topsrv_net_ops, and split the
register_pernet_subsys() of tipc into two parts, and split
tipc_socket_init() with initialization of pernet params.
By the way, I fixed resources rollback error when tipc_bcast_init()
failed in tipc_init_net().
Fixes: 7e27e8d613 ("tipc: switch order of device registration to fix a crash")
Signed-off-by: Junwei Hu <hujunwei4@huawei.com>
Reported-by: Wang Wang <wangwang2@huawei.com>
Reported-by: syzbot+1e8114b61079bfe9cbc5@syzkaller.appspotmail.com
Reviewed-by: Kang Zhou <zhoukang7@huawei.com>
Reviewed-by: Suanming Mou <mousuanming@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a4270d6795 ]
If a network driver provides to napi_gro_frags() an
skb with a page fragment of exactly 14 bytes, the call
to gro_pull_from_frag0() will 'consume' the fragment
by calling skb_frag_unref(skb, 0), and the page might
be freed and reused.
Reading eth->h_proto at the end of napi_frags_skb() might
read mangled data, or crash under specific debugging features.
BUG: KASAN: use-after-free in napi_frags_skb net/core/dev.c:5833 [inline]
BUG: KASAN: use-after-free in napi_gro_frags+0xc6f/0xd10 net/core/dev.c:5841
Read of size 2 at addr ffff88809366840c by task syz-executor599/8957
CPU: 1 PID: 8957 Comm: syz-executor599 Not tainted 5.2.0-rc1+ #32
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
print_address_description.cold+0x7c/0x20d mm/kasan/report.c:188
__kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
kasan_report+0x12/0x20 mm/kasan/common.c:614
__asan_report_load_n_noabort+0xf/0x20 mm/kasan/generic_report.c:142
napi_frags_skb net/core/dev.c:5833 [inline]
napi_gro_frags+0xc6f/0xd10 net/core/dev.c:5841
tun_get_user+0x2f3c/0x3ff0 drivers/net/tun.c:1991
tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2037
call_write_iter include/linux/fs.h:1872 [inline]
do_iter_readv_writev+0x5f8/0x8f0 fs/read_write.c:693
do_iter_write fs/read_write.c:970 [inline]
do_iter_write+0x184/0x610 fs/read_write.c:951
vfs_writev+0x1b3/0x2f0 fs/read_write.c:1015
do_writev+0x15b/0x330 fs/read_write.c:1058
Fixes: a50e233c50 ("net-gro: restore frag0 optimization")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 72f7cfab6f ]
IPv6 does not consider if the socket is bound to a device when binding
to an address. The result is that a socket can be bound to eth0 and
then bound to the address of eth1. If the device is a VRF, the result
is that a socket can only be bound to an address in the default VRF.
Resolve by considering the device if sk_bound_dev_if is set.
Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5dc8cdce1d ]
FullMAC STAs have no way to update bss channel after CSA channel switch
completion. As a result, user-space tools may provide inconsistent
channel info. For instance, consider the following two commands:
$ sudo iw dev wlan0 link
$ sudo iw dev wlan0 info
The latter command gets channel info from the hardware, so most probably
its output will be correct. However the former command gets channel info
from scan cache, so its output will contain outdated channel info.
In fact, current bss channel info will not be updated until the
next [re-]connect.
Note that mac80211 STAs have a workaround for this, but it requires
access to internal cfg80211 data, see ieee80211_chswitch_work:
/* XXX: shouldn't really modify cfg80211-owned data! */
ifmgd->associated->channel = sdata->csa_chandef.chan;
This patch suggests to convert mac80211 workaround into cfg80211 behavior
and to update current bss channel in cfg80211_ch_switch_notify.
Signed-off-by: Sergey Matyukevich <sergey.matyukevich.os@quantenna.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8742dc86d0 ]
We currently don't reload pointers pointing into skb header
after doing pskb_may_pull() in _decode_session4(). So in case
pskb_may_pull() changed the pointers, we read from random
memory. Fix this by putting all the needed infos on the
stack, so that we don't need to access the header pointers
after doing pskb_may_pull().
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5483844c3f ]
If tunnel registration failed during module initialization, the module
would fail to deregister the IPPROTO_COMP protocol and would attempt to
deregister the tunnel.
The tunnel was not deregistered during module-exit.
Fixes: dd9ee34440 ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel")
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6ee02a54ef ]
When unloading xfrm6_tunnel module, xfrm6_tunnel_fini directly
frees the xfrm6_tunnel_spi_kmem. Maybe someone has gotten the
xfrm6_tunnel_spi, so need to wait it.
Fixes: 91cc3bb0b04ff("xfrm6_tunnel: RCU conversion")
Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>