54355 Commits

Author SHA1 Message Date
Florian Westphal
2294be0f11 net: use skb_sec_path helper in more places
skb_sec_path gains 'const' qualifier to avoid
xt_policy.c: 'skb_sec_path' discards 'const' qualifier from pointer target type

same reasoning as previous conversions: Won't need to touch these
spots anymore when skb->sp is removed.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:21:37 -08:00
Florian Westphal
7af8f4ca31 net: move secpath_exist helper to sk_buff.h
Future patch will remove skb->sp pointer.
To reduce noise in those patches, move existing helper to
sk_buff and use it in more places to ease skb->sp replacement later.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:21:37 -08:00
Florian Westphal
0ca64da128 xfrm: change secpath_set to return secpath struct, not error value
It can only return 0 (success) or -ENOMEM.
Change return value to a pointer to secpath struct.

This avoids direct access to skb->sp:

err = secpath_set(skb);
if (!err) ..
skb->sp-> ...

Becomes:
sp = secpath_set(skb)
if (!sp) ..
sp-> ..

This reduces noise in followup patch which is going to remove skb->sp.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:21:37 -08:00
Florian Westphal
de8bda1d22 net: convert bridge_nf to use skb extension infrastructure
This converts the bridge netfilter (calling iptables hooks from bridge)
facility to use the extension infrastructure.

The bridge_nf specific hooks in skb clone and free paths are removed, they
have been replaced by the skb_ext hooks that do the same as the bridge nf
allocations hooks did.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:21:37 -08:00
Florian Westphal
df5042f4c5 sk_buff: add skb extension infrastructure
This adds an optional extension infrastructure, with ispec (xfrm) and
bridge netfilter as first users.
objdiff shows no changes if kernel is built without xfrm and br_netfilter
support.

The third (planned future) user is Multipath TCP which is still
out-of-tree.
MPTCP needs to map logical mptcp sequence numbers to the tcp sequence
numbers used by individual subflows.

This DSS mapping is read/written from tcp option space on receive and
written to tcp option space on transmitted tcp packets that are part of
and MPTCP connection.

Extending skb_shared_info or adding a private data field to skb fclones
doesn't work for incoming skb, so a different DSS propagation method would
be required for the receive side.

mptcp has same requirements as secpath/bridge netfilter:

1. extension memory is released when the sk_buff is free'd.
2. data is shared after cloning an skb (clone inherits extension)
3. adding extension to an skb will COW the extension buffer if needed.

The "MPTCP upstreaming" effort adds SKB_EXT_MPTCP extension to store the
mapping for tx and rx processing.

Two new members are added to sk_buff:
1. 'active_extensions' byte (filling a hole), telling which extensions
   are available for this skb.
   This has two purposes.
   a) avoids the need to initialize the pointer.
   b) allows to "delete" an extension by clearing its bit
   value in ->active_extensions.

   While it would be possible to store the active_extensions byte
   in the extension struct instead of sk_buff, there is one problem
   with this:
    When an extension has to be disabled, we can always clear the
    bit in skb->active_extensions.  But in case it would be stored in the
    extension buffer itself, we might have to COW it first, if
    we are dealing with a cloned skb.  On kmalloc failure we would
    be unable to turn an extension off.

2. extension pointer, located at the end of the sk_buff.
   If the active_extensions byte is 0, the pointer is undefined,
   it is not initialized on skb allocation.

This adds extra code to skb clone and free paths (to deal with
refcount/free of extension area) but this replaces similar code that
manages skb->nf_bridge and skb->sp structs in the followup patches of
the series.

It is possible to add support for extensions that are not preseved on
clones/copies.

To do this, it would be needed to define a bitmask of all extensions that
need copy/cow semantics, and change __skb_ext_copy() to check
->active_extensions & SKB_EXT_PRESERVE_ON_CLONE, then just set
->active_extensions to 0 on the new clone.

This isn't done here because all extensions that get added here
need the copy/cow semantics.

v2:
Allocate entire extension space using kmem_cache.
Upside is that this allows better tracking of used memory,
downside is that we will allocate more space than strictly needed in
most cases (its unlikely that all extensions are active/needed at same
time for same skb).
The allocated memory (except the small extension header) is not cleared,
so no additonal overhead aside from memory usage.

Avoid atomic_dec_and_test operation on skb_ext_put()
by using similar trick as kfree_skbmem() does with fclone_ref:
If recount is 1, there is no concurrent user and we can free right away.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:21:37 -08:00
Florian Westphal
c4b0e771f9 netfilter: avoid using skb->nf_bridge directly
This pointer is going to be removed soon, so use the existing helpers in
more places to avoid noise when the removal happens.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:21:37 -08:00
NeilBrown
04d1532bd0 SUNRPC discard cr_uid from struct rpc_cred.
Just use ->cr_cred->fsuid directly.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:46 -05:00
NeilBrown
2edd8d746e SUNRPC: simplify auth_unix.
1/ discard 'struct unx_cred'.  We don't need any data that
   is not already in 'struct rpc_cred'.
2/ Don't keep these creds in a hash table.  When a credential
   is needed, simply allocate it.  When not needed, discard it.
   This can easily be faster than performing a lookup on
   a shared hash table.
   As the lookup can happen during write-out, use a mempool
   to ensure forward progress.
   This means that we cannot compare two credentials for
   equality by comparing the pointers, but we never do that anyway.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:46 -05:00
NeilBrown
d6efccd97e SUNRPC: remove crbind rpc_cred operation
This now always just does get_rpccred(), so we
don't need an operation pointer to know to do that.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:46 -05:00
NeilBrown
89a4f758d9 SUNRPC: remove generic cred code.
This is no longer used.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:46 -05:00
NeilBrown
a52458b48a NFS/NFSD/SUNRPC: replace generic creds with 'struct cred'.
SUNRPC has two sorts of credentials, both of which appear as
"struct rpc_cred".
There are "generic credentials" which are supplied by clients
such as NFS and passed in 'struct rpc_message' to indicate
which user should be used to authorize the request, and there
are low-level credentials such as AUTH_NULL, AUTH_UNIX, AUTH_GSS
which describe the credential to be sent over the wires.

This patch replaces all the generic credentials by 'struct cred'
pointers - the credential structure used throughout Linux.

For machine credentials, there is a special 'struct cred *' pointer
which is statically allocated and recognized where needed as
having a special meaning.  A look-up of a low-level cred will
map this to a machine credential.

Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:46 -05:00
NeilBrown
354698b7d4 SUNRPC: remove RPCAUTH_AUTH_NO_CRKEY_TIMEOUT
This is no longer used.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:45 -05:00
NeilBrown
ddf529eeed NFS: move credential expiry tracking out of SUNRPC into NFS.
NFS needs to know when a credential is about to expire so that
it can modify write-back behaviour to finish the write inside the
expiry time.
It currently uses functions in SUNRPC code which make use of a
fairly complex callback scheme and flags in the generic credientials.

As I am working to discard the generic credentials, this has to change.

This patch moves the logic into NFS, in part by finding and caching
the low-level credential in the open_context.  We then make direct
cred-api calls on that.

This makes the code much simpler and removes a dependency on generic
rpc credentials.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:45 -05:00
NeilBrown
1de7eea929 SUNRPC: add side channel to use non-generic cred for rpc call.
The credential passed in rpc_message.rpc_cred is always a
generic credential except in one instance.
When gss_destroying_context() calls rpc_call_null(), it passes
a specific credential that it needs to destroy.
In this case the RPC acts *on* the credential rather than
being authorized by it.

This special case deserves explicit support and providing that will
mean that rpc_message.rpc_cred is *always* generic, allowing
some optimizations.

So add "tk_op_cred" to rpc_task and "rpc_op_cred" to the setup data.
Use this to pass the cred down from rpc_call_null(), and have
rpcauth_bindcred() notice it and bind it in place.

Credit to kernel test robot <fengguang.wu@intel.com> for finding
a bug in earlier version of this patch.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:45 -05:00
NeilBrown
a68a72e135 SUNRPC: introduce RPC_TASK_NULLCREDS to request auth_none
In almost all cases the credential stored in rpc_message.rpc_cred
is a "generic" credential.  One of the two expections is when an
AUTH_NULL credential is used such as for RPC ping requests.

To improve consistency, don't pass an explicit credential in
these cases, but instead pass NULL and set a task flag,
similar to RPC_TASK_ROOTCREDS, which requests that NULL credentials
be used by default.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:45 -05:00
NeilBrown
5e16923b43 NFS/SUNRPC: don't lookup machine credential until rpcauth_bindcred().
When NFS creates a machine credential, it is a "generic" credential,
not tied to any auth protocol, and is really just a container for
the princpal name.
This doesn't get linked to a genuine credential until rpcauth_bindcred()
is called.
The lookup always succeeds, so various places that test if the machine
credential is NULL, are pointless.

As a step towards getting rid of generic credentials, this patch gets
rid of generic machine credentials.  The nfs_client and rpc_client
just hold a pointer to a constant principal name.
When a machine credential is wanted, a special static 'struct rpc_cred'
pointer is used. rpcauth_bindcred() recognizes this, finds the
principal from the client, and binds the correct credential.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:45 -05:00
NeilBrown
1a80810fbf SUNRPC: remove machine_cred field from struct auth_cred
The cred is a machine_cred iff ->principal is set, so there is no
need for the extra flag.

There is one case which deserves some
explanation. nfs4_root_machine_cred() calls rpc_lookup_machine_cred()
with a NULL principal name which results in not getting a machine
credential, but getting a root credential instead.
This appears to be what is expected of the caller, and is
clearly the result provided by both auth_unix and auth_gss
which already ignore the flag.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:45 -05:00
NeilBrown
8276c902bb SUNRPC: remove uid and gid from struct auth_cred
Use cred->fsuid and cred->fsgid instead.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:45 -05:00
NeilBrown
fc0664fd9b SUNRPC: remove groupinfo from struct auth_cred.
We can use cred->groupinfo (from the 'struct cred') instead.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:45 -05:00
NeilBrown
97f68c6b02 SUNRPC: add 'struct cred *' to auth_cred and rpc_cred
The SUNRPC credential framework was put together before
Linux has 'struct cred'.  Now that we have it, it makes sense to
use it.
This first step just includes a suitable 'struct cred *' pointer
in every 'struct auth_cred' and almost every 'struct rpc_cred'.

The rpc_cred used for auth_null has a NULL 'struct cred *' as nothing
else really makes sense.

For rpc_cred, the pointer is reference counted.
For auth_cred it isn't.  struct auth_cred are either allocated on
the stack, in which case the thread owns a reference to the auth,
or are part of 'struct generic_cred' in which case gc_base owns the
reference, and "acred" shares it.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:44 -05:00
Ben Dooks
8e2e5b7c49 SUNRPC: allow /proc entries without CONFIG_SUNRPC_DEBUG
If we want /proc/sys/sunrpc the current kernel also drags in other debug
features which we don't really want. Instead, we should always show the
following entries:

/proc/sys/sunrpc/udp_slot_table_entries
/proc/sys/sunrpc/tcp_slot_table_entries
/proc/sys/sunrpc/tcp_max_slot_table_entries
/proc/sys/sunrpc/min_resvport
/proc/sys/sunrpc/max_resvport
/proc/sys/sunrpc/tcp_fin_timeout

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Thomas Preston <thomas.preston@codethink.co.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:44 -05:00
shamir rabinovitch
c75ab8a55a net/rds: remove user triggered WARN_ON in rds_sendmsg
per comment from Leon in rdma mailing list
https://lkml.org/lkml/2018/10/31/312 :

Please don't forget to remove user triggered WARN_ON.
https://lwn.net/Articles/769365/
"Greg Kroah-Hartman raised the problem of core kernel API code that will
use WARN_ON_ONCE() to complain about bad usage; that will not generate
the desired result if WARN_ON_ONCE() is configured to crash the machine.
He was told that the code should just call pr_warn() instead, and that
the called function should return an error in such situations. It was
generally agreed that any WARN_ON() or WARN_ON_ONCE() calls that can be
triggered from user space need to be fixed."

in addition harden rds_sendmsg to detect and overcome issues with
invalid sg count and fail the sendmsg.

Suggested-by: Leon Romanovsky <leon@kernel.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: shamir rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 10:27:58 -08:00
shamir rabinovitch
ea010070d0 net/rds: fix warn in rds_message_alloc_sgs
redundant copy_from_user in rds_sendmsg system call expose rds
to issue where rds_rdma_extra_size walk the rds iovec and and
calculate the number pf pages (sgs) it need to add to the tail of
rds message and later rds_cmsg_rdma_args copy the rds iovec again
and re calculate the same number and get different result causing
WARN_ON in rds_message_alloc_sgs.

fix this by doing the copy_from_user only once per rds_sendmsg
system call.

When issue occur the below dump is seen:

WARNING: CPU: 0 PID: 19789 at net/rds/message.c:316 rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 19789 Comm: syz-executor827 Not tainted 4.19.0-next-20181030+ #101
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x244/0x39d lib/dump_stack.c:113
 panic+0x2ad/0x55c kernel/panic.c:188
 __warn.cold.8+0x20/0x45 kernel/panic.c:540
 report_bug+0x254/0x2d0 lib/bug.c:186
 fixup_bug arch/x86/kernel/traps.c:178 [inline]
 do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
 do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:969
RIP: 0010:rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316
Code: c0 74 04 3c 03 7e 6c 44 01 ab 78 01 00 00 e8 2b 9e 35 fa 4c 89 e0 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 14 9e 35 fa <0f> 0b 31 ff 44 89 ee e8 18 9f 35 fa 45 85 ed 75 1b e8 fe 9d 35 fa
RSP: 0018:ffff8801c51b7460 EFLAGS: 00010293
RAX: ffff8801bc412080 RBX: ffff8801d7bf4040 RCX: ffffffff8749c9e6
RDX: 0000000000000000 RSI: ffffffff8749ca5c RDI: 0000000000000004
RBP: ffff8801c51b7490 R08: ffff8801bc412080 R09: ffffed003b5c5b67
R10: ffffed003b5c5b67 R11: ffff8801dae2db3b R12: 0000000000000000
R13: 000000000007165c R14: 000000000007165c R15: 0000000000000005
 rds_cmsg_rdma_args+0x82d/0x1510 net/rds/rdma.c:623
 rds_cmsg_send net/rds/send.c:971 [inline]
 rds_sendmsg+0x19a2/0x3180 net/rds/send.c:1273
 sock_sendmsg_nosec net/socket.c:622 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:632
 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2117
 __sys_sendmsg+0x11d/0x280 net/socket.c:2155
 __do_sys_sendmsg net/socket.c:2164 [inline]
 __se_sys_sendmsg net/socket.c:2162 [inline]
 __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2162
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x44a859
Code: e8 dc e6 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 6b cb fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f1d4710ada8 EFLAGS: 00000297 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000006dcc28 RCX: 000000000044a859
RDX: 0000000000000000 RSI: 0000000020001600 RDI: 0000000000000003
RBP: 00000000006dcc20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000297 R12: 00000000006dcc2c
R13: 646e732f7665642f R14: 00007f1d4710b9c0 R15: 00000000006dcd2c
Kernel Offset: disabled
Rebooting in 86400 seconds..

Reported-by: syzbot+26de17458aeda9d305d8@syzkaller.appspotmail.com
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: shamir rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 10:27:58 -08:00
David S. Miller
29d3c047b7 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2018-12-19

Here's the main bluetooth-next pull request for 4.21:

 - Multiple fixes & improvements for Broadcom-based controllers
 - New USB ID for an Intel controller
 - Support for new Broadcom controller variants
 - Use DEFINE_SHOW_ATTRIBUTE to simplify debugfs code
 - Eliminate confusing "last event is not cmd complete" warning message
 - Added vendor suspend/resume support for H:5 (3-Wire UART) controllers
 - Various other smaller improvements & fixes

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 08:41:45 -08:00
David S. Miller
5a862f86b8 This time we have too many changes to list, highlights:
* virt_wifi - wireless control simulation on top of
    another network interface
  * hwsim configurability to test capabilities similar
    to real hardware
  * various mesh improvements
  * various radiotap vendor data fixes in mac80211
  * finally the nl_set_extack_cookie_u64() we talked
    about previously, used for
  * peer measurement APIs, right now only with FTM
    (flight time measurement) for location
  * made nl80211 radio/interface announcements more complete
  * various new HE (802.11ax) things:
    updates, TWT support, ...
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlwaCwkACgkQB8qZga/f
 l8S7mA/+I1CJmGC7Pvy+SBFkzoY5zEjjzgZYL6sGo16qMs89NPcURSe5j+uCsDP3
 nKEjsvhQMYDfGNLTJJfWbDpGwm9LnKp69AFITlvfzmP6Sm36QMZr7oIC4abi8cW4
 osaO3qfdaNoZ//x72jgjrFhUAnphvT2BsRVMNEjz7sXcDd7Jm9NnpRhV8zgXFvLF
 dS2Ng51LM/BLMz5jQpyJUDZeeL/iBYybCecyckmVqzXPh1icIZETSqZXiN4ngv2A
 6p9BSGNtP6wmjnbkvZz5RDq76VhTPZWsTgTpVb45Wf1k2fm1rB96UgpqvfQtjTgB
 +7Zx2WRpMXM5OjGkwaEs8nawFmt7MHCGnhLPLWPCbXc685fhp3OFShysMJdYS/GZ
 IIRJ7+IchAQX1yluftB+NkQM9sBDjyseMBwxHRYkj/rQVhoLY1sT+ke7lkuV10o6
 DQqfpUTZAsIz7zkuscn7hkNdI/Rjub6BZjbrs1Jt9zSt9WQUBao23XudOI0j5JDa
 ErnfC5PISXMQWik5B9M1Zhq3H9qCI2Swh19lMmtxtSDQ9yrLrJkEJ5SA+aHoxNHj
 wSxBc3XXSW47qPXGX/D5DNnbOcOrE7kVZuD8YqRsy8VedyjIgEw7oQ21flAD4FC4
 R4TgbNkqpfZQsU29gaMkDkYXnfQDB/G9FOk6ARGxjBPjT55Hz0E=
 =EpyK
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2018-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
This time we have too many changes to list, highlights:
 * virt_wifi - wireless control simulation on top of
   another network interface
 * hwsim configurability to test capabilities similar
   to real hardware
 * various mesh improvements
 * various radiotap vendor data fixes in mac80211
 * finally the nl_set_extack_cookie_u64() we talked
   about previously, used for
 * peer measurement APIs, right now only with FTM
   (flight time measurement) for location
 * made nl80211 radio/interface announcements more complete
 * various new HE (802.11ax) things:
   updates, TWT support, ...
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 08:36:18 -08:00
David S. Miller
49ce708be6 Just three fixes:
* fix a memory leak in an error path
  * fix TXQs in interface teardown
  * free fraglist if we used it internally
    before returning SKB
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlwaBJAACgkQB8qZga/f
 l8QvBg/9ESMGPa4F+AdXfvwlDWHkb3fBuv8E1HAkiDiV3G0eyziOwqIo4mSD49fT
 tSM+3AFBe9D97O0Y669qnKrcqSWVpGY31hI2MmxskcwwJ+BVgf9M33GjvHa548Bt
 c3DEKVLDIfOfQG9+LySeNNz8kRipqIe11WDjpoHZ1i2LtV/uEGOC+cSAxTRwqxpG
 HyjF2GBz24PVz69eGNODQNIWgIkkGCil8b8BCED94jUio015t7H+b3hzWDDC83m1
 pJwK/Uq4UkJzpjQykSf9805blB0h7PdXo8CSrNNw9+PnEVcvu+81IqD0q5VfZL1a
 CePmM43ud59kx1DtPQ0LB4XSIJX7PPiMskK9ZoPUduDvOsaJU32gFmK/Rvm45Xdt
 8mQrZM8EsBLCFq8M/Gk89QadXByItUD8B6jZcJgMEddUOa5QPnCaodTQqqO2I8ky
 T76FXRegTOMFeGhr/NrWAw3xxCY8TZqwB4P9F4juoKCa9Wz0b1dbIrD7nx1SajoA
 jhb795CTczMKCzFywNIh97fKa2yp0YQO0/EQDOFrbYxEyQozSr+cudU4EBN6B08i
 A4rVRAAPTlwfFz5TSr7gvT/JHlsku+kfjFAY1tmSWxLgWxFJb3cc0Uj1ak05bmv4
 fdANk9g81U4W/bUBKqmeIIrYW0icXHtGZBLX0Khpcx3KQSH11vw=
 =X7vr
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-davem-2018-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
Just three fixes:
 * fix a memory leak in an error path
 * fix TXQs in interface teardown
 * free fraglist if we used it internally
   before returning SKB
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 08:34:46 -08:00
Johan Hedberg
1629db9c75 Bluetooth: Fix unnecessary error message for HCI request completion
In case a command which completes in Command Status was sent using the
hci_cmd_send-family of APIs there would be a misleading error in the
hci_get_cmd_complete function, since the code would be trying to fetch
the Command Complete parameters when there are none.

Avoid the misleading error and silently bail out from the function in
case the received event is a command status.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-12-19 14:37:03 +01:00
YueHaibing
fa89a4593b xfrm6_tunnel: Fix spi check in __xfrm6_tunnel_alloc_spi
gcc warn this:

net/ipv6/xfrm6_tunnel.c:143 __xfrm6_tunnel_alloc_spi() warn:
 always true condition '(spi <= 4294967295) => (0-u32max <= u32max)'

'spi' is u32, which always not greater than XFRM6_TUNNEL_SPI_MAX
because of wrap around. So the second forloop will never reach.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-12-19 12:33:17 +01:00
YueHaibing
cc4acb1b6a xfrm: policy: remove set but not used variable 'priority'
Fixes gcc '-Wunused-but-set-variable' warning:

net/xfrm/xfrm_policy.c: In function 'xfrm_policy_lookup_bytype':
net/xfrm/xfrm_policy.c:2079:6: warning:
 variable 'priority' set but not used [-Wunused-but-set-variable]

It not used since commit 6be3b0db6db8 ("xfrm: policy: add inexact policy
search tree infrastructure")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-12-19 12:24:43 +01:00
Ilan Peer
d359bbce06 mac80211: Properly access radiotap vendor data
The radiotap vendor data might be placed after some other
radiotap elements, and thus when accessing it, need to access
the correct offset in the skb data. Fix the code accordingly.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-12-19 09:41:10 +01:00
Johannes Berg
93bc8ac49e cfg80211: fix ieee80211_get_vht_max_nss()
Fix two bugs in ieee80211_get_vht_max_nss():
 * the spec says we should round down
   (reported by Nissim)
 * there's a double condition, the first one is wrong,
   supp_width == 0 / ext_nss_bw == 2 is valid in 80+80
   (found by smatch)

Fixes: b0aa75f0b1b2 ("ieee80211: add new VHT capability fields/parsing")
Reported-by: Nissim Bendanan <nissimx.bendanan@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-12-19 09:41:10 +01:00
Sara Sharon
34b1e0e9ef mac80211: free skb fraglist before freeing the skb
mac80211 uses the frag list to build AMSDU. When freeing
the skb, it may not be really freed, since someone is still
holding a reference to it.
In that case, when TCP skb is being retransmitted, the
pointer to the frag list is being reused, while the data
in there is no longer valid.
Since we will never get frag list from the network stack,
as mac80211 doesn't advertise the capability, we can safely
free and nullify it before releasing the SKB.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-12-19 09:40:17 +01:00
Johannes Berg
d350a0f431 nl80211: fix memory leak if validate_pae_over_nl80211() fails
If validate_pae_over_nl80211() were to fail in nl80211_crypto_settings(),
we might leak the 'connkeys' allocation. Fix this.

Fixes: 64bf3d4bc2b0 ("nl80211: Add CONTROL_PORT_OVER_NL80211 attribute")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-12-19 09:40:17 +01:00
Johannes Berg
efc38dd7d5 mac80211: fix radiotap vendor presence bitmap handling
Due to the alignment handling, it actually matters where in the code
we add the 4 bytes for the presence bitmap to the length; the first
field is the timestamp with 8 byte alignment so we need to add the
space for the extra vendor namespace presence bitmap *before* we do
any alignment for the fields.

Move the presence bitmap length accounting to the right place to fix
the alignment for the data properly.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-12-19 09:39:01 +01:00
Myungho Jung
78abe3d0df net/smc: fix TCP fallback socket release
clcsock can be released while kernel_accept() references it in TCP
listen worker. Also, clcsock needs to wake up before released if TCP
fallback is used and the clcsock is blocked by accept. Add a lock to
safely release clcsock and call kernel_sock_shutdown() to wake up
clcsock from accept in smc_release().

Reported-by: syzbot+0bf2e01269f1274b4b03@syzkaller.appspotmail.com
Reported-by: syzbot+e3132895630f957306bc@syzkaller.appspotmail.com
Signed-off-by: Myungho Jung <mhjungk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 22:02:51 -08:00
Zhenbo Gao
5679ee784c tipc: handle broadcast NAME_DISTRIBUTOR packet when receiving it
NAME_DISTRIBUTOR messages are transmitted through unicast link on TIPC
2.0, by contrast, the messages are delivered through broadcast link on
TIPC 1.7. But at present, NAME_DISTRIBUTOR messages received by
broadcast link cannot be handled in tipc_rcv() until an unicast message
arrives, which may lead to a significant delay to update name table.

To avoid this delay, we will also deal with broadcast NAME_DISTRIBUTOR
message on broadcast receive path.

Signed-off-by: Zhenbo Gao <zhenbo.gao@windriver.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 21:50:48 -08:00
YueHaibing
a26d94bff4 net: bridge: remove unneeded variable 'err'
function br_multicast_toggle now always return 0,
so the variable 'err' is unneeded.
Also cleanup dead branch in br_changelink.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 15:50:13 -08:00
Cong Wang
3c6306d440 tipc: check group dests after tipc_wait_for_cond()
Similar to commit 143ece654f9f ("tipc: check tsk->group in tipc_wait_for_cond()")
we have to reload grp->dests too after we re-take the sock lock.
This means we need to move the dsts check after tipc_wait_for_cond()
too.

Fixes: 75da2163dbb6 ("tipc: introduce communication groups")
Reported-and-tested-by: syzbot+99f20222fc5018d2b97a@syzkaller.appspotmail.com
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 15:44:23 -08:00
Colin Ian King
75edd1f2f9 Bluetooth: clean an indentation issue, remove extraneous space
Trivial fix to clean up an indentation issue

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-12-19 00:44:01 +01:00
Yangtao Li
8e2924e383 Bluetooth: Change to use DEFINE_SHOW_ATTRIBUTE macro
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-12-19 00:28:20 +01:00
Yangtao Li
f79ba43002 6lowpan: convert to DEFINE_SHOW_ATTRIBUTE
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-12-19 00:28:05 +01:00
John Fastabend
3bdbd0228e bpf: sockmap, metadata support for reporting size of msg
This adds metadata to sk_msg_md for BPF programs to read the sk_msg
size.

When the SK_MSG program is running under an application that is using
sendfile the data is not copied into sk_msg buffers by default. Rather
the BPF program uses sk_msg_pull_data to read the bytes in. This
avoids doing the costly memcopy instructions when they are not in
fact needed. However, if we don't know the size of the sk_msg we
have to guess if needed bytes are available by doing a pull request
which may fail. By including the size of the sk_msg BPF programs can
check the size before issuing sk_msg_pull_data requests.

Additionally, the same applies for sendmsg calls when the application
provides multiple iovs. Here the BPF program needs to pull in data
to update data pointers but its not clear where the data ends without
a size parameter. In many cases "guessing" is not easy to do
and results in multiple calls to pull and without bounded loops
everything gets fairly tricky.

Clean this up by including a u32 size field. Note, all writes into
sk_msg_md are rejected already from sk_msg_is_valid_access so nothing
additional is needed there.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-19 00:27:23 +01:00
Jorgen Hansen
a915b982d8 VSOCK: Send reset control packet when socket is partially bound
If a server side socket is bound to an address, but not in the listening
state yet, incoming connection requests should receive a reset control
packet in response. However, the function used to send the reset
silently drops the reset packet if the sending socket isn't bound
to a remote address (as is the case for a bound socket not yet in
the listening state). This change fixes this by using the src
of the incoming packet as destination for the reset packet in
this case.

Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Reviewed-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Vishnu Dasa <vdasa@vmware.com>
Signed-off-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 11:53:42 -08:00
David S. Miller
fde9cd69a5 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2018-12-18

1) Fix error return code in xfrm_output_one()
   when no dst_entry is attached to the skb.
   From Wei Yongjun.

2) The xfrm state hash bucket count reported to
   userspace is off by one. Fix from Benjamin Poirier.

3) Fix NULL pointer dereference in xfrm_input when
   skb_dst_force clears the dst_entry.

4) Fix freeing of xfrm states on acquire. We use a
   dedicated slab cache for the xfrm states now,
   so free it properly with kmem_cache_free.
   From Mathias Krause.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 11:43:26 -08:00
David S. Miller
77c7a7b3e7 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2018-12-18

1) Add xfrm policy selftest scripts.
   From Florian Westphal.

2) Split inexact policies into four different search list
   classes and use the rbtree infrastructure to store/lookup
   the policies. This is to improve the policy lookup
   performance after the flowcache removal.
   Patches from Florian Westphal.

3) Various coding style fixes, from Colin Ian King.

4) Fix policy lookup logic after adding the inexact policy
   search tree infrastructure. From Florian Westphal.

5) Remove a useless remove BUG_ON from xfrm6_dst_ifdown.
   From Li RongQing.

6) Use the correct policy direction for lookups on hash
   rebuilding. From Florian Westphal.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 08:49:48 -08:00
Trond Myklebust
abc1327577 SUNRPC: Remove xprt_connect_status()
Over the years, xprt_connect_status() has been superseded by
call_connect_status(), which now handles all the errors that
xprt_connect_status() does and more. Since the latter converts
all errors that it doesn't recognise to EIO, then it is time
for it to be retired.

Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
2018-12-18 11:04:10 -05:00
Trond Myklebust
cf76785d30 SUNRPC: Fix a race with XPRT_CONNECTING
Ensure that we clear XPRT_CONNECTING before releasing the XPRT_LOCK so that
we don't have races between the (asynchronous) socket setup code and
tasks in xprt_connect().

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
2018-12-18 11:04:03 -05:00
Trond Myklebust
0445f92c5d SUNRPC: Fix disconnection races
When the socket is closed, we need to call xprt_disconnect_done() in order
to clean up the XPRT_WRITE_SPACE flag, and wake up the sleeping tasks.

However, we also want to ensure that we don't wake them up before the socket
is closed, since that would cause thundering herd issues with everyone
piling up to retransmit before the TCP shutdown dance has completed.
Only the task that holds XPRT_LOCKED needs to wake up early in order to
allow the close to complete.

Reported-by: Dave Wysochanski <dwysocha@redhat.com>
Reported-by: Scott Mayhew <smayhew@redhat.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
2018-12-18 11:03:57 -05:00
Arnd Bergmann
e11d4284e2 y2038: socket: Add compat_sys_recvmmsg_time64
recvmmsg() takes two arguments to pointers of structures that differ
between 32-bit and 64-bit architectures: mmsghdr and timespec.

For y2038 compatbility, we are changing the native system call from
timespec to __kernel_timespec with a 64-bit time_t (in another patch),
and use the existing compat system call on both 32-bit and 64-bit
architectures for compatibility with traditional 32-bit user space.

As we now have two variants of recvmmsg() for 32-bit tasks that are both
different from the variant that we use on 64-bit tasks, this means we
also require two compat system calls!

The solution I picked is to flip things around: The existing
compat_sys_recvmmsg() call gets moved from net/compat.c into net/socket.c
and now handles the case for old user space on all architectures that
have set CONFIG_COMPAT_32BIT_TIME.  A new compat_sys_recvmmsg_time64()
call gets added in the old place for 64-bit architectures only, this
one handles the case of a compat mmsghdr structure combined with
__kernel_timespec.

In the indirect sys_socketcall(), we now need to call either
do_sys_recvmmsg() or __compat_sys_recvmmsg(), depending on what kind of
architecture we are on. For compat_sys_socketcall(), no such change is
needed, we always call __compat_sys_recvmmsg().

I decided to not add a new SYS_RECVMMSG_TIME64 socketcall: Any libc
implementation for 64-bit time_t will need significant changes including
an updated asm/unistd.h, and it seems better to consistently use the
separate syscalls that configuration, leaving the socketcall only for
backward compatibility with 32-bit time_t based libc.

The naming is asymmetric for the moment, so both existing syscalls
entry points keep their names, while the new ones are recvmmsg_time32
and compat_recvmmsg_time64 respectively. I expect that we will rename
the compat syscalls later as we start using generated syscall tables
everywhere and add these entry points.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-12-18 16:13:04 +01:00
Shaul Triebitz
dc7eb0f2c2 mac80211: do not advertise HE cap IE if HE disabled
When disabling HE due to the lack of HT/VHT, do it
at an earlier stage to avoid advertising HE capabilities IE.
Also, at this point, no need to check if AP supports HE, since
it is already checked earlier (in ieee80211_prep_channel).

Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-12-18 14:19:52 +01:00