71888 Commits

Author SHA1 Message Date
Linus Torvalds
75f4d9af8b iov_iter work; most of that is about getting rid of
direction misannotations and (hopefully) preventing
 more of the same for the future.
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHQEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCY5ZzQAAKCRBZ7Krx/gZQ
 65RZAP4nTkvOn0NZLVFkuGOx8pgJelXAvrteyAuecVL8V6CR4AD40qCVY51PJp8N
 MzwiRTeqnGDxTTF7mgd//IB6hoatAA==
 =bcvF
 -----END PGP SIGNATURE-----

Merge tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull iov_iter updates from Al Viro:
 "iov_iter work; most of that is about getting rid of direction
  misannotations and (hopefully) preventing more of the same for the
  future"

* tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  use less confusing names for iov_iter direction initializers
  iov_iter: saner checks for attempt to copy to/from iterator
  [xen] fix "direction" argument of iov_iter_kvec()
  [vhost] fix 'direction' argument of iov_iter_{init,bvec}()
  [target] fix iov_iter_bvec() "direction" argument
  [s390] memcpy_real(): WRITE is "data source", not destination...
  [s390] zcore: WRITE is "data source", not destination...
  [infiniband] READ is "data destination", not source...
  [fsi] WRITE is "data source", not destination...
  [s390] copy_oldmem_kernel() - WRITE is "data source", not destination
  csum_and_copy_to_iter(): handle ITER_DISCARD
  get rid of unlikely() on page_copy_sane() calls
2022-12-12 18:29:54 -08:00
Linus Torvalds
e2ed78d5d9 linux-kselftest-kunit-next-6.2-rc1
This KUnit next update for Linux 6.2-rc1 consists of several enhancements,
 fixes, clean-ups, documentation updates, improvements to logging and KTAP
 compliance of KUnit test output:
 
 - log numbers in decimal and hex
 - parse KTAP compliant test output
 - allow conditionally exposing static symbols to tests
   when KUNIT is enabled
 - make static symbols visible during kunit testing
 - clean-ups to remove unused structure definition
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmOXnPYACgkQCwJExA0N
 Qxwf9RAAwdBKxgPZuKZ40v69Jm8YhaO3vyKUkyYRH59/HQGFUHMA2f2ONez4krEX
 iXPgBFQ+7pB63FdgQi2HSg2z/u3xY02AaGgZGXDuNJDmg2xYjNDfZ0GjN6tuavlN
 Liz01DGZkjZoVVXM6oV2xT8woBg/0BbdkKNL1OBO9RBZFHzwDryRzfXmQb8cKlNr
 S+tkeZTlCA/s7UW2LNj4VlTzn6wgni4Y9gSk4wbQmSGWn3OX3rHaqAb7GiZ/yPGb
 1WjbMeE8FwyydLU40aOZZ8V6AJRiw5VGPJyFzWJyWZ21xOgN9Z95b+I36z8RXraA
 i/wnazO/FJsrhzvKL83rQkrSW6bpmVY+jGvk+L6deFM6Ro/vEWHJ4DgyKsIdMiJy
 gUM1Q69szptq+ZRHGrZWPlVONBkBXMOL+fePbCbGcMzlaEAS/zsFYW9IBKcvLzwP
 uHzzMS/cMmSUq52ZIyl9jhHQFVSoErCpJwQjAaZBQpYXPmE7yLcZItxnCaSUQTay
 bRwyps5ph5md0oJTTFJKZ4Zx5FJ2ItjbC4y9BIexb9gYRDdRq723ivDoVENZl/Zk
 DFIV95AY+mSxadS5vFagwWwX0ZN0KFKxeM8Tw7VTimal/0Sbglqp+oflsuKFD6JQ
 b5HUixYifKMbWxkH5xrUb8NdjmBj561TYa8U4N+j3oOiaPYu5Ss=
 =UQNn
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-next-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:
 "Several enhancements, fixes, clean-ups, documentation updates,
  improvements to logging and KTAP compliance of KUnit test output:

   - log numbers in decimal and hex

   - parse KTAP compliant test output

   - allow conditionally exposing static symbols to tests when KUNIT is
     enabled

   - make static symbols visible during kunit testing

   - clean-ups to remove unused structure definition"

* tag 'linux-kselftest-kunit-next-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (29 commits)
  Documentation: dev-tools: Clarify requirements for result description
  apparmor: test: make static symbols visible during kunit testing
  kunit: add macro to allow conditionally exposing static symbols to tests
  kunit: tool: make parser preserve whitespace when printing test log
  Documentation: kunit: Fix "How Do I Use This" / "Next Steps" sections
  kunit: tool: don't include KTAP headers and the like in the test log
  kunit: improve KTAP compliance of KUnit test output
  kunit: tool: parse KTAP compliant test output
  mm: slub: test: Use the kunit_get_current_test() function
  kunit: Use the static key when retrieving the current test
  kunit: Provide a static key to check if KUnit is actively running tests
  kunit: tool: make --json do nothing if --raw_ouput is set
  kunit: tool: tweak error message when no KTAP found
  kunit: remove KUNIT_INIT_MEM_ASSERTION macro
  Documentation: kunit: Remove redundant 'tips.rst' page
  Documentation: KUnit: reword description of assertions
  Documentation: KUnit: make usage.rst a superset of tips.rst, remove duplication
  kunit: eliminate KUNIT_INIT_*_ASSERT_STRUCT macros
  kunit: tool: remove redundant file.close() call in unit test
  kunit: tool: unit tests all check parser errors, standardize formatting a bit
  ...
2022-12-12 16:42:57 -08:00
Linus Torvalds
268325bda5 Random number generator updates for Linux 6.2-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmOU+U8ACgkQSfxwEqXe
 A67NnQ//Y5DltmvibyPd7r1TFT2gUYv+Rx3sUV9ZE1NYptd/SWhhcL8c5FZ70Fuw
 bSKCa1uiWjOxosjXT1kGrWq3de7q7oUpAPSOGxgxzoaNURIt58N/ajItCX/4Au8I
 RlGAScHy5e5t41/26a498kB6qJ441fBEqCYKQpPLINMBAhe8TQ+NVp0rlpUwNHFX
 WrUGg4oKWxdBIW3HkDirQjJWDkkAiklRTifQh/Al4b6QDbOnRUGGCeckNOhixsvS
 waHWTld+Td8jRrA4b82tUb2uVZ2/b8dEvj/A8CuTv4yC0lywoyMgBWmJAGOC+UmT
 ZVNdGW02Jc2T+Iap8ZdsEmeLHNqbli4+IcbY5xNlov+tHJ2oz41H9TZoYKbudlr6
 /ReAUPSn7i50PhbQlEruj3eg+M2gjOeh8OF8UKwwRK8PghvyWQ1ScW0l3kUhPIhI
 PdIG6j4+D2mJc1FIj2rTVB+Bg933x6S+qx4zDxGlNp62AARUFYf6EgyD6aXFQVuX
 RxcKb6cjRuFkzFiKc8zkqg5edZH+IJcPNuIBmABqTGBOxbZWURXzIQvK/iULqZa4
 CdGAFIs6FuOh8pFHLI3R4YoHBopbHup/xKDEeAO9KZGyeVIuOSERDxxo5f/ITzcq
 APvT77DFOEuyvanr8RMqqh0yUjzcddXqw9+ieufsAyDwjD9DTuE=
 =QRhK
 -----END PGP SIGNATURE-----

Merge tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:

 - Replace prandom_u32_max() and various open-coded variants of it,
   there is now a new family of functions that uses fast rejection
   sampling to choose properly uniformly random numbers within an
   interval:

       get_random_u32_below(ceil) - [0, ceil)
       get_random_u32_above(floor) - (floor, U32_MAX]
       get_random_u32_inclusive(floor, ceil) - [floor, ceil]

   Coccinelle was used to convert all current users of
   prandom_u32_max(), as well as many open-coded patterns, resulting in
   improvements throughout the tree.

   I'll have a "late" 6.1-rc1 pull for you that removes the now unused
   prandom_u32_max() function, just in case any other trees add a new
   use case of it that needs to converted. According to linux-next,
   there may be two trivial cases of prandom_u32_max() reintroductions
   that are fixable with a 's/.../.../'. So I'll have for you a final
   conversion patch doing that alongside the removal patch during the
   second week.

   This is a treewide change that touches many files throughout.

 - More consistent use of get_random_canary().

 - Updates to comments, documentation, tests, headers, and
   simplification in configuration.

 - The arch_get_random*_early() abstraction was only used by arm64 and
   wasn't entirely useful, so this has been replaced by code that works
   in all relevant contexts.

 - The kernel will use and manage random seeds in non-volatile EFI
   variables, refreshing a variable with a fresh seed when the RNG is
   initialized. The RNG GUID namespace is then hidden from efivarfs to
   prevent accidental leakage.

   These changes are split into random.c infrastructure code used in the
   EFI subsystem, in this pull request, and related support inside of
   EFISTUB, in Ard's EFI tree. These are co-dependent for full
   functionality, but the order of merging doesn't matter.

 - Part of the infrastructure added for the EFI support is also used for
   an improvement to the way vsprintf initializes its siphash key,
   replacing an sleep loop wart.

 - The hardware RNG framework now always calls its correct random.c
   input function, add_hwgenerator_randomness(), rather than sometimes
   going through helpers better suited for other cases.

 - The add_latent_entropy() function has long been called from the fork
   handler, but is a no-op when the latent entropy gcc plugin isn't
   used, which is fine for the purposes of latent entropy.

   But it was missing out on the cycle counter that was also being mixed
   in beside the latent entropy variable. So now, if the latent entropy
   gcc plugin isn't enabled, add_latent_entropy() will expand to a call
   to add_device_randomness(NULL, 0), which adds a cycle counter,
   without the absent latent entropy variable.

 - The RNG is now reseeded from a delayed worker, rather than on demand
   when used. Always running from a worker allows it to make use of the
   CPU RNG on platforms like S390x, whose instructions are too slow to
   do so from interrupts. It also has the effect of adding in new inputs
   more frequently with more regularity, amounting to a long term
   transcript of random values. Plus, it helps a bit with the upcoming
   vDSO implementation (which isn't yet ready for 6.2).

 - The jitter entropy algorithm now tries to execute on many different
   CPUs, round-robining, in hopes of hitting even more memory latencies
   and other unpredictable effects. It also will mix in a cycle counter
   when the entropy timer fires, in addition to being mixed in from the
   main loop, to account more explicitly for fluctuations in that timer
   firing. And the state it touches is now kept within the same cache
   line, so that it's assured that the different execution contexts will
   cause latencies.

* tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits)
  random: include <linux/once.h> in the right header
  random: align entropy_timer_state to cache line
  random: mix in cycle counter when jitter timer fires
  random: spread out jitter callback to different CPUs
  random: remove extraneous period and add a missing one in comments
  efi: random: refresh non-volatile random seed when RNG is initialized
  vsprintf: initialize siphash key using notifier
  random: add back async readiness notifier
  random: reseed in delayed work rather than on-demand
  random: always mix cycle counter in add_latent_entropy()
  hw_random: use add_hwgenerator_randomness() for early entropy
  random: modernize documentation comment on get_random_bytes()
  random: adjust comment to account for removed function
  random: remove early archrandom abstraction
  random: use random.trust_{bootloader,cpu} command line option only
  stackprotector: actually use get_random_canary()
  stackprotector: move get_random_canary() into stackprotector.h
  treewide: use get_random_u32_inclusive() when possible
  treewide: use get_random_u32_{above,below}() instead of manual loop
  treewide: use get_random_u32_below() instead of deprecated function
  ...
2022-12-12 16:22:22 -08:00
Coco Li
89300468e2 IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver
IPv6/TCP and GRO stacks can build big TCP packets with an added
temporary Hop By Hop header.

Is GSO is not involved, then the temporary header needs to be removed in
the driver. This patch provides a generic helper for drivers that need
to modify their headers in place.

Tested:
Compiled and ran with ethtool -K eth1 tso off
Could send Big TCP packets

Signed-off-by: Coco Li <lixiaoyan@google.com>
Link: https://lore.kernel.org/r/20221210041646.3587757-1-lixiaoyan@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:41:44 -08:00
Ido Schimmel
61f2183512 bridge: mcast: Support replacement of MDB port group entries
Now that user space can specify additional attributes of port group
entries such as filter mode and source list, it makes sense to allow
user space to atomically modify these attributes by replacing entries
instead of forcing user space to delete the entries and add them back.

Replace MDB port group entries when the 'NLM_F_REPLACE' flag is
specified in the netlink message header.

When a (*, G) entry is replaced, update the following attributes: Source
list, state, filter mode, protocol and flags. If the entry is temporary
and in EXCLUDE mode, reset the group timer to the group membership
interval. If the entry is temporary and in INCLUDE mode, reset the
source timers of associated sources to the group membership interval.

Examples:

 # bridge mdb replace dev br0 port dummy10 grp 239.1.1.1 permanent source_list 192.0.2.1,192.0.2.2 filter_mode include
 # bridge -d -s mdb show
 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.2 permanent filter_mode include proto static     0.00
 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.1 permanent filter_mode include proto static     0.00
 dev br0 port dummy10 grp 239.1.1.1 permanent filter_mode include source_list 192.0.2.2/0.00,192.0.2.1/0.00 proto static     0.00

 # bridge mdb replace dev br0 port dummy10 grp 239.1.1.1 permanent source_list 192.0.2.1,192.0.2.3 filter_mode exclude proto zebra
 # bridge -d -s mdb show
 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.3 permanent filter_mode include proto zebra  blocked    0.00
 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.1 permanent filter_mode include proto zebra  blocked    0.00
 dev br0 port dummy10 grp 239.1.1.1 permanent filter_mode exclude source_list 192.0.2.3/0.00,192.0.2.1/0.00 proto zebra     0.00

 # bridge mdb replace dev br0 port dummy10 grp 239.1.1.1 temp source_list 192.0.2.4,192.0.2.3 filter_mode include proto bgp
 # bridge -d -s mdb show
 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.4 temp filter_mode include proto bgp     0.00
 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.3 temp filter_mode include proto bgp     0.00
 dev br0 port dummy10 grp 239.1.1.1 temp filter_mode include source_list 192.0.2.4/259.44,192.0.2.3/259.44 proto bgp     0.00

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:33:37 -08:00
Ido Schimmel
1d7b66a7d9 bridge: mcast: Allow user space to specify MDB entry routing protocol
Add the 'MDBE_ATTR_RTPORT' attribute to allow user space to specify the
routing protocol of the MDB port group entry. Enforce a minimum value of
'RTPROT_STATIC' to prevent user space from using protocol values that
should only be set by the kernel (e.g., 'RTPROT_KERNEL'). Maintain
backward compatibility by defaulting to 'RTPROT_STATIC'.

The protocol is already visible to user space in RTM_NEWMDB responses
and notifications via the 'MDBA_MDB_EATTR_RTPROT' attribute.

The routing protocol allows a routing daemon to distinguish between
entries configured by it and those configured by the administrator. Once
MDB flush is supported, the protocol can be used as a criterion
according to which the flush is performed.

Examples:

 # bridge mdb add dev br0 port dummy10 grp 239.1.1.1 permanent proto kernel
 Error: integer out of range.

 # bridge mdb add dev br0 port dummy10 grp 239.1.1.1 permanent proto static

 # bridge mdb add dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.1 permanent proto zebra

 # bridge mdb add dev br0 port dummy10 grp 239.1.1.2 permanent source_list 198.51.100.1,198.51.100.2 filter_mode include proto 250

 # bridge -d mdb show
 dev br0 port dummy10 grp 239.1.1.2 src 198.51.100.2 permanent filter_mode include proto 250
 dev br0 port dummy10 grp 239.1.1.2 src 198.51.100.1 permanent filter_mode include proto 250
 dev br0 port dummy10 grp 239.1.1.2 permanent filter_mode include source_list 198.51.100.2/0.00,198.51.100.1/0.00 proto 250
 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.1 permanent filter_mode include proto zebra
 dev br0 port dummy10 grp 239.1.1.1 permanent filter_mode exclude proto static

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:33:37 -08:00
Ido Schimmel
6afaae6d12 bridge: mcast: Allow user space to add (*, G) with a source list and filter mode
Add new netlink attributes to the RTM_NEWMDB request that allow user
space to add (*, G) with a source list and filter mode.

The RTM_NEWMDB message can already dump such entries (created by the
kernel) so there is no need to add dump support. However, the message
contains a different set of attributes depending if it is a request or a
response. The naming and structure of the new attributes try to follow
the existing ones used in the response.

Request:

[ struct nlmsghdr ]
[ struct br_port_msg ]
[ MDBA_SET_ENTRY ]
	struct br_mdb_entry
[ MDBA_SET_ENTRY_ATTRS ]
	[ MDBE_ATTR_SOURCE ]
		struct in_addr / struct in6_addr
	[ MDBE_ATTR_SRC_LIST ]		// new
		[ MDBE_SRC_LIST_ENTRY ]
			[ MDBE_SRCATTR_ADDRESS ]
				struct in_addr / struct in6_addr
		[ ...]
	[ MDBE_ATTR_GROUP_MODE ]	// new
		u8

Response:

[ struct nlmsghdr ]
[ struct br_port_msg ]
[ MDBA_MDB ]
	[ MDBA_MDB_ENTRY ]
		[ MDBA_MDB_ENTRY_INFO ]
			struct br_mdb_entry
		[ MDBA_MDB_EATTR_TIMER ]
			u32
		[ MDBA_MDB_EATTR_SOURCE ]
			struct in_addr / struct in6_addr
		[ MDBA_MDB_EATTR_RTPROT ]
			u8
		[ MDBA_MDB_EATTR_SRC_LIST ]
			[ MDBA_MDB_SRCLIST_ENTRY ]
				[ MDBA_MDB_SRCATTR_ADDRESS ]
					struct in_addr / struct in6_addr
				[ MDBA_MDB_SRCATTR_TIMER ]
					u8
			[...]
		[ MDBA_MDB_EATTR_GROUP_MODE ]
			u8

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:33:37 -08:00
Ido Schimmel
b1c8fec8d4 bridge: mcast: Add support for (*, G) with a source list and filter mode
In preparation for allowing user space to add (*, G) entries with a
source list and associated filter mode, add the necessary plumbing to
handle such requests.

Extend the MDB configuration structure with a currently empty source
array and filter mode that is currently hard coded to EXCLUDE.

Add the source entries and the corresponding (S, G) entries before
making the new (*, G) port group entry visible to the data path.

Handle the creation of each source entry in a similar fashion to how it
is created from the data path in response to received Membership
Reports: Create the source entry, arm the source timer (if needed), add
a corresponding (S, G) forwarding entry and finally mark the source
entry as installed (by user space).

Add the (S, G) entry by populating an MDB configuration structure and
calling br_mdb_add_group_sg() as if a new entry is created by user
space, with the sole difference that the 'src_entry' field is set to
make sure that the group timer of such entries is never armed.

Note that it is not currently possible to add more than 32 source
entries to a port group entry. If this proves to be a problem we can
either increase 'PG_SRC_ENT_LIMIT' or avoid forcing a limit on entries
created by user space.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:33:37 -08:00
Ido Schimmel
079afd6616 bridge: mcast: Avoid arming group timer when (S, G) corresponds to a source
User space will soon be able to install a (*, G) with a source list,
prompting the creation of a (S, G) entry for each source.

In this case, the group timer of the (S, G) entry should never be set.

Solve this by adding a new field to the MDB configuration structure that
denotes whether the (S, G) corresponds to a source or not.

The field will be set in a subsequent patch where br_mdb_add_group_sg()
is called in order to create a (S, G) entry for each user provided
source.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:33:37 -08:00
Ido Schimmel
a01ecb1712 bridge: mcast: Add a flag for user installed source entries
There are a few places where the bridge driver differentiates between
(S, G) entries installed by the kernel (in response to Membership
Reports) and those installed by user space. One of them is when deleting
an (S, G) entry corresponding to a source entry that is being deleted.

While user space cannot currently add a source entry to a (*, G), it can
add an (S, G) entry that later corresponds to a source entry created by
the reception of a Membership Report. If this source entry is later
deleted because its source timer expired or because the (*, G) entry is
being deleted, the bridge driver will not delete the corresponding (S,
G) entry if it was added by user space as permanent.

This is going to be a problem when the ability to install a (*, G) with
a source list is exposed to user space. In this case, when user space
installs the (*, G) as permanent, then all the (S, G) entries
corresponding to its source list will also be installed as permanent.
When user space deletes the (*, G), all the source entries will be
deleted and the expectation is that the corresponding (S, G) entries
will be deleted as well.

Solve this by introducing a new source entry flag denoting that the
entry was installed by user space. When the entry is deleted, delete the
corresponding (S, G) entry even if it was installed by user space as
permanent, as the flag tells us that it was installed in response to the
source entry being created.

The flag will be set in a subsequent patch where source entries are
created in response to user requests.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:33:37 -08:00
Ido Schimmel
083e353482 bridge: mcast: Expose __br_multicast_del_group_src()
Expose __br_multicast_del_group_src() which is symmetric to
br_multicast_new_group_src() and does not remove the installed {S, G}
forwarding entry, unlike br_multicast_del_group_src().

The function will be used in the error path when user space was able to
add a new source entry, but failed to install a corresponding forwarding
entry.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:33:37 -08:00
Ido Schimmel
fd0c696164 bridge: mcast: Expose br_multicast_new_group_src()
Currently, new group source entries are only created in response to
received Membership Reports. Subsequent patches are going to allow user
space to install (*, G) entries with a source list.

As a preparatory step, expose br_multicast_new_group_src() so that it
could later be invoked from the MDB code (i.e., br_mdb.c) that handles
RTM_NEWMDB messages.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:33:37 -08:00
Ido Schimmel
160dd93114 bridge: mcast: Add a centralized error path
Subsequent patches will add memory allocations in br_mdb_config_init()
as the MDB configuration structure will include a linked list of source
entries. This memory will need to be freed regardless if br_mdb_add()
succeeded or failed.

As a preparation for this change, add a centralized error path where the
memory will be freed.

Note that br_mdb_del() already has one error path and therefore does not
require any changes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:33:37 -08:00
Ido Schimmel
1870a2d35a bridge: mcast: Place netlink policy before validation functions
Subsequent patches are going to add additional validation functions and
netlink policies. Some of these functions will need to perform parsing
using nla_parse_nested() and the new policies.

In order to keep all the policies next to each other, move the current
policy to before the validation functions.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:33:36 -08:00
Ido Schimmel
6ff1e68eb2 bridge: mcast: Split (*, G) and (S, G) addition into different functions
When the bridge is using IGMP version 3 or MLD version 2, it handles the
addition of (*, G) and (S, G) entries differently.

When a new (S, G) port group entry is added, all the (*, G) EXCLUDE
ports need to be added to the port group of the new entry. Similarly,
when a new (*, G) EXCLUDE port group entry is added, the port needs to
be added to the port group of all the matching (S, G) entries.

Subsequent patches will create more differences between both entry
types. Namely, filter mode and source list can only be specified for (*,
G) entries.

Given the current and future differences between both entry types,
handle the addition of each entry type in a different function, thereby
avoiding the creation of one complex function.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:33:36 -08:00
Ido Schimmel
b63e30651c bridge: mcast: Do not derive entry type from its filter mode
Currently, the filter mode (i.e., INCLUDE / EXCLUDE) of MDB entries
cannot be set from user space. Instead, it is set by the kernel
according to the entry type: (*, G) entries are treated as EXCLUDE and
(S, G) entries are treated as INCLUDE. This allows the kernel to derive
the entry type from its filter mode.

Subsequent patches will allow user space to set the filter mode of (*,
G) entries, making the current assumption incorrect.

As a preparation, remove the current assumption and instead determine
the entry type from its key, which is a more direct way.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:33:36 -08:00
Vladimir Oltean
e095493091 net: dsa: tag_8021q: avoid leaking ctx on dsa_tag_8021q_register() error path
If dsa_tag_8021q_setup() fails, for example due to the inability of the
device to install a VLAN, the tag_8021q context of the switch will leak.
Make sure it is freed on the error path.

Fixes: 328621f6131f ("net: dsa: tag_8021q: absorb dsa_8021q_setup into dsa_tag_8021q_{,un}register")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20221209235242.480344-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:27:52 -08:00
Xin Long
cb54d39227 net: failover: use IFF_NO_ADDRCONF flag to prevent ipv6 addrconf
Similar to Bonding and Team, to prevent ipv6 addrconf with
IFF_NO_ADDRCONF in slave_dev->priv_flags for slave ports
is also needed in net failover.

Note that dev_open(slave_dev) is called in .slave_register,
which is called after the IFF_NO_ADDRCONF flag is set in
failover_slave_register().

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:18:25 -08:00
Xin Long
8a321cf7be net: add IFF_NO_ADDRCONF and use it in bonding to prevent ipv6 addrconf
Currently, in bonding it reused the IFF_SLAVE flag and checked it
in ipv6 addrconf to prevent ipv6 addrconf.

However, it is not a proper flag to use for no ipv6 addrconf, for
bonding it has to move IFF_SLAVE flag setting ahead of dev_open()
in bond_enslave(). Also, IFF_MASTER/SLAVE are historical flags
used in bonding and eql, as Jiri mentioned, the new devices like
Team, Failover do not use this flag.

So as Jiri suggested, this patch adds IFF_NO_ADDRCONF in priv_flags
of the device to indicate no ipv6 addconf, and uses it in bonding
and moves IFF_SLAVE flag setting back to its original place.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:18:25 -08:00
Yunsheng Lin
d7b061b80e net: tso: inline tso_count_descs()
tso_count_descs() is a small function doing simple calculation,
and tso_count_descs() is used in fast path, so inline it to
reduce the overhead of calls.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Link: https://lore.kernel.org/r/20221212032426.16050-1-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:04:39 -08:00
Vladimir Oltean
8f18655c49 net: dsa: don't call ptp_classify_raw() if switch doesn't provide RX timestamping
ptp_classify_raw() is not exactly cheap, since it invokes a BPF program
for every skb in the receive path. For switches which do not provide
ds->ops->port_rxtstamp(), running ptp_classify_raw() provides precisely
nothing, so check for the presence of the function pointer first, since
that is much cheaper.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://lore.kernel.org/r/20221209175840.390707-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 15:03:21 -08:00
Jakub Kicinski
4cc58a087d bluetooth-next pull request for net-next:
- Add a new VID/PID 0489/e0f2 for MT7922
  - Add Realtek RTL8852BE support ID 0x0cb8:0xc559
  - Add a new PID/VID 13d3/3549 for RTL8822CU
  - Add support for broadcom BCM43430A0 & BCM43430A1
  - Add CONFIG_BT_HCIBTUSB_POLL_SYNC
  - Add CONFIG_BT_LE_L2CAP_ECRED
  - Add support for CYW4373A0
  - Add support for RTL8723DS
  - Add more device IDs for WCN6855
  - Add Broadcom BCM4377 family PCIe Bluetooth
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmOXqQYZHGx1aXoudm9u
 LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKa4LEACMpjtWb7IhfSt42FC+p7sf
 gXLZ1GOEZnP3x+rJUXfHRdVfRvkpABixnEFuqbzkCZLQMy6W01abtBJ0xC7mADIu
 CKvef5IjUdEQJWbbAE5ZwFPxdWbWjoBnO39crl9W1hlpxatLy0QEFfC0SqZRpiiZ
 Jo2BZ8g6eFP5l1GARhW+B1idjxaXI7JIg41s43PFFHOK/TkfpQxCVCeg4bv36wTD
 Pp14oixAmJKXsxQoozWJcY9zLy6/6KEedfBzj0mKrOG5/v0O8YyXKoe2T0yjx1d8
 t/VSV+XquYiyJ6FA9FsW8spS5GFxGjuEvKagGn11t+in9GIb8t1ve4oAL+y43nC8
 br/Sx8FVbFRWu5UA0Xv4issEMp/s/1Zty7U15CHiaFwv6VNZuyXDDLZVzvgUULLl
 3ikBrM1JG/AKdoHiIjwmv4qLpteZ7WWOqb3qT3cPMdnUaGBd9fMJ17qCRYs8roW7
 9QoGcHUfsPUUAiM6OyBz5L2HbfZuZTdVWvPVqoeaQ7LEmGeqKe4KnGeep0fVpVe8
 viKGtLbpi4+S/2IKQvEVvulmsXOved2uJef/x+leH8HShPDceQU9c+TG8RgxOygy
 f8dhAgn5wV+AgaKzLDJj2xT3xb+0j9jctTWX5SDjvHUPmu23bcNcsA0jv+dknMFh
 VhOZTE+Qze08ZVGfnRT4uw==
 =Oj2B
 -----END PGP SIGNATURE-----

Merge tag 'for-net-next-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next

Luiz Augusto von Dentz says:

====================
bluetooth-next pull request for net-next:

 - Add a new VID/PID 0489/e0f2 for MT7922
 - Add Realtek RTL8852BE support ID 0x0cb8:0xc559
 - Add a new PID/VID 13d3/3549 for RTL8822CU
 - Add support for broadcom BCM43430A0 & BCM43430A1
 - Add CONFIG_BT_HCIBTUSB_POLL_SYNC
 - Add CONFIG_BT_LE_L2CAP_ECRED
 - Add support for CYW4373A0
 - Add support for RTL8723DS
 - Add more device IDs for WCN6855
 - Add Broadcom BCM4377 family PCIe Bluetooth

* tag 'for-net-next-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (51 commits)
  Bluetooth: Wait for HCI_OP_WRITE_AUTH_PAYLOAD_TO to complete
  Bluetooth: ISO: Avoid circular locking dependency
  Bluetooth: RFCOMM: don't call kfree_skb() under spin_lock_irqsave()
  Bluetooth: hci_core: don't call kfree_skb() under spin_lock_irqsave()
  Bluetooth: hci_bcsp: don't call kfree_skb() under spin_lock_irqsave()
  Bluetooth: hci_h5: don't call kfree_skb() under spin_lock_irqsave()
  Bluetooth: hci_ll: don't call kfree_skb() under spin_lock_irqsave()
  Bluetooth: hci_qca: don't call kfree_skb() under spin_lock_irqsave()
  Bluetooth: btusb: don't call kfree_skb() under spin_lock_irqsave()
  Bluetooth: btintel: Fix missing free skb in btintel_setup_combined()
  Bluetooth: hci_conn: Fix crash on hci_create_cis_sync
  Bluetooth: btintel: Fix existing sparce warnings
  Bluetooth: btusb: Fix existing sparce warning
  Bluetooth: btusb: Fix new sparce warnings
  Bluetooth: btusb: Add a new PID/VID 13d3/3549 for RTL8822CU
  Bluetooth: btusb: Add Realtek RTL8852BE support ID 0x0cb8:0xc559
  dt-bindings: net: realtek-bluetooth: Add RTL8723DS
  Bluetooth: btusb: Add a new VID/PID 0489/e0f2 for MT7922
  dt-bindings: bluetooth: broadcom: add BCM43430A0 & BCM43430A1
  Bluetooth: hci_bcm4377: Fix missing pci_disable_device() on error in bcm4377_probe()
  ...
====================

Link: https://lore.kernel.org/r/20221212222322.1690780-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 14:51:29 -08:00
Jakub Kicinski
95d1815f09 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

1) Incorrect error check in nft_expr_inner_parse(), from Dan Carpenter.

2) Add DATA_SENT state to SCTP connection tracking helper, from
   Sriram Yagnaraman.

3) Consolidate nf_confirm for ipv4 and ipv6, from Florian Westphal.

4) Add bitmask support for ipset, from Vishwanath Pai.

5) Handle icmpv6 redirects as RELATED, from Florian Westphal.

6) Add WARN_ON_ONCE() to impossible case in flowtable datapath,
   from Li Qiong.

7) A large batch of IPVS updates to replace timer-based estimators by
   kthreads to scale up wrt. CPUs and workload (millions of estimators).

Julian Anastasov says:

	This patchset implements stats estimation in kthread context.
It replaces the code that runs on single CPU in timer context every 2
seconds and causing latency splats as shown in reports [1], [2], [3].
The solution targets setups with thousands of IPVS services,
destinations and multi-CPU boxes.

	Spread the estimation on multiple (configured) CPUs and multiple
time slots (timer ticks) by using multiple chains organized under RCU
rules.  When stats are not needed, it is recommended to use
run_estimation=0 as already implemented before this change.

RCU Locking:

- As stats are now RCU-locked, tot_stats, svc and dest which
hold estimator structures are now always freed from RCU
callback. This ensures RCU grace period after the
ip_vs_stop_estimator() call.

Kthread data:

- every kthread works over its own data structure and all
such structures are attached to array. For now we limit
kthreads depending on the number of CPUs.

- even while there can be a kthread structure, its task
may not be running, eg. before first service is added or
while the sysctl var is set to an empty cpulist or
when run_estimation is set to 0 to disable the estimation.

- the allocated kthread context may grow from 1 to 50
allocated structures for timer ticks which saves memory for
setups with small number of estimators

- a task and its structure may be released if all
estimators are unlinked from its chains, leaving the
slot in the array empty

- every kthread data structure allows limited number
of estimators. Kthread 0 is also used to initially
calculate the max number of estimators to allow in every
chain considering a sub-100 microsecond cond_resched
rate. This number can be from 1 to hundreds.

- kthread 0 has an additional job of optimizing the
adding of estimators: they are first added in
temp list (est_temp_list) and later kthread 0
distributes them to other kthreads. The optimization
is based on the fact that newly added estimator
should be estimated after 2 seconds, so we have the
time to offload the adding to chain from controlling
process to kthread 0.

- to add new estimators we use the last added kthread
context (est_add_ktid). The new estimators are linked to
the chains just before the estimated one, based on add_row.
This ensures their estimation will start after 2 seconds.
If estimators are added in bursts, common case if all
services and dests are initially configured, we may
spread the estimators to more chains and as result,
reducing the initial delay below 2 seconds.

Many thanks to Jiri Wiesner for his valuable comments
and for spending a lot of time reviewing and testing
the changes on different platforms with 48-256 CPUs and
1-8 NUMA nodes under different cpufreq governors.

The new IPVS estimators do not use workqueue infrastructure
because:

- The estimation can take long time when using multiple IPVS rules (eg.
  millions estimator structures) and especially when box has multiple
  CPUs due to the for_each_possible_cpu usage that expects packets from
  any CPU. With est_nice sysctl we have more control how to prioritize the
  estimation kthreads compared to other processes/kthreads that have
  latency requirements (such as servers). As a benefit, we can see these
  kthreads in top and decide if we will need some further control to limit
  their CPU usage (max number of structure to estimate per kthread).

- with kthreads we run code that is read-mostly, no write/lock
  operations to process the estimators in 2-second intervals.

- work items are one-shot: as estimators are processed every
  2 seconds, they need to be re-added every time. This again
  loads the timers (add_timer) if we use delayed works, as there are
  no kthreads to do the timings.

[1] Report from Yunhong Jiang:
    https://lore.kernel.org/netdev/D25792C1-1B89-45DE-9F10-EC350DC04ADC@gmail.com/
[2] https://marc.info/?l=linux-virtual-server&m=159679809118027&w=2
[3] Report from Dust:
    https://archive.linuxvirtualserver.org/html/lvs-devel/2020-12/msg00000.html

* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  ipvs: run_estimation should control the kthread tasks
  ipvs: add est_cpulist and est_nice sysctl vars
  ipvs: use kthreads for stats estimation
  ipvs: use u64_stats_t for the per-cpu counters
  ipvs: use common functions for stats allocation
  ipvs: add rcu protection to stats
  netfilter: flowtable: add a 'default' case to flowtable datapath
  netfilter: conntrack: set icmpv6 redirects as RELATED
  netfilter: ipset: Add support for new bitmask parameter
  netfilter: conntrack: merge ipv4+ipv6 confirm functions
  netfilter: conntrack: add sctp DATA_SENT state
  netfilter: nft_inner: fix IS_ERR() vs NULL check
====================

Link: https://lore.kernel.org/r/20221211101204.1751-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 14:45:36 -08:00
Luiz Augusto von Dentz
7aca0ac479 Bluetooth: Wait for HCI_OP_WRITE_AUTH_PAYLOAD_TO to complete
This make sure HCI_OP_WRITE_AUTH_PAYLOAD_TO completes before notifying
the encryption change just as is done with HCI_OP_READ_ENC_KEY_SIZE.

Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-12-12 14:19:26 -08:00
Luiz Augusto von Dentz
241f51931c Bluetooth: ISO: Avoid circular locking dependency
This attempts to avoid circular locking dependency between sock_lock
and hdev_lock:

WARNING: possible circular locking dependency detected
6.0.0-rc7-03728-g18dd8ab0a783 #3 Not tainted
------------------------------------------------------
kworker/u3:2/53 is trying to acquire lock:
ffff888000254130 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO){+.+.}-{0:0}, at:
iso_conn_del+0xbd/0x1d0
but task is already holding lock:
ffffffff9f39a080 (hci_cb_list_lock){+.+.}-{3:3}, at:
hci_le_cis_estabilished_evt+0x1b5/0x500
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (hci_cb_list_lock){+.+.}-{3:3}:
       __mutex_lock+0x10e/0xfe0
       hci_le_remote_feat_complete_evt+0x17f/0x320
       hci_event_packet+0x39c/0x7d0
       hci_rx_work+0x2bf/0x950
       process_one_work+0x569/0x980
       worker_thread+0x2a3/0x6f0
       kthread+0x153/0x180
       ret_from_fork+0x22/0x30
-> #1 (&hdev->lock){+.+.}-{3:3}:
       __mutex_lock+0x10e/0xfe0
       iso_connect_cis+0x6f/0x5a0
       iso_sock_connect+0x1af/0x710
       __sys_connect+0x17e/0x1b0
       __x64_sys_connect+0x37/0x50
       do_syscall_64+0x43/0x90
       entry_SYSCALL_64_after_hwframe+0x62/0xcc
-> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO){+.+.}-{0:0}:
       __lock_acquire+0x1b51/0x33d0
       lock_acquire+0x16f/0x3b0
       lock_sock_nested+0x32/0x80
       iso_conn_del+0xbd/0x1d0
       iso_connect_cfm+0x226/0x680
       hci_le_cis_estabilished_evt+0x1ed/0x500
       hci_event_packet+0x39c/0x7d0
       hci_rx_work+0x2bf/0x950
       process_one_work+0x569/0x980
       worker_thread+0x2a3/0x6f0
       kthread+0x153/0x180
       ret_from_fork+0x22/0x30
other info that might help us debug this:
Chain exists of:
  sk_lock-AF_BLUETOOTH-BTPROTO_ISO --> &hdev->lock --> hci_cb_list_lock
 Possible unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(hci_cb_list_lock);
                               lock(&hdev->lock);
                               lock(hci_cb_list_lock);
  lock(sk_lock-AF_BLUETOOTH-BTPROTO_ISO);
 *** DEADLOCK ***
4 locks held by kworker/u3:2/53:
 #0: ffff8880021d9130 ((wq_completion)hci0#2){+.+.}-{0:0}, at:
 process_one_work+0x4ad/0x980
 #1: ffff888002387de0 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0},
 at: process_one_work+0x4ad/0x980
 #2: ffff888001ac0070 (&hdev->lock){+.+.}-{3:3}, at:
 hci_le_cis_estabilished_evt+0xc3/0x500
 #3: ffffffff9f39a080 (hci_cb_list_lock){+.+.}-{3:3}, at:
 hci_le_cis_estabilished_evt+0x1b5/0x500

Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-12-12 14:19:26 -08:00
Yang Yingliang
0ba18967d4 Bluetooth: RFCOMM: don't call kfree_skb() under spin_lock_irqsave()
It is not allowed to call kfree_skb() from hardware interrupt
context or with interrupts being disabled. So replace kfree_skb()
with dev_kfree_skb_irq() under spin_lock_irqsave().

Fixes: 81be03e026dc ("Bluetooth: RFCOMM: Replace use of memcpy_from_msg with bt_skb_sendmmsg")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-12-12 14:19:26 -08:00
Yang Yingliang
39c1eb6fcb Bluetooth: hci_core: don't call kfree_skb() under spin_lock_irqsave()
It is not allowed to call kfree_skb() from hardware interrupt
context or with interrupts being disabled. So replace kfree_skb()
with dev_kfree_skb_irq() under spin_lock_irqsave().

Fixes: 9238f36a5a50 ("Bluetooth: Add request cmd_complete and cmd_status functions")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-12-12 14:19:26 -08:00
Luiz Augusto von Dentz
50757a259b Bluetooth: hci_conn: Fix crash on hci_create_cis_sync
When attempting to connect multiple ISO sockets without using
DEFER_SETUP may result in the following crash:

BUG: KASAN: null-ptr-deref in hci_create_cis_sync+0x18b/0x2b0
Read of size 2 at addr 0000000000000036 by task kworker/u3:1/50

CPU: 0 PID: 50 Comm: kworker/u3:1 Not tainted
6.0.0-rc7-02243-gb84a13ff4eda #4373
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS 1.16.0-1.fc36 04/01/2014
Workqueue: hci0 hci_cmd_sync_work
Call Trace:
 <TASK>
 dump_stack_lvl+0x19/0x27
 kasan_report+0xbc/0xf0
 ? hci_create_cis_sync+0x18b/0x2b0
 hci_create_cis_sync+0x18b/0x2b0
 ? get_link_mode+0xd0/0xd0
 ? __ww_mutex_lock_slowpath+0x10/0x10
 ? mutex_lock+0xe0/0xe0
 ? get_link_mode+0xd0/0xd0
 hci_cmd_sync_work+0x111/0x190
 process_one_work+0x427/0x650
 worker_thread+0x87/0x750
 ? process_one_work+0x650/0x650
 kthread+0x14e/0x180
 ? kthread_exit+0x50/0x50
 ret_from_fork+0x22/0x30
 </TASK>

Fixes: 26afbd826ee3 ("Bluetooth: Add initial implementation of CIS connections")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-12-12 14:19:25 -08:00
Sven Peter
ffcb0a445e Bluetooth: Add quirk to disable MWS Transport Configuration
Broadcom 4378/4387 controllers found in Apple Silicon Macs claim to
support getting MWS Transport Layer Configuration,

< HCI Command: Read Local Supported... (0x04|0x0002) plen 0
> HCI Event: Command Complete (0x0e) plen 68
      Read Local Supported Commands (0x04|0x0002) ncmd 1
        Status: Success (0x00)
[...]
          Get MWS Transport Layer Configuration (Octet 30 - Bit 3)]
[...]

, but then don't actually allow the required command:

> HCI Event: Command Complete (0x0e) plen 15
      Get MWS Transport Layer Configuration (0x05|0x000c) ncmd 1
        Status: Command Disallowed (0x0c)
        Number of transports: 0
        Baud rate list: 0 entries
        00 00 00 00 00 00 00 00 00 00

Signed-off-by: Sven Peter <sven@svenpeter.dev>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-12-12 14:19:24 -08:00
Sven Peter
ad38e55e1c Bluetooth: hci_event: Ignore reserved bits in LE Extended Adv Report
Broadcom controllers present on Apple Silicon devices use the upper
8 bits of the event type in the LE Extended Advertising Report for
the channel on which the frame has been received.
These bits are reserved according to the Bluetooth spec anyway such that
we can just drop them to ensure that the advertising results are parsed
correctly.

The following excerpt from a btmon trace shows a report received on
channel 37 by these controllers:

> HCI Event: LE Meta Event (0x3e) plen 55
      LE Extended Advertising Report (0x0d)
        Num reports: 1
        Entry 0
          Event type: 0x2513
            Props: 0x0013
              Connectable
              Scannable
              Use legacy advertising PDUs
            Data status: Complete
            Reserved (0x2500)
          Legacy PDU Type: Reserved (0x2513)
          Address type: Public (0x00)
          Address: XX:XX:XX:XX:XX:XX (Shenzhen Jingxun Software [...])
          Primary PHY: LE 1M
          Secondary PHY: No packets
          SID: no ADI field (0xff)
          TX power: 127 dBm
          RSSI: -76 dBm (0xb4)
          Periodic advertising interval: 0.00 msec (0x0000)
          Direct address type: Public (0x00)
          Direct address: 00:00:00:00:00:00 (OUI 00-00-00)
          Data length: 0x1d
          [...]
        Flags: 0x18
          Simultaneous LE and BR/EDR (Controller)
          Simultaneous LE and BR/EDR (Host)
        Company: Harman International Industries, Inc. (87)
          Data: [...]
        Service Data (UUID 0xfddf):
        Name (complete): JBL Flip 5

Signed-off-by: Sven Peter <sven@svenpeter.dev>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-12-12 14:19:24 -08:00
Kang Minchul
3958e87783 Bluetooth: Use kzalloc instead of kmalloc/memset
Replace kmalloc+memset by kzalloc
for better readability and simplicity.

This addresses the cocci warning below:

WARNING: kzalloc should be used for d, instead of kmalloc/memset

Signed-off-by: Kang Minchul <tegongkang@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-12-12 14:19:24 -08:00
Christophe JAILLET
63db780a93 Bluetooth: Fix EALREADY and ELOOP cases in bt_status()
'err' is known to be <0 at this point.

So, some cases can not be reached because of a missing "-".
Add it.

Fixes: ca2045e059c3 ("Bluetooth: Add bt_status")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-12-12 14:19:24 -08:00
Luiz Augusto von Dentz
462fcd5392 Bluetooth: Add CONFIG_BT_LE_L2CAP_ECRED
This adds CONFIG_BT_LE_L2CAP_ECRED which can be used to enable L2CAP
Enhanced Credit Flow Control Mode by default, previously it was only
possible to set it via module parameter (e.g. bluetooth.enable_ecred=1).

Since L2CAP ECRED mode is required by the likes of EATT which is
recommended for LE Audio this enables it by default.

Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-By: Tedd Ho-Jeong An <tedd.an@intel.com>
2022-12-12 14:19:24 -08:00
Inga Stotland
3b1c7c00b8 Bluetooth: MGMT: Fix error report for ADD_EXT_ADV_PARAMS
When validating the parameter length for MGMT_OP_ADD_EXT_ADV_PARAMS
command, use the correct op code in error status report:
was MGMT_OP_ADD_ADVERTISING, changed to MGMT_OP_ADD_EXT_ADV_PARAMS.

Fixes: 12410572833a2 ("Bluetooth: Break add adv into two mgmt commands")
Signed-off-by: Inga Stotland <inga.stotland@intel.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-12-12 14:19:23 -08:00
Yang Yingliang
0d75da38e0 Bluetooth: hci_core: fix error handling in hci_register_dev()
If hci_register_suspend_notifier() returns error, the hdev and rfkill
are leaked. We could disregard the error and print a warning message
instead to avoid leaks, as it just means we won't be handing suspend
requests.

Fixes: 9952d90ea288 ("Bluetooth: Handle PM_SUSPEND_PREPARE and PM_POST_SUSPEND")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-12-12 14:19:23 -08:00
Jiapeng Chong
37224a2908 Bluetooth: Use kzalloc instead of kmalloc/memset
Use kzalloc rather than duplicating its implementation, which makes code
simple and easy to understand.

./net/bluetooth/hci_conn.c:2038:6-13: WARNING: kzalloc should be used for cp, instead of kmalloc/memset.

Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2406
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-12-12 14:19:23 -08:00
Pauli Virtanen
d11ab690c3 Bluetooth: hci_conn: use HCI dst_type values also for BIS
For ISO BIS related functions in hci_conn.c, make dst_type values be HCI
address type values, not ISO socket address type values.  This makes it
consistent with CIS functions.

Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-12-12 14:19:23 -08:00
Archie Pusaka
97dfaf073f Bluetooth: hci_sync: cancel cmd_timer if hci_open failed
If a command is already sent, we take care of freeing it, but we
also need to cancel the timeout as well.

Signed-off-by: Archie Pusaka <apusaka@chromium.org>
Reviewed-by: Abhishek Pandit-Subedi <abhishekpandit@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-12-12 14:19:23 -08:00
Luiz Augusto von Dentz
eeb1aafe97 Bluetooth: hci_sync: Fix not able to set force_static_address
force_static_address shall be writable while hdev is initing but is not
considered powered yet since the static address is written only when
powered.

Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-by: Brian Gix <brian.gix@intel.com>
2022-12-12 14:19:23 -08:00
Luiz Augusto von Dentz
e411443c32 Bluetooth: hci_sync: Fix not setting static address
This attempts to program the address stored in hdev->static_addr after
the init sequence has been complete:

@ MGMT Command: Set Static A.. (0x002b) plen 6
        Address: C0:55:44:33:22:11 (Static)
@ MGMT Event: Command Complete (0x0001) plen 7
      Set Static Address (0x002b) plen 4
        Status: Success (0x00)
        Current settings: 0x00008200
          Low Energy
          Static Address
@ MGMT Event: New Settings (0x0006) plen 4
        Current settings: 0x00008200
          Low Energy
          Static Address
< HCI Command: LE Set Random.. (0x08|0x0005) plen 6
        Address: C0:55:44:33:22:11 (Static)
> HCI Event: Command Complete (0x0e) plen 4
      LE Set Random Address (0x08|0x0005) ncmd 1
        Status: Success (0x00)
@ MGMT Event: Command Complete (0x0001) plen 7
      Set Powered (0x0005) plen 4
        Status: Success (0x00)
        Current settings: 0x00008201
          Powered
          Low Energy
          Static Address
@ MGMT Event: New Settings (0x0006) plen 4
        Current settings: 0x00008201
          Powered
          Low Energy
          Static Address

Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-by: Brian Gix <brian.gix@intel.com>
2022-12-12 14:19:23 -08:00
Matthieu Baerts
d3295fee3c mptcp: use proper req destructor for IPv6
Before, only the destructor from TCP request sock in IPv4 was called
even if the subflow was IPv6.

It is important to use the right destructor to avoid memory leaks with
some advanced IPv6 features, e.g. when the request socks contain
specific IPv6 options.

Fixes: 79c0949e9a09 ("mptcp: Add key generation and token tree")
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 13:11:24 -08:00
Matthieu Baerts
34b21d1ddc mptcp: dedicated request sock for subflow in v6
tcp_request_sock_ops structure is specific to IPv4. It should then not
be used with MPTCP subflows on top of IPv6.

For example, it contains the 'family' field, initialised to AF_INET.
This 'family' field is used by TCP FastOpen code to generate the cookie
but also by TCP Metrics, SELinux and SYN Cookies. Using the wrong family
will not lead to crashes but displaying/using/checking wrong things.

Note that 'send_reset' callback from request_sock_ops structure is used
in some error paths. It is then also important to use the correct one
for IPv4 or IPv6.

The slab name can also be different in IPv4 and IPv6, it will be used
when printing some log messages. The slab pointer will anyway be the
same because the object size is the same for both v4 and v6. A
BUILD_BUG_ON() has also been added to make sure this size is the same.

Fixes: cec37a6e41aa ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 13:11:24 -08:00
Matthieu Baerts
3fff88186f mptcp: remove MPTCP 'ifdef' in TCP SYN cookies
To ease the maintenance, it is often recommended to avoid having #ifdef
preprocessor conditions.

Here the section related to CONFIG_MPTCP was quite short but the next
commit needs to add more code around. It is then cleaner to move
specific MPTCP code to functions located in net/mptcp directory.

Now that mptcp_subflow_request_sock_ops structure can be static, it can
also be marked as "read only after init".

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 13:11:24 -08:00
Wei Yongjun
e0fe1123ab mptcp: netlink: fix some error return code
Fix to return negative error code -EINVAL from some error handling
case instead of 0, as done elsewhere in those functions.

Fixes: 9ab4807c84a4 ("mptcp: netlink: Add MPTCP_PM_CMD_ANNOUNCE")
Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment")
Cc: stable@vger.kernel.org
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 13:11:23 -08:00
Firo Yang
da05cecc49 sctp: sysctl: make extra pointers netns aware
Recently, a customer reported that from their container whose
net namespace is different to the host's init_net, they can't set
the container's net.sctp.rto_max to any value smaller than
init_net.sctp.rto_min.

For instance,
Host:
sudo sysctl net.sctp.rto_min
net.sctp.rto_min = 1000

Container:
echo 100 > /mnt/proc-net/sctp/rto_min
echo 400 > /mnt/proc-net/sctp/rto_max
echo: write error: Invalid argument

This is caused by the check made from this'commit 4f3fdf3bc59c
("sctp: add check rto_min and rto_max in sysctl")'
When validating the input value, it's always referring the boundary
value set for the init_net namespace.

Having container's rto_max smaller than host's init_net.sctp.rto_min
does make sense. Consider that the rto between two containers on the
same host is very likely smaller than it for two hosts.

So to fix this problem, as suggested by Marcelo, this patch makes the
extra pointers of rto_min, rto_max, pf_retrans, and ps_retrans point
to the corresponding variables from the newly created net namespace while
the new net namespace is being registered in sctp_sysctl_net_register.

Fixes: 4f3fdf3bc59c ("sctp: add check rto_min and rto_max in sysctl")
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Firo Yang <firo.yang@suse.com>
Link: https://lore.kernel.org/r/20221209054854.23889-1-firo.yang@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 12:57:29 -08:00
Linus Torvalds
0a1d4434db Updates for timers, timekeeping and drivers:
- Core:
 
    - The timer_shutdown[_sync]() infrastructure:
 
      Tearing down timers can be tedious when there are circular
      dependencies to other things which need to be torn down. A prime
      example is timer and workqueue where the timer schedules work and the
      work arms the timer.
 
      What needs to prevented is that pending work which is drained via
      destroy_workqueue() does not rearm the previously shutdown
      timer. Nothing in that shutdown sequence relies on the timer being
      functional.
 
      The conclusion was that the semantics of timer_shutdown_sync() should
      be:
 
 	- timer is not enqueued
     	- timer callback is not running
     	- timer cannot be rearmed
 
      Preventing the rearming of shutdown timers is done by discarding rearm
      attempts silently. A warning for the case that a rearm attempt of a
      shutdown timer is detected would not be really helpful because it's
      entirely unclear how it should be acted upon. The only way to address
      such a case is to add 'if (in_shutdown)' conditionals all over the
      place. This is error prone and in most cases of teardown not required
      all.
 
    - The real fix for the bluetooth HCI teardown based on
      timer_shutdown_sync().
 
      A larger scale conversion to timer_shutdown_sync() is work in
      progress.
 
    - Consolidation of VDSO time namespace helper functions
 
    - Small fixes for timer and timerqueue
 
  - Drivers:
 
    - Prevent integer overflow on the XGene-1 TVAL register which causes
      an never ending interrupt storm.
 
    - The usual set of new device tree bindings
 
    - Small fixes and improvements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmOUuC0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYodpZD/9kCDi009n65QFF1J4kE5aZuABbRMtO
 7sy66fJpDyB/MtcbPPH29uzQUEs1VMTQVB+ZM+7e1YGoxSWuSTzeoFH+yK1w4tEZ
 VPbOcvUEjG0esKUehwYFeOjSnIjy6M1Y41aOUaDnq00/azhfTrzLxQA1BbbFbkpw
 S7u2hllbyRJ8KdqQyV9cVpXmze6fcpdtNhdQeoA7qQCsSPnJ24MSpZ/PG9bAovq8
 75IRROT7CQRd6AMKAVpA9Ov8ak9nbY3EgQmoKcp5ZXfXz8kD3nHky9Lste7djgYB
 U085Vwcelt39V5iXevDFfzrBYRUqrMKOXIf2xnnoDNeF5Jlj5gChSNVZwTLO38wu
 RFEVCjCjuC41GQJWSck9LRSYdriW/htVbEE8JLc6uzUJGSyjshgJRn/PK4HjpiLY
 AvH2rd4rAap/rjDKvfWvBqClcfL7pyBvavgJeyJ8oXyQjHrHQwapPcsMFBm0Cky5
 soF0Lr3hIlQ9u+hwUuFdNZkY9mOg09g9ImEjW1AZTKY0DfJMc5JAGjjSCfuopVUN
 Uf/qqcUeQPSEaC+C9xiFs0T3svYFxBqpgPv4B6t8zAnozon9fyZs+lv5KdRg4X77
 qX395qc6PaOSQlA7gcxVw3vjCPd0+hljXX84BORP7z+uzcsomvIH1MxJepIHmgaJ
 JrYbSZ5qzY5TTA==
 =JlDe
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer updates from Thomas Gleixner:
 "Updates for timers, timekeeping and drivers:

  Core:

   - The timer_shutdown[_sync]() infrastructure:

     Tearing down timers can be tedious when there are circular
     dependencies to other things which need to be torn down. A prime
     example is timer and workqueue where the timer schedules work and
     the work arms the timer.

     What needs to prevented is that pending work which is drained via
     destroy_workqueue() does not rearm the previously shutdown timer.
     Nothing in that shutdown sequence relies on the timer being
     functional.

     The conclusion was that the semantics of timer_shutdown_sync()
     should be:
	- timer is not enqueued
    	- timer callback is not running
    	- timer cannot be rearmed

     Preventing the rearming of shutdown timers is done by discarding
     rearm attempts silently.

     A warning for the case that a rearm attempt of a shutdown timer is
     detected would not be really helpful because it's entirely unclear
     how it should be acted upon. The only way to address such a case is
     to add 'if (in_shutdown)' conditionals all over the place. This is
     error prone and in most cases of teardown not required all.

   - The real fix for the bluetooth HCI teardown based on
     timer_shutdown_sync().

     A larger scale conversion to timer_shutdown_sync() is work in
     progress.

   - Consolidation of VDSO time namespace helper functions

   - Small fixes for timer and timerqueue

  Drivers:

   - Prevent integer overflow on the XGene-1 TVAL register which causes
     an never ending interrupt storm.

   - The usual set of new device tree bindings

   - Small fixes and improvements all over the place"

* tag 'timers-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
  dt-bindings: timer: renesas,cmt: Add r8a779g0 CMT support
  dt-bindings: timer: renesas,tmu: Add r8a779g0 support
  clocksource/drivers/arm_arch_timer: Use kstrtobool() instead of strtobool()
  clocksource/drivers/timer-ti-dm: Fix missing clk_disable_unprepare in dmtimer_systimer_init_clock()
  clocksource/drivers/timer-ti-dm: Clear settings on probe and free
  clocksource/drivers/timer-ti-dm: Make timer_get_irq static
  clocksource/drivers/timer-ti-dm: Fix warning for omap_timer_match
  clocksource/drivers/arm_arch_timer: Fix XGene-1 TVAL register math error
  clocksource/drivers/timer-npcm7xx: Enable timer 1 clock before use
  dt-bindings: timer: nuvoton,npcm7xx-timer: Allow specifying all clocks
  dt-bindings: timer: rockchip: Add rockchip,rk3128-timer
  clockevents: Repair kernel-doc for clockevent_delta2ns()
  clocksource/drivers/ingenic-ost: Define pm functions properly in platform_driver struct
  clocksource/drivers/sh_cmt: Access registers according to spec
  vdso/timens: Refactor copy-pasted find_timens_vvar_page() helper into one copy
  Bluetooth: hci_qca: Fix the teardown problem for real
  timers: Update the documentation to reflect on the new timer_shutdown() API
  timers: Provide timer_shutdown[_sync]()
  timers: Add shutdown mechanism to the internal functions
  timers: Split [try_to_]del_timer[_sync]() to prepare for shutdown mode
  ...
2022-12-12 12:52:02 -08:00
Jakub Kicinski
26f708a284 bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmOWgtsACgkQ6rmadz2v
 bTpT2g//WzQRsODtPVVmg87fEo1GSTXvoXq/fhg95OKNZrVKgx1N6EVlFSLSqEjL
 TAmOuv5cZT28ZpMPMNjnU/c/lFf/6/UWbbTusA+F3MtSCBSbP5DPsWDD0yvNT9DL
 EZbGoQDSyt1M+BakZLzwOV6HPn9oDhj5p/4lMw+gptTY+3IeYUbS50DinM8eLz+Q
 067aF01p3ROF6LNUx9Az0cLPdU05oHzL2MvRsj/F7h/sWoSW5B/1Kx/m1vsT9lwn
 T2vbm6r4Jo0m0ZvpEMeRyKNZgVKIc64C7NH9CV7V66giJaONmxvLwkc0zWFwbXJ2
 V9aPQbbBUx/CZXoC72LEsvVcoAFl7LAL1IALm2HVt1iQjpj1yDlWw3WV0PMQ9Rn7
 xRVDOfQNGZ6jnkv6LB2j7V1z7hVENWQQwM48dgO2pAnJwYmUW9wZaAGE5kadUrZf
 eCD4c1U+qcZkSk4vwvpr8ubJ0PWPMUZqI0FrHUxfPxqkdy78c1h3qNQufZvAHWff
 Ca9NZqraFACTx58ZBsN1V5Xzv7azoK8Zgr9+JwVNahpFxclrbL8xuceThkC4smBl
 fiZJC9fClD9ATquIdj177jNMVC8F4B5yrKF/ehJDcNQhcqUdWx9Sbj461enf+3HI
 nfTP+77ZzyIJ76iRXJBV/jr9wkaPWhAZVeBGxmw5clTvB9/RBbU=
 =fzwv
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Alexei Starovoitov says:

====================
pull-request: bpf-next 2022-12-11

We've added 74 non-merge commits during the last 11 day(s) which contain
a total of 88 files changed, 3362 insertions(+), 789 deletions(-).

The main changes are:

1) Decouple prune and jump points handling in the verifier, from Andrii.

2) Do not rely on ALLOW_ERROR_INJECTION for fmod_ret, from Benjamin.
   Merged from hid tree.

3) Do not zero-extend kfunc return values. Necessary fix for 32-bit archs,
   from Björn.

4) Don't use rcu_users to refcount in task kfuncs, from David.

5) Three reg_state->id fixes in the verifier, from Eduard.

6) Optimize bpf_mem_alloc by reusing elements from free_by_rcu, from Hou.

7) Refactor dynptr handling in the verifier, from Kumar.

8) Remove the "/sys" mount and umount dance in {open,close}_netns
  in bpf selftests, from Martin.

9) Enable sleepable support for cgrp local storage, from Yonghong.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (74 commits)
  selftests/bpf: test case for relaxed prunning of active_lock.id
  selftests/bpf: Add pruning test case for bpf_spin_lock
  bpf: use check_ids() for active_lock comparison
  selftests/bpf: verify states_equal() maintains idmap across all frames
  bpf: states_equal() must build idmap for all function frames
  selftests/bpf: test cases for regsafe() bug skipping check_id()
  bpf: regsafe() must not skip check_ids()
  docs/bpf: Add documentation for BPF_MAP_TYPE_SK_STORAGE
  selftests/bpf: Add test for dynptr reinit in user_ringbuf callback
  bpf: Use memmove for bpf_dynptr_{read,write}
  bpf: Move PTR_TO_STACK alignment check to process_dynptr_func
  bpf: Rework check_func_arg_reg_off
  bpf: Rework process_dynptr_func
  bpf: Propagate errors from process_* checks in check_func_arg
  bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func
  bpf: Skip rcu_barrier() if rcu_trace_implies_rcu_gp() is true
  bpf: Reuse freed element in free_by_rcu during allocation
  selftests/bpf: Bring test_offload.py back to life
  bpf: Fix comment error in fixup_kfunc_call function
  bpf: Do not zero-extend kfunc return values
  ...
====================

Link: https://lore.kernel.org/r/20221212024701.73809-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 11:27:42 -08:00
Linus Torvalds
1fab45ab6e RCU pull request for v6.2
This pull request contains the following branches:
 
 doc.2022.10.20a: Documentation updates.  This is the second
 	in a series from an ongoing review of the RCU documentation.
 
 fixes.2022.10.21a: Miscellaneous fixes.
 
 lazy.2022.11.30a: Introduces a default-off Kconfig option that depends
 	on RCU_NOCB_CPU that, on CPUs mentioned in the nohz_full or
 	rcu_nocbs boot-argument CPU lists, causes call_rcu() to introduce
 	delays.  These delays result in significant power savings on
 	nearly idle Android and ChromeOS systems.  These savings range
 	from a few percent to more than ten percent.
 
 	This series also includes several commits that change call_rcu()
 	to a new call_rcu_hurry() function that avoids these delays in
 	a few cases, for example, where timely wakeups are required.
 	Several of these are outside of RCU and thus have acks and
 	reviews from the relevant maintainers.
 
 srcunmisafe.2022.11.09a: Creates an srcu_read_lock_nmisafe() and an
 	srcu_read_unlock_nmisafe() for architectures that support NMIs,
 	but which do not provide NMI-safe this_cpu_inc().  These NMI-safe
 	SRCU functions are required by the upcoming lockless printk()
 	work by John Ogness et al.
 
 	That printk() series depends on these commits, so if you pull
 	the printk() series before this one, you will have already
 	pulled in this branch, plus two more SRCU commits:
 
 	0cd7e350abc4 ("rcu: Make SRCU mandatory")
 	51f5f78a4f80 ("srcu: Make Tiny synchronize_srcu() check for readers")
 
 	These two commits appear to work well, but do not have
 	sufficient testing exposure over a long enough time for me to
 	feel comfortable pushing them unless something in mainline is
 	definitely going to use them immediately, and currently only
 	the new printk() work uses them.
 
 torture.2022.10.18c: Changes providing minor but important increases
 	in test coverage for the new RCU polled-grace-period APIs.
 
 torturescript.2022.10.20a: Changes that avoid redundant kernel builds,
 	thus providing about a 30% speedup for the torture.sh acceptance
 	test.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmOKnS8THHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jCMiD/4weraRjmcLhZ3tz2vgTI8ZsXdIiCfU
 vCln0AOKroVo37S4BhViVfryV2D4VFfEb1UY6EgxNFu7Jd3z0seQShZh/5r8bFMU
 p0E6TC8PwyKUpQstTOwOynkw6BWGW1qeL620PpBNRAy4MkxL8AGv40tHRIHEeAzc
 cCTax2+xW9ae0ZtAZHDDCUAzpYpcjScIf4OZ3tkSaFCcpWZijg+dN60dnsZ9l7h9
 DtqKH61rszXAtxkmN9Fs9OY5MPCXi9Es6LVYq6KN06jqxwJRqmYf+pai3apmNIOf
 P8isXOQG58tbhBLpNCG58UBSkjI2GG8Lcq6hYr6d/7Ukm7RF49q8eL7OQlVrJMuQ
 Zi2DVTEAu2U3pzdTC14gi3RvqP7dO+psBs+LpGXtj4RxYvAP99e9KSRcG14j/Wwa
 L52AetBzBXTCS5nhPOG8RP22d8HRZLxMe9x7T8iVCDuwH4M1zTF5cVzLeEdgPAD7
 tdX4eV16PLt1AvhCEuHU/2v520gc2K9oGXLI1A6kzquXh7FflcPWl5WS+sYUbB/p
 gBsblz7C3I5GgSoW4aAMnkukZiYgSvVql8ZyRwQuRzvLpYcofMpoanZbcufDjuw9
 N5QzAaMmzHnBu3hOJS2WaSZRZ73fed3NO8jo8q8EMfYeWK3NAHybBdaQqSTgsO8i
 s+aN+LZ4s5MnRw==
 =eMOr
 -----END PGP SIGNATURE-----

Merge tag 'rcu.2022.12.02a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull RCU updates from Paul McKenney:

 - Documentation updates. This is the second in a series from an ongoing
   review of the RCU documentation.

 - Miscellaneous fixes.

 - Introduce a default-off Kconfig option that depends on RCU_NOCB_CPU
   that, on CPUs mentioned in the nohz_full or rcu_nocbs boot-argument
   CPU lists, causes call_rcu() to introduce delays.

   These delays result in significant power savings on nearly idle
   Android and ChromeOS systems. These savings range from a few percent
   to more than ten percent.

   This series also includes several commits that change call_rcu() to a
   new call_rcu_hurry() function that avoids these delays in a few
   cases, for example, where timely wakeups are required. Several of
   these are outside of RCU and thus have acks and reviews from the
   relevant maintainers.

 - Create an srcu_read_lock_nmisafe() and an srcu_read_unlock_nmisafe()
   for architectures that support NMIs, but which do not provide
   NMI-safe this_cpu_inc(). These NMI-safe SRCU functions are required
   by the upcoming lockless printk() work by John Ogness et al.

 - Changes providing minor but important increases in torture test
   coverage for the new RCU polled-grace-period APIs.

 - Changes to torturescript that avoid redundant kernel builds, thus
   providing about a 30% speedup for the torture.sh acceptance test.

* tag 'rcu.2022.12.02a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (49 commits)
  net: devinet: Reduce refcount before grace period
  net: Use call_rcu_hurry() for dst_release()
  workqueue: Make queue_rcu_work() use call_rcu_hurry()
  percpu-refcount: Use call_rcu_hurry() for atomic switch
  scsi/scsi_error: Use call_rcu_hurry() instead of call_rcu()
  rcu/rcutorture: Use call_rcu_hurry() where needed
  rcu/rcuscale: Use call_rcu_hurry() for async reader test
  rcu/sync: Use call_rcu_hurry() instead of call_rcu
  rcuscale: Add laziness and kfree tests
  rcu: Shrinker for lazy rcu
  rcu: Refactor code a bit in rcu_nocb_do_flush_bypass()
  rcu: Make call_rcu() lazy to save power
  rcu: Implement lockdep_rcu_enabled for !CONFIG_DEBUG_LOCK_ALLOC
  srcu: Debug NMI safety even on archs that don't require it
  srcu: Explain the reason behind the read side critical section on GP start
  srcu: Warn when NMI-unsafe API is used in NMI
  arch/s390: Add ARCH_HAS_NMI_SAFE_THIS_CPU_OPS Kconfig option
  arch/loongarch: Add ARCH_HAS_NMI_SAFE_THIS_CPU_OPS Kconfig option
  rcu: Fix __this_cpu_read() lockdep warning in rcu_force_quiescent_state()
  rcu-tasks: Make grace-period-age message human-readable
  ...
2022-12-12 07:47:15 -08:00
David S. Miller
b2b509fb5a linux-can-next-for-6.2-20221212
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEBsvAIBsPu6mG7thcrX5LkNig010FAmOXCo0THG1rbEBwZW5n
 dXRyb25peC5kZQAKCRCtfkuQ2KDTXUD/B/4m3F/FIpfQjrLR8P/87hkHtu1vLnPV
 uo7SGVmq8aDiMRqWSLBWkiP6ceGememckrplG+qprx9QVZhagbtFN1/kE9jXxYSU
 s+hh4ARmLfpdZmcNCFzFi2S68G4fsQ3rTI8g/itbpQYCGyHt90yA5+PqT+QQZOvo
 J4l4uim4nVyBNEW136Vf13K0iFCmeJr4zr1eSqqHhZVSbK2cQSSN93HlhmZ+ccjT
 c1vZdNL1nd9mCQkNFHC+JPehfcgaI0r6hx8RYECvOCy8EbK1ZXdseWlM/z7IoxZm
 DiT2KhTao9xa9qcnRjI6oqTwO0omlqND2YDBj+SRwfFAs3HogeOaQjRT
 =w/j3
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-next-for-6.2-20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
linux-can-next-for-6.2-20221212

this is a pull request of 39 patches for net-next/master.

The first 2 patches are by me fix a warning and coding style in the
kvaser_usb driver.

Vivek Yadav's patch sorts the includes of the m_can driver.

Biju Das contributes 5 patches for the rcar_canfd driver improve the
support for different IP core variants.

Jean Delvare's patch for the ctucanfd drops the dependency on
COMPILE_TEST.

Vincent Mailhol's patch sorts the includes of the etas_es58x driver.

Haibo Chen's contributes 2 patches that add i.MX93 support to the
flexcan driver.

Lad Prabhakar's patch updates the dt-bindings documentation of the
rcar_canfd driver.

Minghao Chi's patch converts the c_can platform driver to
devm_platform_get_and_ioremap_resource().

In the next 7 patches Vincent Mailhol adds devlink support to the
etas_es58x driver to report firmware, bootloader and hardware version.

Xu Panda's patch converts a strncpy() -> strscpy() in the ucan driver.

Ye Bin's patch removes a useless parameter from the AF_CAN protocol.

The next 2 patches by Vincent Mailhol and remove unneeded or unused
pointers to struct usb_interface in device's priv struct in the ucan
and gs_usb driver.

Vivek Yadav's patch cleans up the usage of the RAM initialization in
the m_can driver.

A patch by me add support for SO_MARK to the AF_CAN protocol.

Geert Uytterhoeven's patch fixes the number of CAN channels in the
rcan_canfd bindings documentation.

In the last 11 patches Markus Schneider-Pargmann optimizes the
register access in the t_can driver and cleans up the tcan glue
driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-12 12:11:37 +00:00
Marc Kleine-Budde
0826e82b8a can: raw: add support for SO_MARK
Add support for SO_MARK to the CAN_RAW protocol. This makes it
possible to add traffic control filters based on the fwmark.

Link: https://lore.kernel.org/all/20221210113653.170346-1-mkl@pengutronix.de
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:43:16 +01:00