Commit Graph

180 Commits

Author SHA1 Message Date
Geliang Tang
af250c27ea mptcp: drop lookup_by_id in lookup_addr
When the lookup_by_id parameter of __lookup_addr() is true, it's the same
as __lookup_addr_by_id(), it can be replaced by __lookup_addr_by_id()
directly. So drop this parameter, let __lookup_addr() only looks up address
on the local address list by comparing addresses in it, not address ids.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240305-upstream-net-next-20240304-mptcp-misc-cleanup-v1-4-c436ba5e569b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-06 20:24:10 -08:00
Geliang Tang
a4d68b1602 mptcp: set error messages for set_flags
In addition to returning the error value, this patch also sets an error
messages with GENL_SET_ERR_MSG or NL_SET_ERR_MSG_ATTR both for pm_netlink.c
and pm_userspace.c. It will help the userspace to identify the issue.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240305-upstream-net-next-20240304-mptcp-misc-cleanup-v1-3-c436ba5e569b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-06 20:24:10 -08:00
Geliang Tang
6a42477fe4 mptcp: update set_flags interfaces
This patch updates set_flags interfaces, make it more similar to the
interfaces of dump_addr and get_addr:

 mptcp_pm_set_flags(struct sk_buff *skb, struct genl_info *info)
 mptcp_pm_nl_set_flags(struct sk_buff *skb, struct genl_info *info)
 mptcp_userspace_pm_set_flags(struct sk_buff *skb, struct genl_info *info)

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240305-upstream-net-next-20240304-mptcp-misc-cleanup-v1-2-c436ba5e569b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-06 20:24:10 -08:00
Geliang Tang
d5dfbfa2f8 mptcp: drop duplicate header inclusions
The headers net/tcp.h, net/genetlink.h and uapi/linux/mptcp.h are included
in protocol.h already, no need to include them again directly. This patch
removes these duplicate header inclusions.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240305-upstream-net-next-20240304-mptcp-misc-cleanup-v1-1-c436ba5e569b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-06 20:24:10 -08:00
Geliang Tang
564ae6794e mptcp: get addr in userspace pm list
This patch renames mptcp_pm_nl_get_addr_doit() as a dedicated in-kernel
netlink PM get addr function mptcp_pm_nl_get_addr(). and invoke a new
wrapper mptcp_pm_get_addr() in mptcp_pm_nl_get_addr_doit.

If a token is gotten in the wrapper, that means a userspace PM is used.
So invoke mptcp_userspace_pm_get_addr() to get addr in userspace PM list.
Otherwise, invoke mptcp_pm_nl_get_addr().

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04 13:07:46 +00:00
Geliang Tang
9ae7846c4b mptcp: dump addrs in userspace pm list
This patch renames mptcp_pm_nl_get_addr_dumpit() as a dedicated in-kernel
netlink PM dump addrs function mptcp_pm_nl_dump_addr(), and invoke a newly
added wrapper mptcp_pm_dump_addr() in mptcp_pm_nl_get_addr_dumpit().

Invoke in-kernel PM dump addrs function mptcp_pm_nl_dump_addr() or
userspace PM dump addrs function mptcp_userspace_pm_dump_addr() based on
whether the token parameter is passed in or not in the wrapper.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04 13:07:45 +00:00
Geliang Tang
34ca91e15e mptcp: export mptcp_genl_family & mptcp_nl_fill_addr
This patch exports struct mptcp_genl_family and mptcp_nl_fill_addr() helper
to allow them can be used in pm_userspace.c.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04 13:07:45 +00:00
Geliang Tang
e38b117d7f mptcp: make pm_remove_addrs_and_subflows static
mptcp_pm_remove_addrs_and_subflows() is only used in pm_netlink.c, it's
no longer used in pm_userspace.c any more since the commit 8b1c94da1e
("mptcp: only send RM_ADDR in nl_cmd_remove"). So this patch changes it
to a static function.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04 13:07:45 +00:00
Jakub Kicinski
fecc51559a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/ipv4/udp.c
  f796feabb9 ("udp: add local "peek offset enabled" flag")
  56667da739 ("net: implement lockless setsockopt(SO_PEEK_OFF)")

Adjacent changes:

net/unix/garbage.c
  aa82ac51d6 ("af_unix: Drop oob_skb ref before purging queue in GC.")
  11498715f2 ("af_unix: Remove io_uring code for GC.")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-22 15:29:26 -08:00
Paolo Abeni
045e9d8128 mptcp: fix duplicate subflow creation
Fullmesh endpoints could end-up unexpectedly generating duplicate
subflows - same local and remote addresses - when multiple incoming
ADD_ADDR are processed before the PM creates the subflow for the local
endpoints.

Address the issue explicitly checking for duplicates at subflow
creation time.

To avoid a quadratic computational complexity, track the unavailable
remote address ids in a temporary bitmap and initialize such bitmap
with the remote ids of all the existing subflows matching the local
address currently processed.

The above allows additionally replacing the existing code checking
for duplicate entry in the current set with a simple bit test
operation.

Fixes: 2843ff6f36 ("mptcp: remote addresses fullmesh")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/435
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-18 10:25:00 +00:00
Paolo Abeni
967d3c2712 mptcp: fix data races on remote_id
Similar to the previous patch, address the data race on
remote_id, adding the suitable ONCE annotations.

Fixes: bedee0b561 ("mptcp: address lookup improvements")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-18 10:25:00 +00:00
Paolo Abeni
a7cfe77663 mptcp: fix data races on local_id
The local address id is accessed lockless by the NL PM, add
all the required ONCE annotation. There is a caveat: the local
id can be initialized late in the subflow life-cycle, and its
validity is controlled by the local_id_valid flag.

Remove such flag and encode the validity in the local_id field
itself with negative value before initialization. That allows
accessing the field consistently with a single read operation.

Fixes: 0ee4261a36 ("mptcp: implement mptcp_pm_remove_subflow")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-18 10:25:00 +00:00
Geliang Tang
584f389426 mptcp: add needs_id for netlink appending addr
Just the same as userspace PM, a new parameter needs_id is added for
in-kernel PM mptcp_pm_nl_append_new_local_addr() too.

Add a new helper mptcp_pm_has_addr_attr_id() to check whether an address
ID is set from PM or not.

In mptcp_pm_nl_get_local_id(), needs_id is always true, but in
mptcp_pm_nl_add_addr_doit(), pass mptcp_pm_has_addr_attr_id() to
needs_it.

Fixes: efd5a4c04e ("mptcp: add the address ID assignment bitmap")
Cc: stable@vger.kernel.org
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-18 10:25:00 +00:00
Paolo Abeni
b9f4554356 mptcp: annotate lockless access for token
The token field is manipulated under the msk socket lock
and accessed lockless in a few spots, add proper ONCE annotation

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-05 11:18:09 +00:00
Geliang Tang
c693a85164 mptcp: use mptcp_set_state
This patch replaces all the 'inet_sk_state_store()' calls under net/mptcp
with the new helper mptcp_set_state().

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/460
Signed-off-by: Geliang Tang <geliang.tang@linux.dev>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02 13:33:33 +00:00
Ido Schimmel
cd4d7263d5 genetlink: Use internal flags for multicast groups
As explained in commit e03781879a ("drop_monitor: Require
'CAP_SYS_ADMIN' when joining "events" group"), the "flags" field in the
multicast group structure reuses uAPI flags despite the field not being
exposed to user space. This makes it impossible to extend its use
without adding new uAPI flags, which is inappropriate for internal
kernel checks.

Solve this by adding internal flags (i.e., "GENL_MCAST_*") and convert
the existing users to use them instead of the uAPI flags.

Tested using the reproducers in commit 44ec98ea5e ("psample: Require
'CAP_NET_ADMIN' when joining "packets" group") and commit e03781879a
("drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group").

No functional changes intended.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-29 08:43:59 +00:00
Geliang Tang
8df220b292 mptcp: add validity check for sending RM_ADDR
This patch adds the validity check for sending RM_ADDRs for userspace PM
in mptcp_pm_remove_addrs(), only send a RM_ADDR when the address is in the
anno_list or conn_list.

Fixes: 8b1c94da1e ("mptcp: only send RM_ADDR in nl_cmd_remove")
Cc: stable@vger.kernel.org
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-3-7b9cd6a7b7f4@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-14 20:10:20 -08:00
Davide Caratti
aab4d85649 net: mptcp: use policy generated by YAML spec
generated with:

 $ ./tools/net/ynl/ynl-gen-c.py --mode kernel \
 > --spec Documentation/netlink/specs/mptcp.yaml --source \
 > -o net/mptcp/mptcp_pm_gen.c
 $ ./tools/net/ynl/ynl-gen-c.py --mode kernel \
 > --spec Documentation/netlink/specs/mptcp.yaml --header \
 > -o net/mptcp/mptcp_pm_gen.h

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/340
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-1-v2-7-16b1f701f900@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-24 13:00:32 -07:00
Davide Caratti
1e07938e29 net: mptcp: rename netlink handlers to mptcp_pm_nl_<blah>_{doit,dumpit}
so that they will match names generated from YAML spec.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/340
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-1-v2-6-16b1f701f900@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-24 13:00:32 -07:00
Davide Caratti
1d0507f468 net: mptcp: convert netlink from small_ops to ops
in the current MPTCP control plane, all operations use a netlink
attribute of the same type "MPTCP_PM_ATTR". However, add/del/get/flush
operations only parse the first element in the message _ the one that
describes MPTCP endpoints (that was named MPTCP_PM_ATTR_ADDR and
mostly used in ADD_ADDR operations _ probably the similarity of "attr",
"addr" and "add" might cause some confusion to human readers).
Convert MPTCP from 'small_ops' to 'ops', thus allowing different attributes
for each single operation, hopefully makes all this clearer to human
readers.

- use a separate attribute set for add/del/get/flush address operation,
  binary compatible with the existing one, to store the endpoint address.
  MPTCP_PM_ENDPOINT_ADDR is added to the uAPI (with the same value as
  MPTCP_PM_ATTR_ADDR) for these operations.
- convert mptcp_pm_ops[] and add policy files accordingly.

this prepares MPTCP control plane to be described as YAML spec.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/340
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-1-v2-3-16b1f701f900@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-24 13:00:31 -07:00
Geliang Tang
ebc1e08f01 mptcp: drop last_snd and MPTCP_RESET_SCHEDULER
Since the burst check conditions have moved out of the function
mptcp_subflow_get_send(), it makes all msk->last_snd useless.
This patch drops them as well as the macro MPTCP_RESET_SCHEDULER.

Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230821-upstream-net-next-20230818-v1-2-0c860fb256a8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-22 17:31:18 -07:00
Paolo Abeni
3f326a821b mptcp: change the mpc check helper to return a sk
After the previous patch the __mptcp_nmpc_socket helper is used
only to ensure that the MPTCP socket is a suitable status - that
is, the mptcp capable handshake is not started yet.

Change the return value to the relevant subflow sock, to finally
remove the last references to first subflow socket in the MPTCP stack.

As a bonus, we can get rid of a few local variables in different
functions.

No functional change intended.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-14 07:06:14 +01:00
Paolo Abeni
3aa3624941 mptcp: avoid ssock usage in mptcp_pm_nl_create_listen_socket()
This is one of the few remaining spots actually manipulating the
first subflow socket. We can leverage the recently introduced
inet helpers to get rid of ssock there.

No functional changes intended.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-14 07:06:14 +01:00
Jakub Kicinski
a7384f3918 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

tools/testing/selftests/net/fcnal-test.sh
  d7a2fc1437 ("selftests: net: fcnal-test: check if FIPS mode is enabled")
  dd017c72dd ("selftests: fcnal: Test SO_DONTROUTE on TCP sockets.")
https://lore.kernel.org/all/5007b52c-dd16-dbf6-8d64-b9701bfa498b@tessares.net/
https://lore.kernel.org/all/20230619105427.4a0df9b3@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-22 18:40:38 -07:00
Geliang Tang
528cb5f2a1 mptcp: pass addr to mptcp_pm_alloc_anno_list
Pass addr parameter to mptcp_pm_alloc_anno_list() instead of entry. We
can reduce the scope, e.g. in mptcp_pm_alloc_anno_list(), we only access
"entry->addr", we can then restrict to the pointer to "addr" then.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-21 22:45:58 -07:00
Paolo Abeni
57fc0f1cea mptcp: ensure listener is unhashed before updating the sk status
The MPTCP protocol access the listener subflow in a lockless
manner in a couple of places (poll, diag). That works only if
the msk itself leaves the listener status only after that the
subflow itself has been closed/disconnected. Otherwise we risk
deadlock in diag, as reported by Christoph.

Address the issue ensuring that the first subflow (the listener
one) is always disconnected before updating the msk socket status.

Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/407
Fixes: b29fcfb54c ("mptcp: full disconnect implementation")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-21 22:44:54 -07:00
Christophe JAILLET
92b0829085 mptcp: Reorder fields in 'struct mptcp_pm_add_entry'
Group some variables based on their sizes to reduce hole and avoid padding.
On x86_64, this shrinks the size of 'struct mptcp_pm_add_entry'
from 136 to 128 bytes.

It saves a few bytes of memory and is more cache-line friendly.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/e47b71de54fd3e580544be56fc1bb2985c77b0f4.1687081558.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-20 20:06:34 -07:00
Geliang Tang
6ba7ce8990 mptcp: unify pm set_flags interfaces
This patch unifies the three PM set_flags() interfaces:

mptcp_pm_nl_set_flags() in mptcp/pm_netlink.c for the in-kernel PM and
mptcp_userspace_pm_set_flags() in mptcp/pm_userspace.c for the
userspace PM.

They'll be switched in the common PM infterface mptcp_pm_set_flags() in
mptcp/pm.c based on whether token is NULL or not.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10 00:05:59 -07:00
Geliang Tang
f40be0db0b mptcp: unify pm get_flags_and_ifindex_by_id
This patch unifies the three PM get_flags_and_ifindex_by_id() interfaces:

mptcp_pm_nl_get_flags_and_ifindex_by_id() in mptcp/pm_netlink.c for the
in-kernel PM and mptcp_userspace_pm_get_flags_and_ifindex_by_id() in
mptcp/pm_userspace.c for the userspace PM.

They'll be switched in the common PM infterface
mptcp_pm_get_flags_and_ifindex_by_id() in mptcp/pm.c based on whether
mptcp_pm_is_userspace() or not.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10 00:05:59 -07:00
Geliang Tang
9bbec87ecf mptcp: unify pm get_local_id interfaces
This patch unifies the three PM get_local_id() interfaces:

mptcp_pm_nl_get_local_id() in mptcp/pm_netlink.c for the in-kernel PM and
mptcp_userspace_pm_get_local_id() in mptcp/pm_userspace.c for the
userspace PM.

They'll be switched in the common PM infterface mptcp_pm_get_local_id()
in mptcp/pm.c based on whether mptcp_pm_is_userspace() or not.

Also put together the declarations of these three functions in protocol.h.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10 00:05:59 -07:00
Geliang Tang
dc886bce75 mptcp: export local_address
Rename local_address() with "mptcp_" prefix and export it in protocol.h.

This function will be re-used in the common PM code (pm.c) in the
following commit.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10 00:05:59 -07:00
Geliang Tang
8b1c94da1e mptcp: only send RM_ADDR in nl_cmd_remove
The specifications from [1] about the "REMOVE" command say:

    Announce that an address has been lost to the peer

It was then only supposed to send a RM_ADDR and not trying to delete
associated subflows.

A new helper mptcp_pm_remove_addrs() is then introduced to do just
that, compared to mptcp_pm_remove_addrs_and_subflows() also removing
subflows.

To delete a subflow, the userspace daemon can use the "SUB_DESTROY"
command, see mptcp_nl_cmd_sf_destroy().

Fixes: d9a4594eda ("mptcp: netlink: Add MPTCP_PM_CMD_REMOVE")
Link: https://github.com/multipath-tcp/mptcp/blob/mptcp_v0.96/include/uapi/linux/mptcp.h [1]
Cc: stable@vger.kernel.org
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-05 15:15:57 +01:00
Paolo Abeni
ddb1a072f8 mptcp: move first subflow allocation at mpc access time
In the long run this will simplify the mptcp code and will
allow for more consistent behavior. Move the first subflow
allocation out of the sock->init ops into the __mptcp_nmpc_socket()
helper.

Since the first subflow creation can now happen after the first
setsockopt() we additionally need to invoke mptcp_sockopt_sync()
on it.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-17 08:18:34 +01:00
Jakub Kicinski
1118aa4c70 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/wireless/nl80211.c
  b27f07c50a ("wifi: nl80211: fix puncturing bitmap policy")
  cbbaf2bb82 ("wifi: nl80211: add a command to enable/disable HW timestamping")
https://lore.kernel.org/all/20230314105421.3608efae@canb.auug.org.au

tools/testing/selftests/net/Makefile
  62199e3f16 ("selftests: net: Add VXLAN MDB test")
  13715acf8a ("selftest: Add test for bind() conflicts.")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-17 16:29:25 -07:00
Eric Dumazet
9ae8e5ad99 mptcp: annotate lockless accesses to sk->sk_err
mptcp_poll() reads sk->sk_err without socket lock held/owned.

Add READ_ONCE() and WRITE_ONCE() to avoid load/store tearing.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17 08:25:05 +00:00
Paolo Abeni
cee4034a3d mptcp: fix lockdep false positive in mptcp_pm_nl_create_listen_socket()
Christoph reports a lockdep splat in the mptcp_subflow_create_socket()
error path, when such function is invoked by
mptcp_pm_nl_create_listen_socket().

Such code path acquires two separates, nested socket lock, with the
internal lock operation lacking the "nested" annotation. Adding that
in sock_release() for mptcp's sake only could be confusing.

Instead just add a new lockclass to the in-kernel msk socket,
re-initializing the lockdep infra after the socket creation.

Fixes: ad2171009d ("mptcp: fix locking for in-kernel listener creation")
Cc: stable@vger.kernel.org
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/354
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10 21:42:56 -08:00
Jakub Kicinski
8697a258ae Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/devlink/leftover.c / net/core/devlink.c:
  565b4824c3 ("devlink: change port event netdev notifier from per-net to global")
  f05bd8ebeb ("devlink: move code to a dedicated directory")
  687125b579 ("devlink: split out core code")
https://lore.kernel.org/all/20230208094657.379f2b1a@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-09 12:25:40 -08:00
Paolo Abeni
ad2171009d mptcp: fix locking for in-kernel listener creation
For consistency, in mptcp_pm_nl_create_listen_socket(), we need to
call the __mptcp_nmpc_socket() under the msk socket lock.

Note that as a side effect, mptcp_subflow_create_socket() needs a
'nested' lockdep annotation, as it will acquire the subflow (kernel)
socket lock under the in-kernel listener msk socket lock.

The current lack of locking is almost harmless, because the relevant
socket is not exposed to the user space, but in future we will add
more complexity to the mentioned helper, let's play safe.

Fixes: 1729cf186d ("mptcp: create the listening socket for new port")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08 09:39:34 +00:00
Paolo Abeni
b9d69db87f mptcp: let the in-kernel PM use mixed IPv4 and IPv6 addresses
Currently the in-kernel PM arbitrary enforces that created subflow's
family must match the main MPTCP socket while the RFC allows mixing
IPv4 and IPv6 subflows.

This patch changes the in-kernel PM logic to create subflows matching
the currently selected source (or destination) address. IPv4 sockets
can pick only IPv4 addresses (and v4 mapped in v6), while IPv6 sockets
not restricted to V6ONLY can pick either IPv4 and IPv6 addresses as
long as the source and destination matches.

A helper, previously introduced is used to ease family matching checks,
taking care of IPv4 vs IPv4-mapped-IPv6 vs IPv6 only addresses.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/269
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 13:33:30 +01:00
Geliang Tang
a963853fd4 mptcp: use net instead of sock_net
Use the local variable 'net' instead of sock_net() in the functions where
the variable 'struct net *net' has been defined.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-09 07:30:49 +00:00
Matthieu Baerts
03e7d28cd2 mptcp: return 0 instead of 'err' var
When 'err' is 0, it looks clearer to return '0' instead of the variable
called 'err'.

The behaviour is then not modified, just a clearer code.

By doing this, we can also avoid false positive smatch warnings like
this one:

  net/mptcp/pm_netlink.c:1169 mptcp_pm_parse_pm_addr_attr() warn: missing error code? 'err'

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-09 19:46:11 -08:00
Geliang Tang
8b34b52c17 mptcp: use nlmsg_free instead of kfree_skb
Use nlmsg_free() instead of kfree_skb() in pm_netlink.c.

The SKB's have been created by nlmsg_new(). The proper cleaning way
should then be done with nlmsg_free().

For the moment, nlmsg_free() is simply calling kfree_skb() so we don't
change the behaviour here.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-09 19:46:11 -08:00
Geliang Tang
f8c9dfbd87 mptcp: add pm listener events
This patch adds two new MPTCP netlink event types for PM listening
socket create and close, named MPTCP_EVENT_LISTENER_CREATED and
MPTCP_EVENT_LISTENER_CLOSED.

Add a new function mptcp_event_pm_listener() to push the new events
with family, port and addr to userspace.

Invoke mptcp_event_pm_listener() with MPTCP_EVENT_LISTENER_CREATED in
mptcp_listen() and mptcp_pm_nl_create_listen_socket(), invoke it with
MPTCP_EVENT_LISTENER_CLOSED in __mptcp_close_ssk().

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-01 20:06:06 -08:00
Paolo Abeni
a3400e8746 mptcp: more detailed error reporting on endpoint creation
Endpoint creation can fail for a number of reasons; in case of failure
append the error number to the extended ack message, using a newly
introduced generic helper.

Additionally let mptcp_pm_nl_append_new_local_addr() report different
error reasons.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 13:09:07 +00:00
Paolo Abeni
976d302fb6 mptcp: deduplicate error paths on endpoint creation
When endpoint creation fails, we need to free the newly allocated
entry and eventually destroy the paired mptcp listener socket.

Consolidate such action in a single point let all the errors path
reach it.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 13:09:07 +00:00
Thomas Haller
3eb9a6b650 mptcp: account memory allocation in mptcp_nl_cmd_add_addr() to user
Now that non-root users can configure MPTCP endpoints, account
the memory allocation to the user.

Signed-off-by: Thomas Haller <thaller@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-15 12:01:02 +02:00
Thomas Haller
d156971854 mptcp: allow privileged operations from user namespaces
GENL_ADMIN_PERM checks that the user has CAP_NET_ADMIN in the initial
namespace by calling netlink_capable(). Instead, use GENL_UNS_ADMIN_PERM
which uses netlink_ns_capable(). This checks that the caller has
CAP_NET_ADMIN in the current user namespace.

See also

  commit 4a92602aa1 ("openvswitch: allow management from inside user namespaces")

which introduced this mechanism. See also

  commit 5617c6cd6f ("nl80211: Allow privileged operations from user namespaces")

which introduced this for nl80211.

Signed-off-by: Thomas Haller <thaller@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-15 12:01:02 +02:00
Matthieu Baerts
5efbf6f7f0 mptcp: add mptcp_for_each_subflow_safe helper
Similar to mptcp_for_each_subflow(): this is clearer now that the _safe
version is used in multiple places.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-15 12:01:02 +02:00
Jakub Kicinski
9c5d03d362 genetlink: start to validate reserved header bytes
We had historically not checked that genlmsghdr.reserved
is 0 on input which prevents us from using those precious
bytes in the future.

One use case would be to extend the cmd field, which is
currently just 8 bits wide and 256 is not a lot of commands
for some core families.

To make sure that new families do the right thing by default
put the onus of opting out of validation on existing families.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com> (NetLabel)
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-29 12:47:15 +01:00
Paolo Abeni
3ad14f54bd mptcp: more accurate MPC endpoint tracking
Currently the id accounting for the ID 0 subflow is not correct:
at creation time we mark (correctly) as unavailable the endpoint
id corresponding the MPC subflow source address, while at subflow
removal time set as available the id 0.

With this change we track explicitly the endpoint id corresponding
to the MPC subflow so that we can mark it as available at removal time.
Additionally this allow deleting the initial subflow via the NL PM
specifying the corresponding endpoint id.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-12 18:37:20 -07:00