linux

iv/linux

Author	SHA1	Message	Date
Jakub Kicinski	ec4c20ca09	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: net/mac80211/rx.c 91535613b609 ("wifi: mac80211: don't drop all unprotected public action frames") 6c02fab72429 ("wifi: mac80211: split ieee80211_drop_unencrypted_mgmt() return value") Adjacent changes: drivers/net/ethernet/apm/xgene/xgene_enet_main.c 61471264c018 ("net: ethernet: apm: Convert to platform remove callback returning void") d2ca43f30611 ("net: xgene: Fix unused xgene_enet_of_match warning for !CONFIG_OF") net/vmw_vsock/virtio_transport.c 64c99d2d6ada ("vsock/virtio: support to send non-linear skb") 53b08c498515 ("vsock/virtio: initialize the_virtio_vsock before using VQs") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-26 13:46:28 -07:00
Paolo Abeni	3967336126	netfilter pull request 23-10-25 -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmU5gvgACgkQ1V2XiooU IOS9eA//QyIqcGRzr+tX1ZPwikkicmSb7w8vkrY7jXMNWNiye54tA5fSJsxqpIMy 9J9k8eB+fI6AJV44tOq4K8XsYCcI4ZEst2mftumvuq8igX27ulz46uLvoVwnqhwc NlPO06RwSHQHR3S5tKRRwwYfUwpPjDRCW15c14pHw4EkHAaL+dItLwATrrJhPv93 PUZNGbB+i+55QrLJdMMshvpoPAhLmo57cDvDcerhOWygaoxiaKIaR0bRQ40eM3Zl j9veG2oiuehv/RHqFJ5MBCiqrIYRHU8kTflqaVA+ODfgUSbijCcv/RQxaILnwnZd 57vdLSBTgVFh2uiPTYGAxfUwv3BpA7g9uzmeBgMbl/t71HhYoPe/QhjvoVhlzNl/ 1JrCFbrQVaCdhtZbDt7f359mUtXv9yh1+9pBytEkdbxRfsDmzRVsFrUeLo0P9Ho8 4jKpnaqPTdnzhfoQocZjL7+M22/zk6jZCu1Pcs318yqpTkhJwiTtByo4iaQ2f0F7 xgA/auZ/33mmirFXnLzMoA4b0TbJNm+Jjye3tdbu0FlY2sKb943RW2kXg9rdfyOL OUvSSux7Fezyq5y55+KV4FFrKgYhrZ5tWR8mVBg1KHrYs9r9p7SmXL3KzbSPxBvn QrPdCQsQ187QNIkwli/ChYQcrahwSiEqFSJbdDmzIA7AFNLHVaY= =ZG+P -----END PGP SIGNATURE----- Merge tag 'nf-next-23-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next. Mostly nf_tables updates with two patches for connlabel and br_netfilter. 1) Rename function name to perform on-demand GC for rbtree elements, and replace async GC in rbtree by sync GC. Patches from Florian Westphal. 2) Use commit_mutex for NFT_MSG_GETRULE_RESET to ensure that two concurrent threads invoking this command do not underrun stateful objects. Patches from Phil Sutter. 3) Use single hook to deal with IP and ARP packets in br_netfilter. Patch from Florian Westphal. 4) Use atomic_t in netns->connlabel use counter instead of using a spinlock, also patch from Florian. 5) Cleanups for stateful objects infrastructure in nf_tables. Patches from Phil Sutter. 6) Flush path uses opaque set element offered by the iterator, instead of calling pipapo_deactivate() which looks up for it again. 7) Set backend .flush interface always succeeds, make it return void instead. 8) Add struct nft_elem_priv placeholder structure and use it by replacing void * to pass opaque set element representation from backend to frontend which defeats compiler type checks. 9) Shrink memory consumption of set element transactions, by reducing struct nft_trans_elem object size and reducing stack memory usage. 10) Use struct nft_elem_priv also for set backend .insert operation too. 11) Carry reset flag in nft_set_dump_ctx structure, instead of passing it as a function argument, from Phil Sutter. netfilter pull request 23-10-25 * tag 'nf-next-23-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_tables: Carry reset boolean in nft_set_dump_ctx netfilter: nf_tables: set->ops->insert returns opaque set element in case of EEXIST netfilter: nf_tables: shrink memory consumption of set elements netfilter: nf_tables: expose opaque set element as struct nft_elem_priv netfilter: nf_tables: set backend .flush always succeeds netfilter: nft_set_pipapo: no need to call pipapo_deactivate() from flush netfilter: nf_tables: Carry reset boolean in nft_obj_dump_ctx netfilter: nf_tables: nft_obj_filter fits into cb->ctx netfilter: nf_tables: Carry s_idx in nft_obj_dump_ctx netfilter: nf_tables: A better name for nft_obj_filter netfilter: nf_tables: Unconditionally allocate nft_obj_filter netfilter: nf_tables: Drop pointless memset in nf_tables_dump_obj netfilter: conntrack: switch connlabels to atomic_t br_netfilter: use single forward hook for ip and arp netfilter: nf_tables: Add locking for NFT_MSG_GETRULE_RESET requests netfilter: nf_tables: Introduce nf_tables_getrule_single() netfilter: nf_tables: Open-code audit log call in nf_tables_getrule() netfilter: nft_set_rbtree: prefer sync gc to async worker netfilter: nft_set_rbtree: rename gc deactivate+erase function ==================== Link: https://lore.kernel.org/r/20231025212555.132775-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-10-26 12:20:35 +02:00
Pablo Neira Ayuso	735795f68b	netfilter: flowtable: GC pushes back packets to classic path Since 41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded unreplied tuple"), flowtable GC pushes back flows with IPS_SEEN_REPLY back to classic path in every run, ie. every second. This is because of a new check for NF_FLOW_HW_ESTABLISHED which is specific of sched/act_ct. In Netfilter's flowtable case, NF_FLOW_HW_ESTABLISHED never gets set on and IPS_SEEN_REPLY is unreliable since users decide when to offload the flow before, such bit might be set on at a later stage. Fix it by adding a custom .gc handler that sched/act_ct can use to deal with its NF_FLOW_HW_ESTABLISHED bit. Fixes: 41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded unreplied tuple") Reported-by: Vladimir Smelhaus <vl.sm@email.cz> Reviewed-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-25 11:35:46 +02:00
Phil Sutter	9cdee06347	netfilter: nf_tables: Carry reset boolean in nft_set_dump_ctx Relieve the dump callback from having to check nlmsg_type upon each call. Prep work for set element reset locking. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 15:48:30 +02:00
Pablo Neira Ayuso	078996fcd6	netfilter: nf_tables: set->ops->insert returns opaque set element in case of EEXIST Return struct nft_elem_priv instead of struct nft_set_ext for consistency with ("netfilter: nf_tables: expose opaque set element as struct nft_elem_priv") and to prepare the introduction of element timeout updates from control path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:37:46 +02:00
Pablo Neira Ayuso	0e1ea651c9	netfilter: nf_tables: shrink memory consumption of set elements Instead of copying struct nft_set_elem into struct nft_trans_elem, store the pointer to the opaque set element object in the transaction. Adapt set backend API (and set backend implementations) to take the pointer to opaque set element representation whenever required. This patch deconstifies .remove() and .activate() set backend API since these modify the set element opaque object. And it also constify nft_set_elem_ext() this provides access to the nft_set_ext struct without updating the object. According to pahole on x86_64, this patch shrinks struct nft_trans_elem size from 216 to 24 bytes. This patch also reduces stack memory consumption by removing the template struct nft_set_elem object, using the opaque set element object instead such as from the set iterator API, catchall elements and the get element command. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:37:42 +02:00
Pablo Neira Ayuso	9dad402b89	netfilter: nf_tables: expose opaque set element as struct nft_elem_priv Add placeholder structure and place it at the beginning of each struct nft__elem for each existing set backend, instead of exposing elements as void type to the frontend which defeats compiler type checks. Use this pointer to this new type to replace void . This patch updates the following set backend API to use this new struct nft_elem_priv placeholder structure: - update - deactivate - flush - get as well as the following helper functions: - nft_set_elem_ext() - nft_set_elem_init() - nft_set_elem_destroy() - nf_tables_set_elem_destroy() This patch adds nft_elem_priv_cast() to cast struct nft_elem_priv to native element representation from the corresponding set backend. BUILD_BUG_ON() makes sure this .priv placeholder is always at the top of the opaque set element representation. Suggested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:16:30 +02:00
Pablo Neira Ayuso	6509a2e410	netfilter: nf_tables: set backend .flush always succeeds .flush is always successful since this results from iterating over the set elements to toggle mark the element as inactive in the next generation. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:16:30 +02:00
Pablo Neira Ayuso	26cec9d414	netfilter: nft_set_pipapo: no need to call pipapo_deactivate() from flush Use the element object that is already offered instead. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:16:30 +02:00
Phil Sutter	a552339063	netfilter: nf_tables: Carry reset boolean in nft_obj_dump_ctx Relieve the dump callback from having to inspect nlmsg_type upon each call, just do it once at start of the dump. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:16:30 +02:00
Phil Sutter	5a893b9cdf	netfilter: nf_tables: nft_obj_filter fits into cb->ctx No need to allocate it if one may just use struct netlink_callback's scratch area for it. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:16:30 +02:00
Phil Sutter	2eda95cfa2	netfilter: nf_tables: Carry s_idx in nft_obj_dump_ctx Prep work for moving the context into struct netlink_callback scratch area. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:16:30 +02:00
Phil Sutter	ecf49cad80	netfilter: nf_tables: A better name for nft_obj_filter Name it for what it is supposed to become, a real nft_obj_dump_ctx. No functional change intended. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:16:30 +02:00
Phil Sutter	4279cc60b3	netfilter: nf_tables: Unconditionally allocate nft_obj_filter Prep work for moving the filter into struct netlink_callback's scratch area. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:16:30 +02:00
Phil Sutter	ff16111cc1	netfilter: nf_tables: Drop pointless memset in nf_tables_dump_obj The code does not make use of cb->args fields past the first one, no need to zero them. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:16:30 +02:00
Florian Westphal	643d126036	netfilter: conntrack: switch connlabels to atomic_t The spinlock is back from the day when connabels did not have a fixed size and reallocation had to be supported. Remove it. This change also allows to call the helpers from softirq or timers without deadlocks. Also add WARN()s to catch refcounting imbalances. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:16:30 +02:00
Phil Sutter	3cb03edb4d	netfilter: nf_tables: Add locking for NFT_MSG_GETRULE_RESET requests Rule reset is not concurrency-safe per-se, so multiple CPUs may reset the same rule at the same time. At least counter and quota expressions will suffer from value underruns in this case. Prevent this by introducing dedicated locking callbacks for nfnetlink and the asynchronous dump handling to serialize access. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:16:29 +02:00
Phil Sutter	1578c32877	netfilter: nf_tables: Introduce nf_tables_getrule_single() Outsource the reply skb preparation for non-dump getrule requests into a distinct function. Prep work for rule reset locking. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:16:29 +02:00
Phil Sutter	8877393029	netfilter: nf_tables: Open-code audit log call in nf_tables_getrule() The table lookup will be dropped from that function, so remove that dependency from audit logging code. Using whatever is in nla[NFTA_RULE_TABLE] is sufficient as long as the previous rule info filling succeded. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:16:29 +02:00
Florian Westphal	7d259f021a	netfilter: nft_set_rbtree: prefer sync gc to async worker There is no need for asynchronous garbage collection, rbtree inserts can only happen from the netlink control plane. We already perform on-demand gc on insertion, in the area of the tree where the insertion takes place, but we don't do a full tree walk there for performance reasons. Do a full gc walk at the end of the transaction instead and remove the async worker. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:16:29 +02:00
Florian Westphal	8079fc30f7	netfilter: nft_set_rbtree: rename gc deactivate+erase function Next patch adds a cllaer that doesn't hold the priv->write lock and will need a similar function. Rename the existing function to make it clear that it can only be used for opportunistic gc during insertion. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-10-24 13:16:29 +02:00
Eric Dumazet	2a7c8d291f	tcp: introduce tcp_clock_ms() It delivers current TCP time stamp in ms unit, and is used in place of confusing tcp_time_stamp_raw() It is the same family than tcp_clock_ns() and tcp_clock_ms(). tcp_time_stamp_raw() will be replaced later for TSval contexts with a more descriptive name. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-23 09:35:01 +01:00
Jakub Kicinski	041c3466f3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. net/mac80211/key.c 02e0e426a2fb ("wifi: mac80211: fix error path key leak") 2a8b665e6bcc ("wifi: mac80211: remove key_mtx") 7d6904bf26b9 ("Merge wireless into wireless-next") https://lore.kernel.org/all/20231012113648.46eea5ec@canb.auug.org.au/ Adjacent changes: drivers/net/ethernet/ti/Kconfig a602ee3176a8 ("net: ethernet: ti: Fix mixed module-builtin object") 98bdeae9502b ("net: cpmac: remove driver to prepare for platform removal") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-19 13:29:01 -07:00
Pablo Neira Ayuso	f86fb94011	netfilter: nf_tables: revert do not remove elements if set backend implements .abort nf_tables_abort_release() path calls nft_set_elem_destroy() for NFT_MSG_NEWSETELEM which releases the element, however, a reference to the element still remains in the working copy. Fixes: ebd032fa8818 ("netfilter: nf_tables: do not remove elements if set backend implements .abort") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-18 13:47:32 +02:00
Pablo Neira Ayuso	d111692a59	netfilter: nft_set_rbtree: .deactivate fails if element has expired This allows to remove an expired element which is not possible in other existing set backends, this is more noticeable if gc-interval is high so expired elements remain in the tree. On-demand gc also does not help in this case, because this is delete element path. Return NULL if element has expired. Fixes: 8d8540c4f5e0 ("netfilter: nft_set_rbtree: add timeout support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-18 13:47:32 +02:00
Phil Sutter	1baf0152f7	netfilter: nf_tables: audit log object reset once per table When resetting multiple objects at once (via dump request), emit a log message per table (or filled skb) and resurrect the 'entries' parameter to contain the number of objects being logged for. To test the skb exhaustion path, perform some bulk counter and quota adds in the kselftest. Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Acked-by: Paul Moore <paul@paul-moore.com> (Audit) Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-18 13:43:40 +02:00
Florian Westphal	2560016721	netfilter: nf_tables: de-constify set commit ops function argument The set backend using this already has to work around this via ugly cast, don't spread this pattern. Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-18 10:26:43 +02:00
Florian Westphal	e0d4593140	netfilter: make nftables drops visible in net dropmonitor net_dropmonitor blames core.c:nf_hook_slow. Add NF_DROP_REASON() helper and use it in nft_do_chain(). The helper releases the skb, so exact drop location becomes available. Calling code will observe the NF_STOLEN verdict instead. Adjust nf_hook_slow so we can embed an erro value wih NF_STOLEN verdicts, just like we do for NF_DROP. After this, drop in nftables can be pinpointed to a drop due to a rule or the chain policy. Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-18 10:26:43 +02:00
Florian Westphal	35c038b0a4	netfilter: nf_nat: mask out non-verdict bits when checking return value Same as previous change: we need to mask out the non-verdict bits, as upcoming patches may embed an errno value in NF_STOLEN verdicts too. NF_DROP could already do this, but not all called functions do this. Checks that only test ret vs NF_ACCEPT are fine, the 'errno parts' are always 0 for those. Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-18 10:26:43 +02:00
Florian Westphal	6291b3a67a	netfilter: conntrack: convert nf_conntrack_update to netfilter verdicts This function calls helpers that can return nf-verdicts, but then those get converted to -1/0 as thats what the caller expects. Theoretically NF_DROP could have an errno number set in the upper 24 bits of the return value. Or any of those helpers could return NF_STOLEN, which would result in use-after-free. This is fine as-is, the called functions don't do this yet. But its better to avoid possible future problems if the upcoming patchset to add NF_DROP_REASON() support gains further users, so remove the 0/-1 translation from the picture and pass the verdicts down to the caller. Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-18 10:26:43 +02:00
Florian Westphal	4d26ab0086	netfilter: nf_tables: mask out non-verdict bits when checking return value nftables trace infra must mask out the non-verdict bit parts of the return value, else followup changes that 'return errno << 8 \| NF_STOLEN' will cause breakage. Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-18 10:26:43 +02:00
Florian Westphal	d351c1ea2d	netfilter: nft_payload: fix wrong mac header matching mcast packets get looped back to the local machine. Such packets have a 0-length mac header, we should treat this like "mac header not set" and abort rule evaluation. As-is, we just copy data from the network header instead. Fixes: 96518518cc41 ("netfilter: add nftables") Reported-by: Blažej Krajňák <krajnak@levonet.sk> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-12 10:28:45 +02:00
Xingyuan Mo	505ce0630a	nf_tables: fix NULL pointer dereference in nft_expr_inner_parse() We should check whether the NFTA_EXPR_NAME netlink attribute is present before accessing it, otherwise a null pointer deference error will occur. Call Trace: <TASK> dump_stack_lvl+0x4f/0x90 print_report+0x3f0/0x620 kasan_report+0xcd/0x110 __asan_load2+0x7d/0xa0 nla_strcmp+0x2f/0x90 __nft_expr_type_get+0x41/0xb0 nft_expr_inner_parse+0xe3/0x200 nft_inner_init+0x1be/0x2e0 nf_tables_newrule+0x813/0x1230 nfnetlink_rcv_batch+0xec3/0x1170 nfnetlink_rcv+0x1e4/0x220 netlink_unicast+0x34e/0x4b0 netlink_sendmsg+0x45c/0x7e0 __sys_sendto+0x355/0x370 __x64_sys_sendto+0x84/0xa0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching") Signed-off-by: Xingyuan Mo <hdthky0@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-12 10:28:45 +02:00
Xingyuan Mo	52177bbf19	nf_tables: fix NULL pointer dereference in nft_inner_init() We should check whether the NFTA_INNER_NUM netlink attribute is present before accessing it, otherwise a null pointer deference error will occur. Call Trace: dump_stack_lvl+0x4f/0x90 print_report+0x3f0/0x620 kasan_report+0xcd/0x110 __asan_load4+0x84/0xa0 nft_inner_init+0x128/0x2e0 nf_tables_newrule+0x813/0x1230 nfnetlink_rcv_batch+0xec3/0x1170 nfnetlink_rcv+0x1e4/0x220 netlink_unicast+0x34e/0x4b0 netlink_sendmsg+0x45c/0x7e0 __sys_sendto+0x355/0x370 __x64_sys_sendto+0x84/0xa0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching") Signed-off-by: Xingyuan Mo <hdthky0@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-12 10:28:45 +02:00
Pablo Neira Ayuso	4c90bba60c	netfilter: nf_tables: do not refresh timeout when resetting element The dump and reset command should not refresh the timeout, this command is intended to allow users to list existing stateful objects and reset them, element expiration should be refresh via transaction instead with a specific command to achieve this, otherwise this is entering combo semantics that will be hard to be undone later (eg. a user asking to retrieve counters but _not_ requiring to refresh expiration). Fixes: 079cd633219d ("netfilter: nf_tables: Introduce NFT_MSG_GETSETELEM_RESET") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-12 10:28:45 +02:00
Kees Cook	d51c42cdef	netfilter: nf_tables: Annotate struct nft_pipapo_match with __counted_by Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct nft_pipapo_match. Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Jozsef Kadlecsik <kadlec@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: netfilter-devel@vger.kernel.org Cc: coreteam@netfilter.org Cc: netdev@vger.kernel.org Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1] Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-12 10:28:45 +02:00
Florian Westphal	2e1d175410	netfilter: nfnetlink_log: silence bogus compiler warning net/netfilter/nfnetlink_log.c:800:18: warning: variable 'ctinfo' is uninitialized The warning is bogus, the variable is only used if ct is non-NULL and always initialised in that case. Init to 0 too to silence this. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202309100514.ndBFebXN-lkp@intel.com/ Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-12 10:28:45 +02:00
Pablo Neira Ayuso	ebd032fa88	netfilter: nf_tables: do not remove elements if set backend implements .abort pipapo set backend maintains two copies of the datastructure, removing the elements from the copy that is going to be discarded slows down the abort path significantly, from several minutes to few seconds after this patch. Fixes: 212ed75dc5fb ("netfilter: nf_tables: integrate pipapo into commit protocol") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-12 10:28:45 +02:00
Florian Westphal	6ac9c51eeb	netfilter: conntrack: prefer tcp_error_log to pr_debug pr_debug doesn't provide any information other than that a packet did not match existing state but also was found to not create a new connection. Replaces this with tcp_error_log, which will also dump packets' content so one can see if this is a stray FIN or RST. Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-10 16:34:28 +02:00
Florian Westphal	8a23f4ab92	netfilter: conntrack: simplify nf_conntrack_alter_reply nf_conntrack_alter_reply doesn't do helper reassignment anymore. Remove the comments that make this claim. Furthermore, remove dead code from the function and place ot in nf_conntrack.h. Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-10 16:34:28 +02:00
Phil Sutter	99ab9f84b8	netfilter: nf_tables: Don't allocate nft_rule_dump_ctx Since struct netlink_callback::args is not used by rule dumpers anymore, use it to hold nft_rule_dump_ctx. Add a build-time check to make sure it won't ever exceed the available space. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-10 16:34:28 +02:00
Phil Sutter	8194d599bc	netfilter: nf_tables: Carry s_idx in nft_rule_dump_ctx In order to move the context into struct netlink_callback's scratch area, the latter must be unused first. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-10 16:34:28 +02:00
Phil Sutter	405c8fd62d	netfilter: nf_tables: Carry reset flag in nft_rule_dump_ctx This relieves the dump callback from having to check nlmsg_type upon each call and instead performs the check once in .start callback. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-10 16:34:27 +02:00
Phil Sutter	30fa41a0f6	netfilter: nf_tables: Drop pointless memset when dumping rules None of the dump callbacks uses netlink_callback::args beyond the first element, no need to zero the data. Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-10 16:34:20 +02:00
Phil Sutter	afed2b54c5	netfilter: nf_tables: Always allocate nft_rule_dump_ctx It will move into struct netlink_callback's scratch area later, just put nf_tables_dump_rules_start in shape to reduce churn later. Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-10 16:01:42 +02:00
Jakub Kicinski	2606cf059c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. No conflicts (or adjacent changes of note). Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-05 13:16:47 -07:00
Jakub Kicinski	07cf7974a2	netfilter pull request 2023-09-28 -----BEGIN PGP SIGNATURE----- iQJBBAABCAArFiEEgKkgxbID4Gn1hq6fcJGo2a1f9gAFAmUVjwUNHGZ3QHN0cmxl bi5kZQAKCRBwkajZrV/2AKneEACzrKtIC0j0DyhgVW4Kb57T8Y7cD5wQCv7oz1Cx 8A3UJ1pSLYhRnz94zY453GIenK+zx/KKIetDhyWnjA9gjk95HkUN+OwuuiKnUAgu 7KPGbIYat7hERwoZpR88nrbTYXcDZfcZGTqWA++3yL2vn4Lu4lsuowqXYKBf/axk 5gEwEtwn2mVsdo0qTVJcXkHqnf5CCdqd26ixF4yB1rz/P6kISi4I9q7ul43paFJW +/ifacdG+7raQkGlUlYiDNMVd0uO01HHaAcWfYa+FOMK+GSn+89zzTs906CU0g2O GRJSWjNTgfDtM2AHN7peUnf/G9XHSK2Y7Re8FzauKzwWSl5N9w5610nbQnT+ME5O uOZE1P/lhnidOwCEV8zU4yhs6fBrCMCHz+S5Yh8C8PCUhi12IEEYRHyGCoUVMOwY 1LINjdn4HddL57QUGumy0VqVBlxQru8VXnlzm0eIyhsbZ3/mVXQWIHX4u1G36UUQ zSkm4/qP4kna/tV86mETNX1MUcJsQ1vQ842abcUbxudKei/uT9av6YHlz/aBOcQZ NDMrGVO6mjh7/HnYUr7+zbQfhLZdg424SpGEoiuS7dDcTpGlcT3pnWBJDGEHsy+4 0VnLI8/GPT1/jQCCYTVLu+tn0XmfZF18j2bvGhz1hM9J/HXaRpuqjGF6thLgYl63 CZf5Yg== =ALU2 -----END PGP SIGNATURE----- Merge tag 'nf-next-23-09-28' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Florian Westphal says: ==================== netfilter updates for net-next First patch, from myself, is a bug fix. The issue (connect timeout) is ancient, so I think its safe to give this more soak time given the esoteric conditions needed to trigger this. Also updates the existing selftest to cover this. Add netlink extacks when an update references a non-existent table/chain/set. This allows userspace to provide much better errors to the user, from Pablo Neira Ayuso. Last patch adds more policy checks to nf_tables as a better alternative to the existing runtime checks, from Phil Sutter. * tag 'nf-next-23-09-28' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_tables: Utilize NLA_POLICY_NESTED_ARRAY netfilter: nf_tables: missing extended netlink error in lookup functions selftests: netfilter: test nat source port clash resolution interaction with tcp early demux netfilter: nf_nat: undo erroneous tcp edemux lookup after port clash ==================== Link: https://lore.kernel.org/r/20230928144916.18339-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 14:25:37 -07:00
Florian Westphal	087388278e	netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure nft_rbtree_gc_elem() walks back and removes the end interval element that comes before the expired element. There is a small chance that we've cached this element as 'rbe_ge'. If this happens, we hold and test a pointer that has been queued for freeing. It also causes spurious insertion failures: $ cat test-testcases-sets-0044interval_overlap_0.1/testout.log Error: Could not process rule: File exists add element t s { 0 - 2 } ^^^^^^ Failed to insert 0 - 2 given: table ip t { set s { type inet_service flags interval,timeout timeout 2s gc-interval 2s } } The set (rbtree) is empty. The 'failure' doesn't happen on next attempt. Reason is that when we try to insert, the tree may hold an expired element that collides with the range we're adding. While we do evict/erase this element, we can trip over this check: if (rbe_ge && nft_rbtree_interval_end(rbe_ge) && nft_rbtree_interval_end(new)) return -ENOTEMPTY; rbe_ge was erased by the synchronous gc, we should not have done this check. Next attempt won't find it, so retry results in successful insertion. Restart in-kernel to avoid such spurious errors. Such restart are rare, unless userspace intentionally adds very large numbers of elements with very short timeouts while setting a huge gc interval. Even in this case, this cannot loop forever, on each retry an existing element has been removed. As the caller is holding the transaction mutex, its impossible for a second entity to add more expiring elements to the tree. After this it also becomes feasible to remove the async gc worker and perform all garbage collection from the commit path. Fixes: c9e6978e2725 ("netfilter: nft_set_rbtree: Switch to node list walk for overlap detection") Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-04 15:57:28 +02:00
Phil Sutter	0d880dc6f0	netfilter: nf_tables: Deduplicate nft_register_obj audit logs When adding/updating an object, the transaction handler emits suitable audit log entries already, the one in nft_obj_notify() is redundant. To fix that (and retain the audit logging from objects' 'update' callback), Introduce an "audit log free" variant for internal use. Fixes: c520292f29b8 ("audit: log nftables configuration change events once per table") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Acked-by: Paul Moore <paul@paul-moore.com> (Audit) Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-04 15:57:06 +02:00
Xin Long	8e56b063c8	netfilter: handle the connecting collision properly in nf_conntrack_proto_sctp In Scenario A and B below, as the delayed INIT_ACK always changes the peer vtag, SCTP ct with the incorrect vtag may cause packet loss. Scenario A: INIT_ACK is delayed until the peer receives its own INIT_ACK 192.168.1.2 > 192.168.1.1: [INIT] [init tag: 1328086772] 192.168.1.1 > 192.168.1.2: [INIT] [init tag: 1414468151] 192.168.1.2 > 192.168.1.1: [INIT ACK] [init tag: 1328086772] 192.168.1.1 > 192.168.1.2: [INIT ACK] [init tag: 1650211246] * 192.168.1.2 > 192.168.1.1: [COOKIE ECHO] 192.168.1.1 > 192.168.1.2: [COOKIE ECHO] 192.168.1.2 > 192.168.1.1: [COOKIE ACK] Scenario B: INIT_ACK is delayed until the peer completes its own handshake 192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408] 192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885] 192.168.1.2 > 192.168.1.1: sctp (1) [INIT ACK] [init tag: 3922216408] 192.168.1.1 > 192.168.1.2: sctp (1) [COOKIE ECHO] 192.168.1.2 > 192.168.1.1: sctp (1) [COOKIE ACK] 192.168.1.1 > 192.168.1.2: sctp (1) [INIT ACK] [init tag: 3914796021] * This patch fixes it as below: In SCTP_CID_INIT processing: - clear ct->proto.sctp.init[!dir] if ct->proto.sctp.init[dir] && ct->proto.sctp.init[!dir]. (Scenario E) - set ct->proto.sctp.init[dir]. In SCTP_CID_INIT_ACK processing: - drop it if !ct->proto.sctp.init[!dir] && ct->proto.sctp.vtag[!dir] && ct->proto.sctp.vtag[!dir] != ih->init_tag. (Scenario B, Scenario C) - drop it if ct->proto.sctp.init[dir] && ct->proto.sctp.init[!dir] && ct->proto.sctp.vtag[!dir] != ih->init_tag. (Scenario A) In SCTP_CID_COOKIE_ACK processing: - clear ct->proto.sctp.init[dir] and ct->proto.sctp.init[!dir]. (Scenario D) Also, it's important to allow the ct state to move forward with cookie_echo and cookie_ack from the opposite dir for the collision scenarios. There are also other Scenarios where it should allow the packet through, addressed by the processing above: Scenario C: new CT is created by INIT_ACK. Scenario D: start INIT on the existing ESTABLISHED ct. Scenario E: start INIT after the old collision on the existing ESTABLISHED ct. 192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408] 192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885] (both side are stopped, then start new connection again in hours) 192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 242308742] Fixes: 9fb9cbb1082d ("[NETFILTER]: Add nf_conntrack subsystem.") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-10-04 14:12:01 +02:00

1 2 3 4 5 ...

6564 Commits