linux/net/core
Joel Fernandes (Google) 483c26ff63 net: Use call_rcu_hurry() for dst_release()
In a networking test on ChromeOS, kernels built with the new
CONFIG_RCU_LAZY=y Kconfig option fail a networking test in the teardown
phase.

This failure may be reproduced as follows: ip netns del <name>

The CONFIG_RCU_LAZY=y Kconfig option was introduced by earlier commits
in this series for the benefit of certain battery-powered systems.
This Kconfig option causes call_rcu() to delay its callbacks in order
to batch them.  This means that a given RCU grace period covers more
callbacks, thus reducing the number of grace periods, in turn reducing
the amount of energy consumed, which increases battery lifetime which
can be a very good thing.  This is not a subtle effect: In some important
use cases, the battery lifetime is increased by more than 10%.

This CONFIG_RCU_LAZY=y option is available only for CPUs that offload
callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot
parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y.

Delaying callbacks is normally not a problem because most callbacks do
nothing but free memory.  If the system is short on memory, a shrinker
will kick all currently queued lazy callbacks out of their laziness,
thus freeing their memory in short order.  Similarly, the rcu_barrier()
function, which blocks until all currently queued callbacks are invoked,
will also kick lazy callbacks, thus enabling rcu_barrier() to complete
in a timely manner.

However, there are some cases where laziness is not a good option.
For example, synchronize_rcu() invokes call_rcu(), and blocks until
the newly queued callback is invoked.  It would not be a good for
synchronize_rcu() to block for ten seconds, even on an idle system.
Therefore, synchronize_rcu() invokes call_rcu_hurry() instead of
call_rcu().  The arrival of a non-lazy call_rcu_hurry() callback on a
given CPU kicks any lazy callbacks that might be already queued on that
CPU.  After all, if there is going to be a grace period, all callbacks
might as well get full benefit from it.

Yes, this could be done the other way around by creating a
call_rcu_lazy(), but earlier experience with this approach and
feedback at the 2022 Linux Plumbers Conference shifted the approach
to call_rcu() being lazy with call_rcu_hurry() for the few places
where laziness is inappropriate.

Returning to the test failure, use of ftrace showed that this failure
cause caused by the aadded delays due to this new lazy behavior of
call_rcu() in kernels built with CONFIG_RCU_LAZY=y.

Therefore, make dst_release() use call_rcu_hurry() in order to revert
to the old test-failure-free behavior.

[ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ]

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: <netdev@vger.kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-11-30 13:17:25 -08:00
..
bpf_sk_storage.c net: Fix data-races around sysctl_optmem_max. 2022-08-24 13:46:57 +01:00
datagram.c tcp: TX zerocopy should not sense pfmemalloc status 2022-09-02 12:29:02 +01:00
dev_addr_lists_test.c net: kunit: add a test for dev_addr_lists 2021-11-20 12:25:57 +00:00
dev_addr_lists.c net: extract a few internals from netdevice.h 2022-04-07 20:32:09 -07:00
dev_ioctl.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
dev.c net: skb: introduce and use a single page frag cache 2022-09-29 18:48:15 -07:00
dev.h net: add skb_defer_max sysctl 2022-05-16 11:33:59 +01:00
devlink.c net: devlink: add port_init/fini() helpers to allow pre-register/post-unregister functions 2022-09-30 18:17:16 -07:00
drop_monitor.c genetlink: start to validate reserved header bytes 2022-08-29 12:47:15 +01:00
dst_cache.c wireguard: device: reset peer src endpoint when netns exits 2021-11-29 19:50:45 -08:00
dst.c net: Use call_rcu_hurry() for dst_release() 2022-11-30 13:17:25 -08:00
failover.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
fib_notifier.c
fib_rules.c fib: expand fib_rule_policy 2021-12-16 07:18:35 -08:00
filter.c Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2022-10-03 13:02:49 -07:00
flow_dissector.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-09-22 13:02:10 -07:00
flow_offload.c flow_offload: Introduce flow_match_l2tpv3 2022-09-20 09:13:38 +02:00
gen_estimator.c net: sched: Remove Qdisc::running sequence counter 2021-10-18 12:54:41 +01:00
gen_stats.c net: sched: fix misuse of qcpu->backlog in gnet_stats_add_queue_cpu 2022-08-16 19:38:20 -07:00
gro_cells.c net: drop the weight argument from netif_napi_add 2022-09-28 18:57:14 -07:00
gro.c gro: add support of (hw)gro packets to gro stack 2022-10-03 12:38:34 +01:00
hwbm.c
link_watch.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
lwt_bpf.c bpf, lwt: Fix crash when using bpf_skb_set_tunnel_key() from bpf_xmit lwt hook 2022-04-22 17:45:25 +02:00
lwtunnel.c xfrm: lwtunnel: add lwtunnel support for xfrm interfaces in collect_md mode 2022-08-29 10:44:08 +02:00
Makefile net: skb: export skb drop reaons to user by TRACE_DEFINE_ENUM 2022-09-07 15:28:08 +01:00
neighbour.c treewide: use prandom_u32_max() when possible, part 1 2022-10-11 17:42:55 -06:00
net_namespace.c Revert "net: set proper memcg for net_init hooks allocations" 2022-09-28 09:20:37 -07:00
net-procfs.c net: extract a few internals from netdevice.h 2022-04-07 20:32:09 -07:00
net-sysfs.c net-sysfs: Convert to use sysfs_emit() APIs 2022-09-30 12:27:44 +01:00
net-sysfs.h
net-traces.c tcp: add tracepoint for checksum errors 2021-05-14 15:26:03 -07:00
netclassid_cgroup.c core: Variable type completion 2022-08-31 09:40:34 +01:00
netevent.c net: core: Correct function name netevent_unregister_notifier() in the kerneldoc 2021-03-28 17:56:56 -07:00
netpoll.c net: move from strlcpy with unused retval to strscpy 2022-08-22 18:06:18 -07:00
netprio_cgroup.c bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode 2021-09-13 16:35:58 -07:00
of_net.c Revert "of: net: support NVMEM cells with MAC in text format" 2022-01-12 14:14:36 +00:00
page_pool.c - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
pktgen.c treewide: use get_random_u32() when possible 2022-10-11 17:42:58 -06:00
ptp_classifier.c ptp: Add generic PTP is_sync() function 2022-03-07 11:31:34 +00:00
request_sock.c
rtnetlink.c net: rtnetlink: Enslave device before bringing it up 2022-09-20 08:37:44 -07:00
scm.c memcg: enable accounting for scm_fp_list objects 2021-07-20 06:00:38 -07:00
secure_seq.c tcp: Fix data-races around sysctl knobs related to SYN option. 2022-07-20 10:14:49 +01:00
selftests.c net: core: constify mac addrs in selftests 2021-10-24 13:59:44 +01:00
skbuff.c net: skb: introduce and use a single page frag cache 2022-09-29 18:48:15 -07:00
skmsg.c skmsg: Schedule psock work if the cached skb exists on the psock 2022-09-26 17:48:05 +02:00
sock_destructor.h skb_expand_head() adjust skb->truesize incorrectly 2021-10-22 12:35:51 -07:00
sock_diag.c net: Don't include filter.h from net/sock.h 2021-12-29 08:48:14 -08:00
sock_map.c Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2022-08-17 20:29:36 -07:00
sock_reuseport.c tcp: Fix data-races around sysctl_tcp_migrate_req. 2022-07-18 12:21:54 +01:00
sock.c ipv6: Fix data races around sk->sk_prot. 2022-10-12 17:50:37 -07:00
stream.c treewide: use prandom_u32_max() when possible, part 1 2022-10-11 17:42:55 -06:00
sysctl_net_core.c net: sysctl: remove unused variable long_max 2022-09-07 15:31:19 +01:00
timestamping.c
tso.c
utils.c net: core: Use csum_replace_by_diff() and csum_sub() instead of opencoding 2022-02-21 11:40:44 +00:00
xdp.c xdp: improve page_pool xdp_return performance 2022-09-26 11:28:19 -07:00