linux/include
Jesper Dangaard Brouer fb452a1aa3 Revert "net: use lib/percpu_counter API for fragmentation mem accounting"
This reverts commit 6d7b857d54.

There is a bug in fragmentation codes use of the percpu_counter API,
that can cause issues on systems with many CPUs.

The frag_mem_limit() just reads the global counter (fbc->count),
without considering other CPUs can have upto batch size (130K) that
haven't been subtracted yet.  Due to the 3MBytes lower thresh limit,
this become dangerous at >=24 CPUs (3*1024*1024/130000=24).

The correct API usage would be to use __percpu_counter_compare() which
does the right thing, and takes into account the number of (online)
CPUs and batch size, to account for this and call __percpu_counter_sum()
when needed.

We choose to revert the use of the lib/percpu_counter API for frag
memory accounting for several reasons:

1) On systems with CPUs > 24, the heavier fully locked
   __percpu_counter_sum() is always invoked, which will be more
   expensive than the atomic_t that is reverted to.

Given systems with more than 24 CPUs are becoming common this doesn't
seem like a good option.  To mitigate this, the batch size could be
decreased and thresh be increased.

2) The add_frag_mem_limit+sub_frag_mem_limit pairs happen on the RX
   CPU, before SKBs are pushed into sockets on remote CPUs.  Given
   NICs can only hash on L2 part of the IP-header, the NIC-RXq's will
   likely be limited.  Thus, a fair chance that atomic add+dec happen
   on the same CPU.

Revert note that commit 1d6119baf0 ("net: fix percpu memory leaks")
removed init_frag_mem_limit() and instead use inet_frags_init_net().
After this revert, inet_frags_uninit_net() becomes empty.

Fixes: 6d7b857d54 ("net: use lib/percpu_counter API for fragmentation mem accounting")
Fixes: 1d6119baf0 ("net: fix percpu memory leaks")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-03 11:01:05 -07:00
..
acpi ACPI: NUMA: add missing include in acpi_numa.h 2017-07-24 22:27:43 +02:00
asm-generic cpumask: fix spurious cpumask_of_node() on non-NUMA multi-node configs 2017-08-28 16:13:16 -07:00
clocksource
crypto
drm i915, amd and some core fixes + mediatek color support 2017-07-13 11:26:18 -07:00
dt-bindings Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2017-07-15 10:59:54 -07:00
keys
kvm KVM: arm/arm64: PMU: Fix overflow interrupt injection 2017-07-25 14:18:01 +01:00
linux Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-09-01 12:49:03 -07:00
math-emu
media media: platform: davinci: drop VPFE_CMD_S_CCDC_RAW_PARAMS 2017-07-26 06:14:33 -04:00
memory
misc
net Revert "net: use lib/percpu_counter API for fragmentation mem accounting" 2017-09-03 11:01:05 -07:00
pcmcia
ras
rdma IB/core: Avoid accessing non-allocated memory when inferring port type 2017-08-24 15:33:33 -04:00
rxrpc
scsi SCSI fixes on 20170823 2017-08-23 11:34:40 -07:00
soc
sound ASoC: fix pcm-creation regression 2017-07-17 15:50:32 +01:00
target iscsi-target: Fix iscsi_np reset hung task during parallel delete 2017-08-06 14:41:41 -07:00
trace ext4: remove unused metadata accounting variables 2017-07-30 22:30:11 -04:00
uapi libnvdimm: clean up command definitions 2017-08-28 08:33:20 -07:00
video
xen xen/balloon: don't online new memory initially 2017-07-23 08:13:18 +02:00