linux

iv/linux

History

Kirill Tkhai 0a4465d340 mm, memcg: assign memcg-aware shrinkers bitmap to memcg Imagine a big node with many cpus, memory cgroups and containers. Let we have 200 containers, every container has 10 mounts, and 10 cgroups. All container tasks don't touch foreign containers mounts. If there is intensive pages write, and global reclaim happens, a writing task has to iterate over all memcgs to shrink slab, before it's able to go to shrink_page_list(). Iteration over all the memcg slabs is very expensive: the task has to visit 200 * 10 = 2000 shrinkers for every memcg, and since there are 2000 memcgs, the total calls are 2000 * 2000 = 4000000. So, the shrinker makes 4 million do_shrink_slab() calls just to try to isolate SWAP_CLUSTER_MAX pages in one of the actively writing memcg via shrink_page_list(). I've observed a node spending almost 100% in kernel, making useless iteration over already shrinked slab. This patch adds bitmap of memcg-aware shrinkers to memcg. The size of the bitmap depends on bitmap_nr_ids, and during memcg life it's maintained to be enough to fit bitmap_nr_ids shrinkers. Every bit in the map is related to corresponding shrinker id. Next patches will maintain set bit only for really charged memcg. This will allow shrink_slab() to increase its performance in significant way. See the last patch for the numbers. [ktkhai@virtuozzo.com: v9] Link: http://lkml.kernel.org/r/153112549031.4097.3576147070498769979.stgit@localhost.localdomain [ktkhai@virtuozzo.com: add comment to mem_cgroup_css_online()] Link: http://lkml.kernel.org/r/521f9e5f-c436-b388-fe83-4dc870bfb489@virtuozzo.com Link: http://lkml.kernel.org/r/153063056619.1818.12550500883688681076.stgit@localhost.localdomain Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Tested-by: Shakeel Butt <shakeelb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Josef Bacik <jbacik@fb.com> Cc: Li RongQing <lirongqing@baidu.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Roman Gushchin <guro@fb.com> Cc: Sahitya Tummala <stummala@codeaurora.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2018-08-17 16:20:30 -07:00
..
kasan	kernel/memremap, kasan: make ZONE_DEVICE with work with KASAN	2018-08-17 16:20:30 -07:00
backing-dev.c	bdi: Fix another oops in wb_workfn()	2018-06-22 12:08:07 -06:00
balloon_compaction.c	virtio_balloon: fix deadlock on OOM	2017-11-14 23:57:38 +02:00
bootmem.c	docs/mm: bootmem: add overview documentation	2018-08-02 12:17:27 -06:00
cleancache.c	mm: use octal not symbolic permissions	2018-06-15 07:55:25 +09:00
cma_debug.c	mm: use octal not symbolic permissions	2018-06-15 07:55:25 +09:00
cma.c	Revert "mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE"	2018-05-24 10:07:50 -07:00
cma.h	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
compaction.c	mm: use octal not symbolic permissions	2018-06-15 07:55:25 +09:00
debug_page_ref.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
debug.c	mm: teach dump_page() to correctly output poisoned struct pages	2018-07-03 17:32:19 -07:00
dmapool.c	mm: use octal not symbolic permissions	2018-06-15 07:55:25 +09:00
early_ioremap.c	mm/early_ioremap: Fix boot hang with earlyprintk=efi,keep	2017-12-11 14:54:44 +01:00
fadvise.c	mm/fadvise.c: fix signed overflow UBSAN complaint	2018-08-17 16:20:30 -07:00
failslab.c	mm: use octal not symbolic permissions	2018-06-15 07:55:25 +09:00
filemap.c	mm: use new return type vm_fault_t	2018-06-07 17:34:36 -07:00
frame_vector.c	mm/frame_vector.c: release a semaphore in 'get_vaddr_frames()'	2017-12-14 16:00:48 -08:00
frontswap.c	mm: use octal not symbolic permissions	2018-06-15 07:55:25 +09:00
gup_benchmark.c	treewide: kvzalloc() -> kvcalloc()	2018-06-12 16:19:22 -07:00
gup.c	mm: do not bug_on on incorrect length in __mm_populate()	2018-07-14 11:11:10 -07:00
highmem.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
hmm.c	mm: convert return type of handle_mm_fault() caller to vm_fault_t	2018-08-17 16:20:28 -07:00
huge_memory.c	mm, huge page: copy target sub-page last when copy huge page	2018-08-17 16:20:29 -07:00
hugetlb_cgroup.c	mm: rename page_counter's count/limit into usage/max	2018-06-07 17:34:35 -07:00
hugetlb.c	mm, hugetlbfs: pass fault address to cow handler	2018-08-17 16:20:29 -07:00
hwpoison-inject.c	mm/memory_failure: Remove unused trapno from memory_failure	2018-01-23 12:17:42 -06:00
init-mm.c	mm: Allocate the mm_cpumask (mm->cpu_bitmap[]) dynamically based on nr_cpu_ids	2018-07-17 09:35:30 +02:00
internal.h	Changes for 4.18:	2018-06-05 13:24:20 -07:00
interval_tree.c	mm/interval_tree.c: use vma_pages() helper	2018-01-31 17:18:37 -08:00
Kconfig	mm: make DEFERRED_STRUCT_PAGE_INIT explicitly depend on SPARSEMEM	2018-08-17 16:20:30 -07:00
Kconfig.debug	kmemcheck: rip it out	2017-11-15 18:21:05 -08:00
khugepaged.c	mm: thp: pass correct vm_flags to hugepage_vma_check()	2018-08-17 16:20:30 -07:00
kmemleak-test.c
kmemleak.c	mm: kernel-doc: add missing parameter descriptions	2018-04-05 21:36:27 -07:00
ksm.c	mm: convert return type of handle_mm_fault() caller to vm_fault_t	2018-08-17 16:20:28 -07:00
list_lru.c	mm: introduce CONFIG_MEMCG_KMEM as combination of CONFIG_MEMCG && !CONFIG_SLOB	2018-08-17 16:20:30 -07:00
maccess.c	mm: docs: fix parameter names mismatch	2018-02-06 18:32:48 -08:00
madvise.c	mm/memory_failure: Remove unused trapno from memory_failure	2018-01-23 12:17:42 -06:00
Makefile	mm: restructure memfd code	2018-06-07 17:34:35 -07:00
memblock.c	mm/memblock.c: replace u64 with phys_addr_t where appropriate	2018-08-17 16:20:30 -07:00
memcontrol.c	mm, memcg: assign memcg-aware shrinkers bitmap to memcg	2018-08-17 16:20:30 -07:00
memfd.c	alloc_file(): switch to passing O_... flags instead of FMODE_... mode	2018-07-12 10:02:57 -04:00
memory_hotplug.c	mm/memory_hotplug.c: make register_mem_sect_under_node() a callback of walk_memory_range()	2018-08-17 16:20:29 -07:00
memory-failure.c	mm, migrate: remove reason argument from new_page_t	2018-04-11 10:28:32 -07:00
memory.c	memcg, oom: move out_of_memory back to the charge path	2018-08-17 16:20:30 -07:00
mempolicy.c	mm: use vma_init() to initialize VMAs on stack and data segments	2018-07-26 19:38:03 -07:00
mempool.c	mm/mempool.c: remove unused argument in kasan_unpoison_element() and remove_element()	2018-08-17 16:20:28 -07:00
memtest.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
migrate.c	dax: remove VM_MIXEDMAP for fsdax and device dax	2018-08-17 16:20:27 -07:00
mincore.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
mlock.c	dax: remove VM_MIXEDMAP for fsdax and device dax	2018-08-17 16:20:27 -07:00
mm_init.c
mmap.c	dax: remove VM_MIXEDMAP for fsdax and device dax	2018-08-17 16:20:27 -07:00
mmu_context.c	sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h>	2017-03-02 08:42:38 +01:00
mmu_notifier.c	mm, mmu_notifier: annotate mmu notifiers with blockable invalidate callbacks	2018-01-31 17:18:38 -08:00
mmzone.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
mprotect.c	x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings	2018-06-20 19:10:01 +02:00
mremap.c	mremap: remove LATENCY_LIMIT from mremap to reduce the number of TLB shootdowns	2018-06-15 07:55:24 +09:00
msync.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
nobootmem.c	mm/memblock: add a name for memblock flags enumeration	2018-08-02 12:17:27 -06:00
nommu.c	mm: provide a fallback for PAGE_KERNEL_EXEC for architectures	2018-08-17 16:20:29 -07:00
oom_kill.c	mm: fix oom_kill event handling	2018-06-15 07:55:25 +09:00
page_alloc.c	mm: drop VM_BUG_ON from __get_free_pages	2018-08-17 16:20:29 -07:00
page_counter.c	memcg: introduce memory.min	2018-06-07 17:34:36 -07:00
page_ext.c	mm/page_ext.c: constify lookup_page_ext() argument	2018-08-17 16:20:28 -07:00
page_idle.c	mm: use octal not symbolic permissions	2018-06-15 07:55:25 +09:00
page_io.c	swap,blkcg: issue swap io with the appropriate context	2018-07-09 09:07:54 -06:00
page_isolation.c	mm, migrate: remove reason argument from new_page_t	2018-04-11 10:28:32 -07:00
page_owner.c	mm: use octal not symbolic permissions	2018-06-15 07:55:25 +09:00
page_poison.c	mm/page_poison.c: make early_page_poison_param() __init	2018-04-05 21:36:26 -07:00
page_vma_mapped.c	mm, page_vma_mapped: Introduce pfn_in_hpage()	2018-01-22 12:15:57 -08:00
page-writeback.c	mm/page-writeback.c: update stale account_page_redirty() comment	2018-08-17 16:20:30 -07:00
pagewalk.c	mm: kernel-doc: add missing parameter descriptions	2018-04-05 21:36:27 -07:00
percpu-internal.h	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
percpu-km.c	percpu: allow select gfp to be passed to underlying allocators	2018-02-18 05:33:01 -08:00
percpu-stats.c	treewide: Use array_size() in vmalloc()	2018-06-12 16:19:22 -07:00
percpu-vm.c	percpu: allow select gfp to be passed to underlying allocators	2018-02-18 05:33:01 -08:00
percpu.c	arch: remove obsolete architecture ports	2018-04-02 20:20:12 -07:00
pgtable-generic.c	mm: do not lose dirty and accessed bits in pmdp_invalidate()	2018-01-31 17:18:38 -08:00
process_vm_access.c	mm: docs: add blank lines to silence sphinx "Unexpected indentation" errors	2018-02-06 18:32:48 -08:00
quicklist.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
readahead.c	readahead: stricter check for bdi io_pages	2018-07-27 09:09:53 -06:00
rmap.c	mm: do not drop unused pages when userfaultd is running	2018-07-14 11:11:09 -07:00
rodata_test.c	mm: fix RODATA_TEST failure "rodata_test: test data was not read only"	2017-10-03 17:54:24 -07:00
shmem.c	shmem: use monotonic time for i_generation	2018-08-17 16:20:28 -07:00
slab_common.c	mm: introduce CONFIG_MEMCG_KMEM as combination of CONFIG_MEMCG && !CONFIG_SLOB	2018-08-17 16:20:30 -07:00
slab.c	treewide: kzalloc() -> kcalloc()	2018-06-12 16:19:22 -07:00
slab.h	mm: introduce CONFIG_MEMCG_KMEM as combination of CONFIG_MEMCG && !CONFIG_SLOB	2018-08-17 16:20:30 -07:00
slob.c	slab: __GFP_ZERO is incompatible with a constructor	2018-06-07 17:34:34 -07:00
slub.c	mm, slub: restore the original intention of prefetch_freepointer()	2018-08-17 16:20:28 -07:00
sparse-vmemmap.c	mm: merge vmem_altmap_alloc into altmap_alloc_block_buf	2018-01-08 11:46:23 -08:00
sparse.c	mm/sparse.c: make sparse_init_one_section void and remove check	2018-08-17 16:20:30 -07:00
swap_cgroup.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
swap_slots.c	mm/swap_slots.c: make swap_slots_cache_mutex and swap_slots_cache_enable_mutex static	2018-08-17 16:20:30 -07:00
swap_state.c	treewide: kvzalloc() -> kvcalloc()	2018-06-12 16:19:22 -07:00
swap.c	mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS	2018-05-22 06:59:39 -07:00
swapfile.c	for-4.19/block-20180812	2018-08-14 10:23:25 -07:00
truncate.c	page cache: use xa_lock	2018-04-11 10:28:39 -07:00
usercopy.c	usercopy: Allow boot cmdline disabling of hardening	2018-07-04 08:04:52 -07:00
userfaultfd.c	userfaultfd: prevent non-cooperative events vs mcopy_atomic races	2018-06-07 17:34:38 -07:00
util.c	mm: kvmalloc does not fallback to vmalloc for incompatible gfp flags	2018-06-07 17:34:38 -07:00
vmacache.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
vmalloc.c	mm: provide a fallback for PAGE_KERNEL_EXEC for architectures	2018-08-17 16:20:29 -07:00
vmpressure.c	mm/vmpressure.c: convert to use match_string() helper	2018-06-07 17:34:36 -07:00
vmscan.c	mm, memcg: assign memcg-aware shrinkers bitmap to memcg	2018-08-17 16:20:30 -07:00
vmstat.c	Revert mm/vmstat.c: fix vmstat_update() preemption BUG	2018-06-28 11:16:44 -07:00
workingset.c	mm: workingset: make shadow_lru_isolate() use locking suffix	2018-08-17 16:20:29 -07:00
z3fold.c	z3fold: fix reclaim lock-ups	2018-05-11 17:28:45 -07:00
zbud.c	mm: docs: fix parameter names mismatch	2018-02-06 18:32:48 -08:00
zpool.c	mm/zpool.c: zpool_evictable: fix mismatch in parameter name and kernel-doc	2018-02-21 15:35:43 -08:00
zsmalloc.c	mm/zsmalloc.c: make several functions and a struct static	2018-08-17 16:20:30 -07:00
zswap.c	zswap: re-check zswap_is_full() after do zswap_shrink()	2018-07-26 19:38:03 -07:00