linux/mm
Johannes Weiner 4c6355b25e mm: memcontrol: charge swapin pages on instantiation
Right now, users that are otherwise memory controlled can easily escape
their containment and allocate significant amounts of memory that they're
not being charged for.  That's because swap readahead pages are not being
charged until somebody actually faults them into their page table.  This
can be exploited with MADV_WILLNEED, which triggers arbitrary readahead
allocations without charging the pages.

There are additional problems with the delayed charging of swap pages:

1. To implement refault/workingset detection for anonymous pages, we
   need to have a target LRU available at swapin time, but the LRU is not
   determinable until the page has been charged.

2. To implement per-cgroup LRU locking, we need page->mem_cgroup to be
   stable when the page is isolated from the LRU; otherwise, the locks
   change under us.  But swapcache gets charged after it's already on the
   LRU, and even if we cannot isolate it ourselves (since charging is not
   exactly optional).

The previous patch ensured we always maintain cgroup ownership records for
swap pages.  This patch moves the swapcache charging point from the fault
handler to swapin time to fix all of the above problems.

v2: simplify swapin error checking (Joonsoo)

[hughd@google.com: fix livelock in __read_swap_cache_async()]
  Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2005212246080.8458@eggly.anvils
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Link: http://lkml.kernel.org/r/20200508183105.225460-17-hannes@cmpxchg.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03 20:09:48 -07:00
..
kasan kasan: move kasan_report() into report.c 2020-06-02 10:59:12 -07:00
backing-dev.c bdi: remove the name field in struct backing_dev_info 2020-05-09 16:15:13 -06:00
balloon_compaction.c mm/balloon_compaction: suppress allocation warnings 2019-09-04 07:42:01 -04:00
cleancache.c Driver Core and debugfs changes for 5.3-rc1 2019-07-12 12:24:03 -07:00
cma_debug.c mm/cma_debug.c: use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs fops 2019-12-01 12:59:09 -08:00
cma.c mm: cma: NUMA node interface 2020-04-10 15:36:21 -07:00
cma.h
compaction.c mm: rename gfpflags_to_migratetype to gfp_migratetype for same convention 2020-06-03 20:09:45 -07:00
debug_page_ref.c
debug.c mm, dump_page(): do not crash with invalid mapping pointer 2020-06-02 10:59:06 -07:00
dmapool.c mm/dmapool.c: micro-optimisation remove unnecessary branch 2020-04-07 10:43:42 -07:00
early_ioremap.c mm/early_ioremap.c: use %pa to print resource_size_t variables 2020-01-31 10:30:38 -08:00
fadvise.c mm: return void from various readahead functions 2020-06-02 10:59:06 -07:00
failslab.c mm/failslab.c: by default, do not fail allocations with direct reclaim only 2019-07-12 11:05:43 -07:00
filemap.c mm: memcontrol: convert anon and file-thp to new mem_cgroup_charge() API 2020-06-03 20:09:48 -07:00
frame_vector.c mm: untag user pointers in get_vaddr_frames 2019-09-25 17:51:41 -07:00
frontswap.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 482 2019-06-19 17:09:52 +02:00
gup_benchmark.c mm/gup_benchmark: support pin_user_pages() and related calls 2020-04-02 09:35:27 -07:00
gup.c mm/gup: might_lock_read(mmap_sem) in get_user_pages_fast() 2020-06-03 20:09:42 -07:00
highmem.c mm, x86/mm: Untangle address space layout definitions from basic pgtable type definitions 2019-12-10 10:12:55 +01:00
hmm.c mm/hmm: remove the customizable pfn format from hmm_range_fault 2020-05-11 10:47:29 -03:00
huge_memory.c mm: memcontrol: convert anon and file-thp to new mem_cgroup_charge() API 2020-06-03 20:09:48 -07:00
hugetlb_cgroup.c mm: use fallthrough; 2020-04-07 10:43:41 -07:00
hugetlb.c mm/hugetlb: avoid unnecessary check on pud and pmd entry in huge_pte_offset 2020-06-03 20:09:46 -07:00
hwpoison-inject.c mm/hwpoison-inject: use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs fops 2019-12-01 12:59:09 -08:00
init-mm.c mm/init-mm.c: include <linux/mman.h> for vm_committed_as_batch 2019-10-19 06:32:32 -04:00
internal.h mm/vmscan.c: change prototype for shrink_page_list 2020-06-03 20:09:47 -07:00
interval_tree.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 248 2019-06-19 17:09:08 +02:00
Kconfig mm: parallelize deferred_init_memmap() 2020-06-03 20:09:45 -07:00
Kconfig.debug mm: add generic ptdump 2020-02-04 03:05:25 +00:00
khugepaged.c mm: memcontrol: convert anon and file-thp to new mem_cgroup_charge() API 2020-06-03 20:09:48 -07:00
kmemleak-test.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
kmemleak.c mm/kmemleak.c: use address-of operator on section symbols 2020-04-02 09:35:26 -07:00
ksm.c mm/ksm: fix NULL pointer dereference when KSM zero page is enabled 2020-04-21 11:11:55 -07:00
list_lru.c mm: use fallthrough; 2020-04-07 10:43:41 -07:00
maccess.c uaccess: Add strict non-pagefault kernel-space read function 2019-11-02 12:39:12 -07:00
madvise.c mm: check that mm is still valid in madvise() 2020-04-24 13:28:03 -07:00
Makefile mm: introduce Reported pages 2020-04-07 10:43:38 -07:00
mapping_dirty_helpers.c mm/mapping_dirty_helpers: update huge page-table entry callbacks 2020-04-02 09:35:29 -07:00
memblock.c mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP option 2020-06-03 20:09:43 -07:00
memcontrol.c mm: memcontrol: make swap tracking an integral part of memory control 2020-06-03 20:09:48 -07:00
memfd.c mm: page cache: store only head pages in i_pages 2019-09-24 15:54:08 -07:00
memory_hotplug.c mm/page_alloc: integrate classzone_idx and high_zoneidx 2020-06-03 20:09:44 -07:00
memory-failure.c ACPI updates for 5.8-rc1 2020-06-02 13:25:52 -07:00
memory.c mm: memcontrol: charge swapin pages on instantiation 2020-06-03 20:09:48 -07:00
mempolicy.c libnvdimm for 5.7 2020-04-08 21:03:40 -07:00
mempool.c
memremap.c mm/memremap: set caching mode for PCI P2PDMA memory to WC 2020-04-10 15:36:21 -07:00
memtest.c
migrate.c mm: memcontrol: convert anon and file-thp to new mem_cgroup_charge() API 2020-06-03 20:09:48 -07:00
mincore.c mm: pagewalk: add 'depth' parameter to pte_hole 2020-02-04 03:05:25 +00:00
mlock.c mm: untag user pointers passed to memory syscalls 2019-09-25 17:51:41 -07:00
mm_init.c mm/mm_init.c: report kasan-tag information stored in page->flags 2020-06-02 10:59:12 -07:00
mmap.c mm/vma: introduce VM_ACCESS_FLAGS 2020-04-10 15:36:21 -07:00
mmu_context.c
mmu_gather.c asm-generic/tlb: provide MMU_GATHER_TABLE_FREE 2020-02-04 03:05:26 +00:00
mmu_notifier.c mm/mmu_notifier: silence PROVE_RCU_LIST warnings 2020-03-21 18:56:06 -07:00
mmzone.c
mprotect.c mm/vma: introduce VM_ACCESS_FLAGS 2020-04-10 15:36:21 -07:00
mremap.c userfaultfd: fix remap event with MREMAP_DONTUNMAP 2020-05-14 10:00:35 -07:00
msync.c mm: untag user pointers passed to memory syscalls 2019-09-25 17:51:41 -07:00
nommu.c mm: remove vmalloc_sync_(un)mappings() 2020-06-02 10:59:12 -07:00
oom_kill.c mm/page_alloc: integrate classzone_idx and high_zoneidx 2020-06-03 20:09:44 -07:00
page_alloc.c mm/vmscan.c: change prototype for shrink_page_list 2020-06-03 20:09:47 -07:00
page_counter.c mm, memcg: prevent memory.min load/store tearing 2020-04-02 09:35:29 -07:00
page_ext.c mm/page_ext.c: drop pfn_present() check when onlining 2020-04-07 10:43:40 -07:00
page_idle.c mm/page_idle.c: fix oops because end_pfn is larger than max_pfn 2019-06-29 16:43:45 +08:00
page_io.c fs: Enable bmap() function to properly return errors 2020-02-03 08:05:37 -05:00
page_isolation.c mm: add function __putback_isolated_page 2020-04-07 10:43:38 -07:00
page_owner.c mm: rename gfpflags_to_migratetype to gfp_migratetype for same convention 2020-06-03 20:09:45 -07:00
page_poison.c mm/page_poison.c: fix a typo in a comment 2019-09-24 15:54:08 -07:00
page_reporting.c mm/page_reporting: add budget limit on how many pages can be reported per pass 2020-04-07 10:43:39 -07:00
page_reporting.h mm: introduce Reported pages 2020-04-07 10:43:38 -07:00
page_vma_mapped.c mm/page_vma_mapped.c: explicitly compare pfn for normal, hugetlbfs and THP page 2020-01-31 10:30:38 -08:00
page-writeback.c mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK instead 2020-06-02 10:59:08 -07:00
pagewalk.c x86: mm: avoid allocating struct mm_struct on the stack 2020-02-04 03:05:25 +00:00
percpu-internal.h percpu: convert chunk hints to be based on pcpu_block_md 2019-03-13 12:25:31 -07:00
percpu-km.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 2019-06-05 17:37:16 +02:00
percpu-stats.c percpu: update copyright emails to dennis@kernel.org 2020-04-01 10:09:12 -07:00
percpu-vm.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 2019-06-05 17:37:16 +02:00
percpu.c mm: remove the pgprot argument to __vmalloc 2020-06-02 10:59:11 -07:00
pgtable-generic.c asm-generic/mm: stub out p{4,u}d_clear_bad() if __PAGETABLE_P{4,U}D_FOLDED 2019-12-01 06:29:19 -08:00
process_vm_access.c mm: docs: Fix a comment in process_vm_rw_core 2020-03-25 10:04:01 -05:00
ptdump.c x86: mm: ptdump: calculate effective permissions correctly 2020-06-02 10:59:09 -07:00
readahead.c mm: use memalloc_nofs_save in readahead path 2020-06-02 10:59:07 -07:00
rmap.c mm: memcontrol: switch to native NR_ANON_THPS counter 2020-06-03 20:09:47 -07:00
rodata_test.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 2019-06-05 17:37:17 +02:00
shmem.c mm: memcontrol: charge swapin pages on instantiation 2020-06-03 20:09:48 -07:00
shuffle.c mm: adjust shuffle code to allow for future coalescing 2020-04-07 10:43:38 -07:00
shuffle.h mm: adjust shuffle code to allow for future coalescing 2020-04-07 10:43:38 -07:00
slab_common.c usercopy: mark dma-kmalloc caches as usercopy caches 2020-06-02 10:59:06 -07:00
slab.c mm/page_alloc: integrate classzone_idx and high_zoneidx 2020-06-03 20:09:44 -07:00
slab.h mm: kmem: rename (__)memcg_kmem_(un)charge_memcg() to __memcg_kmem_(un)charge() 2020-04-02 09:35:28 -07:00
slob.c mm/sl[uo]b: export __kmalloc_track(_node)_caller 2020-03-26 14:45:51 +01:00
slub.c mm/page_alloc: integrate classzone_idx and high_zoneidx 2020-06-03 20:09:44 -07:00
sparse-vmemmap.c mm/sparsemem: convert kmalloc_section_memmap() to populate_section_memmap() 2019-07-18 17:08:07 -07:00
sparse.c mm/sparse.c: move subsection_map related functions together 2020-04-07 10:43:40 -07:00
swap_cgroup.c mm: memcontrol: make swap tracking an integral part of memory control 2020-06-03 20:09:48 -07:00
swap_slots.c mm/swap_slots.c: assign|reset cache slot by value directly 2020-04-02 09:35:27 -07:00
swap_state.c mm: memcontrol: charge swapin pages on instantiation 2020-06-03 20:09:48 -07:00
swap.c mm: simplify calling a compound page destructor 2020-06-03 20:09:47 -07:00
swapfile.c mm: memcontrol: charge swapin pages on instantiation 2020-06-03 20:09:48 -07:00
truncate.c mm/thp: allow dropping THP from page cache 2019-10-19 06:32:33 -04:00
usercopy.c usercopy: Avoid HIGHMEM pfn warning 2019-09-17 15:20:17 -07:00
userfaultfd.c mm: memcontrol: convert anon and file-thp to new mem_cgroup_charge() API 2020-06-03 20:09:48 -07:00
util.c mm: remove __vmalloc_node_flags_caller 2020-06-02 10:59:11 -07:00
vmacache.c
vmalloc.c mm: remove vmalloc_sync_(un)mappings() 2020-06-02 10:59:12 -07:00
vmpressure.c mm: vmpressure: use mem_cgroup_is_root API 2020-04-02 09:35:31 -07:00
vmscan.c mm/vmscan: update the comment of should_continue_reclaim() 2020-06-03 20:09:47 -07:00
vmstat.c mm/vmstat.c: do not show lowmem reserve protection information of empty zone 2020-06-03 20:09:44 -07:00
workingset.c mm: vmscan: detect file thrashing at the reclaim root 2019-12-01 12:59:07 -08:00
z3fold.c mm/z3fold: silence kmemleak false positives of slots 2020-05-28 11:35:40 -07:00
zbud.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
zpool.c zpool: add malloc_support_movable to zpool_driver 2019-09-24 15:54:12 -07:00
zsmalloc.c mm: remove map_vm_range 2020-06-02 10:59:11 -07:00
zswap.c mm/zswap: allow setting default status, compressor and allocator in Kconfig 2020-04-07 10:43:41 -07:00