Go to file
Johannes Weiner b91ac37434 mm: vmscan: enforce inactive:active ratio at the reclaim root
We split the LRU lists into inactive and an active parts to maximize
workingset protection while allowing just enough inactive cache space to
faciltate readahead and writeback for one-off file accesses (e.g.  a
linear scan through a file, or logging); or just enough inactive anon to
maintain recent reference information when reclaim needs to swap.

With cgroups and their nested LRU lists, we currently don't do this
correctly.  While recursive cgroup reclaim establishes a relative LRU
order among the pages of all involved cgroups, inactive:active size
decisions are done on a per-cgroup level.  As a result, we'll reclaim a
cgroup's workingset when it doesn't have cold pages, even when one of its
siblings has plenty of it that should be reclaimed first.

For example: workload A has 50M worth of hot cache but doesn't do any
one-off file accesses; meanwhile, parallel workload B scans files and
rarely accesses the same page twice.

If these workloads were to run in an uncgrouped system, A would be
protected from the high rate of cache faults from B.  But if they were put
in parallel cgroups for memory accounting purposes, B's fast cache fault
rate would push out the hot cache pages of A.  This is unexpected and
undesirable - the "scan resistance" of the page cache is broken.

This patch moves inactive:active size balancing decisions to the root of
reclaim - the same level where the LRU order is established.

It does this by looking at the recursive size of the inactive and the
active file sets of the cgroup subtree at the beginning of the reclaim
cycle, and then making a decision - scan or skip active pages - that
applies throughout the entire run and to every cgroup involved.

With that in place, in the test above, the VM will recognize that there
are plenty of inactive pages in the combined cache set of workloads A and
B and prefer the one-off cache in B over the hot pages in A.  The scan
resistance of the cache is restored.

[cai@lca.pw: fix some -Wenum-conversion warnings]
  Link: http://lkml.kernel.org/r/1573848697-29262-1-git-send-email-cai@lca.pw
Link: http://lkml.kernel.org/r/20191107205334.158354-4-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-01 12:59:07 -08:00
arch x86/kasan: support KASAN_VMALLOC 2019-12-01 12:59:06 -08:00
block for-5.5/disk-revalidate-20191122 2019-11-25 11:37:01 -08:00
certs PKCS#7: Refactor verify_pkcs7_signature() 2019-08-05 18:40:18 -04:00
crypto Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2019-11-25 19:49:58 -08:00
Documentation kasan: support backing vmalloc space with real shadow memory 2019-12-01 12:59:05 -08:00
drivers drivers/base/memory.c: drop the mem_sysfs_mutex 2019-12-01 12:59:04 -08:00
fs fs/direct-io.c: keep dio_warn_stale_pagecache() when CONFIG_BLOCK=n 2019-12-01 06:29:18 -08:00
include mm: vmscan: enforce inactive:active ratio at the reclaim root 2019-12-01 12:59:07 -08:00
init Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2019-11-25 20:02:57 -08:00
ipc ipc/sem.c: convert to use built-in RCU list checking 2019-09-25 17:51:41 -07:00
kernel fork: support VMAP_STACK with KASAN_VMALLOC 2019-12-01 12:59:05 -08:00
lib kasan: add test for vmalloc 2019-12-01 12:59:05 -08:00
LICENSES LICENSES: Rename other to deprecated 2019-05-03 06:34:32 -06:00
mm mm: vmscan: enforce inactive:active ratio at the reclaim root 2019-12-01 12:59:07 -08:00
net for-5.5/io_uring-post-20191128 2019-11-28 10:43:39 -08:00
samples New tracing features: 2019-11-27 11:42:01 -08:00
scripts scripts/spelling.txt: add more spellings to spelling.txt 2019-12-01 06:29:17 -08:00
security drm main pull for 5.5-rc1 2019-11-27 17:45:48 -08:00
sound sound updates for 5.5-rc1 2019-11-26 20:04:35 -08:00
tools selftests: vm: add fragment CONFIG_TEST_VMALLOC 2019-12-01 12:59:05 -08:00
usr kbuild: update compile-test header list for v5.4-rc2 2019-10-05 15:29:49 +09:00
virt KVM: Fix jump label out_free_* in kvm_init() 2019-11-23 11:29:17 +01:00
.clang-format clang-format: Update with the latest for_each macro list 2019-08-31 10:00:51 +02:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes
.gitignore Modules updates for v5.4 2019-09-22 10:34:46 -07:00
.mailmap Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-11-26 14:52:11 -08:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS MAINTAINERS: Remove Simon as Renesas SoC Co-Maintainer 2019-10-10 08:12:51 -07:00
Kbuild kbuild: do not descend to ./Kbuild when cleaning 2019-08-21 21:03:58 +09:00
Kconfig docs: kbuild: convert docs to ReST and rename to *.rst 2019-06-14 14:21:21 -06:00
MAINTAINERS \n 2019-11-30 11:34:33 -08:00
Makefile Linux 5.4 2019-11-24 16:32:01 -08:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.