Go to file
Oscar Salvador ea2c3f6f55 mm,mremap: bail out earlier in mremap_to under map pressure
When using mremap() syscall in addition to MREMAP_FIXED flag, mremap()
calls mremap_to() which does the following:

1) unmaps the destination region where we are going to move the map
2) If the new region is going to be smaller, we unmap the last part
   of the old region

Then, we will eventually call move_vma() to do the actual move.

move_vma() checks whether we are at least 4 maps below max_map_count
before going further, otherwise it bails out with -ENOMEM.  The problem
is that we might have already unmapped the vma's in steps 1) and 2), so
it is not possible for userspace to figure out the state of the vmas
after it gets -ENOMEM, and it gets tricky for userspace to clean up
properly on error path.

While it is true that we can return -ENOMEM for more reasons (e.g: see
may_expand_vm() or move_page_tables()), I think that we can avoid this
scenario if we check early in mremap_to() if the operation has high
chances to succeed map-wise.

Should that not be the case, we can bail out before we even try to unmap
anything, so we make sure the vma's are left untouched in case we are
likely to be short of maps.

The thumb-rule now is to rely on the worst-scenario case we can have.
That is when both vma's (old region and new region) are going to be
split in 3, so we get two more maps to the ones we already hold (one per
each).  If current map count + 2 maps still leads us to 4 maps below the
threshold, we are going to pass the check in move_vma().

Of course, this is not free, as it might generate false positives when
it is true that we are tight map-wise, but the unmap operation can
release several vma's leading us to a good state.

Another approach was also investigated [1], but it may be too much
hassle for what it brings.

[1] https://lore.kernel.org/lkml/20190219155320.tkfkwvqk53tfdojt@d104.suse.de/

Link: http://lkml.kernel.org/r/20190226091314.18446-1-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Cyril Hrubis <chrubis@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:21 -08:00
arch docs/core-api/mm: fix user memory accessors formatting 2019-03-05 21:07:20 -08:00
block for-linus-20190215 2019-02-15 09:12:28 -08:00
certs kbuild: remove redundant target cleaning on failure 2019-01-06 09:46:51 +09:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2019-03-05 09:09:55 -08:00
Documentation mm: remove zone_lru_lock() function, access ->lru_lock directly 2019-03-05 21:07:21 -08:00
drivers agp: efficeon: no need to set PG_reserved on GATT tables 2019-03-05 21:07:18 -08:00
firmware kbuild: change filechk to surround the given command with { } 2019-01-06 09:46:51 +09:00
fs mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd 2019-03-05 21:07:19 -08:00
include writeback: fix inode cgroup switching comment 2019-03-05 21:07:21 -08:00
init mm: replace all open encodings for NUMA_NO_NODE 2019-03-05 21:07:14 -08:00
ipc ipc: introduce ksys_ipc()/compat_ksys_ipc() for s390 2019-01-18 09:33:18 +01:00
kernel kernel: cgroup: add poll file operation 2019-03-05 21:07:17 -08:00
lib mm/page_owner: move config option to mm/Kconfig.debug 2019-03-05 21:07:18 -08:00
LICENSES This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
mm mm,mremap: bail out earlier in mremap_to under map pressure 2019-03-05 21:07:21 -08:00
net mm: replace all open encodings for NUMA_NO_NODE 2019-03-05 21:07:14 -08:00
samples Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2019-03-05 08:26:13 -08:00
scripts scripts/decode_stacktrace.sh: handle RIP address with segment 2019-03-05 21:07:13 -08:00
security get rid of legacy 'get_ds()' function 2019-03-04 10:50:14 -08:00
sound sound fixes for 5.0 2019-02-20 09:42:52 -08:00
tools tmpfs: test link accounting with O_TMPFILE 2019-03-05 21:07:20 -08:00
usr user/Makefile: Fix typo and capitalization in comment section 2018-12-11 00:18:03 +09:00
virt kvm: properly check debugfs dentry before using it 2019-02-28 08:57:32 -08:00
.clang-format clang-format: Update .clang-format with the latest for_each macro list 2019-01-19 19:26:06 +01:00
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore kbuild: Add support for DT binding schema checks 2018-12-13 09:41:32 -06:00
.mailmap .mailmap: Add Mathieu Othacehe 2019-02-21 11:41:19 +00:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS CREDITS/MAINTAINERS: Retire parisc-linux.org email domain 2019-02-21 20:16:10 +01:00
Kbuild kbuild: use assignment instead of define ... endef for filechk_* rules 2019-01-06 10:22:35 +09:00
Kconfig kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
MAINTAINERS MAINTAINERS: add entry for memblock 2019-03-05 21:07:20 -08:00
Makefile Linux 5.0 2019-03-03 15:21:29 -08:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.