linux

iv/linux

History

Mike Kravetz 4643d67e8c hugetlbfs: fix hugetlb page migration/fault race causing SIGBUS Li Wang discovered that LTP/move_page12 V2 sometimes triggers SIGBUS in the kernel-v5.2.3 testing. This is caused by a race between hugetlb page migration and page fault. If a hugetlb page can not be allocated to satisfy a page fault, the task is sent SIGBUS. This is normal hugetlbfs behavior. A hugetlb fault mutex exists to prevent two tasks from trying to instantiate the same page. This protects against the situation where there is only one hugetlb page, and both tasks would try to allocate. Without the mutex, one would fail and SIGBUS even though the other fault would be successful. There is a similar race between hugetlb page migration and fault. Migration code will allocate a page for the target of the migration. It will then unmap the original page from all page tables. It does this unmap by first clearing the pte and then writing a migration entry. The page table lock is held for the duration of this clear and write operation. However, the beginnings of the hugetlb page fault code optimistically checks the pte without taking the page table lock. If clear (as it can be during the migration unmap operation), a hugetlb page allocation is attempted to satisfy the fault. Note that the page which will eventually satisfy this fault was already allocated by the migration code. However, the allocation within the fault path could fail which would result in the task incorrectly being sent SIGBUS. Ideally, we could take the hugetlb fault mutex in the migration code when modifying the page tables. However, locks must be taken in the order of hugetlb fault mutex, page lock, page table lock. This would require significant rework of the migration code. Instead, the issue is addressed in the hugetlb fault code. After failing to allocate a huge page, take the page table lock and check for huge_pte_none before returning an error. This is the same check that must be made further in the code even if page allocation is successful. Link: http://lkml.kernel.org/r/20190808000533.7701-1-mike.kravetz@oracle.com Fixes: `290408d4a2` ("hugetlb: hugepage migration core") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: Li Wang <liwang@redhat.com> Tested-by: Li Wang <liwang@redhat.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Cyril Hrubis <chrubis@suse.cz> Cc: Xishi Qiu <xishi.qiuxishi@alibaba-inc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2019-08-13 16:06:53 -07:00
..
kasan	mm/kasan: change kasan_check_{read,write} to return boolean	2019-07-12 11:05:42 -07:00
backing-dev.c	backing-dev: no need to check return value of debugfs_create functions	2019-06-03 15:49:07 +02:00
balloon_compaction.c	balloon: fix up comments	2019-07-22 11:19:26 -04:00
cleancache.c	Driver Core and debugfs changes for 5.3-rc1	2019-07-12 12:24:03 -07:00
cma_debug.c	mm/cma_debug.c: fix the break condition in cma_maxchunk_get()	2019-05-14 09:47:45 -07:00
cma.c	mm/cma.c: fail if fixed declaration can't be honored	2019-07-16 19:23:21 -07:00
cma.h
compaction.c	mm: compaction: avoid 100% CPU usage during compaction when a task is killed	2019-08-03 07:02:00 -07:00
debug_page_ref.c
debug.c	mm: update references to page _refcount	2019-05-14 19:52:47 -07:00
dmapool.c	mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options	2019-07-12 11:05:46 -07:00
early_ioremap.c	mm/early_ioremap: Fix boot hang with earlyprintk=efi,keep	2017-12-11 14:54:44 +01:00
fadvise.c	vfs: implement readahead(2) using POSIX_FADV_WILLNEED	2018-08-30 20:01:32 +02:00
failslab.c	mm/failslab.c: by default, do not fail allocations with direct reclaim only	2019-07-12 11:05:43 -07:00
filemap.c	mm/filemap.c: correct the comment about VM_FAULT_RETRY	2019-07-12 11:05:43 -07:00
frame_vector.c	mm/frame_vector.c: release a semaphore in 'get_vaddr_frames()'	2017-12-14 16:00:48 -08:00
frontswap.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 482	2019-06-19 17:09:52 +02:00
gup_benchmark.c	mm/gup: replace get_user_pages_longterm() with FOLL_LONGTERM	2019-05-14 09:47:45 -07:00
gup.c	mm: introduce ARCH_HAS_PTE_DEVMAP	2019-07-16 19:23:25 -07:00
highmem.c	mm: convert totalram_pages and totalhigh_pages variables to atomic	2018-12-28 12:11:47 -08:00
hmm.c	mm/hmm: always return EBUSY for invalid ranges in hmm_range_{fault,snapshot}	2019-07-25 16:14:39 -03:00
huge_memory.c	Revert "mm, thp: restore node-local hugepage allocations"	2019-08-13 16:06:52 -07:00
hugetlb_cgroup.c	mm: rename page_counter's count/limit into usage/max	2018-06-07 17:34:35 -07:00
hugetlb.c	hugetlbfs: fix hugetlb page migration/fault race causing SIGBUS	2019-08-13 16:06:53 -07:00
hwpoison-inject.c	hwpoison-inject: no need to check return value of debugfs_create functions	2019-06-03 15:39:40 +02:00
init-mm.c	mm: Allocate the mm_cpumask (mm->cpu_bitmap[]) dynamically based on nr_cpu_ids	2018-07-17 09:35:30 +02:00
internal.h	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152	2019-05-30 11:26:32 -07:00
interval_tree.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 248	2019-06-19 17:09:08 +02:00
Kconfig	mm: introduce ARCH_HAS_PTE_DEVMAP	2019-07-16 19:23:25 -07:00
Kconfig.debug	mm, debug_pagealloc: use a page type instead of page_ext flag	2019-07-12 11:05:43 -07:00
khugepaged.c	Revert "mm: page cache: store only head pages in i_pages"	2019-07-05 19:55:18 -07:00
kmemleak-test.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333	2019-06-05 17:37:06 +02:00
kmemleak.c	mm: kmemleak: disable early logging in case of error	2019-08-13 16:06:52 -07:00
ksm.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 482	2019-06-19 17:09:52 +02:00
list_lru.c	mm: memcg/slab: stop setting page->mem_cgroup pointer for slab pages	2019-07-12 11:05:44 -07:00
maccess.c	The main changes in this release include:	2019-07-18 11:51:00 -07:00
madvise.c	mm: remove MEMORY_DEVICE_PUBLIC support	2019-07-02 14:32:43 -03:00
Makefile	memremap: move from kernel/ to mm/	2019-08-03 07:02:01 -07:00
memblock.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152	2019-05-30 11:26:32 -07:00
memcontrol.c	mm: workingset: fix vmstat counters for shadow nodes	2019-08-13 16:06:52 -07:00
memfd.c	Revert "mm: page cache: store only head pages in i_pages"	2019-07-05 19:55:18 -07:00
memory_hotplug.c	mm/memory_hotplug.c: remove unneeded return for void function	2019-08-03 07:02:01 -07:00
memory-failure.c	HMM patches for 5.3	2019-07-14 19:42:11 -07:00
memory.c	mm: thp: make transhuge_vma_suitable available for anonymous THP	2019-07-18 17:08:06 -07:00
mempolicy.c	Revert "mm, thp: restore node-local hugepage allocations"	2019-08-13 16:06:52 -07:00
mempool.c	docs/core-api/mm: fix return value descriptions in mm/	2019-03-05 21:07:20 -08:00
memremap.c	mm/hmm: fix ZONE_DEVICE anon page mapping reuse	2019-08-13 16:06:52 -07:00
memtest.c
migrate.c	mm/migrate.c: initialize pud_entry in migrate_vma()	2019-08-03 07:02:01 -07:00
mincore.c	mm/mincore.c: fix race between swapoff and mincore	2019-07-12 11:05:43 -07:00
mlock.c	mm/mlock.c: change count_mm_mlocked_page_nr return type	2019-06-13 17:34:56 -10:00
mm_init.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
mmap.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
mmu_context.c
mmu_gather.c	mm: mmu_gather: remove __tlb_reset_range() for force flush	2019-06-13 17:34:56 -10:00
mmu_notifier.c	mm/mmu_notifier: use hlist_add_head_rcu()	2019-07-12 11:05:46 -07:00
mmzone.c
mprotect.c	mm/mprotect.c: fix compilation warning because of unused 'mm' variable	2019-05-14 09:47:51 -07:00
mremap.c	mm/mmu_notifier: contextual information for event triggering invalidation	2019-05-14 09:47:49 -07:00
msync.c
nommu.c	mm: fix the MAP_UNINITIALIZED flag	2019-07-16 19:23:21 -07:00
oom_kill.c	mm/oom_kill.c: remove redundant OOM score normalization in select_bad_process()	2019-07-12 11:05:47 -07:00
page_alloc.c	mm/sparsemem: support sub-section hotplug	2019-07-18 17:08:07 -07:00
page_counter.c	memcg: introduce memory.min	2018-06-07 17:34:36 -07:00
page_ext.c	mm, debug_pagealloc: use a page type instead of page_ext flag	2019-07-12 11:05:43 -07:00
page_idle.c	mm/page_idle.c: fix oops because end_pfn is larger than max_pfn	2019-06-29 16:43:45 +08:00
page_io.c	mm, swap: use rbtree for swap_extent	2019-07-12 11:05:43 -07:00
page_isolation.c	mm/page_isolation.c: change the prototype of undo_isolate_page_range()	2019-07-12 11:05:43 -07:00
page_owner.c	mm/page_owner: Simplify stack trace handling	2019-04-29 12:37:50 +02:00
page_poison.c	page_poison: play nicely with KASAN	2019-03-05 21:07:13 -08:00
page_vma_mapped.c	mm/rmap: map_pte() was not handling private ZONE_DEVICE page properly	2018-10-31 08:54:11 -07:00
page-writeback.c	mm: remove the account_page_dirtied export	2019-07-12 11:05:42 -07:00
pagewalk.c	mm: kernel-doc: add missing parameter descriptions	2018-04-05 21:36:27 -07:00
percpu-internal.h	percpu: convert chunk hints to be based on pcpu_block_md	2019-03-13 12:25:31 -07:00
percpu-km.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428	2019-06-05 17:37:16 +02:00
percpu-stats.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428	2019-06-05 17:37:16 +02:00
percpu-vm.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428	2019-06-05 17:37:16 +02:00
percpu.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428	2019-06-05 17:37:16 +02:00
pgtable-generic.c	x86/mm: Page size aware flush_tlb_mm_range()	2018-10-09 16:51:11 +02:00
process_vm_access.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152	2019-05-30 11:26:32 -07:00
quicklist.c
readahead.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
rmap.c	mm/hmm: fix bad subpage pointer in try_to_unmap_one	2019-08-13 16:06:52 -07:00
rodata_test.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441	2019-06-05 17:37:17 +02:00
shmem.c	Revert "Revert "mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask""	2019-08-13 16:06:52 -07:00
shuffle.c	mm: maintain randomization of page free lists	2019-05-14 19:52:48 -07:00
shuffle.h	mm: maintain randomization of page free lists	2019-05-14 19:52:48 -07:00
slab_common.c	mm/slab_common.c: work around clang bug #42570	2019-07-16 19:23:21 -07:00
slab.c	mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options	2019-07-12 11:05:46 -07:00
slab.h	mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options	2019-07-12 11:05:46 -07:00
slob.c	mm/slab: refactor common ksize KASAN logic into slab_common.c	2019-07-12 11:05:42 -07:00
slub.c	mm: slub: Fix slab walking for init_on_free	2019-07-31 13:16:06 -07:00
sparse-vmemmap.c	mm/sparsemem: convert kmalloc_section_memmap() to populate_section_memmap()	2019-07-18 17:08:07 -07:00
sparse.c	mm/sparsemem: cleanup 'section number' data types	2019-07-18 17:08:07 -07:00
swap_cgroup.c
swap_slots.c	mm, swap, get_swap_pages: use entry_size instead of cluster in parameter	2018-08-22 10:52:44 -07:00
swap_state.c	mm/swap_state.c: simplify total_swapcache_pages() with get_swap_device()	2019-07-12 11:05:43 -07:00
swap.c	docs: admin-guide: move sysctl directory to it	2019-07-15 11:03:01 -03:00
swapfile.c	mm, swap: use rbtree for swap_extent	2019-07-12 11:05:43 -07:00
truncate.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
usercopy.c	mm/usercopy: use memory range to be accessed for wraparound check	2019-08-13 16:06:52 -07:00
userfaultfd.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499	2019-06-19 17:09:53 +02:00
util.c	mm: add account_locked_vm utility function	2019-07-16 19:23:25 -07:00
vmacache.c	mm: get rid of vmacache_flush_all() entirely	2018-09-13 15:18:04 -10:00
vmalloc.c	mm/vmalloc.c: fix percpu free VM area search criteria	2019-08-13 16:06:52 -07:00
vmpressure.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500	2019-06-19 17:09:55 +02:00
vmscan.c	mm, vmscan: do not special-case slab reclaim when watermarks are boosted	2019-08-13 16:06:53 -07:00
vmstat.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
workingset.c	mm: workingset: fix vmstat counters for shadow nodes	2019-08-13 16:06:52 -07:00
z3fold.c	mm/z3fold.c: fix z3fold_destroy_pool() race condition	2019-08-13 16:06:52 -07:00
zbud.c	treewide: Add SPDX license identifier for more missed files	2019-05-21 10:50:45 +02:00
zpool.c	treewide: Add SPDX license identifier for more missed files	2019-05-21 10:50:45 +02:00
zsmalloc.c	Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2019-07-19 10:42:02 -07:00
zswap.c	zswap: ignore debugfs_create_dir() return value	2019-06-03 15:39:39 +02:00