linux/mm
Linus Torvalds 9dcb8b685f mm: remove per-zone hashtable of bitlock waitqueues
The per-zone waitqueues exist because of a scalability issue with the
page waitqueues on some NUMA machines, but it turns out that they hurt
normal loads, and now with the vmalloced stacks they also end up
breaking gfs2 that uses a bit_wait on a stack object:

     wait_on_bit(&gh->gh_iflags, HIF_WAIT, TASK_UNINTERRUPTIBLE)

where 'gh' can be a reference to the local variable 'mount_gh' on the
stack of fill_super().

The reason the per-zone hash table breaks for this case is that there is
no "zone" for virtual allocations, and trying to look up the physical
page to get at it will fail (with a BUG_ON()).

It turns out that I actually complained to the mm people about the
per-zone hash table for another reason just a month ago: the zone lookup
also hurts the regular use of "unlock_page()" a lot, because the zone
lookup ends up forcing several unnecessary cache misses and generates
horrible code.

As part of that earlier discussion, we had a much better solution for
the NUMA scalability issue - by just making the page lock have a
separate contention bit, the waitqueue doesn't even have to be looked at
for the normal case.

Peter Zijlstra already has a patch for that, but let's see if anybody
even notices.  In the meantime, let's fix the actual gfs2 breakage by
simplifying the bitlock waitqueues and removing the per-zone issue.

Reported-by: Andreas Gruenbacher <agruenba@redhat.com>
Tested-by: Bob Peterson <rpeterso@redhat.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 09:27:57 -07:00
..
kasan kprobes: Unpoison stack in jprobe_return() for KASAN 2016-10-16 11:02:31 +02:00
backing-dev.c block: fix bdi vs gendisk lifetime mismatch 2016-08-04 14:19:16 -06:00
balloon_compaction.c
bootmem.c mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping 2016-10-11 15:06:33 -07:00
cleancache.c
cma_debug.c
cma.c mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping 2016-10-11 15:06:33 -07:00
cma.h
compaction.c mm, compaction: restrict fragindex to costly orders 2016-10-07 18:46:29 -07:00
debug_page_ref.c
debug.c mm: clarify why we avoid page_mapcount() for slab pages in dump_page() 2016-10-07 18:46:29 -07:00
dmapool.c
early_ioremap.c
fadvise.c
failslab.c
filemap.c mm: remove per-zone hashtable of bitlock waitqueues 2016-10-27 09:27:57 -07:00
frame_vector.c mm: replace get_vaddr_frames() write/force parameters with gup_flags 2016-10-19 08:11:24 -07:00
frontswap.c
gup.c mm: unexport __get_user_pages() 2016-10-24 19:13:20 -07:00
highmem.c
huge_memory.c mm: vm_page_prot: update with WRITE_ONCE/READ_ONCE 2016-10-07 18:46:29 -07:00
hugetlb_cgroup.c
hugetlb.c mm: remove unnecessary condition in remove_inode_hugepages 2016-10-07 18:46:29 -07:00
hwpoison-inject.c
init-mm.c
internal.h mm, compaction: make full priority ignore pageblock suitability 2016-10-07 18:46:29 -07:00
interval_tree.c
Kconfig mm: clarify COMPACTION Kconfig text 2016-08-26 17:39:35 -07:00
Kconfig.debug PM / Hibernate: allow hibernation with PAGE_POISONING_ZERO 2016-09-13 02:35:27 +02:00
khugepaged.c mm, thp: fix leaking mapped pte in __collapse_huge_page_swapin() 2016-09-19 15:36:16 -07:00
kmemcheck.c
kmemleak-test.c
kmemleak.c mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping 2016-10-11 15:06:33 -07:00
ksm.c mm,ksm: add __GFP_HIGH to the allocation in alloc_stable_node() 2016-10-07 18:46:29 -07:00
list_lru.c
maccess.c
madvise.c
Makefile Disable the __builtin_return_address() warning globally after all 2016-10-12 10:23:41 -07:00
memblock.c mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping 2016-10-11 15:06:33 -07:00
memcontrol.c mm: memcontrol: consolidate cgroup socket tracking 2016-10-07 18:46:29 -07:00
memory_hotplug.c mm: remove per-zone hashtable of bitlock waitqueues 2016-10-27 09:27:57 -07:00
memory-failure.c mm: hwpoison: remove incorrect comments 2016-07-28 16:07:41 -07:00
memory.c mm: replace access_process_vm() write parameter with gup_flags 2016-10-19 08:31:25 -07:00
mempolicy.c mm: replace get_user_pages() write/force parameters with gup_flags 2016-10-19 08:11:43 -07:00
mempool.c Revert "mm, mempool: only set __GFP_NOMEMALLOC if there are free elements" 2016-07-28 16:07:41 -07:00
memtest.c
migrate.c mm: vm_page_prot: update with WRITE_ONCE/READ_ONCE 2016-10-07 18:46:29 -07:00
mincore.c mm, swap: use offset of swap entry as key of swap cache 2016-10-07 18:46:28 -07:00
mlock.c mm: mlock: avoid increase mm->locked_vm on mlock() when already mlock2(,MLOCK_ONFAULT) 2016-10-07 18:46:28 -07:00
mm_init.c
mmap.c mm: vma_merge: correct false positive from __vma_unlink->validate_mm_rb 2016-10-07 18:46:29 -07:00
mmu_context.c
mmu_notifier.c
mmzone.c
mprotect.c mm/numa: Remove duplicated include from mprotect.c 2016-10-19 17:28:48 +02:00
mremap.c
msync.c
nobootmem.c mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping 2016-10-11 15:06:33 -07:00
nommu.c mm: unexport __get_user_pages() 2016-10-24 19:13:20 -07:00
oom_kill.c oom: print nodemask in the oom report 2016-10-07 18:46:29 -07:00
page_alloc.c mm: remove per-zone hashtable of bitlock waitqueues 2016-10-27 09:27:57 -07:00
page_counter.c
page_ext.c mm/page_ext: support extra space allocation by page_ext user 2016-10-07 18:46:27 -07:00
page_idle.c mm, vmscan: move lru_lock to the node 2016-07-28 16:07:41 -07:00
page_io.c mm/page_io.c: replace some BUG_ON()s with VM_BUG_ON_PAGE() 2016-10-07 18:46:29 -07:00
page_isolation.c mm/page_isolation: fix typo: "paes" -> "pages" 2016-10-07 18:46:29 -07:00
page_owner.c mm/page_owner: don't define fields on struct page_ext by hard-coding 2016-10-07 18:46:27 -07:00
page_poison.c
page-writeback.c mm: don't use radix tree writeback tags for pages in swap cache 2016-10-07 18:46:28 -07:00
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c mm/percpu.c: fix potential memory leakage for pcpu_embed_first_chunk() 2016-10-05 11:52:55 -04:00
pgtable-generic.c
process_vm_access.c mm: remove write/force parameters from __get_user_pages_unlocked() 2016-10-18 14:13:37 -07:00
quicklist.c
readahead.c mm: silently skip readahead for DAX inodes 2016-08-26 17:39:35 -07:00
rmap.c rmap: fix compound check logic in page_remove_file_rmap 2016-08-10 16:40:56 -07:00
shmem.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
slab_common.c
slab.c slab: Convert to hotplug state machine 2016-09-06 18:30:20 +02:00
slab.h mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB 2016-07-28 16:07:41 -07:00
slob.c
slub.c slub: Convert to hotplug state machine 2016-09-06 18:30:20 +02:00
sparse-vmemmap.c treewide: replace obsolete _refok by __ref 2016-08-02 17:31:41 -04:00
sparse.c treewide: replace obsolete _refok by __ref 2016-08-02 17:31:41 -04:00
swap_cgroup.c
swap_state.c mm, swap: use offset of swap entry as key of swap cache 2016-10-07 18:46:28 -07:00
swap.c thp: reduce usage of huge zero page's atomic counter 2016-10-07 18:46:28 -07:00
swapfile.c mm, swap: use offset of swap entry as key of swap cache 2016-10-07 18:46:28 -07:00
truncate.c
usercopy.c mm: usercopy: Check for module addresses 2016-09-20 16:07:39 -07:00
userfaultfd.c
util.c Merge branch 'mm-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-10-22 09:39:10 -07:00
vmacache.c mm: unrig VMA cache hit ratio 2016-10-07 18:46:27 -07:00
vmalloc.c mm: consolidate warn_alloc_failed users 2016-10-07 18:46:29 -07:00
vmpressure.c
vmscan.c mm: use zonelist name instead of using hardcoded index 2016-10-07 18:46:28 -07:00
vmstat.c seq/proc: modify seq_put_decimal_[u]ll to take a const char *, not char 2016-10-07 18:46:30 -07:00
workingset.c mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page() 2016-09-30 15:26:52 -07:00
z3fold.c
zbud.c
zpool.c
zsmalloc.c zsmalloc: Delete an unnecessary check before the function call "iput" 2016-07-28 16:07:41 -07:00
zswap.c