linux/mm
Peter Zijlstra 7cd5a02f54 mm: fix mm_take_all_locks() locking order
Lockdep spotted:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.27-rc1 #270
-------------------------------------------------------
qemu-kvm/2033 is trying to acquire lock:
 (&inode->i_data.i_mmap_lock){----}, at: [<ffffffff802996cc>] mm_take_all_locks+0xc2/0xea

but task is already holding lock:
 (&anon_vma->lock){----}, at: [<ffffffff8029967a>] mm_take_all_locks+0x70/0xea

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&anon_vma->lock){----}:
       [<ffffffff8025cd37>] __lock_acquire+0x11be/0x14d2
       [<ffffffff8025d0a9>] lock_acquire+0x5e/0x7a
       [<ffffffff804c655b>] _spin_lock+0x3b/0x47
       [<ffffffff8029a2ef>] vma_adjust+0x200/0x444
       [<ffffffff8029a662>] split_vma+0x12f/0x146
       [<ffffffff8029bc60>] mprotect_fixup+0x13c/0x536
       [<ffffffff8029c203>] sys_mprotect+0x1a9/0x21e
       [<ffffffff8020c0db>] system_call_fastpath+0x16/0x1b
       [<ffffffffffffffff>] 0xffffffffffffffff

-> #0 (&inode->i_data.i_mmap_lock){----}:
       [<ffffffff8025ca54>] __lock_acquire+0xedb/0x14d2
       [<ffffffff8025d397>] lock_release_non_nested+0x1c2/0x219
       [<ffffffff8025d515>] lock_release+0x127/0x14a
       [<ffffffff804c6403>] _spin_unlock+0x1e/0x50
       [<ffffffff802995d9>] mm_drop_all_locks+0x7f/0xb0
       [<ffffffff802a965d>] do_mmu_notifier_register+0xe2/0x112
       [<ffffffff802a96a8>] mmu_notifier_register+0xe/0x10
       [<ffffffffa0043b6b>] kvm_dev_ioctl+0x11e/0x287 [kvm]
       [<ffffffff802bd0ca>] vfs_ioctl+0x2a/0x78
       [<ffffffff802bd36f>] do_vfs_ioctl+0x257/0x274
       [<ffffffff802bd3e1>] sys_ioctl+0x55/0x78
       [<ffffffff8020c0db>] system_call_fastpath+0x16/0x1b
       [<ffffffffffffffff>] 0xffffffffffffffff

other info that might help us debug this:

5 locks held by qemu-kvm/2033:
 #0:  (&mm->mmap_sem){----}, at: [<ffffffff802a95d0>] do_mmu_notifier_register+0x55/0x112
 #1:  (mm_all_locks_mutex){--..}, at: [<ffffffff8029963e>] mm_take_all_locks+0x34/0xea
 #2:  (&anon_vma->lock){----}, at: [<ffffffff8029967a>] mm_take_all_locks+0x70/0xea
 #3:  (&anon_vma->lock){----}, at: [<ffffffff8029967a>] mm_take_all_locks+0x70/0xea
 #4:  (&anon_vma->lock){----}, at: [<ffffffff8029967a>] mm_take_all_locks+0x70/0xea

stack backtrace:
Pid: 2033, comm: qemu-kvm Not tainted 2.6.27-rc1 #270

Call Trace:
 [<ffffffff8025b7c7>] print_circular_bug_tail+0xb8/0xc3
 [<ffffffff8025ca54>] __lock_acquire+0xedb/0x14d2
 [<ffffffff80259bb1>] ? add_lock_to_list+0x7e/0xad
 [<ffffffff8029967a>] ? mm_take_all_locks+0x70/0xea
 [<ffffffff8029967a>] ? mm_take_all_locks+0x70/0xea
 [<ffffffff8025d397>] lock_release_non_nested+0x1c2/0x219
 [<ffffffff802996cc>] ? mm_take_all_locks+0xc2/0xea
 [<ffffffff802996cc>] ? mm_take_all_locks+0xc2/0xea
 [<ffffffff8025b202>] ? trace_hardirqs_on_caller+0x4d/0x115
 [<ffffffff802995d9>] ? mm_drop_all_locks+0x7f/0xb0
 [<ffffffff8025d515>] lock_release+0x127/0x14a
 [<ffffffff804c6403>] _spin_unlock+0x1e/0x50
 [<ffffffff802995d9>] mm_drop_all_locks+0x7f/0xb0
 [<ffffffff802a965d>] do_mmu_notifier_register+0xe2/0x112
 [<ffffffff802a96a8>] mmu_notifier_register+0xe/0x10
 [<ffffffffa0043b6b>] kvm_dev_ioctl+0x11e/0x287 [kvm]
 [<ffffffff8033f9f2>] ? file_has_perm+0x83/0x8e
 [<ffffffff802bd0ca>] vfs_ioctl+0x2a/0x78
 [<ffffffff802bd36f>] do_vfs_ioctl+0x257/0x274
 [<ffffffff802bd3e1>] sys_ioctl+0x55/0x78
 [<ffffffff8020c0db>] system_call_fastpath+0x16/0x1b

Which the locking hierarchy in mm/rmap.c confirms as valid.

Fix this by first taking all the mapping->i_mmap_lock instances and then
take all anon_vma->lock instances.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 09:30:25 +02:00
..
allocpercpu.c mm/allocpercpu.c: make 4 functions static 2008-07-26 12:00:12 -07:00
backing-dev.c mm: bdi: fix race in bdi_class device creation 2008-05-20 13:31:53 -07:00
bootmem.c bootmem: replace node_boot_start in struct bootmem_data 2008-07-24 10:47:20 -07:00
bounce.c block: Initial support for data-less (or empty) barrier support 2007-10-16 11:03:56 +02:00
dmapool.c dmapool: enable debugging for CONFIG_SLUB_DEBUG_ON too 2008-04-28 08:58:20 -07:00
fadvise.c xip: support non-struct page backed memory 2008-04-28 08:58:23 -07:00
filemap_xip.c mmu-notifiers: core 2008-07-28 16:30:21 -07:00
filemap.c vfs: pagecache usage optimization for pagesize!=blocksize 2008-07-28 16:30:21 -07:00
fremap.c mmu-notifiers: core 2008-07-28 16:30:21 -07:00
highmem.c highmem: Export totalhigh_pages. 2008-07-19 22:39:46 -07:00
hugetlb.c mm/hugetlb.c must #include <asm/io.h> 2008-07-28 16:30:21 -07:00
internal.h mm: export prep_compound_page to mm 2008-07-24 10:47:17 -07:00
Kconfig mmu-notifiers: core 2008-07-28 16:30:21 -07:00
maccess.c kgdb: fix optional arch functions and probe_kernel_* 2008-04-17 20:05:39 +02:00
madvise.c xip: support non-struct page backed memory 2008-04-28 08:58:23 -07:00
Makefile mmu-notifiers: core 2008-07-28 16:30:21 -07:00
memcontrol.c memcg: limit change shrink usage 2008-07-25 10:53:37 -07:00
memory_hotplug.c memory-hotplug: add sysfs removable attribute for hotplug memory remove 2008-07-24 10:47:21 -07:00
memory.c mmu-notifiers: core 2008-07-28 16:30:21 -07:00
mempolicy.c hugetlb: modular state for hugetlb page size 2008-07-24 10:47:17 -07:00
mempool.c spelling fixes: mm/ 2007-10-20 01:27:18 +02:00
migrate.c mm: spinlock tree_lock 2008-07-26 12:00:06 -07:00
mincore.c mm: remove nopage 2008-04-28 08:58:18 -07:00
mlock.c
mm_init.c mm: create /sys/kernel/mm 2008-07-24 10:47:17 -07:00
mmap.c mm: fix mm_take_all_locks() locking order 2008-08-11 09:30:25 +02:00
mmu_notifier.c mmu-notifiers: core 2008-07-28 16:30:21 -07:00
mmzone.c mm: filter based on a nodemask as well as a gfp_mask 2008-04-28 08:58:19 -07:00
mprotect.c mmu-notifiers: core 2008-07-28 16:30:21 -07:00
mremap.c mmu-notifiers: core 2008-07-28 16:30:21 -07:00
msync.c
nommu.c tracehook: tracehook_expect_breakpoints 2008-07-26 12:00:09 -07:00
oom_kill.c oom_kill: remove unused parameter in badness() 2008-04-28 08:58:26 -07:00
page_alloc.c stop_machine: Wean existing callers off stop_machine_run() 2008-07-28 12:16:31 +10:00
page_io.c mm: fix PageUptodate data race 2008-02-05 09:44:19 -08:00
page_isolation.c memory hotremove: unset migrate type "ISOLATE" after removal 2007-11-14 18:45:38 -08:00
page-writeback.c mm: spinlock tree_lock 2008-07-26 12:00:06 -07:00
pagewalk.c pagemap: pass mm into pagewalkers 2008-06-12 18:05:41 -07:00
pdflush.c pdflush: use time_after() instead of open-coding it 2008-07-25 10:53:28 -07:00
prio_tree.c spelling fixes: mm/ 2007-10-20 01:27:18 +02:00
quicklist.c quicklists: Only consider memory that can be used with GFP_KERNEL 2008-01-14 08:52:22 -08:00
readahead.c mm: readahead scan lockless 2008-07-26 12:00:06 -07:00
rmap.c mmu-notifiers: core 2008-07-28 16:30:21 -07:00
shmem_acl.c [PATCH] sanitize ->permission() prototype 2008-07-26 20:53:14 -04:00
shmem.c tmpfs: fix kernel BUG in shmem_delete_inode 2008-07-28 16:30:20 -07:00
slab.c SL*B: drop kmem cache argument from constructor 2008-07-26 12:00:07 -07:00
slob.c SL*B: drop kmem cache argument from constructor 2008-07-26 12:00:07 -07:00
slub.c SL*B: drop kmem cache argument from constructor 2008-07-26 12:00:07 -07:00
sparse-vmemmap.c Christoph has moved 2008-07-04 10:40:04 -07:00
sparse.c make mm/sparse.c: make a function static 2008-07-26 12:00:12 -07:00
swap_state.c mm: print swapcache page count in show_swap_cache_info() 2008-07-26 12:00:10 -07:00
swap.c mm: remove initialization of static per-cpu variables 2008-07-24 10:47:21 -07:00
swapfile.c mm/swapfile.c: make code static 2008-07-26 12:00:12 -07:00
thrash.c
tiny-shmem.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2008-03-25 08:57:47 -07:00
truncate.c mm: spinlock tree_lock 2008-07-26 12:00:06 -07:00
util.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2008-07-26 20:17:56 -07:00
vmalloc.c Use WARN() in mm/vmalloc.c 2008-07-26 12:00:07 -07:00
vmscan.c mm: spinlock tree_lock 2008-07-26 12:00:06 -07:00
vmstat.c mm/vmstat.c: proper externs 2008-07-24 10:47:14 -07:00