Files
linux/kernel
Peter Zijlstra 4d82a1debb lockdep: fix oops in processing workqueue
Under memory load, on x86_64, with lockdep enabled, the workqueue's
process_one_work() has been seen to oops in __lock_acquire(), barfing
on a 0xffffffff00000000 pointer in the lockdep_map's class_cache[].

Because it's permissible to free a work_struct from its callout function,
the map used is an onstack copy of the map given in the work_struct: and
that copy is made without any locking.

Surprisingly, gcc (4.5.1 in Hugh's case) uses "rep movsl" rather than
"rep movsq" for that structure copy: which might race with a workqueue
user's wait_on_work() doing lock_map_acquire() on the source of the
copy, putting a pointer into the class_cache[], but only in time for
the top half of that pointer to be copied to the destination map.

Boom when process_one_work() subsequently does lock_map_acquire()
on its onstack copy of the lockdep_map.

Fix this, and a similar instance in call_timer_fn(), with a
lockdep_copy_map() function which additionally NULLs the class_cache[].

Note: this oops was actually seen on 3.4-next, where flush_work() newly
does the racing lock_map_acquire(); but Tejun points out that 3.4 and
earlier are already vulnerable to the same through wait_on_work().

* Patch orginally from Peter.  Hugh modified it a bit and wrote the
  description.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Reported-by: Hugh Dickins <hughd@google.com>
LKML-Reference: <alpine.LSU.2.00.1205070951170.1544@eggly.anvils>
Signed-off-by: Tejun Heo <tj@kernel.org>
2012-05-15 08:08:31 -07:00
..
2011-07-26 16:49:45 -07:00
2011-03-14 09:15:23 -04:00
2011-09-23 12:05:29 +05:30
2012-02-14 10:45:42 +11:00
2011-07-14 12:59:14 +03:00
2012-03-28 18:30:03 +01:00
2012-03-26 12:50:53 +10:30
2012-03-29 19:52:46 +08:00
2012-03-23 16:58:41 -07:00
2012-03-28 18:30:03 +01:00
2011-10-31 17:30:44 -07:00
2011-09-19 17:04:37 -07:00
2012-03-15 18:17:55 -07:00