Miles Chen
c4de112347
mm/memcontrol.c: fix use after free in mem_cgroup_iter()
...
commit 54a83d6bcbf8f4700013766b974bf9190d40b689 upstream.
This patch is sent to report an use after free in mem_cgroup_iter()
after merging commit be2657752e9e ("mm: memcg: fix use after free in
mem_cgroup_iter()").
I work with android kernel tree (4.9 & 4.14), and commit be2657752e9e
("mm: memcg: fix use after free in mem_cgroup_iter()") has been merged
to the trees. However, I can still observe use after free issues
addressed in the commit be2657752e9e. (on low-end devices, a few times
this month)
backtrace:
css_tryget <- crash here
mem_cgroup_iter
shrink_node
shrink_zones
do_try_to_free_pages
try_to_free_pages
__perform_reclaim
__alloc_pages_direct_reclaim
__alloc_pages_slowpath
__alloc_pages_nodemask
To debug, I poisoned mem_cgroup before freeing it:
static void __mem_cgroup_free(struct mem_cgroup *memcg)
for_each_node(node)
free_mem_cgroup_per_node_info(memcg, node);
free_percpu(memcg->stat);
+ /* poison memcg before freeing it */
+ memset(memcg, 0x78, sizeof(struct mem_cgroup));
kfree(memcg);
}
The coredump shows the position=0xdbbc2a00 is freed.
(gdb) p/x ((struct mem_cgroup_per_node *)0xe5009e00)->iter[8]
$13 = {position = 0xdbbc2a00, generation = 0x2efd}
0xdbbc2a00: 0xdbbc2e00 0x00000000 0xdbbc2800 0x00000100
0xdbbc2a10: 0x00000200 0x78787878 0x00026218 0x00000000
0xdbbc2a20: 0xdcad6000 0x00000001 0x78787800 0x00000000
0xdbbc2a30: 0x78780000 0x00000000 0x0068fb84 0x78787878
0xdbbc2a40: 0x78787878 0x78787878 0x78787878 0xe3fa5cc0
0xdbbc2a50: 0x78787878 0x78787878 0x00000000 0x00000000
0xdbbc2a60: 0x00000000 0x00000000 0x00000000 0x00000000
0xdbbc2a70: 0x00000000 0x00000000 0x00000000 0x00000000
0xdbbc2a80: 0x00000000 0x00000000 0x00000000 0x00000000
0xdbbc2a90: 0x00000001 0x00000000 0x00000000 0x00100000
0xdbbc2aa0: 0x00000001 0xdbbc2ac8 0x00000000 0x00000000
0xdbbc2ab0: 0x00000000 0x00000000 0x00000000 0x00000000
0xdbbc2ac0: 0x00000000 0x00000000 0xe5b02618 0x00001000
0xdbbc2ad0: 0x00000000 0x78787878 0x78787878 0x78787878
0xdbbc2ae0: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2af0: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b00: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b10: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b20: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b30: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b40: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b50: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b60: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b70: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b80: 0x78787878 0x78787878 0x00000000 0x78787878
0xdbbc2b90: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2ba0: 0x78787878 0x78787878 0x78787878 0x78787878
In the reclaim path, try_to_free_pages() does not setup
sc.target_mem_cgroup and sc is passed to do_try_to_free_pages(), ...,
shrink_node().
In mem_cgroup_iter(), root is set to root_mem_cgroup because
sc->target_mem_cgroup is NULL. It is possible to assign a memcg to
root_mem_cgroup.nodeinfo.iter in mem_cgroup_iter().
try_to_free_pages
struct scan_control sc = {...}, target_mem_cgroup is 0x0;
do_try_to_free_pages
shrink_zones
shrink_node
mem_cgroup *root = sc->target_mem_cgroup;
memcg = mem_cgroup_iter(root, NULL, &reclaim);
mem_cgroup_iter()
if (!root)
root = root_mem_cgroup;
...
css = css_next_descendant_pre(css, &root->css);
memcg = mem_cgroup_from_css(css);
cmpxchg(&iter->position, pos, memcg);
My device uses memcg non-hierarchical mode. When we release a memcg:
invalidate_reclaim_iterators() reaches only dead_memcg and its parents.
If non-hierarchical mode is used, invalidate_reclaim_iterators() never
reaches root_mem_cgroup.
static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
{
struct mem_cgroup *memcg = dead_memcg;
for (; memcg; memcg = parent_mem_cgroup(memcg)
...
}
So the use after free scenario looks like:
CPU1 CPU2
try_to_free_pages
do_try_to_free_pages
shrink_zones
shrink_node
mem_cgroup_iter()
if (!root)
root = root_mem_cgroup;
...
css = css_next_descendant_pre(css, &root->css);
memcg = mem_cgroup_from_css(css);
cmpxchg(&iter->position, pos, memcg);
invalidate_reclaim_iterators(memcg);
...
__mem_cgroup_free()
kfree(memcg);
try_to_free_pages
do_try_to_free_pages
shrink_zones
shrink_node
mem_cgroup_iter()
if (!root)
root = root_mem_cgroup;
...
mz = mem_cgroup_nodeinfo(root, reclaim->pgdat->node_id);
iter = &mz->iter[reclaim->priority];
pos = READ_ONCE(iter->position);
css_tryget(&pos->css) <- use after free
To avoid this, we should also invalidate root_mem_cgroup.nodeinfo.iter
in invalidate_reclaim_iterators().
[cai@lca.pw: fix -Wparentheses compilation warning]
Link: http://lkml.kernel.org/r/1564580753-17531-1-git-send-email-cai@lca.pw
Link: http://lkml.kernel.org/r/20190730015729.4406-1-miles.chen@mediatek.com
Fixes: 5ac8fb31ad2e ("mm: memcontrol: convert reclaim iterator to simple css refcounting")
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-25 10:50:02 +02:00
..
2018-08-24 13:09:12 +02:00
2019-03-05 17:58:01 +01:00
2018-10-13 09:27:30 +02:00
2017-11-02 11:10:55 +01:00
2017-06-05 16:59:12 +02:00
2019-06-15 11:54:51 +02:00
2019-08-06 19:05:23 +02:00
2017-11-02 11:10:55 +01:00
2017-11-02 11:10:55 +01:00
2017-11-02 11:10:55 +01:00
2018-09-19 22:43:48 +02:00
2017-02-27 18:43:47 -08:00
2018-02-25 11:08:03 +01:00
2018-09-15 09:45:28 +02:00
2017-11-02 11:10:55 +01:00
2019-07-31 07:28:47 +02:00
2018-03-03 10:24:21 +01:00
2016-07-26 16:19:19 -07:00
2019-07-31 07:28:56 +02:00
2017-11-02 11:10:55 +01:00
2019-01-13 10:01:02 +01:00
2018-12-05 19:41:08 +01:00
2016-05-20 17:58:30 -07:00
2019-06-15 11:54:51 +02:00
2017-05-03 15:52:12 -07:00
2017-11-02 11:10:55 +01:00
2017-09-06 17:27:30 -07:00
2017-09-08 18:26:49 -07:00
2019-06-15 11:54:51 +02:00
2018-02-22 15:42:24 +01:00
2019-06-22 08:16:19 +02:00
2016-03-17 15:09:34 -07:00
2019-07-31 07:28:55 +02:00
2018-07-03 11:25:03 +02:00
2019-06-19 08:20:54 +02:00
2016-05-22 17:21:27 -07:00
2018-10-10 08:54:22 +02:00
2018-02-22 15:42:24 +01:00
2018-03-28 18:24:39 +02:00
2019-08-25 10:50:02 +02:00
2019-03-13 14:03:18 -07:00
2019-03-23 14:35:24 +01:00
2019-05-16 19:42:31 +02:00
2019-07-03 13:15:59 +02:00
2017-11-02 11:10:55 +01:00
2017-11-02 11:10:55 +01:00
2019-04-03 06:25:20 +02:00
2019-05-21 18:50:16 +02:00
2019-07-10 09:54:36 +02:00
2016-03-17 15:09:34 -07:00
2019-04-27 09:35:37 +02:00
2017-03-02 08:42:38 +01:00
2019-07-31 07:28:56 +02:00
2017-11-02 11:10:55 +01:00
2018-08-15 18:12:51 +02:00
2018-10-20 09:48:53 +02:00
2017-11-02 11:10:55 +01:00
2017-11-02 11:10:55 +01:00
2017-09-14 18:13:32 -07:00
2019-02-06 17:31:36 +01:00
2019-06-15 11:54:51 +02:00
2017-11-02 11:10:55 +01:00
2019-04-05 22:31:27 +02:00
2019-07-03 13:15:59 +02:00
2017-11-02 11:10:55 +01:00
2017-11-02 11:10:55 +01:00
2018-05-30 07:52:21 +02:00
2019-04-05 22:31:28 +02:00
2018-11-13 11:15:08 -08:00
2019-01-26 09:37:06 +01:00
2017-11-24 08:37:04 +01:00
2017-11-02 11:10:55 +01:00
2019-02-12 19:46:05 +01:00
2017-09-27 14:45:57 -07:00
2018-04-08 14:26:29 +02:00
2019-06-15 11:54:59 +02:00
2017-11-02 11:10:55 +01:00
2017-03-02 08:42:28 +01:00
2017-11-02 11:10:55 +01:00
2018-09-09 19:55:53 +02:00
2018-10-13 09:27:22 +02:00
2017-10-03 17:54:24 -07:00
2019-03-23 14:35:17 +01:00
2018-12-01 09:42:51 +01:00
2019-06-15 11:54:51 +02:00
2018-02-22 15:42:23 +01:00
2017-11-02 11:10:55 +01:00
2018-10-03 17:00:55 -07:00
2017-11-02 11:10:55 +01:00
2018-05-16 10:10:27 +02:00
2017-11-02 11:10:55 +01:00
2017-11-02 11:10:55 +01:00
2017-11-02 11:10:55 +01:00
2017-10-03 17:54:24 -07:00
2019-01-26 09:37:06 +01:00
2018-12-08 13:03:40 +01:00
2019-08-25 10:50:02 +02:00
2019-05-31 06:47:12 -07:00
2019-01-16 22:07:11 +01:00
2018-09-19 22:43:48 +02:00
2019-08-16 10:13:48 +02:00
2017-07-10 16:32:31 -07:00
2019-07-31 07:28:48 +02:00
2019-04-27 09:35:41 +02:00
2017-11-02 11:10:55 +01:00
2018-12-01 09:42:54 +01:00
2017-12-14 09:53:10 +01:00
2018-09-05 09:26:30 +02:00