Vlastimil Babka ecf9a253ce mm/slub: optimize free fast path code layout
Inspection of kmem_cache_free() disassembly showed we could make the
fast path smaller by providing few more hints to the compiler, and
splitting the memcg_slab_free_hook() into an inline part that only
checks if there's work to do, and an out of line part doing the actual
uncharge.

bloat-o-meter results:
add/remove: 2/0 grow/shrink: 0/3 up/down: 286/-554 (-268)
Function                                     old     new   delta
__memcg_slab_free_hook                         -     270    +270
__pfx___memcg_slab_free_hook                   -      16     +16
kfree                                        828     665    -163
kmem_cache_free                             1116     948    -168
kmem_cache_free_bulk.part                   1701    1478    -223

Checking kmem_cache_free() disassembly now shows the non-fastpath
cases are handled out of line, which should reduce instruction cache
usage.

Acked-by: David Rientjes <rientjes@google.com>
Tested-by: David Rientjes <rientjes@google.com>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-06 11:57:22 +01:00
..
2023-07-24 18:04:30 -04:00
2023-06-19 16:19:25 -07:00
2023-10-16 15:44:39 -07:00
2023-04-12 17:36:23 -07:00
2023-08-31 12:20:12 -07:00
2023-10-25 16:47:13 -07:00
2023-10-04 10:32:19 -07:00
2023-06-23 16:59:30 -07:00
2023-04-12 17:36:23 -07:00
2023-08-31 12:20:12 -07:00
2023-10-25 16:47:10 -07:00