Dominique Martinet 91a76be37f 9p: add a per-client fcall kmem_cache
Having a specific cache for the fcall allocations helps speed up
end-to-end latency.

The caches will automatically be merged if there are multiple caches
of items with the same size so we do not need to try to share a cache
between different clients of the same size.

Since the msize is negotiated with the server, only allocate the cache
after that negotiation has happened - previous allocations or
allocations of different sizes (e.g. zero-copy fcall) are made with
kmalloc directly.

Some figures on two beefy VMs with Connect-IB (sriov) / trans=rdma,
with ior running 32 processes in parallel doing small 32 bytes IOs:
 - no alloc (4.18-rc7 request cache): 65.4k req/s
 - non-power of two alloc, no patch: 61.6k req/s
 - power of two alloc, no patch: 62.2k req/s
 - non-power of two alloc, with patch: 64.7k req/s
 - power of two alloc, with patch: 65.1k req/s

Link: http://lkml.kernel.org/r/1532943263-24378-2-git-send-email-asmadeus@codewreck.org
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Acked-by: Jun Piao <piaojun@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Greg Kurz <groug@kaod.org>
2018-09-08 01:39:47 +09:00
..
2018-08-14 23:49:13 +02:00
2018-08-15 17:39:07 -07:00
2018-08-25 14:12:36 -07:00
2018-08-20 15:38:44 -07:00
2018-09-08 01:39:47 +09:00
2018-08-23 15:34:48 -07:00
2018-08-15 22:06:26 -07:00
2018-07-26 00:12:56 -07:00
2018-08-13 12:12:31 +02:00
2018-08-23 15:44:58 -07:00