IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Originally, after submitting request into virtio crypto control
queue, the guest side polls the result from the virt queue. This
works like following:
CPU0 CPU1 ... CPUx CPUy
| | | |
\ \ / /
\--------spin_lock(&vcrypto->ctrl_lock)-------/
|
virtqueue add & kick
|
busy poll virtqueue
|
spin_unlock(&vcrypto->ctrl_lock)
...
There are two problems:
1, The queue depth is always 1, the performance of a virtio crypto
device gets limited. Multi user processes share a single control
queue, and hit spin lock race from control queue. Test on Intel
Platinum 8260, a single worker gets ~35K/s create/close session
operations, and 8 workers get ~40K/s operations with 800% CPU
utilization.
2, The control request is supposed to get handled immediately, but
in the current implementation of QEMU(v6.2), the vCPU thread kicks
another thread to do this work, the latency also gets unstable.
Tracking latency of virtio_crypto_alg_akcipher_close_session in 5s:
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 7 | |
4 -> 7 : 72 | |
8 -> 15 : 186485 |************************|
16 -> 31 : 687 | |
32 -> 63 : 5 | |
64 -> 127 : 3 | |
128 -> 255 : 1 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 0 | |
4096 -> 8191 : 0 | |
8192 -> 16383 : 2 | |
This means that a CPU may hold vcrypto->ctrl_lock as long as 8192~16383us.
To improve the performance of control queue, a request on control queue
waits completion instead of busy polling to reduce lock racing, and gets
completed by control queue callback.
CPU0 CPU1 ... CPUx CPUy
| | | |
\ \ / /
\--------spin_lock(&vcrypto->ctrl_lock)-------/
|
virtqueue add & kick
|
---------spin_unlock(&vcrypto->ctrl_lock)------
/ / \ \
| | | |
wait wait wait wait
Test this patch, the guest side get ~200K/s operations with 300% CPU
utilization.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20220506131627.180784-4-pizhenwei@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Originally, all of the control requests share a single buffer(
ctrl & input & ctrl_status fields in struct virtio_crypto), this
allows queue depth 1 only, the performance of control queue gets
limited by this design.
In this patch, each request allocates request buffer dynamically, and
free buffer after request, so the scope protected by ctrl_lock also
get optimized here.
It's possible to optimize control queue depth in the next step.
A necessary comment is already in code, still describe it again:
/*
* Note: there are padding fields in request, clear them to zero before
* sending to host to avoid to divulge any information.
* Ex, virtio_crypto_ctrl_request::ctrl::u::destroy_session::padding[48]
*/
So use kzalloc to allocate buffer of struct virtio_crypto_ctrl_request.
Potentially dereferencing uninitialized variables:
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20220506131627.180784-3-pizhenwei@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use temporary variable to make code easy to read and maintain.
/* Pad cipher's parameters */
vcrypto->ctrl.u.sym_create_session.op_type =
cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER);
vcrypto->ctrl.u.sym_create_session.u.cipher.para.algo =
vcrypto->ctrl.header.algo;
vcrypto->ctrl.u.sym_create_session.u.cipher.para.keylen =
cpu_to_le32(keylen);
vcrypto->ctrl.u.sym_create_session.u.cipher.para.op =
cpu_to_le32(op);
-->
sym_create_session = &ctrl->u.sym_create_session;
sym_create_session->op_type = cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER);
sym_create_session->u.cipher.para.algo = ctrl->header.algo;
sym_create_session->u.cipher.para.keylen = cpu_to_le32(keylen);
sym_create_session->u.cipher.para.op = cpu_to_le32(op);
The new style shows more obviously:
- the variable we want to operate.
- an assignment statement in a single line.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20220506131627.180784-2-pizhenwei@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Suggested by Gonglei, rename virtio_crypto_algs.c to
virtio_crypto_skcipher_algs.c. Also minor changes for function name.
Thus the function of source files get clear: skcipher services in
virtio_crypto_skcipher_algs.c and akcipher services in
virtio_crypto_akcipher_algs.c.
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Link: https://lore.kernel.org/r/20220302033917.1295334-5-pizhenwei@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>