linux/io_uring
Jens Axboe ac5f71a3d9 io_uring/net: add provided buffer support for IORING_OP_SEND
It's pretty trivial to wire up provided buffer support for the send
side, just like how it's done the receive side. This enables setting up
a buffer ring that an application can use to push pending sends to,
and then have a send pick a buffer from that ring.

One of the challenges with async IO and networking sends is that you
can get into reordering conditions if you have more than one inflight
at the same time. Consider the following scenario where everything is
fine:

1) App queues sendA for socket1
2) App queues sendB for socket1
3) App does io_uring_submit()
4) sendA is issued, completes successfully, posts CQE
5) sendB is issued, completes successfully, posts CQE

All is fine. Requests are always issued in-order, and both complete
inline as most sends do.

However, if we're flooding socket1 with sends, the following could
also result from the same sequence:

1) App queues sendA for socket1
2) App queues sendB for socket1
3) App does io_uring_submit()
4) sendA is issued, socket1 is full, poll is armed for retry
5) Space frees up in socket1, this triggers sendA retry via task_work
6) sendB is issued, completes successfully, posts CQE
7) sendA is retried, completes successfully, posts CQE

Now we've sent sendB before sendA, which can make things unhappy. If
both sendA and sendB had been using provided buffers, then it would look
as follows instead:

1) App queues dataA for sendA, queues sendA for socket1
2) App queues dataB for sendB queues sendB for socket1
3) App does io_uring_submit()
4) sendA is issued, socket1 is full, poll is armed for retry
5) Space frees up in socket1, this triggers sendA retry via task_work
6) sendB is issued, picks first buffer (dataA), completes successfully,
   posts CQE (which says "I sent dataA")
7) sendA is retried, picks first buffer (dataB), completes successfully,
   posts CQE (which says "I sent dataB")

Now we've sent the data in order, and everybody is happy.

It's worth noting that this also opens the door for supporting multishot
sends, as provided buffers would be a prerequisite for that. Those can
trigger either when new buffers are added to the outgoing ring, or (if
stalled due to lack of space) when space frees up in the socket.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-22 11:25:56 -06:00
..
advise.c io_uring: always go async for unsupported fadvise flags 2023-01-29 15:18:26 -07:00
advise.h
alloc_cache.h io_uring/alloc_cache: switch to array based caching 2024-04-15 08:10:25 -06:00
cancel.c io_uring: fix warnings on shadow variables 2024-04-15 08:10:26 -06:00
cancel.h io_uring/cancel: don't default to setting req->work.cancel_seq 2024-02-08 13:27:06 -07:00
epoll.c io_uring: undeprecate epoll_ctl support 2023-05-26 20:22:41 -06:00
epoll.h
fdinfo.c io_uring: fix warnings on shadow variables 2024-04-15 08:10:26 -06:00
fdinfo.h
filetable.c io_uring: drop any code related to SCM_RIGHTS 2023-12-19 12:36:34 -07:00
filetable.h io_uring: expand main struct io_kiocb flags to 64-bits 2024-02-08 13:27:03 -07:00
fs.c io_uring/fs: consider link->flags when getting path for LINKAT 2023-11-20 09:01:42 -07:00
fs.h
futex.c io_uring/alloc_cache: switch to array based caching 2024-04-15 08:10:25 -06:00
futex.h io_uring/alloc_cache: switch to array based caching 2024-04-15 08:10:25 -06:00
io_uring.c io_uring/rw: ensure retry condition isn't lost 2024-04-17 09:23:55 -06:00
io_uring.h io_uring/rw: ensure retry condition isn't lost 2024-04-17 09:23:55 -06:00
io-wq.c io-wq: Drop intermediate step between pending list and active work 2024-04-17 08:20:32 -06:00
io-wq.h io_uring: break out of iowq iopoll on teardown 2023-09-07 09:02:27 -06:00
kbuf.c io_uring/kbuf: remove dead define 2024-04-15 08:10:26 -06:00
kbuf.h io_uring: return void from io_put_kbuf_comp() 2024-04-15 08:10:26 -06:00
Makefile io_uring: move mapping/allocation helpers to a separate file 2024-04-15 08:10:26 -06:00
memmap.c io_uring: move mapping/allocation helpers to a separate file 2024-04-15 08:10:26 -06:00
memmap.h io_uring: move mapping/allocation helpers to a separate file 2024-04-15 08:10:26 -06:00
msg_ring.c io_uring: use io_file_from_index in io_msg_grab_file 2023-06-20 09:36:22 -06:00
msg_ring.h io_uring: get rid of double locking 2022-12-07 06:47:13 -07:00
napi.c io_uring/napi: enable even with a timeout of 0 2024-02-15 15:37:28 -07:00
napi.h io_uring: add register/unregister napi function 2024-02-09 11:54:32 -07:00
net.c io_uring/net: add provided buffer support for IORING_OP_SEND 2024-04-22 11:25:56 -06:00
net.h io_uring/alloc_cache: switch to array based caching 2024-04-15 08:10:25 -06:00
nop.c
nop.h
notif.c io_uring/notif: remove ctx var from io_notif_tw_complete 2024-04-15 08:10:49 -06:00
notif.h io_uring/notif: shrink account_pages to u32 2024-04-15 08:10:49 -06:00
opdef.c io_uring/net: add provided buffer support for IORING_OP_SEND 2024-04-22 11:25:56 -06:00
opdef.h io_uring: drop ->prep_async() 2024-04-15 08:10:25 -06:00
openclose.c io_uring: enable audit and restrict cred override for IORING_OP_FIXED_FD_INSTALL 2024-01-23 15:25:14 -07:00
openclose.h io_uring/openclose: add support for IORING_OP_FIXED_FD_INSTALL 2023-12-12 07:42:57 -07:00
poll.c io_uring/alloc_cache: switch to array based caching 2024-04-15 08:10:25 -06:00
poll.h io_uring/poll: shrink alloc cache size to 32 2024-04-15 08:10:25 -06:00
refs.h io_uring: kill dead code in io_req_complete_post 2024-04-15 08:10:26 -06:00
register.c io_uring: fix warnings on shadow variables 2024-04-15 08:10:26 -06:00
register.h io_uring/register: move io_uring_register(2) related code to register.c 2023-12-19 08:54:20 -07:00
rsrc.c io_uring: move mapping/allocation helpers to a separate file 2024-04-15 08:10:26 -06:00
rsrc.h io_uring: remove io_req_put_rsrc_locked() 2024-04-15 08:10:26 -06:00
rw.c io_uring/rw: ensure retry condition isn't lost 2024-04-17 09:23:55 -06:00
rw.h io_uring/alloc_cache: switch to array based caching 2024-04-15 08:10:25 -06:00
slist.h io_uring: silence variable ‘prev’ set but not used warning 2023-03-09 10:10:58 -07:00
splice.c splice: return type ssize_t from all helpers 2023-12-12 16:19:59 +01:00
splice.h
sqpoll.c io_uring/sqpoll: work around a potential audit memory leak 2024-04-15 13:06:19 -06:00
sqpoll.h io_uring/sqpoll: statistics of the true utilization of sq threads 2024-03-01 06:28:19 -07:00
statx.c io_uring: for requests that require async, force it 2023-01-29 15:18:26 -07:00
statx.h
sync.c io_uring: for requests that require async, force it 2023-01-29 15:18:26 -07:00
sync.h
tctx.c io_uring: Add io_uring_setup flag to pre-register ring fd and never install it 2023-05-16 08:06:00 -06:00
tctx.h io_uring: simplify __io_uring_add_tctx_node 2022-10-07 12:25:30 -06:00
timeout.c io_uring/timeout: remove duplicate initialization of the io_timeout list. 2024-04-15 08:10:27 -06:00
timeout.h io_uring: remove unused return from io_disarm_next 2022-09-21 13:15:01 -06:00
truncate.c io_uring: add support for ftruncate 2024-02-09 09:04:39 -07:00
truncate.h io_uring: add support for ftruncate 2024-02-09 09:04:39 -07:00
uring_cmd.c io_uring: separate header for exported net bits 2024-04-15 08:10:26 -06:00
uring_cmd.h io_uring/alloc_cache: switch to array based caching 2024-04-15 08:10:25 -06:00
waitid.c io_uring: remove struct io_tw_state::locked 2024-04-15 08:10:24 -06:00
waitid.h io_uring: add IORING_OP_WAITID support 2023-09-21 12:04:45 -06:00
xattr.c io_uring: use file_mnt_idmap helper 2024-02-06 19:55:14 -07:00
xattr.h