linux

iv/linux

Author	SHA1	Message	Date
Pavel Begunkov	48dcd38d73	io_uring: optimise iowq refcounting If a requests is forwarded into io-wq, there is a good chance it hasn't been refcounted yet and we can save one req_ref_get() by setting the refcount number to the right value directly. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/2d53f4449faaf73b4a4c5de667fc3c176d974860.1628981736.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:37 -06:00
Jens Axboe	a141dd896f	io_uring: correct __must_hold annotation io_req_free_batch() has a __must_hold annotation referencing a request being passed in, but we're passing in the context. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:37 -06:00
Hao Xu	41a5169c23	io_uring: code clean for completion_lock in io_arm_poll_handler() We can merge two spin_unlock() operations to one since we removed some code not long ago. Signed-off-by: Hao Xu <haoxu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:37 -06:00
Hao Xu	f552a27afe	io_uring: remove files pointer in cancellation functions When doing cancellation, we use a parameter to indicate where it's from do_exit or exec. So a boolean value is good enough for this, remove the struct files* as it is not necessary. Signed-off-by: Hao Xu <haoxu@linux.alibaba.com> [axboe: fixup io_uring_files_cancel for !CONFIG_IO_URING] Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:37 -06:00
Hao Xu	a4aadd11ea	io_uring: extract io_uring_files_cancel() in io_uring_task_cancel() Extract io_uring_files_cancel() call in io_uring_task_cancel() to make io_uring_files_cancel() and io_uring_task_cancel() coherent and easy to read. Signed-off-by: Hao Xu <haoxu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:37 -06:00
Pavel Begunkov	fd08e5309b	io_uring: optimise hot path of ltimeout prep io_prep_linked_timeout() grew too heavy and compiler now refuse to inline the function. Help it by splitting in two and annotating with inline. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/560636717a32e9513724f09b9ecaace942dde4d4.1628705069.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:37 -06:00
Pavel Begunkov	20e60a3832	io_uring: skip request refcounting As submission references are gone, there is only one initial reference left. Instead of actually doing atomic refcounting, add a flag indicating whether we're going to take more refs or doing any other sync magic. The flag should be set before the request may get used in parallel. Together with the previous patch it saves 2 refcount atomics per request for IOPOLL and IRQ completions, and 1 atomic per req for inline completions, with some exceptions. In particular, currently, there are three cases, when the refcounting have to be enabled: - Polling, including apoll. Because double poll entries takes a ref. Might get relaxed in the near future. - Link timeouts, enabled for both, the timeout and the request it's bound to, because they work in-parallel and we need to synchronise to cancel one of them on completion. - When a request gets in io-wq, because it doesn't hold uring_lock and we need guarantees of submission references. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8b204b6c5f6643062270a1913d6d3a7f8f795fd9.1628705069.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:32 -06:00
Pavel Begunkov	5d5901a343	io_uring: remove submission references Requests are by default given with two references, submission and completion. Completion references are straightforward, they represent request ownership and are put when a request is completed or so. Submission references are a bit more trickier. They're needed when io_issue_sqe() followed deep into the submission stack (e.g. in fs, block, drivers, etc.), request may have given away for concurrent execution or already completed, and the code unwinding back to io_issue_sqe() may be accessing some pieces of our requests, e.g. file or iov. Now, we prevent such async/in-depth completions by pushing requests through task_work. Punting to io-wq is also done through task_works, apart from a couple of cases with a pretty well known context. So, there're two cases: 1) io_issue_sqe() from the task context and protected by ->uring_lock. Either requests return back to io_uring or handed to task_work, which won't be executed because we're currently controlling that task. So, we can be sure that requests are staying alive all the time and we don't need submission references to pin them. 2) io_issue_sqe() from io-wq, which doesn't hold the mutex. The role of submission reference is played by io-wq reference, which is put by io_wq_submit_work(). Hence, it should be fine. Considering that, we can carefully kill the submission reference. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6b68f1c763229a590f2a27148aee77767a8d7750.1628705069.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:32 -06:00
Pavel Begunkov	91c2f69783	io_uring: remove req_ref_sub_and_test() Soon, we won't need to put several references at once, remove req_ref_sub_and_test() and @nr argument from io_put_req_deferred(), and put the rest of the references by hand. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1868c7554108bff9194fb5757e77be23fadf7fc0.1628705069.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:32 -06:00
Pavel Begunkov	21c843d582	io_uring: move req_ref_get() and friends Move all request refcount helpers to avoid forward declarations in the future. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/89fd36f6f3fe5b733dfe4546c24725eee40df605.1628705069.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:32 -06:00
Jens Axboe	79ebeaee8a	io_uring: remove IRQ aspect of io_ring_ctx completion lock We have no hard/soft IRQ users of this lock left, remove any IRQ disabling/saving and restoring when grabbing this lock. This is straight forward with no users entering with IRQs disabled anymore, the only thing to look out for is the waitqueue poll head lock which nests inside the completion lock. That needs IRQs disabled, and hence we have to do that now instead of relying on the outer lock doing so. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:32 -06:00
Jens Axboe	8ef12efe26	io_uring: run regular file completions from task_work This is in preparation to making the completion lock work outside of hard/soft IRQ context. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:32 -06:00
Jens Axboe	89b263f6d5	io_uring: run linked timeouts from task_work This is in preparation to making the completion lock work outside of hard/soft IRQ context. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:32 -06:00
Jens Axboe	89850fce16	io_uring: run timeouts from task_work This is in preparation to making the completion lock work outside of hard/soft IRQ context. Add a timeout_lock to handle the ordering of timeout completions or cancelations with the timeouts actually triggering. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:32 -06:00
Pavel Begunkov	62906e89e6	io_uring: remove file batch-get optimisation For requests with non-fixed files, instead of grabbing just one reference, we get by the number of left requests, so the following requests using the same file can take it without atomics. However, it's not all win. If there is one request in the middle not using files or having a fixed file, we'll need to put back the left references. Even worse if an application submits requests dealing with different files, it will do a put for each new request, so doubling the number of atomics needed. Also, even if not used, it's still takes some cycles in the submission path. If a file used many times, it rather makes sense to pre-register it, if not, we may fall in the described pitfall. So, this optimisation is a matter of use case. Go with the simpliest code-wise way, remove it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:32 -06:00
Pavel Begunkov	6294f3686b	io_uring: clean up tctx_task_work() After recent fixes, tctx_task_work() always does proper spinlocking before looking into ->task_list, so now we don't need atomics for ->task_state, replace it with non-atomic task_running using the critical section. Tide it up, combine two separate block with spinlocking, and always try to splice in there, so we do less locking when new requests are arriving during the function execution. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> [axboe: fix missing ->task_running reset on task_work_add() failure] Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:32 -06:00
Pavel Begunkov	5d70904367	io_uring: inline io_poll_remove_waitqs Inline io_poll_remove_waitqs() into its only user and clean it up. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/2f1a91a19ffcd591531dc4c61e2f11c64a2d6a6d.1628536684.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:26 -06:00
Pavel Begunkov	90f67366cb	io_uring: remove extra argument for overflow flush Unlike __io_cqring_overflow_flush(), nobody does forced flushing with io_cqring_overflow_flush(), so removed the argument from it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7594f869ca41b7cfb5a35a3c7c2d402242834e9e.1628536684.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:10:19 -06:00
Pavel Begunkov	cd0ca2e048	io_uring: inline struct io_comp_state Inline struct io_comp_state into struct io_submit_state. They are already coupled tightly, together with mixed responsibilities it only brings confusion having them separately. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e55bba77426b399e3a2e54e3c6c267c6a0fc4b57.1628536684.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:09:48 -06:00
Pavel Begunkov	bb943b8265	io_uring: use inflight_entry instead of compl.list req->compl.list is used to cache freed requests, and so can't overlap in time with req->inflight_entry. So, use inflight_entry to link requests and remove compl.list. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e430e79d22d70a190d718831bda7bfed1daf8976.1628536684.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:09:43 -06:00
Pavel Begunkov	7255834ed6	io_uring: remove redundant args from cache_free We don't use @tsk argument of io_req_cache_free(), remove it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6a28b4a58ee0aaf0db98e2179b9c9f06f9b0cca1.1628536684.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:09:43 -06:00
Pavel Begunkov	c34b025f2d	io_uring: cache __io_free_req()'d requests Don't kfree requests in __io_free_req() but put them back into the internal request cache. That makes allocations more sustainable and will be used for refcounting optimisations. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9f4950fbe7771c8d41799366d0a3a08ac3040236.1628536684.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:09:43 -06:00
Pavel Begunkov	f56165e62f	io_uring: move io_fallback_req_func() Move io_fallback_req_func() to kill yet another forward declaration. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d0a8f9d9a0057ed761d6237167d51c9378798d2d.1628536684.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:09:22 -06:00
Pavel Begunkov	e9dbe221f5	io_uring: optimise putting task struct We cache all the reference to task + tctx, so if io_put_task() is called by the corresponding task itself, we can save on atomics and return the refs right back into the cache. It's beneficial for all inline completions, and also iopolling, when polling and submissions are done by the same task, including SQPOLL\|IOPOLL. Note: io_uring_cancel_generic() can return refs to the cache as well, so those should be flushed in the loop for tctx_inflight() to work right. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6fe9646b3cb70e46aca1f58426776e368c8926b3.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:08:06 -06:00
Pavel Begunkov	af066f31eb	io_uring: drop exec checks from io_req_task_submit In case of on-exec io_uring cancellations, tasks already wait for all submitted requests to get completed/cancelled, so we don't need to check for ->in_execve separately. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/be8707049f10df9d20ca03dc4ca3316239b5e8e0.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:08:06 -06:00
Pavel Begunkov	bbbca09489	io_uring: kill unused IO_IOPOLL_BATCH IO_IOPOLL_BATCH is not used, delete it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b2bdf19dbee2c9fc8865bbab9412135a14e24a64.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:08:06 -06:00
Pavel Begunkov	58d3be2c60	io_uring: improve ctx hang handling If io_ring_exit_work() can't get it done in 5 minutes, something is going very wrong, don't keep spinning at HZ / 20 rate, it doesn't help and it may take much of CPU time if there is a lot of workers stuck as such. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9e2d1ca81d569f6bc628af1a42ff6663bff7ce9c.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:59 -06:00
Pavel Begunkov	d3fddf6ddd	io_uring: deduplicate open iopoll check Move IORING_SETUP_IOPOLL check into __io_openat_prep(), so both openat and openat2 reuse it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9a73ce83e4ee60d011180ef177eecef8e87ff2a2.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:59 -06:00
Pavel Begunkov	543af3a13d	io_uring: inline io_free_req_deferred Inline io_free_req_deferred(), there is no reason to keep it separated. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ce04b7180d4eac0d69dd00677b227eefe80c2cc5.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:59 -06:00
Pavel Begunkov	b9bd2bea0f	io_uring: move io_rsrc_node_alloc() definition Move the function together with io_rsrc_node_ref_zero() in the source file as it is to get rid of forward declarations. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4d81f6f833e7d017860b24463a9a68b14a8a5ed2.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:59 -06:00
Pavel Begunkov	6a290a1442	io_uring: move io_put_task() definition Move the function in the source file as it is to get rid of forward declarations. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/33d917d69e4206557c75a5b98fe22bcdf77ce47d.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:59 -06:00
Pavel Begunkov	e73c5c7cd3	io_uring: extract a helper for ctx quiesce Refactor __io_uring_register() by extracting a helper responsible for ctx queisce. Looks better and will make it easier to add more optimisations. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0339e0027504176be09237eefa7945bf9a6f153d.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:59 -06:00
Pavel Begunkov	90291099f2	io_uring: optimise io_cqring_wait() hot path Turns out we always init struct io_wait_queue in io_cqring_wait(), even if it's not used after, i.e. there are already enough of CQEs. And often it's exactly what happens, for instance, requests may have been completed inline, or in case of io_uring_enter(submit=N, wait=1). It shows up in my profiler, so optimise it by delaying the struct init. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6f1b81c60b947d165583dc333947869c3d85d037.1628471125.git.asml.silence@gmail.com [axboe: fixed up for new cqring wait] Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:59 -06:00
Pavel Begunkov	282cdc8693	io_uring: add more locking annotations for submit Add more annotations for submission path functions holding ->uring_lock. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/128ec4185e26fbd661dd3a424aa66108ee8ff951.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:59 -06:00
Pavel Begunkov	a2416e1ec2	io_uring: don't halt iopoll too early IOPOLL users should care more about getting completions for requests they submitted, but not in "device did/completed something". Currently, io_do_iopoll() may return a positive number, which will instruct io_iopoll_check() to break the loop and end the syscall, even if there is not enough CQEs or none at all. Don't return positive numbers, so io_iopoll_check() exits only when it gets an actual error, need reschedule or got enough CQEs. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/641a88f751623b6758303b3171f0a4141f06726e.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:59 -06:00
Pavel Begunkov	864ea921b0	io_uring: refactor io_alloc_req Replace the main if of io_flush_cached_reqs() with inverted condition + goto, so all the cases are handled in the same way. And also extract io_preinit_req() to make it cleaner and easier to refer to. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1abcba1f7b55dc53bf1dbe95036e345ffb1d5b01.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:59 -06:00
Pavel Begunkov	8724dd8c83	io-wq: improve wq_list_add_tail() Prepare nodes that we're going to add before actually linking them, it's always safer and costs us nothing. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f7e53f0c84c02ed6748c488ed0789b98f8cc6185.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:56 -06:00
Pavel Begunkov	2215bed924	io_uring: remove unnecessary PF_EXITING check We prefer nornal task_works even if it would fail requests inside. Kill a PF_EXITING check in io_req_task_work_add(), task_work_add() handles well dying tasks, i.e. return error when can't enqueue due to late stages of do_exit(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/fc14297e8441cd8f5d1743a2488cf0df09bf48ac.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:56 -06:00
Pavel Begunkov	ebc11b6c6b	io_uring: clean io-wq callbacks Move io-wq callbacks closer to each other, so it's easier to work with them, and rename io_free_work() into io_wq_free_work() for consistency. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/851bbc7f0f86f206d8c1333efee8bcb9c26e419f.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:56 -06:00
Pavel Begunkov	c97d8a0f68	io_uring: avoid touching inode in rw prep If we use fixed files, we can be sure (almost) that REQ_F_ISREG is set. However, for non-reg files io_prep_rw() still will look into inode to double check, and that's expensive and can be avoided. The only caveat is that it only currently works with 64+ bit architectures, see FFS_ISREG, so we should consider that. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0a62780c491ca2522cd52db4ae3f16e03aafed0f.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:56 -06:00
Pavel Begunkov	b191e2dfe5	io_uring: rename io_file_supports_async() io_file_supports_async() checks whether a file supports nowait operations, so "async" in the name is misleading. Rename it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/33d55b5ce43aa1884c637c1957f1e30d30dc3bec.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:56 -06:00
Pavel Begunkov	ac177053bb	io_uring: inline fixed part of io_file_get() Optimise io_file_get() with registered files, which is in a hot path, by inlining parts of the function. Saves a function call, and inefficiencies of passing arguments, e.g. evaluating (sqe_flags & IOSQE_FIXED_FILE). It couldn't have been done before as compilers were refusing to inline it because of the function size. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/52115cd6ce28f33bd0923149c0e6cb611084a0b1.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:56 -06:00
Pavel Begunkov	042b0d85ea	io_uring: use kvmalloc for fixed files Instead of hand-coded two-level tables for registered files, allocate them with kvmalloc(). In many cases small enough tables are enough, and so can be kmalloc()'ed removing an extra memory load and a bunch of bit logic instructions from the hot path. If the table is larger, we trade off all the pros with a TLB-assisted memory lookup. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/280421d3b48775dabab773006bb5588c7b2dabc0.1628471125.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:56 -06:00
Jens Axboe	5fd4617840	io_uring: be smarter about waking multiple CQ ring waiters Currently we only wake the first waiter, even if we have enough entries posted to satisfy multiple waiters. Improve that situation so that every waiter knows how much the CQ tail has to advance before they can be safely woken up. With this change, if we have N waiters each asking for 1 event and we get 4 completions, then we wake up 4 waiters. If we have N waiters asking for 2 completions and we get 4 completions, then we wake up the first two. Previously, only the first waiter would've been woken up. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:56 -06:00
Jens Axboe	d3e9f732c4	io-wq: remove GFP_ATOMIC allocation off schedule out path Daniel reports that the v5.14-rc4-rt4 kernel throws a BUG when running stress-ng: \| [ 90.202543] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:35 \| [ 90.202549] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 2047, name: iou-wrk-2041 \| [ 90.202555] CPU: 5 PID: 2047 Comm: iou-wrk-2041 Tainted: G W 5.14.0-rc4-rt4+ #89 \| [ 90.202559] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 \| [ 90.202561] Call Trace: \| [ 90.202577] dump_stack_lvl+0x34/0x44 \| [ 90.202584] ___might_sleep.cold+0x87/0x94 \| [ 90.202588] rt_spin_lock+0x19/0x70 \| [ 90.202593] ___slab_alloc+0xcb/0x7d0 \| [ 90.202598] ? newidle_balance.constprop.0+0xf5/0x3b0 \| [ 90.202603] ? dequeue_entity+0xc3/0x290 \| [ 90.202605] ? io_wqe_dec_running.isra.0+0x98/0xe0 \| [ 90.202610] ? pick_next_task_fair+0xb9/0x330 \| [ 90.202612] ? __schedule+0x670/0x1410 \| [ 90.202615] ? io_wqe_dec_running.isra.0+0x98/0xe0 \| [ 90.202618] kmem_cache_alloc_trace+0x79/0x1f0 \| [ 90.202621] io_wqe_dec_running.isra.0+0x98/0xe0 \| [ 90.202625] io_wq_worker_sleeping+0x37/0x50 \| [ 90.202628] schedule+0x30/0xd0 \| [ 90.202630] schedule_timeout+0x8f/0x1a0 \| [ 90.202634] ? __bpf_trace_tick_stop+0x10/0x10 \| [ 90.202637] io_wqe_worker+0xfd/0x320 \| [ 90.202641] ? finish_task_switch.isra.0+0xd3/0x290 \| [ 90.202644] ? io_worker_handle_work+0x670/0x670 \| [ 90.202646] ? io_worker_handle_work+0x670/0x670 \| [ 90.202649] ret_from_fork+0x22/0x30 which is due to the RT kernel not liking a GFP_ATOMIC allocation inside a raw spinlock. Besides that not working on RT, doing any kind of allocation from inside schedule() is kind of nasty and should be avoided if at all possible. This particular path happens when an io-wq worker goes to sleep, and we need a new worker to handle pending work. We currently allocate a small data item to hold the information we need to create a new worker, but we can instead include this data in the io_worker struct itself and just protect it with a single bit lock. We only really need one per worker anyway, as we will have run pending work between to sleep cycles. https://lore.kernel.org/lkml/20210804082418.fbibprcwtzyt5qax@beryllium.lan/ Reported-by: Daniel Wagner <dwagner@suse.de> Tested-by: Daniel Wagner <dwagner@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 13:07:56 -06:00
Linus Torvalds	e22ce8eb63	Linux 5.14-rc7	2021-08-22 14:24:56 -07:00
Linus Torvalds	1bdc3d5be7	powerpc fixes for 5.14 #6 - Fix random crashes on some 32-bit CPUs by adding isync() after locking/unlocking KUEP - Fix intermittent crashes when loading modules with strict module RWX - Fix a section mismatch introduce by a previous fix. Thanks to: Christophe Leroy, Fabiano Rosas, Laurent Vivier, Murilo Opsfelder Araújo, Nathan Chancellor, Stan Johnson. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmEhjVUTHG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgM/VD/4o5D9d2Xppt+T0zdnKomq+3ffkC33/ zBK4vqOVOXbnlRpChqIqsHB3LNxMNMvTVaoLvxgy3ZQ57+rnirSDaFOaj4Nbazdx STwWmyxW9xPshqvj8tz8uHadSkvbrCClFy59FXtJf4H/iztTnQORKnXI9r3wxXS+ wBhw8Nhquuqg4O5h4q6yLLRIAaskus7uymDzYHVZkHO0RhfPLEZJwfCxydc29ukK wIRB6qojFBbWm/UscY1w6FiYrBn4Y5F3DzoTzJ7xlO6l1NYaE+58aun/oTGU7922 /8fXYs34TnkF6sA9qGhOtOc1MXfH8meFoH9s/fY3Z3O88xTe8k15wo2Ujlk/u0X1 1Gzv9FZI0RnpPtSLPiiu72/zS/vFOxAVCFMTvcodlte9RN90fW5Qwt/O1ya22vWt Ea3O9iNmYgQ+lV7ZZYDtKQ22WHIublg6cY5d3NDyj5HrzN/vGyp3QJFb2dnWoEpx k/KkK16oiIlduLGiFoYjn1ELyHUBTvp483y7zmspA4fCb0ue6W8b2zt8FszH0hI7 N4uroGXuk9OyhNsLWR8UHUR0s6Gi0XSaQ0O4XgWfoDAAvdev4oZiCqw0q5552OvX eE/Ogxc7INCiaoeLwOhYhCKjr+jBP8fhqyQzquyqMgUqEbxLtcFZCJ09bpXHjjiH OlAvZwlzOhwcKg== =K2B0 -----END PGP SIGNATURE----- Merge tag 'powerpc-5.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix random crashes on some 32-bit CPUs by adding isync() after locking/unlocking KUEP - Fix intermittent crashes when loading modules with strict module RWX - Fix a section mismatch introduce by a previous fix. Thanks to Christophe Leroy, Fabiano Rosas, Laurent Vivier, Murilo Opsfelder Araújo, Nathan Chancellor, and Stan Johnson. h# -----BEGIN PGP SIGNATURE----- * tag 'powerpc-5.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Fix set_memory_*() against concurrent accesses powerpc/32s: Fix random crashes by adding isync() after locking/unlocking KUEP powerpc/xive: Do not mark xive_request_ipi() as __init	2021-08-22 09:49:31 -07:00
Linus Torvalds	9ff50bf2f2	Two clk driver fixes - Make the regulator state match the GDSC power domain state at boot on Qualcomm SoCs so that the regulator isn't turned off inadvertently. - Fix earlycon on i.MX6Q SoCs -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmEhQMcRHHNib3lkQGtl cm5lbC5vcmcACgkQrQKIl8bklSX+8RAA1oxdmw6UzJPh0QgH89oV+B8/SAPF4+so h8froa/uvy/xDzhtd326umxEQmu4JVnwtXepSQEvSbBiKl+Llz9ODCjQkMI3RoXd 1QbpCtb4TFWKk0pYZeV5XqhWHWXgzkT3dsFU6buWD2SscGbenWAXm3dDScanYulb 7XPXVXvNoTc+ekChzexUsvMnDVlwIBcDTMkiQ0N4penOMJWKFS40zfPbZwUpJZC1 M/V2Ua7cvDLVWKqnRBm2xlnhn1j23xKVWLtElJYq75Ea86zihZbiq8bri22fgnNY /K0H3IQKtoMkRq7JhYlWtqX9tVFUP+ia2KBlW/lXk0W4XWvTmu6zWp8BBPGWJXDx G+Os+Dzj351RiKmIRRVMFS8mUCw5Sjs5T6YHwR4yItxp4syUTcPCHiPAJ/en6Y9U alC+6F2EEe8rVgQT3O0UTIGwHSXs8nssl2A56IGoIgB8eO/GkjnqtuESBGsPJXZP SfOKSFKwr9sSjw5lhyvO6K/k+VfpIbxI3eFU1K9MbOTVW/1R7anbZ2REtO/ncLu7 nazLlsWYos/XodRNjdKRuDmgQKwr63Dq/AHaKeF/nJxLRLG8kf3b2U3ij9A9LvfN WUP5gi8UsFuzbpwUVe2MCT5ro9M607kDydatfnvfLtllOov6JgWMSJhQPs/NIWrd 3rGb3K1Nmw0= =EGjz -----END PGP SIGNATURE----- Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk driver fixes from Stephen Boyd: - Make the regulator state match the GDSC power domain state at boot on Qualcomm SoCs so that the regulator isn't turned off inadvertently. - Fix earlycon on i.MX6Q SoCs * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: qcom: gdsc: Ensure regulator init state matches GDSC state clk: imx6q: fix uart earlycon unwork	2021-08-21 11:27:16 -07:00
Linus Torvalds	9085423f0e	Char/Misc driver fixes for 5.14-rc7 Here are some small driver fixes for 5.14-rc7. They consist of: - revert for an interconnect patch that was found to have problems. - ipack tpci200 driver fixes for reported problems - slimbus messaging and ngd fixes for reported problems. All are small and have been in linux-next for a while with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYSEfhw8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ylxAQCfYvG2nCMkIvBi7MXNwRS0go5YJyAAn3RprTCU V7dduATGH8zJ4u6AOdtg =Uq3v -----END PGP SIGNATURE----- Merge tag 'char-misc-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are some small driver fixes for 5.14-rc7. They consist of: - revert for an interconnect patch that was found to have problems - ipack tpci200 driver fixes for reported problems - slimbus messaging and ngd fixes for reported problems All are small and have been in linux-next for a while with no reported issues" * tag 'char-misc-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: ipack: tpci200: fix memory leak in the tpci200_register ipack: tpci200: fix many double free issues in tpci200_pci_probe slimbus: ngd: reset dma setup during runtime pm slimbus: ngd: set correct device for pm slimbus: messaging: check for valid transaction id slimbus: messaging: start transaction ids from 1 instead of zero Revert "interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate"	2021-08-21 11:22:10 -07:00
Linus Torvalds	f4ff9e6b01	USB fix for 5.14-rc7 Here is a single USB typec tcpm fix for a reported problem for 5.14-rc7. It showed up in 5.13 and resolves an issue that Hans found. It has been in linux-next this week with no reported problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYSEgIQ8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ymZpQCfeD9XR0ExAJgn7E0sNuLaOi2BxmUAn3Eu2IFu dFcSEpQtg23mY6aBFWfH =7kfD -----END PGP SIGNATURE----- Merge tag 'usb-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fix from Greg KH: "Here is a single USB typec tcpm fix for a reported problem for 5.14-rc7. It showed up in 5.13 and resolves an issue that Hans found. It has been in linux-next this week with no reported problems" * tag 'usb-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: typec: tcpm: Fix VDMs sometimes not being forwarded to alt-mode drivers	2021-08-21 11:10:06 -07:00

1 2 3 4 5 ...

1031339 Commits