linux

iv/linux

Author	SHA1	Message	Date
Pavel Begunkov	553deffd09	io_uring: don't pass state to io_submit_state_end Submission state and ctx and coupled together, no need to passs Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e22d77a5786ef77e0c49b933ad74bae55cfb6ca6.1632516769.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Pavel Begunkov	1cce17aca6	io_uring: don't pass tail into io_free_batch_list io_free_batch_list() iterates all requests in the passed in list, so we don't really need to know the tail but can keep iterating until meet NULL. Just pass the first node into it and it will be enough. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4a12c84b6d887d980e05f417ba4172d04c64acae.1632516769.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Pavel Begunkov	d4b7a5ef2b	io_uring: inline completion batching helpers We now have a single function for batched put of requests, just inline struct req_batch and all related helpers into it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/595a2917f80dd94288cd7203052c7934f5446580.1632516769.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Pavel Begunkov	f5ed3bcd5b	io_uring: optimise batch completion First, convert rest of iopoll bits to single linked lists, and also replace per-request list_add_tail() with splicing a part of slist. With that, use io_free_batch_list() to put/free requests. The main advantage of it is that it's now the only user of struct req_batch and friends, and so they can be inlined. The main overhead there was per-request call to not-inlined io_req_free_batch(), which is expensive enough. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b37fc6d5954b241e025eead7ab92c6f44a42f229.1632516769.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Pavel Begunkov	b3fa03fd1b	io_uring: convert iopoll_completed to store_release Convert explicit barrier around iopoll_completed to smp_load_acquire() and smp_store_release(). Similar on the callback side, but replaces a single smp_rmb() with per-request smp_load_acquire(), neither imply any extra CPU ordering for x86. Use READ_ONCE as usual where it doesn't matter. Use it to move filling CQEs by iopoll earlier, that will be necessary to avoid traversing the list one extra time in the future. Suggested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8bd663cb15efdc72d6247c38ee810964e744a450.1632516769.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Pavel Begunkov	3aa83bfb6e	io_uring: add a helper for batch free Add a helper io_free_batch_list(), which takes a single linked list and puts/frees all requests from it in an efficient manner. Will be reused later. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4fc8306b542c6b1dd1d08e8021ef3bdb0ad15010.1632516769.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Pavel Begunkov	5eef4e87eb	io_uring: use single linked list for iopoll Use single linked lists for keeping iopoll requests, takes less space, may be faster, but mostly will be of benefit for further patches. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/314033676b100cd485518c3bc55e1b95a0dcd71f.1632516769.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Pavel Begunkov	e3f721e6f6	io_uring: split iopoll loop The main loop of io_do_iopoll() iterates and does ->iopoll() until it meets a first completed request, then it continues from that position and splices requests to pass them through io_iopoll_complete(). Split the loop in two for clearness, iopolling and reaping completed requests from the list. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a7f6fd27a94845e5dc925a47a4a9765a92e514fb.1632516769.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Pavel Begunkov	c2b6c6bc4e	io_uring: replace list with stack for req caches Replace struct list_head free_list serving for caching requests with singly linked stack, which is faster. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1bc942b82422fb2624b8353bd93aca183a022846.1632516769.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Pavel Begunkov	0d9521b9b5	io-wq: add io_wq_work_node based stack Apart from just using lists (i.e. io_wq_work_list), we also want to have stacks, which are a bit faster, and have some interoperability between them. Add a stack implementation based on io_wq_work_node and some helpers. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5d3a412a5ac0d47e0f0499d70d2207d70a68925e.1632516769.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Pavel Begunkov	3ab665b74e	io_uring: remove allocation cache array We have several of request allocation layers, remove the last one, which is the submit->reqs array, and always use submit->free_reqs instead. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8547095c35f7a87bab14f6447ecd30a273ed7500.1632516769.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Pavel Begunkov	6f33b0bc4e	io_uring: use slist for completion batching Currently we collect requests for completion batching in an array. Replace them with a singly linked list. It's as fast as arrays but doesn't take some much space in ctx, and will be used in future patches. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a666826f2854d17e9fb9417fb302edfeb750f425.1632516769.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Pavel Begunkov	5ba3c874eb	io_uring: make io_do_iopoll return number of reqs Don't pass nr_events pointer around but return directly, it's less expensive than pointer increments. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f771a8153a86f16f12ff4272524e9e549c5de40b.1632516769.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Pavel Begunkov	87a115fb71	io_uring: force_nonspin We don't really need to pass the number of requests to complete into io_do_iopoll(), a flag whether to enforce non-spin mode is enough. Should be straightforward, maybe except io_iopoll_check(). We pass !min there, because we do never enter with the number of already reaped requests is larger than the specified @min, apart from the first iteration, where nr_events is 0 and so the final check should be identical. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/782b39d1d8ec584eae15bca0a1feb6f0571fe5b8.1632516769.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Pavel Begunkov	6878b40e7b	io_uring: mark having different creds unlikely Hint the compiler that it's not as likely to have creds different from current attached to a request. The current code generation is far from ideal, hopefully it can help to some compilers to remove duplicated jump tables and so. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e7815251ac4bf5a4a23d298c752f029ae19f3837.1632516769.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Hao Xu	8d4af6857c	io_uring: return boolean value for io_alloc_async_data boolean value is good enough for io_alloc_async_data. Signed-off-by: Hao Xu <haoxu@linux.alibaba.com> Link: https://lore.kernel.org/r/20210922101522.9179-1-haoxu@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Pavel Begunkov	68fe256aad	io_uring: optimise io_req_init() sqe flags checks IOSQE_IO_DRAIN is quite marginal and we don't care too much about IOSQE_BUFFER_SELECT. Save to ifs and hide both of them under SQE_VALID_FLAGS check. Now we first check whether it uses a "safe" subset, i.e. without DRAIN and BUFFER_SELECT, and only if it's not true we test the rest of the flags. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/dccfb9ab2ab0969a2d8dc59af88fa0ce44eeb1d5.1631703764.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Pavel Begunkov	a3f349071e	io_uring: remove ctx referencing from complete_post Now completions are done from task context, that means that it's either the task itself, task_work or io-wq worker. In all those cases the ctx will be staying alive by mutexing, explicit referencing or req references by iowq. Remove extra ctx pinning from io_req_complete_post(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/60a0e96434c16ab4fe587651448290d61ec9a113.1631703756.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:53 -06:00
Hao Xu	83f84356bc	io_uring: add more uring info to fdinfo for debug Developers may need some uring info to help themselves debug and address issues in production. This includes sqring/cqring head/tail and the detailed sqe/cqe info, which is very useful when an application is hung on a ring. Signed-off-by: Hao Xu <haoxu@linux.alibaba.com> Link: https://lore.kernel.org/r/20210913130854.38542-1-haoxu@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:52 -06:00
Pavel Begunkov	d97ec6239a	io_uring: kill extra wake_up_process in tw add TWA_SIGNAL already wakes the thread, no need in wake_up_process() after it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7e90cf643f633e857443e0c9e72471b221735c50.1631115443.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:52 -06:00
Pavel Begunkov	c450178d9b	io_uring: dedup CQE flushing non-empty checks We don't do io_submit_flush_completions() when there is no requests enqueued, and every single caller checks for it. Hide that check into the function not forgetting about inlining. That will make it much easier for changing the empty check condition in the future. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d7ff8cef5da1b38e8ea648f5aad9a315ddfc7b57.1631115443.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:52 -06:00
Pavel Begunkov	d81499bfcd	io_uring: inline linked part of io_req_find_next Inline part of __io_req_find_next() that returns a request but doesn't need io_disarm_next(). It's just two places, but makes links a bit faster. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4126d13f23d0e91b39b3558e16bd86cafa7fcef2.1631115443.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:52 -06:00
Pavel Begunkov	6b639522f6	io_uring: inline io_dismantle_req io_dismantle_req() is hot, and not _too_ huge. Inline it, there are 3 call sites, which hopefully will turn into 2 in the future. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/bdd2dc30716cac270c2403e99bccd6286e4ae201.1631115443.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:52 -06:00
Pavel Begunkov	4b628aeb69	io_uring: kill off ios_left ->ios_left is only used to decide whether to plug or not, kill it to avoid this extra accounting, just use the initial submission number. There is no much difference in regards of enabling plugging, where this one does it in a few more cases, but all major ones should be covered well. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f13993bcf5b477f9a7d52881fc49f9457ea9870a.1631115443.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:52 -06:00
Bixuan Cui	71e1cef2d7	io-wq: Remove duplicate code in io_workqueue_create() While task_work_add() in io_workqueue_create() is true, then duplicate code is executed: -> clear_bit_unlock(0, &worker->create_state); -> io_worker_release(worker); -> atomic_dec(&acct->nr_running); -> io_worker_ref_put(wq); -> return false; -> clear_bit_unlock(0, &worker->create_state); // back to io_workqueue_create() -> io_worker_release(worker); -> kfree(worker); The io_worker_release() and clear_bit_unlock() are executed twice. Fixes: `3146cba99a` ("io-wq: make worker creation resilient against signals") Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Link: https://lore.kernel.org/r/20210911085847.34849-1-cuibixuan@huawei.com Reviwed-by: Hao Xu <haoxu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:52 -06:00
Jens Axboe	a87acfde94	io_uring: dump sqe contents if issue fails I recently had to look at a production problem where a request ended up getting the dreaded -EINVAL error on submit. The most used and hence useless of error codes, as it just tells you that something was wrong with your request, but not more than that. Let's dump the full sqe contents if we run into an issue failure, that'll allow easier diagnosing of a wide variety of issues. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:49:52 -06:00
Jens Axboe	e0d78afeb8	block: fix too broad elevator check in blk_mq_free_request() We added RQF_ELV to tell whether there's an IO scheduler attached, and RQF_ELVPRIV tells us whether there's an IO scheduler with private data attached. Don't check RQF_ELV in blk_mq_free_request(), what we care about here is just if we have scheduler private data attached. This fixes a boot crash Fixes: `2ff0682da6` ("block: store elevator state in request") Reported-by: Yi Zhang <yi.zhang@redhat.com> Reported-by: syzbot+eb8104072aeab6cc1195@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-19 05:48:15 -06:00
Jens Axboe	4f5022453a	nvme: wire up completion batching for the IRQ path Trivial to do now, just need our own io_comp_batch on the stack and pass that in to the usual command completion handling. I pondered making this dependent on how many entries we had to process, but even for a single entry there's no discernable difference in performance or latency. Running a sync workload over io_uring: t/io_uring -b512 -d1 -s1 -c1 -p0 -F1 -B1 -n2 /dev/nvme1n1 /dev/nvme2n1 yields the below performance before the patch: IOPS=254820, BW=124MiB/s, IOS/call=1/1, inflight=(1 1) IOPS=251174, BW=122MiB/s, IOS/call=1/1, inflight=(1 1) IOPS=250806, BW=122MiB/s, IOS/call=1/1, inflight=(1 1) and the following after: IOPS=255972, BW=124MiB/s, IOS/call=1/1, inflight=(1 1) IOPS=251920, BW=123MiB/s, IOS/call=1/1, inflight=(1 1) IOPS=251794, BW=122MiB/s, IOS/call=1/1, inflight=(1 1) which definitely isn't slower, about the same if you factor in a bit of variance. For peak performance workloads, benchmarking shows a 2% improvement. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 14:40:47 -06:00
Jens Axboe	b688f11e86	io_uring: utilize the io batching infrastructure for more efficient polled IO Wire up using an io_comp_batch for f_op->iopoll(). If the lower stack supports it, we can handle high rates of polled IO more efficiently. This raises the single core efficiency on my system from ~6.1M IOPS to ~6.6M IOPS running a random read workload at depth 128 on two gen2 Optane drives. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 14:40:46 -06:00
Jens Axboe	c234a65392	nvme: add support for batched completion of polled IO Take advantage of struct io_comp_batch, if passed in to the nvme poll handler. If it's set, rather than complete each request individually inline, store them in the io_comp_batch list. We only do so for requests that will complete successfully, anything else will be completed inline as before. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 14:40:45 -06:00
Jens Axboe	f794f3351f	block: add support for blk_mq_end_request_batch() Instead of calling blk_mq_end_request() on a single request, add a helper that takes the new struct io_comp_batch and completes any request stored in there. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 14:40:43 -06:00
Jens Axboe	1aec5e4a29	sbitmap: add helper to clear a batch of tags sbitmap currently only supports clearing tags one-by-one, add a helper that allows the caller to pass in an array of tags to clear. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 14:40:42 -06:00
Jens Axboe	5a72e899ce	block: add a struct io_comp_batch argument to fops->iopoll() struct io_comp_batch contains a list head and a completion handler, which will allow completions to more effciently completed batches of IO. For now, no functional changes in this patch, we just define the io_comp_batch structure and add the argument to the file_operations iopoll handler. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 14:40:40 -06:00
Jens Axboe	013a7f9543	block: provide helpers for rq_list manipulation Instead of open-coding the list additions, traversal, and removal, provide a basic set of helpers. Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 14:40:39 -06:00
Jens Axboe	afd7de03c5	block: remove some blk_mq_hw_ctx debugfs entries Just like the blk_mq_ctx counterparts, we've got a bunch of counters in here that are only for debugfs and are of questionnable value. They are: - dispatched, index of how many requests were dispatched in one go - poll_{considered,invoked,success}, which track poll sucess rates. We're confident in the iopoll implementation at this point, don't bother tracking these. As a bonus, this shrinks each hardware queue from 576 bytes to 512 bytes, dropping a whole cacheline. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 14:40:33 -06:00
Jens Axboe	9a14d6ce41	block: remove debugfs blk_mq_ctx dispatched/merged/completed attributes These were added as part of early days debugging for blk-mq, and they are not really useful anymore. Rather than spend cycles updating them, just get rid of them. As a bonus, this shrinks the per-cpu software queue size from 256b to 192b. That's a whole cacheline less. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 14:39:48 -06:00
Pavel Begunkov	128459062b	block: cache rq_flags inside blk_mq_rq_ctx_init() Add a local variable for rq_flags, it helps to compile out some of rq_flags reloads. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 14:38:45 -06:00
Pavel Begunkov	605f784e4f	block: blk_mq_rq_ctx_init cache ctx/q/hctx We should have enough of registers in blk_mq_rq_ctx_init(), store them in local vars, so we don't keep reloading them. note: keeping q->elevator may look unnecessary, but it's also used inside inlined blk_mq_tags_from_data(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 14:38:43 -06:00
Pavel Begunkov	4f266f2be8	block: skip elevator fields init for non-elv queue Don't init rq->hash and rq->rb_node in blk_mq_rq_ctx_init() if there is no elevator. Also, move some other initialisers that imply barriers to the end, so the compiler is free to rearrange and optimise other the rest of them. note: fold in a change from Jens leaving queue_list unconditional, as it might lead to problems otherwise. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 14:38:42 -06:00
Jens Axboe	2ff0682da6	block: store elevator state in request Add an rq private RQF_ELV flag, which tells the block layer that this request was initialized on a queue that has an IO scheduler attached. This allows for faster checking in the fast path, rather than having to deference rq->q later on. Elevator switching does full quiesce of the queue before detaching an IO scheduler, so it's safe to cache this in the request itself. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 08:51:52 -06:00
Jens Axboe	90b8faa0e8	block: only mark bio as tracked if it really is tracked We set BIO_TRACKED unconditionally when rq_qos_throttle() is called, even though we may not even have an rq_qos handler. Only mark it as TRACKED if it really is potentially tracked. This saves considerable time for the case where the bio isn't tracked: 2.64% -1.65% [kernel.vmlinux] [k] bio_endio Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 08:50:47 -06:00
Jens Axboe	b608762968	block: improve layout of struct request It's been a while since this was analyzed, move some members around to better flow with the use case. Initial state up top, and queued state after that. This improves my peak case by about 1.5%, from 7750K to 7900K IOPS. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 08:50:42 -06:00
Jens Axboe	9be3e06fb7	block: move update request helpers into blk-mq.c For some reason we still have them in blk-core, with the rest of the request completion being in blk-mq. That causes and out-of-line call for each completion. Move them into blk-mq.c instead, where they belong. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 08:50:28 -06:00
Jens Axboe	c477b79778	block: remove useless caller argument to print_req_error() We have exactly one caller of this, just get rid of adding the useless function name to the output. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 08:50:22 -06:00
Jens Axboe	d4aa57a1ca	block: don't bother iter advancing a fully done bio If we're completing nbytes and nbytes is the size of the bio, don't bother with calling into the iterator increment helpers. Just clear the bio size and we're done. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 08:50:16 -06:00
Pavel Begunkov	ed6cddefdf	block: convert the rest of block to bdev_get_queue Convert bdev->bd_disk->queue to bdev_get_queue(), it's uses a cached queue pointer and so is faster. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/addf6ea988c04213697ba3684c853e4ed7642a39.1634219547.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 06:17:37 -06:00
Pavel Begunkov	eab4e02733	block: use bdev_get_queue() in blk-core.c Convert bdev->bd_disk->queue to bdev_get_queue(), it's uses a cached queue pointer and so is faster. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/efc41f880262517c8dc32f932f1b23112f21b255.1634219547.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 06:17:36 -06:00
Pavel Begunkov	3caee4634b	block: use bdev_get_queue() in bio.c Convert bdev->bd_disk->queue to bdev_get_queue(), it's uses a cached queue pointer and so is faster. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/85c36ea784d285a5075baa10049e6b59e15fb484.1634219547.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 06:17:36 -06:00
Pavel Begunkov	025a38651b	block: use bdev_get_queue() in bdev.c Convert bdev->bd_disk->queue to bdev_get_queue(), it's uses a cached queue pointer and so is faster. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a352936ce5d9ac719645b1e29b173d931ebcdc02.1634219547.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 06:17:36 -06:00
Pavel Begunkov	17220ca5ce	block: cache request queue in bdev There are tons of places where we need to get a request_queue only having bdev, which turns into bdev->bd_disk->queue. There are probably a hundred of such places considering inline helpers, and enough of them are in hot paths. Cache queue pointer in struct block_device and make use of it in bdev_get_queue(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a3bfaecdd28956f03629d0ca5c63ebc096e1c809.1634219547.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 06:17:36 -06:00

1 2 3 4 5 ...

1044556 Commits