linux

iv/linux

Author	SHA1	Message	Date
Kent Overstreet	2c91ab7262	bcachefs: bch2_dev_get_ioref() checks for device not present Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-09 16:23:36 -04:00
Kent Overstreet	6212ea2497	bcachefs: bch2_dev_get_ioref2(); journal_io.c Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-09 16:23:35 -04:00
Kent Overstreet	b6d29b5869	bcachefs: kill bch2_dev_bkey_exists() in journal_ptrs_to_text() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-08 17:29:24 -04:00
Kent Overstreet	d8585a79be	bcachefs: bch2_dev_have_ref() bch2_dev_bkey_exists() is going away; bch2_dev_have_ref() documents that we're looking up a device without checking if it's present because we have a reference to it already. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-08 17:29:24 -04:00
Kent Overstreet	b895c70326	bcachefs: x-macroize journal flags enums Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-08 17:29:22 -04:00
Kent Overstreet	c8bda9f20a	bcachefs: Simplify resuming of journal position Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-08 17:29:21 -04:00
Kent Overstreet	45150765d3	bcachefs: bch_member.last_journal_bucket On recovery from clean shutdown we don't typically read the journal, but we still want to avoid overwriting existing entries in the journal for list_journal debugging. Thus, add some fields to the member info section so we can remember where we left off. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-08 17:29:21 -04:00
Kent Overstreet	f04158290d	bcachefs: journal seq blacklist gc no longer has to walk btree Since btree_ptr_v2, we no longer require the journal seq blacklist table for skipping blacklisted bsets (btree node entries); the pointer to a given node indicates how much data is present. Therefore there's no longer any need for journal seq blacklist gc to walk the btree - we can prune entries older than journal last_seq. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-08 17:29:20 -04:00
Kent Overstreet	2f724563fc	bcachefs: member helper cleanups Some renaming for better consistency bch2_member_exists -> bch2_member_alive bch2_dev_exists -> bch2_member_exists bch2_dev_exsits2 -> bch2_dev_exists bch_dev_locked -> bch2_dev_locked bch_dev_bkey_exists -> bch2_dev_bkey_exists new helper - bch2_dev_safe Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-08 17:29:19 -04:00
Kent Overstreet	19391b9294	bcachefs: allow for custom action in fsck error messages Be more explicit to the user about what we're doing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-08 17:29:18 -04:00
Kent Overstreet	85ab365f7c	bcachefs: Fix deadlock in journal write path bch2_journal_write() was incorrectly waiting on earlier journal writes synchronously; this usually worked because most of the time we'd be running in the context of a thread that did a journal_buf_put(), but sometimes we'd be running out of the same workqueue that completes those prior journal writes. Additionally, this makes sure to punt to a workqueue before submitting preflushes - we really don't want to be calling submit_bio() in the main transaction commit path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-20 23:00:59 -04:00
Kent Overstreet	9abb6dd7ce	bcachefs: Standardize helpers for printing enum strs with bounds checks Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-13 22:48:17 -04:00
Kent Overstreet	3ed94062e3	bcachefs: Improve bch2_fatal_error() error messages should always include __func__ Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-18 00:24:24 -04:00
Kent Overstreet	62f35024b2	bcachefs: Change "accounting overran journal reservation" to a warning This doesn't need to be a BUG_ON(); the actual serious "things break" condition is if the whole journal write overruns the available space, and that has a fatal error, not a BUG_ON(). This check indicates we screwed something up, but it should be a warning. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-17 20:53:11 -04:00
Kent Overstreet	f1ca1abfb0	bcachefs: pull out time_stats.[ch] prep work for lifting out of fs/bcachefs/ Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-13 21:30:35 -04:00
Brian Foster	ada02c207c	bcachefs: fix lost journal buf wakeup due to improved pipelining The journal_write_done() handler was reworked into a loop in commit 746a33c96b7a ("bcachefs: better journal pipelining"). As part of this, the journal buffer wake was factored into a post-loop branch that executes if at least one journal buffer has completed. The journal buffer processing loop iterates on the journal buffer pointer, however. This means that w refers to the last buffer processed by the loop, which may or may not be done. This also means that if multiple buffers are processed by the loop, only the last is awoken. This lost wakeup behavior has lead to stalling problems in various CI and fstests, such as generic/703. Lift the wake into the loop so each done buffer sees a wake call as it is processed. Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-13 21:22:26 -04:00
Kent Overstreet	7efa287526	bcachefs: Fix bch2_journal_noflush_seq() Improved journal pipelining broke journal_noflush_seq(); it implicitly assumed only the oldest outstanding journal buf could be in flight, but that's no longer true. Make this more straightforward by just setting buf->must_flush whenever we know a journal buf is going to be flush. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-13 21:22:25 -04:00
Kent Overstreet	2cce3752ce	bcachefs: split out ignore_blacklisted, ignore_not_dirty prep work for replaying the journal backwards Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-13 21:22:25 -04:00
Kent Overstreet	0b5961b0d8	bcachefs: jset_entry for loops declare loop iter Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-13 21:22:25 -04:00
Kent Overstreet	d9290c9931	bcachefs: Fix journal_buf bitfield accesses All jounal_buf bitfield updates must happen under the journal lock - perhaps we should just switch these to atomic bit flags. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-13 21:22:25 -04:00
Kent Overstreet	cb6fc943b6	bcachefs: kill kvpmalloc() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-13 18:39:12 -04:00
Kent Overstreet	916abefd43	bcachefs: better journal pipelining Recently a severe performance regression was discovered, which bisected to `a6548c8b5e` bcachefs: Avoid flushing the journal in the discard path It turns out the old behaviour, which issued excessive journal flushes, worked around a performance issue where queueing delays would cause the journal to not be able to write quickly enough and stall. The journal flushes masked the issue because they periodically flushed the device write cache, reducing write latency for non flushes. This patch reworks the journalling code to allow more than one (non-flush) write to be in flight at a time. With this patch, doing 4k random writes and an iodepth of 128, we are now able to hit 560k iops to a Samsung 970 EVO Plus - previously, we were stuck in the ~200k range. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-10 15:34:08 -04:00
Kent Overstreet	38789c2508	bcachefs: closure per journal buf Prep work for having multiple journal writes in flight. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-10 15:34:08 -04:00
Kent Overstreet	5165400275	bcachefs: bio per journal buf Prep work for having multiple journal writes in flight. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-10 15:34:08 -04:00
Kent Overstreet	52f7d75e7d	bcachefs: jset_entry_datetime This gives us a way to record the date and time every journal entry was written - useful for debugging. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-10 15:34:08 -04:00
Kent Overstreet	3d3d23b341	bcachefs: improve journal entry read fsck error messages Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-10 15:34:08 -04:00
Kent Overstreet	a555bcf4fa	bcachefs: convert journal replay ptrs to darray Eliminates some error paths - no longer have a hardcoded BCH_REPLICAS_MAX limit. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-10 15:34:08 -04:00
Kent Overstreet	bdec47f57f	bcachefs: Journal writes should be REQ_SYNC\|REQ_META Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-10 15:34:08 -04:00
Kent Overstreet	656f05d8bd	bcachefs: Split out journal workqueue We don't want journal write completions to be blocked behind btree transactions - io_complete_wq is used for btree updates after data and metadata writes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-10 15:34:08 -04:00
Kent Overstreet	4e07447503	bcachefs: Clamp replicas_required to replicas This prevents going emergency read only when the user has specified replicas_required > replicas. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-02-13 20:33:38 -05:00
Christoph Hellwig	3e44f325f6	bcachefs: fix incorrect usage of REQ_OP_FLUSH REQ_OP_FLUSH is only for internal use in the blk-mq and request based drivers. File systems and other block layer consumers must use REQ_OP_WRITE \| REQ_PREFLUSH as documented in Documentation/block/writeback_cache_control.rst. While REQ_OP_FLUSH appears to work for blk-mq drivers it does not get the proper flush state machine handling, and completely fails for any bio based drivers, including all the stacking drivers. The block layer will also get a check in 6.8 to reject this use case entirely. [Note: completely untested, but as this never got fixed since the original bug report in November: https://bugzilla.kernel.org/show_bug.cgi?id=218184 and the the discussion in December: https://lore.kernel.org/all/20231221053016.72cqcfg46vxwohcj@moria.home.lan/T/ this seems to be best way to force it] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-22 12:37:51 -05:00
Kent Overstreet	e58f963cec	bcachefs: helpers for printing data types We need bounds checking since new versions may introduce new data types. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-21 06:01:45 -05:00
Kent Overstreet	4819b66e29	bcachefs: improve checksum error messages new helpers: - bch2_csum_to_text() - bch2_csum_err_msg() standardize our checksum error messages a bit, and print out the checksums a bit more nicely. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-05 23:24:21 -05:00
Kent Overstreet	0beebd9245	bcachefs: bkey_for_each_ptr() now declares loop iter Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:43 -05:00
Kent Overstreet	cea07a7b6a	bcachefs: vstruct_for_each() now declares loop iter Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:42 -05:00
Kent Overstreet	9fea2274f7	bcachefs: for_each_member_device() now declares loop iter Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:42 -05:00
Kent Overstreet	73ffa53056	bcachefs: Drop journal entry compaction Previously, we dropped empty journal entries and coalesced entries that could be - but it's not worth the overhead; we very rarely leave unused journal entries after getting a journal reservation. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:41 -05:00
Kent Overstreet	09caeabe1a	bcachefs: btree write buffer now slurps keys from journal Previosuly, the transaction commit path would have to add keys to the btree write buffer as a separate operation, requiring additional global synchronization. This patch introduces a new journal entry type, which indicates that the keys need to be copied into the btree write buffer prior to being written out. We switch the journal entry type back to JSET_ENTRY_btree_keys prior to write, so this is not an on disk format change. Flushing the btree write buffer may require pulling keys out of journal entries yet to be written, and quiescing outstanding journal reservations; we previously added journal->buf_lock for synchronization with the journal write path. We also can't put strict bounds on the number of keys in the journal destined for the write buffer, which means we might overflow the size of the preallocated buffer and have to reallocate - this introduces a potentially fatal memory allocation failure. This is something we'll have to watch for, if it becomes an issue in practice we can do additional mitigation. The transaction commit path no longer has to explicitly check if the write buffer is full and wait on flushing; this is another performance optimization. Instead, when the btree write buffer is close to full we change the journal watermark, so that only reservations for journal reclaim are allowed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:41 -05:00
Kent Overstreet	b05c0e9370	bcachefs: journal->buf_lock Add a new lock for synchronizing between journal IO path and btree write buffer flush. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:41 -05:00
Kent Overstreet	9b34f02cdc	bcachefs: Kill dev_usage->buckets_ec This counter is redundant; it's simply the sum of BCH_DATA_stripe and BCH_DATA_parity buckets. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:38 -05:00
Yang Li	225879f403	bcachefs: clean up one inconsistent indenting fs/bcachefs/journal_io.c:1843 bch2_journal_write_pick_flush() warn: inconsistent indenting Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7585 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:38 -05:00
Kent Overstreet	066a26460b	bcachefs: track_event_change() This introduces a new helper for connecting time_stats to state changes, i.e. when taking journal reservations is blocked for some reason. We use this to track separately the different reasons the journal might be blocked - i.e. space in the journal full, or the journal pin fifo full. Also do some cleanup and improvements on the time stats code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:37 -05:00
Kent Overstreet	fa5df9e7d5	bcachefs: Include average write size in sysfs journal_debug Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:36 -05:00
Kent Overstreet	c8296d730f	bcachefs: Fix leakage of internal error code Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-12-21 23:46:52 -05:00
Kent Overstreet	a66ff26b0f	bcachefs: Close journal entry if necessary when flushing all pins Since outstanding journal buffers hold a journal pin, when flushing all pins we need to close the current journal entry if necessary so its pin can be released. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-12-10 16:53:46 -05:00
Kent Overstreet	d5bd37872a	bcachefs: Add missing validation for jset_entry_data_usage Validation was completely missing for replicas entries in the journal (not the superblock replicas section) - we can't have replicas entries pointing to invalid devices. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-11-28 17:18:24 -05:00
Kent Overstreet	d4e3b928ab	closures: CLOSURE_CALLBACK() to fix type punning Control flow integrity is now checking that type signatures match on indirect function calls. That breaks closures, which embed a work_struct in a closure in such a way that a closure_fn may also be used as a workqueue fn by the underlying closure code. So we have to change closure fns to take a work_struct as their argument - but that results in a loss of clarity, as closure fns have different semantics from normal workqueue functions (they run owning a ref on the closure, which must be released with continue_at() or closure_return()). Thus, this patc introduces CLOSURE_CALLBACK() and closure_type() macros as suggested by Kees, to smooth things over a bit. Suggested-by: Kees Cook <keescook@chromium.org> Cc: Coly Li <colyli@suse.de> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-11-24 00:29:58 -05:00
Kent Overstreet	497c57a303	bcachefs: Disable debug log statements The journal read path had some informational log statements preperatory for ZNS support - they're not of interest to users, so we can turn them off. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-11-14 23:44:44 -05:00
Kent Overstreet	769b360049	bcachefs: Don't iterate over journal entries just for btree roots Small performance optimization, and a bit of a code cleanup too. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-11-05 13:12:18 -05:00
Kent Overstreet	80396a4749	bcachefs: Break up bch2_journal_write() Split up bch2_journal_write() to simplify locking: - bch2_journal_write_pick_flush(), which needs j->lock - bch2_journal_write_prep, which operates on the journal buffer to be written and will need the upcoming buf_lock for synchronization with the btree write buffer flush path Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-11-05 13:12:18 -05:00

1 2 3 4 5

210 Commits