IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Add a field to struct bset for the sector offset within the btree node
where it was written.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This closes a significant hole (and last known hole) in our ability to
verify metadata. Previously, since btree nodes are log structured, we
couldn't detect lost btree writes that weren't the first write to a
given node. Additionally, this seems to have lead to some significant
metadata corruption on multi device filesystems with metadata
replication: since a write may have made it to one device and not
another, if we read that btree node back from the replica that did have
that write and started appending after that point, the other replica
would have a gap in the bset entries and reading from that replica
wouldn't find the rest of the bsets.
But, since updates to interior btree nodes are now journalled, we can
close this hole by updating pointers to btree nodes after every write
with the currently written number of sectors, without negatively
affecting performance. This means we will always detect lost or corrupt
metadata - it also means that our btree is now a curious hybrid of COW
and non COW btrees, with all the benefits of both (excluding
complexity).
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
The unit tests hadn't been updated for various recent btree changes -
this patch makes them work again.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
The fsck code handles transaction restarts in a very ad hoc way, and not
always correctly. This patch makes some improvements to check_dirents(),
but more work needs to be done to figure out how this kind of code
should be structured.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
We weren't correctly verifying that we had interior node intent locks -
this patch also fixes bugs uncovered by the new assertions.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
We probably don't ever want to flip this off in production, but it may
be useful for certain kinds of testing.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
On fstest generic/388, we were seeing sporadic deadlocks in the
emergency shutdown, where we'd get stuck shutting down the allocator
because bch2_btree_update_start() -> bch2_btree_reserve_get() allocated
and then deallocated some btree nodes, putting them back on the
btree_reserve_cache, after the allocator shutdown code had already
cleared out that cache.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This adds safe versions of bch2_varint_(encode|decode) that don't read
or write past the end of the buffer, or varint being encoded.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This is to help debug a rare shutdown deadlock in the allocator code -
the btree code is leaking open_buckets.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This is a performance improvement by removing the need to wait for the
in flight btree write to complete before kicking one off, which is going
to be needed to avoid a performance regression with the upcoming patch
to update btree ptrs after every btree write.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Compat features should be cleared if the filesystem was touched by a
version that doesn't support them.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This is something we've attempted to stick to for quite some time, as it
helps guarantee filesystem latency - but there's a few remaining paths
that this patch fixes.
This is also necessary for an upcoming patch to update btree pointers
after every btree write - since the btree write completion path will now
be doing btree operations.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
btree_trans should always be passed when we have one - iter->trans is
disfavoured. This mainly updates old code in btree_update_interior.c,
some of which predates btree_trans.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Add basic kernel docs for bch2_trans_reset and bch2_trans_begin.
Signed-off-by: Dan Robertson <dan@dlrobertson.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
A new device state that is not a valid state should return -EINVAL
in the disk set state ioctl.
Signed-off-by: Dan Robertson <dan@dlrobertson.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Add a new flag to control assertions about updating to internal snapshot
nodes, that normally should not be written to - to be used in an
upcoming patch.
Also do some renaming - trigger_flags is now update_flags.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This assertion is checking that what the iterator points to is
consistent with iter->real_pos, and since it's an internal btree
ordering property it should be using bpos_cmp.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Internal btree code really wants a POS_MAX with all fields ~0; external
code more likely wants the snapshot field to be 0, because when we're
passing it to bch2_trans_get_iter() it's used for the snapshot we're
operating in, which should be 0 for most btrees that don't use
snapshots.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
xxhash is a much faster algorithm compared to crc32.
could be used to speed up checksum calculation.
xxhash 64-bit only, as it is much faster on 64-bit CPUs compared to xxh32.
Signed-off-by: jpsollie <janpieter.sollie@edpnet.be>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Perform abstraction of hash calculation for advanced checksum algorithms.
Algorithms like xxhash do not store their state as a u64 int.
Signed-off-by: jpsollie <janpieter.sollie@edpnet.be>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
bch2_fs_ioctl() didn't distinguish between unsupported ioctls and those
which the current user is unauthorised to perform. That kept the code
simple but meant that, for example, an unprivileged TIOCGWINSZ ioctl on
a bcachefs file would return -EPERM instead of the expected -ENOTTY.
The same call made by a privileged user would correctly return -ENOTTY.
Fix this discrepancy by moving the check for CAP_SYS_ADMIN into each
privileged ioctl function.
Signed-off-by: Tobias Geerinckx-Rice <me@tobias.gr>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
In !BTREE_ITER_IS_EXTENTS mode, we shouldn't be looking at k->size, i.e.
we shouldn't use bkey_start_pos().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Avoid calling kfree on the returned error pointer if
bch2_acl_from_disk fails.
Signed-off-by: Dan Robertson <dan@dlrobertson.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
The value of f_bfree and f_bavail should be the same. The value of
f_bfree is not currently scaled by the availability factor.
Signed-off-by: Dan Robertson <dan@dlrobertson.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
We need to ensure that packed formats can't represent fields larger than
the unpacked format, which is a bit tricky since the calculations can
also overflow a u64. This patch fixes a shift and simplifies the overall
calculations.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Do not attempt to shortcut a truncate when the given new size is
the same as the current size. There may be blocks allocated to the
file that extend beyond the i_size. The ctime and mtime should
not be updated in this case.
Signed-off-by: Dan Robertson <dan@dlrobertson.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
The implementation of prefetch_four_cachelines should use ifdef
CONFIG_X86_64 to conditionally compile x86_64 asm.
Signed-off-by: Dan Robertson <dan@dlrobertson.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Ensure that iter->should_be_locked is set to true before we
call bch2_trans_update in __bch2_dev_usrdata_drop.
Signed-off-by: Dan Robertson <dan@dlrobertson.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
It's unhelpful if we see "Halting mark and sweep to start topology
repair" but we don't see the error that triggered it.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Ensure that iter->should_be_locked value is set to true before we
call bch2_trans_update in ec_stripe_update_ptrs.
Signed-off-by: Dan Robertson <dan@dlrobertson.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
- We no longer mark subsets of extents, they're marked like regular
keys now - which means we can drop the offset & sectors arguments
to trigger functions
- Drop other arguments that are no longer needed anymore in various
places - fs_usage
- Drop the logic for handling extents in bch2_mark_update() that isn't
needed anymore, to match bch2_trans_mark_update()
- Better logic for hanlding the BTREE_ITER_CACHED_NOFILL case, where we
don't have an old key to mark
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
After the v5.12 rebase, we started oopsing when truncate was passed
ATTR_MODE, due to not passing mnt_userns to setattr_copy(). This
refactors things so that truncate/extend finish by using
bch2_setattr_nonsize(), which solves the problem.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Adding iter->should_be_locked introduced a regression where it ended up
not being set on the iterator passed to bch2_btree_update_start(), which
is definitely not what we want.
This patch requires it to be set when calling bch2_trans_update(), and
adds various fixups to make that happen.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
With trans->updates2 gone, we can now drop this helper and use
bch2_btree_delete_at() instead.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Now that bch2_btree_iter_peek_with_updates() has been removed in favor
of BTREE_ITER_WITH_UPDATES, we need to make sure it's not used where we
don't want it.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Commit c42bca92be928ce7dece5fc04cf68d0e37ee6718 "bio: don't copy bvec
for direct IO" changed bio_iov_iter_get_pages() to point bio->bi_iovec
at the incoming biovec, meaning if we already allocated one, it'll be
leaked.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This fixes some rare cases where the metadata checksum option specified
may map to the wrong actual checksum type.
Signed-off-by: Janpieter Sollie <janpieter.sollie@edpnet.be>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>