IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Back when we relied on the journal sequence number blacklist machinery
for consistency between btree and the journal, we needed to ensure a new
journal entry was written before any btree writes were done. But, this
had the side effect of consuming some space in the journal prior to
doing journal replay - which could lead to a very wedged filesystem,
since we don't yet have a way to grow the journal prior to going RW.
Fortunately, the journal sequence number blacklist machinery isn't
needed anymore, as btree node pointers now record the numer of sectors
currently written to that node - that code should all be ripped out.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Snapshot deletion needs to become a multi step process, where we unlink,
then tear down the page cache, then delete the subvolume - the deleting
flag is equivalent to an inode with i_nlink = 0.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
It seems some users have reflink pointers which span many indirect
extents, from a short window in time when merging of reflink pointers
was allowed.
Now, we're seeing transaction path overflows in fix_reflink_p(), the
code path to clear out the reflink_p fields now used for front/back pad
- but, we don't actually need to be running triggers in that path, which
is an easy partial fix.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
for_each_btree_key() now calls bch2_trans_begin() as needed; that means,
we can also call it when we're in danger of overflowing transaction
paths.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
The way __bch2_mark_reflink_p returns errors was clashing with returning
the number of sectors processed - we weren't returning FSCK_ERR_EXIT
correctly.
Fix this by only using the return code for errors, which actually ends
up simplifying the overall logic.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This fixes a possible race where we fail to remove a device because of
btree nodes still on it, that are being deleted by in flight btree
updates.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
We have been getting away from handling transaction restarts locally -
convert bch2_btree_node_rewrite() to the newer style.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
But we don't need to call it from outside the btree iterator code
anymore, since it's called by bch2_trans_begin() and
bch2_btree_path_traverse().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This is a hacky but effective fix to device usage stats for superblock
and journal being wrong on a newly added device (following the comment
that already told us how it needed to be done!)
Reported-by: Chris Webb <chris@arachsys.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
readdir() in a directory with many subvolumes could overflow transaction
paths - this is a simple hack around the issue.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This changes the on disk format for dirents that point to subvols so
that they also record the subvolid of the parent subvol, so that we can
filter them out in other subvolumes.
This also updates the dirent code to do that filtering, and in
particular tweaks the rename code - we need to ensure that there's only
ever one dirent (counting multiplicities in different snapshots) that
point to a subvolume.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Code that uses for_each_btree_key often wants transaction restarts to be
handled locally and not returned. Originally, we wouldn't return
transaction restarts if there was a single iterator in the transaction -
the reasoning being if there weren't other iterators being invalidated,
and the current iterator was being advanced/retraversed, there weren't
any locks or iterators we were required to preserve.
But with the btree_path conversion that approach doesn't work anymore -
even when we're using for_each_btree_key() with a single iterator there
will still be two paths in the transaction, since we now always preserve
the path at the pos the iterator was initialized at - the reason being
that on restart we often restart from the same place.
And it turns out there's now a lot of for_each_btree_key() uses that _do
not_ want transaction restarts handled locally, and should be returning
them.
This patch splits out for_each_btree_key_norestart() and
for_each_btree_key_continue_norestart(), and converts existing users as
appropriate. for_each_btree_key(), for_each_btree_key_continue(), and
for_each_btree_node() now handle transaction restarts themselves by
calling bch2_trans_begin() when necessary - and the old hack to not
return transaction restarts when there's a single path in the
transaction has been deleted.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
It's not an error if we don't have cached data - skip BCH_DATA_cached in
bch2_have_enough_devs().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
check_path() wasn't checking the snapshot ID when checking for directory
structure loops - so, snapshots would cause us to detect a loop that
wasn't actually a loop.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
When a reflink pointer points to only part of an indirect extent, and
then that indirect extent is fragmented (e.g. by copygc), if the reflink
pointer only points to one of the fragments we leak a reference.
Fix this by storing front/back pad values in reflink pointers - when
inserting reflink pointesr, we initialize them to cover the full range
of the indirect extents we reference.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
We had a bug where reflink_p pointers weren't being initialized to 0,
and when we started using the second word, things broke badly.
This patch revs the on disk format version and adds cleanup code to zero
out the second word of reflink_p pointers before we start using it.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
It shouldn't be necessary when we're only using a single iterator and
not doing updates, but that's harder to debug at the moment.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Now that peek_node()/next_node() are converted to return errors
directly, we don't need bch2_trans_exit() to return errors - it's
cleaner this way and wasn't used much anymore.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This changes for_each_btree_node() to work like for_each_btree_key(),
and to that end bch2_btree_iter_peek_node() and next_node() also return
error ptrs.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
When a reflink pointer points to an indirect extent that doesn't exist,
we need to replace it with a KEY_TYPE_error key.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Checking of directory structure across subvolumes was broken - we need
to look up the snapshot ID of the parent subvolume when crossing subvol
boundaries.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Subvolume deletion doesn't flush & evict the btree key cache - ideally
it would, but that's tricky, so instead bch2_subvolume_create() needs to
make sure the slot doesn't exist in the key cache to avoid triggering
assertions.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Type size_t is architecture-specific. Fix warnings for some non-amd64
arches.
Signed-off-by: Brett Holman <bholman.devel@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This bug was only discovered when we started using the 2nd word in the
val, which should have been zeroed out as those fields had never been
used before - ouch.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
When force-removing a device, we were silently dropping extents that we
no longer had pointers for - we should have been switching them to
KEY_TYPE_error, so that reads for data that was lost return errors.
This patch adds the logic for switching a key to KEY_TYPE_error to
bch2_bkey_drop_ptr(), and improves the logic somewhat.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
With snapshots, __bch2_dev_usr_data_drop() now uses an ALL_SNAPSHOTS
iterator, which isn't an extent iterator - meaning we shouldn't be
inserting whiteouts with nonzero size to delete. This fixes a bug where
we go RO because we tried to insert an invalid key in the device remove
path.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
It was switching off of the key type incorrectly - this code must've
been quite old, and not rereplicating anything that wasn't a
btree_ptr_v1 or a plain old extent.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
When we delete a snapshot, we unlink the inode but we don't want to run
the inode_rm path - the unlink path deletes the subvolume directly,
which does everything we need. Also allowing the inode_rm path to run
was getting us "missing subvolume" errors.
There's still another bug with snapshot deletion: we need to make
snapshot deletion a multi stage process, where we unlink the root
dentry, then tear down the page cache, then delete the snapshot.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
this_cpu_ptr() emits a warning when used without preemption disabled -
harmless in this case, as we have other locking where
bch2_acc_percpu_u64s() is used.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
- check for getting to the end of the btree in bch2_path_verify_locks
and __btree_path_traverse_all(), this fixes an infinite loop in
__btree_path_traverse_all().
- relax requirement in bch2_btree_node_upgrade() that we must want an
intent lock, this fixes bugs with paths that point to interior nodes
(nonzero level).
- bch2_btree_node_update_key(): fix it to upgrade the path to an intent
lock, if necessary
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Btree node iterators want the interior btree_path to point to the same
pos as the returned btree node - this fixes a regression from the
introduction of btree_path, where rewriting/updating keys of btree nodes
(e.g. in bch2_dev_metadata_drop()) via btree node iterators.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
It was missing a lockrestart_do(), to call bch2_trans_begin() and also
handle transaction restarts.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
We return 1 to indicate kthread_should_stop() returned true - we
shouldn't be printing an error.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
We were getting spurious "multiple types of data in same bucket" errors
in fsck, because the check was running for (cached) stale pointers -
oops.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
We were incorrectly using bch2_inode_write(), which gets the snapshot ID
from the iterator, with a BTREE_ITER_ALL_SNAPSHOTS iterator -
fortunately caught by an assertion in the update path.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This will cause the compat code to be run that creates entries in the
subvolumes and snapshots btrees.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
We can end up in a strange situation where a btree_path points to a node
being freed even after pointers to it should have been replaced by
pointers to the new node - if the btree node has been reused since the
pointer to it was created.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This is the final patch in the patch series implementing snapshots.
This patch implements two new ioctls that work like creation and
deletion of directories, but fancier.
- BCH_IOCTL_SUBVOLUME_CREATE, for creating new subvolumes and snaphots
- BCH_IOCTL_SUBVOLUME_DESTROY, for deleting subvolumes and snapshots
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Now that all the existing code has been converted for snapshots, this
patch changes the code for initializing a btree iterator to require a
snapshot to be specified, and also change bkey_invalid() to allow for
non U32_MAX snapshot IDs.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>