IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
by combining the compression call from both encrypted and unencrypted
paths and deciding on the header magic at one site.
No functional changes intended, besides reusing the same buffer for
compression.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Increase the zstd compression throughput by not using the
`zstd::stream::copy_encode` method, because it seems it uses an
internal buffer size of 32 KiB [0], copies at least once extra in the
target buffer and might have some additional (allocation and/or
syscall) overhead. Due to the amount of wrappers and indirections it's
a bit hard to tell for sure. In anyway, there can be a reduced
throughput observed if all, the target and source storage and the
network are so fast that the operations from creating chunks, like
compressions, can become the bottleneck.
Instead use the lower-level `zstd_safe::compress` which avoids (big)
allocations, since we provide the target buffer.
In case of a compression error just return the uncompressed data,
there's nothing we can do and saving uncompressed data is better than
having none. Additionally, log any such error besides the one for the
target buffer being too small.
Some benchmarks on my machine (Intel i7-12700K with DDR5-4800 memory
using a ASUS Prime Z690-A motherboard) from a tmpfs to a datastore on
tmpfs:
Type without patches (MiB/s) with patches (MiB/s)
.img file ~614 ~767
pxar one big file ~657 ~807
pxar small files ~576 ~627
The new approach is faster by a factor of 1.19.
Note that the new approach should not have a measurable negative
impact, e.g. (peak) memory usage wise. That is because we always
reserved a vector with max-data-size (data length + header length) and
thus did not have to add a new buffer, rather we actually removed the
buffer that the high-level zstd wrapper crate used internally.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
We want to check the error code of zstd not to be 'Destination buffer
to small' (dstSize_tooSmall), but currently there is no practical API
that is also public. So we introduce a helper that uses the internal
logic of zstd to determine the error.
Since this is not guaranteed to be a stable api, add a test for that
so we catch that error early on build. This should be fine, as long as
the zstd behavior only changes with e.g. major debian upgrades, which
is normally the only time where the zstd version is updated.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
[ TL: re-order fn, rename test and reword comments ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This is leftover code that is not currently used outside of its own
tests.
Should we need it again, we can just revert this commit.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Commit ea584a75 "move more api types for the client" deprecated
the `archive_type` function in favor of the associated function
`ArchiveType::from_path`.
Replace all remaining callers of the deprecated function with its
replacement.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
[WB: and remove the deprecated function]
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
When getting the `full_path` of a snapshot we did not use the cached
time string. By using it we avoid a call to the super-slow libc strftime.
This has some minor performance improvements of circa 7%. That is ~100ms
on my datastore with ~5000 snapshots.
Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
... if `.chunks/` is not available(deleted/moved) ChunkStore::open
fails, but that would happen after updating the active operations on the
datastore, so no reference that could be dropped is returned. Leading to
the operations counter to always increase. This only updates the counter
when a reference is returned, not before.
Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
Implement the Chunker trait for a dedicated payload stream chunker,
which extends the regular chunker by the option to suggest boundaries
to be used over the hast based boundaries whenever possible.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Add the Chunker trait and move the current Chunker to ChunkerImpl to
implement the trait instead. This allows to use different chunker
implementations by dynamic dispatch and is in preparation for
implementing a dedicated payload chunker.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
When forcing a boundary, the internal chunker state is not in sync
with the chunk stream anymore. The reset method therefore allows
to reset the internal state.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
In preparation for injecting reused payload chunks in payload streams
for regular files with unchanged metaddata. Allows to get the digest
of a dynamic index entry to construct a reusable dynamic entry from
it.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Maintenance mode Delete locks the datastore. It must not be possible to go
back to normal modes, because the datastore may be in undefined state.
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
No functional change intended: In preparation for including the
removed vanished groups and snapshots statistics in a sync jobs task
log output.
Instead of returning a boolean value showing whether all of the
snapshots of the group have been removed, return an instance of
`BackupGroupDeleteStats`, containing the count of deleted and
protected snapshots, the latter not having been removed from the
group.
The `removed_all` method is introduced as replacement for the previous
boolean return value and can be used to check if all snapshots have
been removed. If there are no protected snapshots, the group is
considered to be deleted.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
We keep a DataStore cache, so ChunkStore's and lock files are kept by
the proxy process and don't have to be reopened every time. However,
for specific maintenance modes, e.g. 'offline', our process should not
keep file in that datastore open. This clears the cache entry of a
datastore if it is in a specific maintanance mode and the last task
finished, which also drops any files still open by the process.
Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
Reviewed-by: Gabriel Goller <g.goller@proxmox.com>
Tested-by: Gabriel Goller <g.goller@proxmox.com>
By this it becomes clear that the error stems from a parsing error when
getting the backup group owner.
See also: https://forum.proxmox.com/threads/139482/
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
... making the pull logic independent from the actual source
using two traits.
Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
Tested-by: Lukas Wagner <l.wagner@proxmox.com>
Tested-by: Tested-by: Gabriel Goller <g.goller@proxmox.com>
Fixed a few rustdoc warnings. Converted some 'html'-links to
intra-doc-links and surrounded paths with '`'.
Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
Added lifetime to `find` function. We need this lifetime
because of the `impl MatchList` and 'anonymous lifetimes in
`impl Trait` are unstable'.
Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
When walking through a datastore on a GC run, it can
happen that the snapshot is deleted, and then walked over.
For example:
- read dir entry for group
- walk entries (snapshots)
- snapshot X is removed/pruned
- walking reaches snapshot X, but ENOENT
Previously we bailed here, now we just ignore it.
Backups that are just created (and a atomic rename from
tmpdir happens, which might triggers a ENOENT error) are
not a problem here, the GC handles them separately.
Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
this sounded like we need to skip lost+found to avoid pruning too many chunks,
while the opposite is true - it's safe to skip lost+found on EPERM without
pruning too many chunks, but this is not the case for all EPERM situations..
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Simply pull out the inner IO error and the affected path first.
Clean up style-wise a bit while touching this anyway, but no semantic
change intended.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Passed a closure with the `stat()` function call to `matches()`. This
will traverse through all patterns and try to match using the path only, if a
`file_mode` is needed, it will run the closure. This means that if we exclude
a file with the `MatchType::ANY_FILE_TYPE`, we will skip it without running
`stat()` on it. As we updated the `matches()` function, we also updated all the
invocations of it.
Added `pathpatterns` crate to local overrides in cargo.toml.
Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
the wrong info here was rather misleading, especially when encountering errors
just talking about "blobs" when the actual problem is with a chunk.
chunks did originally have their own magic values, but that got removed in
4ee8f53d07bf0becf19a22cf4f4674558a4d128c "remove DataChunk file format - use DataBlob instead"
back in 2019, before the 0.1.1 release(!)
Reported-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
very unexpected and unreachable is probably fine here, but it's not
really winning us anything, so avoid the panic-potential and just
bail out.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
previously when marking used chunks the namespace wasn't taken into
account and valid snapshots were marked as "strange paths". this lead
to a line in the log of a gc job such as this:
found (and marked) 2 index files outside of expected directory scheme
which some users perceived as an error. parse the namespace too and
only mark the path as strange if parsing the namespace and/or backup
dir fails.
Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
these were previously called out in a comment, but should now be handled (as
much as they can be).
the performance impact shouldn't be too bad, since we only look at the magic 8
bytes at the start of the existing chunk (we already did a stat on it, so that
might even be prefetched already by storage), and only if there is a size
mismatch and encryption is enabled.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
[ T: fold in "just to be sure" touch_chunk calls ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
instead of
Error: unable to open snapshot directory "/full/path/to/snapshot" for locking - ENOENT: No such file or directory
this will now print
Error: Snapshot vm/800/2023-01-16T12:28:11Z does not exist.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>