213 Commits

Author SHA1 Message Date
Thomas Lamprecht
a3e87f5a03 client: progress log: small opinionated code clean-up
It was fine as is, but IMO saving a few lines is nice, albeit it makes
the atomic fetch expressions look slightly complexer by wrapping them
directly with the HumanByte and TimeSpan from-constructors.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-10-17 16:51:41 +02:00
Christian Ebner
1d746a2c02 partial fix #5560: client: periodically show backup progress
Spawn a new tokio task which about every minute displays the
cumulative progress of the backup for pxar, ppxar or img archive
streams. Catalog and metadata archive streams are excluded from the
output for better readability, and because the catalog upload lives
for the whole upload time, leading to possible temporal
misalignments in the output. The actual payload data is written via
the other streams anyway.

Add accounting for uploaded chunks, to distinguish from chunks queued
for upload, but not actually uploaded yet.

Example output in the backup task log:
```
...
INFO:  processed 2.471 GiB in 1m, uploaded 2.439 GiB
INFO:  processed 4.963 GiB in 2m, uploaded 4.929 GiB
INFO:  processed 7.349 GiB in 3m, uploaded 7.284 GiB
...
```

This partially fixes issue 5560:
https://bugzilla.proxmox.com/show_bug.cgi?id=5560

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-10-17 16:32:32 +02:00
Wolfgang Bumiller
e9152ee951 tests: replace static mut with a mutex
rustc warns about creating references to them (although it does allow
using `.as_ref()` on them for some reason), and this will become a
hard error with edition 2024.

Previously we could not use Mutex there as its ::new() was not a
`const fn` , but not we can, so let's drop the `mut`.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2024-08-29 11:40:49 +02:00
Maximiliano Sandoval
2443b3f8d0 client: remove unused deps
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
2024-08-14 12:13:50 +02:00
Maximiliano Sandoval
03412aaa5b client: remove unused lazy_static dependency
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
2024-08-14 12:08:01 +02:00
Maximiliano Sandoval
1198253b20 fix typos in rust documentation blocks
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
2024-08-07 16:49:31 +02:00
Maximiliano Sandoval
fcccc3dfa5 fix #3699: client: prefer xdg cache directory for tmp files
Adds a helper to create temporal files in XDG_CACHE_HOME. If we cannot
create a file there, we fallback to /tmp as before.

Note that the temporary files stored by the client might grow
arbitrarily in size, making XDG_RUNTIME_DIR a less desirable option.
Citing the Arch wiki [1]:

> Should not store large files as it may be mounted as a tmpfs.

While the cache directory is most often not backed up by an ephemeral
FS, using the `O_TMPFILE` flag avoids the need for potential cleanup,
e.g. on interruption of a command. As with this flag set the data will
be discarded when the last file descriptor is closed.

[1] https://wiki.archlinux.org/title/XDG_Base_Directory

Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
 [ TL: mention TMPFILE flag for clarity ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-07-17 13:27:09 +02:00
Hannes Laimer
c3e6770104 http_client: keep renewal future running on failed re-auth
The re-authentication request can also fail due to network instability,
and not necesarrily only due to an invalid ticket. In that case it makes
sense to retry refreshing the ticket in 15 minutes. Also, the future does
not depend on a failed re-authentication to be clean up properly, so that
happens already somewhere else, therefore we don't rely on this return
anyway. If the ticket is actually invalid or timed out, the main job
will fail and also terminate the renewal future, same applies if the
network is not just unstable but straight up not working.

Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2024-07-02 11:13:42 +02:00
Christian Ebner
de00745354 fix 5304: client: set process uid/gid for .pxarexclude-cli
The .pxarexclude-cli encodes the exclude patterns the client was
invoked with in the pxar archive as regular file entry. The current
behaviour of setting the uid and gid to default 0 (root) causes
however issues when trying to backup and restore the backup as
non-root user.

Opt for using the uid/gid of the user the executable was called as,
allowing the restore for this user to succeed. Root will succeed
to restore anyways.

Link to issue in bugtracker:
https://bugzilla.proxmox.com/show_bug.cgi?id=5304

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Tested-by: Gabriel Goller <g.goller@proxmox.com>
2024-07-02 10:51:21 +02:00
Maximiliano Sandoval
f619f3e0e7 tools: write multiplication by 01 succinctly
Fixes the clippy warning:

warning: this multiplication by -1 can be written more succinctly
   --> pbs-client/src/tools/mod.rs:700:58
    |
700 |                         SignedDuration::Negative(val) => -1 * i64::try_from(val.as_secs())?,
    |                                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: consider using: `-i64::try_from(val.as_secs())?`
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#neg_multiply
    = note: `#[warn(clippy::neg_multiply)]` on by default

Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
2024-06-28 09:21:04 +02:00
Fabian Grünbichler
8aa244641d trivial clippy fixes
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2024-06-24 09:59:27 +02:00
Wolfgang Bumiller
d9e9ed845d bump bitflags to 2.4
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2024-06-20 13:38:34 +02:00
Wolfgang Bumiller
da72994faf use XATTR_* constants instead of calling functions
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2024-06-20 11:08:58 +02:00
Wolfgang Bumiller
6359e6d4d4 replace c_str! macro with c"literals"
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2024-06-20 11:07:11 +02:00
Wolfgang Bumiller
8e924a7bc0 client: add 'remove_repository_from_value' helper
'extract_repository_from_value' takes an immutable reference and
doesn't remove the parsed parameter (whereas in contrast in our PVE
codebase, the 'extract_param' method does remove it).

This adds a variant that explicitly removes it called
'remove_repository_from_value'.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2024-06-19 11:31:46 +02:00
Gabriel Goller
00c88a42a2 pxar: use anyhow::Error in PxarBackupStream
Instead of storing the error as a string in the PxarBackupStream, we
store it as an anyhow::Error. As we can't clone an anyhow::Error, we take
it out from the mutex and return it. This won't change anything as
the consumation of the stream will stop if it gets a Some(Err(..)).

Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
2024-06-19 10:30:13 +02:00
Gabriel Goller
230527a360 pxar: add UniqueContext helper
To create a pxar archive, we recursively traverse the target folder.
If there is an error further down and we add a context using anyhow,
the context will be duplicated and we get an output like:

> Error: error at "xattr/xattr.txt": error at "xattr/xattr.txt": E2BIG [skip]

This is obviously not optimal, so in recursive contexts we can use the
UniqueContext, which quickly checks the context from the last item in
the error chain and only adds it if it is unique.

Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
2024-06-19 10:30:12 +02:00
Gabriel Goller
095ddcad48 pxar: remove ArchiveError
The sole purpose of the ArchiveError was to add the file-path to the
error. Using anyhow::Error we can add this information using the context
and don't need this struct anymore.

Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
2024-06-19 10:30:10 +02:00
Christian Ebner
dbfba9db89 client: pxar: fix fuse mount performance for split archives
Adapt to the decoder/accessor method changes introduced in the pxar
library, which were introduced in order to move the consistency check
for metadata and payload data archives.

The new location of the checks allows to access the pxar archive via
a `Split` variant reader instance, without penalization when just
accessing the metadata, not reading any payload data.

This greatly improves performance when accessing fuse mounted
archives.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>

bumped dependency after pxar version bump

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2024-06-17 10:17:42 +02:00
Fabian Grünbichler
481a9515b1 extract: don't interpret prelude as OsStr
that would drop the final byte, and the corresponding code has been removed
from pxar now as well.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2024-06-10 13:38:10 +02:00
Christian Ebner
afdc0f0e42 client: pxar: encode prelude based on writer variant
Currently, whether to encode the exlcude patterns passed via cli as
prelude or via the `.pxar-exclude-cli` is based on the presence of
a previous metadata accessor.
That leaves however to the encoding of the file entry instead of the
prelude for split archives in `data` mode and for the first snapshot
in a backup, creating undesired padding in the first payload chunk.

Therefore, use the pxar writer variant to make the decision instead.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-10 13:11:38 +02:00
Christian Ebner
903ab2e938 client: pxar: json encode cli exclude pattern in prelude
The current encoding is not extensible, so encode the cli exclude
patterns as json instead. By this, the prelude is easily seralized
and deserialized, while remaining human readable.

Originally-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-10 13:11:29 +02:00
Christian Ebner
04bb256abd client: backup spec: rename change detection mode default
The currently default variant is named `Default`, which is not future
prove since the default might change in the future. So rename it to
`Legacy` instead.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-10 10:30:05 +02:00
Christian Ebner
c0302805c4 client: backup: conditionally write catalog for file level backups
Only write the catalog when using the regular backup mode, do not write
it when using the split archive mode.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-07 13:56:24 +02:00
Christian Ebner
8d8142955f client: tools: add helper to lookup ArchiveEntrys via pxar
In preparation to lookup entries via the pxar metadata archive
instead of the catalog, in order to drop encoding the catalog
for snapshots using split pxar archives altogehter.

This helper allows to lookup the directory entries via the provided
accessor instance and formats them to be compatible with the output
as produced by lookups via the catalog.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-07 13:56:24 +02:00
Christian Ebner
6b1110badf client: pxar: fix minor formatting issue
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-07 12:09:21 +02:00
Christian Ebner
1259234488 client: pxar: conditionally skip metadata reference test
The test will fail for all users not having euid/egid set to
1000/1000, as the reference test folder structure cannot be created
with the expected ownership.
Therefore, skip over the test if either euid or egid do not match
this condition.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-06 10:46:05 +02:00
Christian Ebner
bab0645cc6 client: pxar: do not attempt to set uid/gid in test
Setting the uid/gid for the files and folders of the test directory
structure will not work when lacking the permissions.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-06 10:46:05 +02:00
Christian Ebner
cf75bc0db5 client: pxar: set cache limit based on nofile rlimit
The lookahead cache size requires the resource limit for open file
handles to be high in order to allow for efficient reuse of unchanged
file payloads.

Increase the nofile soft limit to the hard limit and dynamically adapt
the cache size to the new soft limit minus the half of the previous
soft limit.

The `PxarCreateOptions` and the `Archiver` are therefore extended by
an additional field to store the maximum cache size, with fallback to
a default size of 512 entries.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:42 +02:00
Christian Ebner
7b352cc0cf client: tools: add helper to raise nofile rlimit
The default soft limit for open file handles is rather low, as some
apis (e.g. the POSIX `select(2)` syscall) do not work [0].

The lookahead cache use during the backup clients metadata comparison
to reuse unchanged files however requires much higher limits to work
effectively.

This helper function allows to raise the soft limit to the hard
limit, as provided by the `getrlimit(2)` syscall.

[0] https://0pointer.net/blog/file-descriptor-limits.html

Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:42 +02:00
Christian Ebner
992487929a client: pxar: add archive creation with reference test
Add a basic regression test for archive creation with reference
metadata archive and index.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:42 +02:00
Christian Ebner
5a5d454083 client: chunk stream: switch payload stream chunker
Use the dedicated chunker with boundary suggestions for the payload
stream, by attaching the channel sender to the archiver and the
channel receiver to the payload stream chunker.

The archiver sends the file boundaries for the chunker to consume.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:42 +02:00
Christian Ebner
589f510e7d chunk stream: tests: add regression tests for payload chunker
Regression tests to cover suggested and forced boundaries as well as
chunk injection.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:42 +02:00
Christian Ebner
e321815635 datastore: chunker: add Chunker trait
Add the Chunker trait and move the current Chunker to ChunkerImpl to
implement the trait instead. This allows to use different chunker
implementations by dynamic dispatch and is in preparation for
implementing a dedicated payload chunker.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:42 +02:00
Christian Ebner
ee478ef1dc client: pxar: allow to restore prelude to optional path
Pxar archives allow to store additional information in a prelude
entry since pxar format version 2.

Add an optional parameter to `pxar` and `proxmox-backup-client` to
specify the path to restore the prelude to and pass this to the
archive extraction by extending the `PxarExtractOptions` by a
corresponding field. If none is given, the prelude is simply skipped
during restore.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:41 +02:00
Christian Ebner
126fe1365d client: pxar: opt encode cli exclude patterns as Prelude
Instead of encoding the pxar cli exclude patterns as regular file
within the root directory of an archive, store this information
directly after the pxar format version entry in the entry of kind
Prelude.

This behavior is however currently exclusive to the archives written
with format version 2 in a split metadata and payload case.

This is a breaking change for the encoding of new cli exclude
parameters. Any new exclude parameter will not be added to an already
present .pxar-cliexclude file, and it will not be created if not
present.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:41 +02:00
Christian Ebner
0cbfaf64b5 client: pxar: add helper to handle optional preludes
Pxar archives with format version 2 allows to store optional
information file format version and prelude entries.

Cover the case for these entries, the file format version entry being
introduced to distinguish between different file formats used for
encoding as well as the prelude entry used to store optional metadata
such as the pxar cli exlude parameters.

Add the logic to accept and decode these prelude entries when
accessing the archive via a decoder instance.

For now simply ignore them.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:41 +02:00
Christian Ebner
7401be7e96 client: backup writer: make backup info output more concise
With the additional output in case of split pxar archives, the upload
statistics logged by the backup writer following a backup are crowded
and hard to read.

Make the output more concise by merging the currenlty 2 lines per
upload stream, shown as e.g.:

```
data.ppxar: had to backup 4 MiB of 10.943 GiB (compressed 159 B) in 49.30s
data.ppxar: average backup speed: 83.09 KiB/s
```

into a single line, shown as e.g.:

```
data.ppxar: had to back up 4 MiB of 10.943 GiB (159 B compressed) in 49.30 s (average 83.09 KiB/s)
```

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:41 +02:00
Christian Ebner
5b91d85150 pxar: create: show chunk injection stats info output
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:41 +02:00
Christian Ebner
64eddfffbe pxar: create: keep track of reused chunks and files
Track and log reused or reencoded files as well as the reused chunks
and their paddings.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:41 +02:00
Christian Ebner
51bb7bc6b0 client: backup writer: add injected chunk count to stats
Track the number of injected chunks and show them in the debug output

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:41 +02:00
Christian Ebner
7dcbe69a87 fix #3174: client: pxar: enable caching and meta comparison
When walking the file system tree, check for each entry if it is
reusable, meaning that the metadata did not change and the payload
chunks can be reindexed instead of reencoding the whole data.

If the metadata matched, the range of the dynamic index entries for
that file are looked up in the previous payload data index.
Use the range and possible padding introduced by partial reuse of
chunks to decide whether to reuse the dynamic entries and encode
the file payloads as payload reference right away or cache the entry
for now and keep looking ahead.

If however a non-reusable (because changed) entry is encountered
before the padding threshold is reached, the entries on the cache are
flushed to the archive by reencoding them, resetting the cached state.

Reusable chunk digests and size as well as reference offsets to the
start of regular files payloads within the payload stream are injected
into the backup stream by sending them to the chunker via a dedicated
channel, forcing a chunk boundary and inserting the chunks.

If the threshold value for reuse is reached, the chunks are injected
in the payload stream and the references with the corresponding
offsets encoded in the metadata stream.

Since multiple files might be contained within a single chunk, it is
assured that the deduplication of chunks is performed, by keeping back
the last chunk, so following files might as well reuse that same
chunk without double indexing it.  It is assured that this chunk is
injected in the stream also in case that the following lookups lead to
a cache clear and reencoding.

Directory boundaries are cached as well, and written as part of the
encoding when flushing.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:41 +02:00
Christian Ebner
3149454511 client: pxar: refactor catalog encoding for directories
Move the catalog directory start and end encoding from `add_entry`
to the `add_directory`, the latter being called by the previous.

By this, the `add_entry` method can be reused to walk the filesystem
tree in the context of an enabled lookahead cache without encoding
anything.

No functional change intended.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:41 +02:00
Christian Ebner
6f23976247 pxar: caching: add look-ahead cache
Add a lookahead cache and the neccessary types to store the required
data and keep track of directory boundaries while traversing the
filesystem tree, in order to postpone a decision if to reuse or
reencode a given regular file with unchanged metadata.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:41 +02:00
Christian Ebner
d0f7d86c9e client: pxar: add method for metadata comparison
Add method to compare metadata of current file entry against metadata
of the entry looked up in the previous backup snapshot. If the
metadata matched, the start offset pointing to the files payload
header in the payload steam is returned.

This is in preparation for reusing payload chunks for unchanged files.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:41 +02:00
Christian Ebner
fdea4e5327 client: implement prepare reference method
Implement a method that prepares the decoder instance to access a
previous snapshots metadata index and payload index in order to
pass it to the pxar archiver. The archiver than can utilize these
to compare the metadata for files to the previous state and gather
reusable chunks.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:41 +02:00
Christian Ebner
7c00ec904d specs: add backup detection mode specification
Adds the specification for switching the detection mode used to
identify regular files which changed since a reference backup run.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:41 +02:00
Christian Ebner
7de35dc243 client: streams: add channels for dynamic entry injection
To reuse dynamic entries of a previous backup run and index them for
the new snapshot. Adds a non-blocking channel between the pxar
archiver and the chunk stream, as well as the chunk stream and the
backup writer.

The archiver sends forced boundary positions and the dynamic
entries to inject into the chunk stream following this boundary.

The chunk stream consumes this channel inputs as receiver whenever a
new chunk is requested by the upload stream, forcing a non-regular
chunk boundary in the pxar stream at the requested positions.

The dynamic entries to inject and the boundary are then send via the
second asynchronous channel to the backup writer's upload stream,
indexing them by inserting the dynamic entries as known chunks into
the upload stream.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:41 +02:00
Christian Ebner
717b9b4c88 client: chunk stream: add struct to hold injection state
Adds a dedicated structure to hold the optional sender and receiver
instances and state for injection of reused dynamic entries in the
payload stream for split stream pxar archives.

The asynchronous channels must only be attached to the payload
archive, leaving the current behavior for the metadata archive and
current default encoding without reusing payload chunks of previous
snapshots.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:41 +02:00
Christian Ebner
e8f3abb88f upload stream: implement reused chunk injector
In order to be included in the backups index file, reused payload
chunks have to be injected into the payload upload stream at a
forced boundary. The chunker forces a chunk boundary and sends the
list of reusable dynamic entries to be uploaded.

This implements the logic to receive these dynamic entries via the
corresponding communication channel from the chunker and inject the
entries into the backup upload stream by looking for the matching
chunk boundary, already forced by the chunker.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
2024-06-05 16:39:41 +02:00