proxmox-backup

History

Christian Ebner 7dcbe69a87 fix #3174 : client: pxar: enable caching and meta comparison When walking the file system tree, check for each entry if it is reusable, meaning that the metadata did not change and the payload chunks can be reindexed instead of reencoding the whole data. If the metadata matched, the range of the dynamic index entries for that file are looked up in the previous payload data index. Use the range and possible padding introduced by partial reuse of chunks to decide whether to reuse the dynamic entries and encode the file payloads as payload reference right away or cache the entry for now and keep looking ahead. If however a non-reusable (because changed) entry is encountered before the padding threshold is reached, the entries on the cache are flushed to the archive by reencoding them, resetting the cached state. Reusable chunk digests and size as well as reference offsets to the start of regular files payloads within the payload stream are injected into the backup stream by sending them to the chunker via a dedicated channel, forcing a chunk boundary and inserting the chunks. If the threshold value for reuse is reached, the chunks are injected in the payload stream and the references with the corresponding offsets encoded in the metadata stream. Since multiple files might be contained within a single chunk, it is assured that the deduplication of chunks is performed, by keeping back the last chunk, so following files might as well reuse that same chunk without double indexing it. It is assured that this chunk is injected in the stream also in case that the following lookups lead to a cache clear and reencoding. Directory boundaries are cached as well, and written as part of the encoding when flushing. Signed-off-by: Christian Ebner <c.ebner@proxmox.com>	2024-06-05 16:39:41 +02:00
..
src	fix #3174 : client: pxar: enable caching and meta comparison	2024-06-05 16:39:41 +02:00
Cargo.toml	api-types: client: datastore: tools: use proxmox-human-bytes crate	2023-06-26 13:56:45 +02:00

Christian Ebner 7dcbe69a87 fix #3174 : client: pxar: enable caching and meta comparison

When walking the file system tree, check for each entry if it is
reusable, meaning that the metadata did not change and the payload
chunks can be reindexed instead of reencoding the whole data.

If the metadata matched, the range of the dynamic index entries for
that file are looked up in the previous payload data index.
Use the range and possible padding introduced by partial reuse of
chunks to decide whether to reuse the dynamic entries and encode
the file payloads as payload reference right away or cache the entry
for now and keep looking ahead.

If however a non-reusable (because changed) entry is encountered
before the padding threshold is reached, the entries on the cache are
flushed to the archive by reencoding them, resetting the cached state.

Reusable chunk digests and size as well as reference offsets to the
start of regular files payloads within the payload stream are injected
into the backup stream by sending them to the chunker via a dedicated
channel, forcing a chunk boundary and inserting the chunks.

If the threshold value for reuse is reached, the chunks are injected
in the payload stream and the references with the corresponding
offsets encoded in the metadata stream.

Since multiple files might be contained within a single chunk, it is
assured that the deduplication of chunks is performed, by keeping back
the last chunk, so following files might as well reuse that same
chunk without double indexing it.  It is assured that this chunk is
injected in the stream also in case that the following lookups lead to
a cache clear and reencoding.

Directory boundaries are cached as well, and written as part of the
encoding when flushing.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>

2024-06-05 16:39:41 +02:00

src

fix #3174 : client: pxar: enable caching and meta comparison

2024-06-05 16:39:41 +02:00

Cargo.toml

api-types: client: datastore: tools: use proxmox-human-bytes crate

2023-06-26 13:56:45 +02:00