doc: Add VDO stacking document

2024-12-21 13:34:40 +03:00 · 2018-01-25 11:12:38 +01:00 · 2018-01-25 11:12:38 +01:00 · edb209776f
commit edb209776f
parent a1cfef9f26
1 changed files with 85 additions and 0 deletions
--- a/doc/vdo.md
+++ b/doc/vdo.md
@ -0,0 +1,85 @@
 # VDO - Compression and deduplication.
 Currently device stacking looks like this:
    Physical x [multipath] x [partition] x [mdadm] x [LUKS] x [LVS] x [LUKS] x [FS|Database|...]
 Adding VDO:
    Physical x [multipath] x [partition] x [mdadm] x [LUKS] x [LVS] x [LUKS] x VDO x [LVS] x [FS|Database|...]
 ## Where VDO fits (and where it does not):
 ### Backing devices for VDO volumes:
 1. Physical x [multipath] x [partition] x [mdadm],
 2. LUKS over (1) - full disk encryption.
 3. LVs (raids|mirror|stripe|linear) x [cache] over (1).
 4. LUKS over (3) - especially when using raids.
 Usual limitations apply:
 - Never layer LUKS over another LUKS - it makes no sense.
 - LUKS is better over the raids, than under.
 ### Using VDO as a PV:
 1. under tpool
    - The best fit - it will deduplicate additional redundancies among all
      snapshots and will reduce the footprint.
    - Risks: Resize! dmevent will not be able to handle resizing of tpool ATM.
 2. under corig
    - Cache fits better under VDO device - it will reduce amount of data, and
      deduplicate, so there should be more hits.
    - This is useful to keep the most frequently used data in cache
      uncompressed (if that happens to be a bottleneck.)
 3. under (multiple) linear LVs - e.g. used for VMs.
 ### And where VDO does not fit:
 - *never* use VDO under LUKS volumes
    - these are random data and do not compress nor deduplicate well,
 - *never* use VDO under cmeta and tmeta LVs
    - these are random data and do not compress nor deduplicate well,
 - under raids
    - raid{4,5,6} scrambles data, so they do not deduplicate well,
    - raid{1,4,5,6,10} also causes amount of data grow, so more (duplicit in
      case of raid{1,10}) work has to be done in order to find less duplicates.
 ### And where it could be useful:
 - under snapshot CoW device - when there are multiple of those it could deduplicate
 ### Things to decide
 - under integrity devices - it should work - mostly for data
    - hash is not compressible and unique - it makes sense to have separate imeta and idata volumes for integrity devices
 ### Future Integration of VDO into LVM:
 One issue is using both LUKS and RAID under VDO. We have two options:
 - use mdadm x LUKS x VDO+LV
 - use LV RAID x LUKS x VDO+LV - still requiring recursive LVs.
 Another issue is duality of VDO - it is a top level LV but it can be seen as a "pool" for multiple devices.
 - This is one usecase which could not be handled by LVM at the moment.
 - Size of the VDO is its physical size and virtual size - just like tpool.
      - same problems with virtual vs physical size - it can get full, without exposing it fo a FS
 Another possible RFE is to split data and metadata:
 - e.g. keep data on HDD and metadata on SSD
 ## Issues / Testing
 - fstrim/discard pass down - does it work with VDO?
 - VDO can run in synchronous vs. asynchronous mode
    - synchronous for devices where write is safe after it is confirmed. Some devices are lying.
    - asynchronous for devices requiring flush
 - multiple devices under VDO - need to find common options
 - pvmove - changing characteristics of underlying device
 - autoactivation during boot
    - Q: can we use VDO for RootFS?