diff --git a/Makefile b/Makefile index 85f0eff..4905eb5 100644 --- a/Makefile +++ b/Makefile @@ -103,6 +103,7 @@ SYSADMIN_SOURCES= \ pve-package-repos.adoc \ pve-installation.adoc \ system-software-updates.adoc \ + local-zfs.adoc \ sysadmin.adoc API_VIEWER_SOURCES= \ @@ -170,6 +171,8 @@ all: index.html %-nwdiag.svg: %.nwdiag nwdiag -T svg $*.nwdiag -o $@; +chapter-sysadmin.html chapter-sysadmin-plain.html: ${SYSADMIN_SOURCES} + chapter-%.html: %.adoc ${PVE_COMMON_DOC_SOURCES} asciidoc ${ADOC_STDARG} -a toc -o $@ $*.adoc diff --git a/local-zfs.adoc b/local-zfs.adoc new file mode 100644 index 0000000..563e0c5 --- /dev/null +++ b/local-zfs.adoc @@ -0,0 +1,325 @@ +ZFS on Linux +------------ +include::attributes.txt[] + +ZFS is a combined file system and logical volume manager designed by +Sun Microsystems. Starting with {pve} 3.4, the native Linux +kernel port of the ZFS file system is introduced as optional +file-system and also as an additional selection for the root +file-system. There is no need for manually compile ZFS modules - all +packages are included. + +By using ZFS, its possible to achieve maximal enterprise features with +low budget hardware, but also high performance systems by leveraging +SSD caching or even SSD only setups. ZFS can replace cost intense +hardware raid cards by moderate CPU and memory load combined with easy +management. + +.General ZFS advantages + +* Easy configuration and management with {pve} GUI and CLI. + +* Reliable + +* Protection against data corruption + +* Data compression on file-system level + +* Snapshots + +* Copy-on-write clone + +* Various raid levels: RAID0, RAID1, RAID10, RAIDZ-1, RAIDZ-2 and RAIDZ-3 + +* Can use SSD for cache + +* Self healing + +* Continuous integrity checking + +* Designed for high storage capacities + +* Protection against data corruption + +* Asynchronous replication over network + +* Open Source + +* Encryption + +* ... + + +Hardware +~~~~~~~~ + +ZFS depends heavily on memory, so you need at least 8GB to start. In +practice, use as much you can get for your hardware/budget. To prevent +data corruption, we recommend the use of high quality ECC RAM. + +If you use a dedicated cache and/or log disk, you should use a +enterprise class SSD (e.g. Intel SSD DC S3700 Series). This can +increase the overall performance significantly. + +IMPORTANT: Do not use ZFS on top of hardware controller which has it's +own cache management. ZFS needs to directly communicate with disks. An +HBA adapter is the way to go, or something like LSI controller flashed +in 'IT' mode. + +If you are experimenting with an installation of {pve} inside a VM +(Nested Virtualization), don't use 'virtio' for disks of that VM, +since they are not supported by ZFS. Use IDE or SCSI instead (works +also with 'virtio' SCSI controller type). + + +Installation as root file system +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When you install using the {pve} installer, you can choose ZFS for the +root file system. You need to select the RAID type at installation +time: + +[horizontal] +RAID0:: Also called 'striping'. The capacity of such volume is the sum +of the capacity of all disks. But RAID0 does not add any redundancy, +so the failure of a single drive makes the volume unusable. + +RAID1:: Also called mirroring. Data is written identically to all +disks. This mode requires at least 2 disks with the same size. The +resulting capacity is that of a single disk. + +RAID10:: A combination of RAID0 and RAID1. Requires at least 4 disks. + +RAIDZ-1:: A variation on RAID-5, single parity. Requires at least 3 disks. + +RAIDZ-2:: A variation on RAID-5, double parity. Requires at least 4 disks. + +RAIDZ-3:: A variation on RAID-5, triple parity. Requires at least 5 disks. + +The installer automatically partitions the disks, creates a ZFS pool +called 'rpool', and installs the root file system on the ZFS subvolume +'rpool/ROOT/pve-1'. + +Another subvolume called 'rpool/data' is created to store VM +images. In order to use that with the {pve} tools, the installer +creates the following configuration entry in '/etc/pve/storage.cfg': + +---- +zfspool: local-zfs + pool rpool/data + sparse + content images,rootdir +---- + +After installation, you can view your ZFS pool status using the +'zpool' command: + +---- +# zpool status + pool: rpool + state: ONLINE + scan: none requested +config: + + NAME STATE READ WRITE CKSUM + rpool ONLINE 0 0 0 + mirror-0 ONLINE 0 0 0 + sda2 ONLINE 0 0 0 + sdb2 ONLINE 0 0 0 + mirror-1 ONLINE 0 0 0 + sdc ONLINE 0 0 0 + sdd ONLINE 0 0 0 + +errors: No known data errors +---- + +The 'zfs' command is used configure and manage your ZFS file +systems. The following command lists all file systems after +installation: + +---- +# zfs list +NAME USED AVAIL REFER MOUNTPOINT +rpool 4.94G 7.68T 96K /rpool +rpool/ROOT 702M 7.68T 96K /rpool/ROOT +rpool/ROOT/pve-1 702M 7.68T 702M / +rpool/data 96K 7.68T 96K /rpool/data +rpool/swap 4.25G 7.69T 64K - +---- + + +Bootloader +~~~~~~~~~~ + +The default ZFS disk partitioning scheme does not use the first 2048 +sectors. This gives enough room to install a GRUB boot partition. The +{pve} installer automatically allocates that space, and installs the +GRUB boot loader there. If you use a redundant RAID setup, it installs +the boot loader on all disk required for booting. So you can boot +even if some disks fail. + +NOTE: It is not possible to use ZFS as root partition with UEFI +boot. + + +ZFS Administration +~~~~~~~~~~~~~~~~~~ + +This section gives you some usage examples for common tasks. ZFS +itself is really powerful and provides many options. The main commands +to manage ZFS are 'zfs' and 'zpool'. Both commands comes with great +manual pages, worth to read: + +---- +# man zpool +# man zfs +----- + +.Create a new ZPool + +To create a new pool, at least one disk is needed. The 'ashift' should +have the same sector-size (2 power of 'ashift') or larger as the +underlying disk. + + zpool create -f -o ashift=12 + +To activate the compression + + zfs set compression=lz4 + +.Create a new pool with RAID-0 + +Minimum 1 Disk + + zpool create -f -o ashift=12 + +.Create a new pool with RAID-1 + +Minimum 2 Disks + + zpool create -f -o ashift=12 mirror + +.Create a new pool with RAID-10 + +Minimum 4 Disks + + zpool create -f -o ashift=12 mirror mirror + +.Create a new pool with RAIDZ-1 + +Minimum 3 Disks + + zpool create -f -o ashift=12 raidz1 + +.Create a new pool with RAIDZ-2 + +Minimum 4 Disks + + zpool create -f -o ashift=12 raidz2 + +.Create a new pool with Cache (L2ARC) + +It is possible to use a dedicated cache drive partition to increase +the performance (use SSD). + +As '' it is possible to use more devices, like it's shown in +"Create a new pool with RAID*". + + zpool create -f -o ashift=12 cache + +.Create a new pool with Log (ZIL) + +It is possible to use a dedicated cache drive partition to increase +the performance(SSD). + +As '' it is possible to use more devices, like it's shown in +"Create a new pool with RAID*". + + zpool create -f -o ashift=12 log + +.Add Cache and Log to an existing pool + +If you have an pool without cache and log. First partition the SSD in +2 partition with parted or gdisk + +IMPORTANT: Always use GPT partition tables (gdisk or parted). + +The maximum size of a log device should be about half the size of +physical memory, so this is usually quite small. The rest of the SSD +can be used to the cache. + + zpool add -f log cache + +.Changing a failed Device + + zpool replace -f + + +Activate E-Mail Notification +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +ZFS comes with an event daemon, which monitors events generated by the +ZFS kernel module. The daemon can also send E-Mails on ZFS event like +pool errors. + +To activate the daemon it is necessary to edit /etc/zfs/zed.d/zed.rc with your favored editor, and uncomment the 'ZED_EMAIL_ADDR' setting: + +ZED_EMAIL_ADDR="root" + +Please note {pve} forwards mails to 'root' to the email address +configured for the root user. + +IMPORTANT: the only settings that is required is ZED_EMAIL_ADDR. All +other settings are optional. + + +Limit ZFS memory usage +~~~~~~~~~~~~~~~~~~~~~~ + +It is good to use max 50 percent of the system memory for ZFS arc to +prevent performance shortage of the host. Use your preferred editor to +change the configuration in /etc/modprobe.d/zfs.conf and insert: + + options zfs zfs_arc_max=8589934592 + +This example setting limits the usage to 8GB. + +[IMPORTANT] +==== +If your root fs is ZFS you must update your initramfs every +time this value changes. + + update-initramfs -u +==== + + +.SWAP on ZFS + +SWAP on ZFS on Linux may generate some troubles, like blocking the +server or generating a high IO load, often seen when starting a Backup +to an external Storage. + +We strongly recommend to use enough memory, so that you normally do not +run into low memory situations. Additionally, you can lower the +'swappiness' value. A good value for servers is 10: + + sysctl -w vm.swappiness=10 + +To make the swappiness persistence, open '/etc/sysctl.conf' with +an editor of your choice and add the following line: + + vm.swappiness = 10 + +.Linux Kernel 'swappiness' parameter values +[width="100%",cols="