mirror of
git://git.proxmox.com/git/pve-docs.git
synced 2025-06-03 09:06:03 +03:00
add ZFS management docs
partly imported from current wiki page.
This commit is contained in:
parent
f039505c1b
commit
9ee943233e
3
Makefile
3
Makefile
@ -103,6 +103,7 @@ SYSADMIN_SOURCES= \
|
|||||||
pve-package-repos.adoc \
|
pve-package-repos.adoc \
|
||||||
pve-installation.adoc \
|
pve-installation.adoc \
|
||||||
system-software-updates.adoc \
|
system-software-updates.adoc \
|
||||||
|
local-zfs.adoc \
|
||||||
sysadmin.adoc
|
sysadmin.adoc
|
||||||
|
|
||||||
API_VIEWER_SOURCES= \
|
API_VIEWER_SOURCES= \
|
||||||
@ -170,6 +171,8 @@ all: index.html
|
|||||||
%-nwdiag.svg: %.nwdiag
|
%-nwdiag.svg: %.nwdiag
|
||||||
nwdiag -T svg $*.nwdiag -o $@;
|
nwdiag -T svg $*.nwdiag -o $@;
|
||||||
|
|
||||||
|
chapter-sysadmin.html chapter-sysadmin-plain.html: ${SYSADMIN_SOURCES}
|
||||||
|
|
||||||
chapter-%.html: %.adoc ${PVE_COMMON_DOC_SOURCES}
|
chapter-%.html: %.adoc ${PVE_COMMON_DOC_SOURCES}
|
||||||
asciidoc ${ADOC_STDARG} -a toc -o $@ $*.adoc
|
asciidoc ${ADOC_STDARG} -a toc -o $@ $*.adoc
|
||||||
|
|
||||||
|
325
local-zfs.adoc
Normal file
325
local-zfs.adoc
Normal file
@ -0,0 +1,325 @@
|
|||||||
|
ZFS on Linux
|
||||||
|
------------
|
||||||
|
include::attributes.txt[]
|
||||||
|
|
||||||
|
ZFS is a combined file system and logical volume manager designed by
|
||||||
|
Sun Microsystems. Starting with {pve} 3.4, the native Linux
|
||||||
|
kernel port of the ZFS file system is introduced as optional
|
||||||
|
file-system and also as an additional selection for the root
|
||||||
|
file-system. There is no need for manually compile ZFS modules - all
|
||||||
|
packages are included.
|
||||||
|
|
||||||
|
By using ZFS, its possible to achieve maximal enterprise features with
|
||||||
|
low budget hardware, but also high performance systems by leveraging
|
||||||
|
SSD caching or even SSD only setups. ZFS can replace cost intense
|
||||||
|
hardware raid cards by moderate CPU and memory load combined with easy
|
||||||
|
management.
|
||||||
|
|
||||||
|
.General ZFS advantages
|
||||||
|
|
||||||
|
* Easy configuration and management with {pve} GUI and CLI.
|
||||||
|
|
||||||
|
* Reliable
|
||||||
|
|
||||||
|
* Protection against data corruption
|
||||||
|
|
||||||
|
* Data compression on file-system level
|
||||||
|
|
||||||
|
* Snapshots
|
||||||
|
|
||||||
|
* Copy-on-write clone
|
||||||
|
|
||||||
|
* Various raid levels: RAID0, RAID1, RAID10, RAIDZ-1, RAIDZ-2 and RAIDZ-3
|
||||||
|
|
||||||
|
* Can use SSD for cache
|
||||||
|
|
||||||
|
* Self healing
|
||||||
|
|
||||||
|
* Continuous integrity checking
|
||||||
|
|
||||||
|
* Designed for high storage capacities
|
||||||
|
|
||||||
|
* Protection against data corruption
|
||||||
|
|
||||||
|
* Asynchronous replication over network
|
||||||
|
|
||||||
|
* Open Source
|
||||||
|
|
||||||
|
* Encryption
|
||||||
|
|
||||||
|
* ...
|
||||||
|
|
||||||
|
|
||||||
|
Hardware
|
||||||
|
~~~~~~~~
|
||||||
|
|
||||||
|
ZFS depends heavily on memory, so you need at least 8GB to start. In
|
||||||
|
practice, use as much you can get for your hardware/budget. To prevent
|
||||||
|
data corruption, we recommend the use of high quality ECC RAM.
|
||||||
|
|
||||||
|
If you use a dedicated cache and/or log disk, you should use a
|
||||||
|
enterprise class SSD (e.g. Intel SSD DC S3700 Series). This can
|
||||||
|
increase the overall performance significantly.
|
||||||
|
|
||||||
|
IMPORTANT: Do not use ZFS on top of hardware controller which has it's
|
||||||
|
own cache management. ZFS needs to directly communicate with disks. An
|
||||||
|
HBA adapter is the way to go, or something like LSI controller flashed
|
||||||
|
in 'IT' mode.
|
||||||
|
|
||||||
|
If you are experimenting with an installation of {pve} inside a VM
|
||||||
|
(Nested Virtualization), don't use 'virtio' for disks of that VM,
|
||||||
|
since they are not supported by ZFS. Use IDE or SCSI instead (works
|
||||||
|
also with 'virtio' SCSI controller type).
|
||||||
|
|
||||||
|
|
||||||
|
Installation as root file system
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
When you install using the {pve} installer, you can choose ZFS for the
|
||||||
|
root file system. You need to select the RAID type at installation
|
||||||
|
time:
|
||||||
|
|
||||||
|
[horizontal]
|
||||||
|
RAID0:: Also called 'striping'. The capacity of such volume is the sum
|
||||||
|
of the capacity of all disks. But RAID0 does not add any redundancy,
|
||||||
|
so the failure of a single drive makes the volume unusable.
|
||||||
|
|
||||||
|
RAID1:: Also called mirroring. Data is written identically to all
|
||||||
|
disks. This mode requires at least 2 disks with the same size. The
|
||||||
|
resulting capacity is that of a single disk.
|
||||||
|
|
||||||
|
RAID10:: A combination of RAID0 and RAID1. Requires at least 4 disks.
|
||||||
|
|
||||||
|
RAIDZ-1:: A variation on RAID-5, single parity. Requires at least 3 disks.
|
||||||
|
|
||||||
|
RAIDZ-2:: A variation on RAID-5, double parity. Requires at least 4 disks.
|
||||||
|
|
||||||
|
RAIDZ-3:: A variation on RAID-5, triple parity. Requires at least 5 disks.
|
||||||
|
|
||||||
|
The installer automatically partitions the disks, creates a ZFS pool
|
||||||
|
called 'rpool', and installs the root file system on the ZFS subvolume
|
||||||
|
'rpool/ROOT/pve-1'.
|
||||||
|
|
||||||
|
Another subvolume called 'rpool/data' is created to store VM
|
||||||
|
images. In order to use that with the {pve} tools, the installer
|
||||||
|
creates the following configuration entry in '/etc/pve/storage.cfg':
|
||||||
|
|
||||||
|
----
|
||||||
|
zfspool: local-zfs
|
||||||
|
pool rpool/data
|
||||||
|
sparse
|
||||||
|
content images,rootdir
|
||||||
|
----
|
||||||
|
|
||||||
|
After installation, you can view your ZFS pool status using the
|
||||||
|
'zpool' command:
|
||||||
|
|
||||||
|
----
|
||||||
|
# zpool status
|
||||||
|
pool: rpool
|
||||||
|
state: ONLINE
|
||||||
|
scan: none requested
|
||||||
|
config:
|
||||||
|
|
||||||
|
NAME STATE READ WRITE CKSUM
|
||||||
|
rpool ONLINE 0 0 0
|
||||||
|
mirror-0 ONLINE 0 0 0
|
||||||
|
sda2 ONLINE 0 0 0
|
||||||
|
sdb2 ONLINE 0 0 0
|
||||||
|
mirror-1 ONLINE 0 0 0
|
||||||
|
sdc ONLINE 0 0 0
|
||||||
|
sdd ONLINE 0 0 0
|
||||||
|
|
||||||
|
errors: No known data errors
|
||||||
|
----
|
||||||
|
|
||||||
|
The 'zfs' command is used configure and manage your ZFS file
|
||||||
|
systems. The following command lists all file systems after
|
||||||
|
installation:
|
||||||
|
|
||||||
|
----
|
||||||
|
# zfs list
|
||||||
|
NAME USED AVAIL REFER MOUNTPOINT
|
||||||
|
rpool 4.94G 7.68T 96K /rpool
|
||||||
|
rpool/ROOT 702M 7.68T 96K /rpool/ROOT
|
||||||
|
rpool/ROOT/pve-1 702M 7.68T 702M /
|
||||||
|
rpool/data 96K 7.68T 96K /rpool/data
|
||||||
|
rpool/swap 4.25G 7.69T 64K -
|
||||||
|
----
|
||||||
|
|
||||||
|
|
||||||
|
Bootloader
|
||||||
|
~~~~~~~~~~
|
||||||
|
|
||||||
|
The default ZFS disk partitioning scheme does not use the first 2048
|
||||||
|
sectors. This gives enough room to install a GRUB boot partition. The
|
||||||
|
{pve} installer automatically allocates that space, and installs the
|
||||||
|
GRUB boot loader there. If you use a redundant RAID setup, it installs
|
||||||
|
the boot loader on all disk required for booting. So you can boot
|
||||||
|
even if some disks fail.
|
||||||
|
|
||||||
|
NOTE: It is not possible to use ZFS as root partition with UEFI
|
||||||
|
boot.
|
||||||
|
|
||||||
|
|
||||||
|
ZFS Administration
|
||||||
|
~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
This section gives you some usage examples for common tasks. ZFS
|
||||||
|
itself is really powerful and provides many options. The main commands
|
||||||
|
to manage ZFS are 'zfs' and 'zpool'. Both commands comes with great
|
||||||
|
manual pages, worth to read:
|
||||||
|
|
||||||
|
----
|
||||||
|
# man zpool
|
||||||
|
# man zfs
|
||||||
|
-----
|
||||||
|
|
||||||
|
.Create a new ZPool
|
||||||
|
|
||||||
|
To create a new pool, at least one disk is needed. The 'ashift' should
|
||||||
|
have the same sector-size (2 power of 'ashift') or larger as the
|
||||||
|
underlying disk.
|
||||||
|
|
||||||
|
zpool create -f -o ashift=12 <pool> <device>
|
||||||
|
|
||||||
|
To activate the compression
|
||||||
|
|
||||||
|
zfs set compression=lz4 <pool>
|
||||||
|
|
||||||
|
.Create a new pool with RAID-0
|
||||||
|
|
||||||
|
Minimum 1 Disk
|
||||||
|
|
||||||
|
zpool create -f -o ashift=12 <pool> <device1> <device2>
|
||||||
|
|
||||||
|
.Create a new pool with RAID-1
|
||||||
|
|
||||||
|
Minimum 2 Disks
|
||||||
|
|
||||||
|
zpool create -f -o ashift=12 <pool> mirror <device1> <device2>
|
||||||
|
|
||||||
|
.Create a new pool with RAID-10
|
||||||
|
|
||||||
|
Minimum 4 Disks
|
||||||
|
|
||||||
|
zpool create -f -o ashift=12 <pool> mirror <device1> <device2> mirror <device3> <device4>
|
||||||
|
|
||||||
|
.Create a new pool with RAIDZ-1
|
||||||
|
|
||||||
|
Minimum 3 Disks
|
||||||
|
|
||||||
|
zpool create -f -o ashift=12 <pool> raidz1 <device1> <device2> <device3>
|
||||||
|
|
||||||
|
.Create a new pool with RAIDZ-2
|
||||||
|
|
||||||
|
Minimum 4 Disks
|
||||||
|
|
||||||
|
zpool create -f -o ashift=12 <pool> raidz2 <device1> <device2> <device3> <device4>
|
||||||
|
|
||||||
|
.Create a new pool with Cache (L2ARC)
|
||||||
|
|
||||||
|
It is possible to use a dedicated cache drive partition to increase
|
||||||
|
the performance (use SSD).
|
||||||
|
|
||||||
|
As '<device>' it is possible to use more devices, like it's shown in
|
||||||
|
"Create a new pool with RAID*".
|
||||||
|
|
||||||
|
zpool create -f -o ashift=12 <pool> <device> cache <cache_device>
|
||||||
|
|
||||||
|
.Create a new pool with Log (ZIL)
|
||||||
|
|
||||||
|
It is possible to use a dedicated cache drive partition to increase
|
||||||
|
the performance(SSD).
|
||||||
|
|
||||||
|
As '<device>' it is possible to use more devices, like it's shown in
|
||||||
|
"Create a new pool with RAID*".
|
||||||
|
|
||||||
|
zpool create -f -o ashift=12 <pool> <device> log <log_device>
|
||||||
|
|
||||||
|
.Add Cache and Log to an existing pool
|
||||||
|
|
||||||
|
If you have an pool without cache and log. First partition the SSD in
|
||||||
|
2 partition with parted or gdisk
|
||||||
|
|
||||||
|
IMPORTANT: Always use GPT partition tables (gdisk or parted).
|
||||||
|
|
||||||
|
The maximum size of a log device should be about half the size of
|
||||||
|
physical memory, so this is usually quite small. The rest of the SSD
|
||||||
|
can be used to the cache.
|
||||||
|
|
||||||
|
zpool add -f <pool> log <device-part1> cache <device-part2>
|
||||||
|
|
||||||
|
.Changing a failed Device
|
||||||
|
|
||||||
|
zpool replace -f <pool> <old device> <new-device>
|
||||||
|
|
||||||
|
|
||||||
|
Activate E-Mail Notification
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
ZFS comes with an event daemon, which monitors events generated by the
|
||||||
|
ZFS kernel module. The daemon can also send E-Mails on ZFS event like
|
||||||
|
pool errors.
|
||||||
|
|
||||||
|
To activate the daemon it is necessary to edit /etc/zfs/zed.d/zed.rc with your favored editor, and uncomment the 'ZED_EMAIL_ADDR' setting:
|
||||||
|
|
||||||
|
ZED_EMAIL_ADDR="root"
|
||||||
|
|
||||||
|
Please note {pve} forwards mails to 'root' to the email address
|
||||||
|
configured for the root user.
|
||||||
|
|
||||||
|
IMPORTANT: the only settings that is required is ZED_EMAIL_ADDR. All
|
||||||
|
other settings are optional.
|
||||||
|
|
||||||
|
|
||||||
|
Limit ZFS memory usage
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
It is good to use max 50 percent of the system memory for ZFS arc to
|
||||||
|
prevent performance shortage of the host. Use your preferred editor to
|
||||||
|
change the configuration in /etc/modprobe.d/zfs.conf and insert:
|
||||||
|
|
||||||
|
options zfs zfs_arc_max=8589934592
|
||||||
|
|
||||||
|
This example setting limits the usage to 8GB.
|
||||||
|
|
||||||
|
[IMPORTANT]
|
||||||
|
====
|
||||||
|
If your root fs is ZFS you must update your initramfs every
|
||||||
|
time this value changes.
|
||||||
|
|
||||||
|
update-initramfs -u
|
||||||
|
====
|
||||||
|
|
||||||
|
|
||||||
|
.SWAP on ZFS
|
||||||
|
|
||||||
|
SWAP on ZFS on Linux may generate some troubles, like blocking the
|
||||||
|
server or generating a high IO load, often seen when starting a Backup
|
||||||
|
to an external Storage.
|
||||||
|
|
||||||
|
We strongly recommend to use enough memory, so that you normally do not
|
||||||
|
run into low memory situations. Additionally, you can lower the
|
||||||
|
'swappiness' value. A good value for servers is 10:
|
||||||
|
|
||||||
|
sysctl -w vm.swappiness=10
|
||||||
|
|
||||||
|
To make the swappiness persistence, open '/etc/sysctl.conf' with
|
||||||
|
an editor of your choice and add the following line:
|
||||||
|
|
||||||
|
vm.swappiness = 10
|
||||||
|
|
||||||
|
.Linux Kernel 'swappiness' parameter values
|
||||||
|
[width="100%",cols="<m,2d",options="header"]
|
||||||
|
|===========================================================
|
||||||
|
| Value | Strategy
|
||||||
|
| vm.swappiness = 0 | The kernel will swap only to avoid
|
||||||
|
an 'out of memory' condition
|
||||||
|
| vm.swappiness = 1 | Minimum amount of swapping without
|
||||||
|
disabling it entirely.
|
||||||
|
| vm.swappiness = 10 | This value is sometimes recommended to
|
||||||
|
improve performance when sufficient memory exists in a system.
|
||||||
|
| vm.swappiness = 60 | The default value.
|
||||||
|
| vm.swappiness = 100 | The kernel will swap aggressively.
|
||||||
|
|===========================================================
|
@ -230,24 +230,12 @@ TODO: explan OVS
|
|||||||
////
|
////
|
||||||
|
|
||||||
|
|
||||||
|
include::local-zfs.adoc[]
|
||||||
|
|
||||||
|
|
||||||
////
|
////
|
||||||
TODO:
|
TODO:
|
||||||
|
|
||||||
Local Storage
|
|
||||||
-------------
|
|
||||||
|
|
||||||
Logical Volume Manager (LVM)
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
TODO: info about LVM.
|
|
||||||
|
|
||||||
|
|
||||||
ZFS on Linux
|
|
||||||
~~~~~~~~~~~~
|
|
||||||
|
|
||||||
TODO: info about ZFS.
|
|
||||||
|
|
||||||
|
|
||||||
Working with 'systemd'
|
Working with 'systemd'
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
@ -256,10 +244,4 @@ Journal and syslog
|
|||||||
|
|
||||||
TODO: explain persistent journal...
|
TODO: explain persistent journal...
|
||||||
|
|
||||||
|
|
||||||
////
|
////
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user