5
0
mirror of git://git.proxmox.com/git/pve-docs.git synced 2025-01-22 22:03:47 +03:00
pve-docs/pvecm.adoc

325 lines
8.2 KiB
Plaintext
Raw Normal View History

2016-04-06 17:36:10 +02:00
ifdef::manvolnum[]
PVE({manvolnum})
================
include::attributes.txt[]
NAME
----
2016-09-27 10:58:58 +02:00
pvecm - {pve} Cluster Manager
2016-04-06 17:36:10 +02:00
SYNOPSYS
--------
include::pvecm.1-synopsis.adoc[]
DESCRIPTION
-----------
endif::manvolnum[]
ifndef::manvolnum[]
Cluster Manager
===============
include::attributes.txt[]
endif::manvolnum[]
The {PVE} cluster manager `pvecm` is a tool to create a group of
physical servers. Such a group is called a *cluster*. We use the
http://www.corosync.org[Corosync Cluster Engine] for reliable group
communication, and such clusters can consist of up to 32 physical nodes
(probably more, dependent on network latency).
`pvecm` can be used to create a new cluster, join nodes to a cluster,
leave the cluster, get status information and do various other cluster
2016-09-27 10:58:55 +02:00
related tasks. The **P**rox**m**o**x** **C**luster **F**ile **S**ystem (``pmxcfs'')
is used to transparently distribute the cluster configuration to all cluster
nodes.
Grouping nodes into a cluster has the following advantages:
* Centralized, web based management
* Multi-master clusters: each node can do all management task
* `pmxcfs`: database-driven file system for storing configuration files,
replicated in real-time on all nodes using `corosync`.
* Easy migration of virtual machines and containers between physical
hosts
* Fast deployment
* Cluster-wide services like firewall and HA
Requirements
------------
* All nodes must be in the same network as `corosync` uses IP Multicast
to communicate between nodes (also see
2016-04-09 17:21:50 +02:00
http://www.corosync.org[Corosync Cluster Engine]). Corosync uses UDP
2016-08-22 10:00:31 +02:00
ports 5404 and 5405 for cluster communication.
2016-04-09 17:21:50 +02:00
+
NOTE: Some switches do not support IP multicast by default and must be
manually enabled first.
* Date and time have to be synchronized.
2016-04-09 17:21:50 +02:00
* SSH tunnel on TCP port 22 between nodes is used.
2016-04-09 17:21:50 +02:00
* If you are interested in High Availability, you need to have at
least three nodes for reliable quorum. All nodes should have the
same version.
* We recommend a dedicated NIC for the cluster traffic, especially if
you use shared storage.
NOTE: It is not possible to mix Proxmox VE 3.x and earlier with
2016-04-09 17:21:50 +02:00
Proxmox VE 4.0 cluster nodes.
2016-04-09 17:21:50 +02:00
Preparing Nodes
---------------
First, install {PVE} on all nodes. Make sure that each node is
installed with the final hostname and IP configuration. Changing the
hostname and IP is not possible after cluster creation.
Currently the cluster creation has to be done on the console, so you
need to login via `ssh`.
Create the Cluster
2016-04-09 17:21:50 +02:00
------------------
Login via `ssh` to the first {pve} node. Use a unique name for your cluster.
This name cannot be changed later.
hp1# pvecm create YOUR-CLUSTER-NAME
2016-08-22 10:09:53 +02:00
CAUTION: The cluster name is used to compute the default multicast
address. Please use unique cluster names if you run more than one
cluster inside your network.
To check the state of your cluster use:
hp1# pvecm status
Adding Nodes to the Cluster
2016-04-09 17:21:50 +02:00
---------------------------
Login via `ssh` to the node you want to add.
hp2# pvecm add IP-ADDRESS-CLUSTER
For `IP-ADDRESS-CLUSTER` use the IP from an existing cluster node.
CAUTION: A new node cannot hold any VMs, because you would get
2016-04-09 17:27:27 +02:00
conflicts about identical VM IDs. Also, all existing configuration in
`/etc/pve` is overwritten when you join a new node to the cluster. To
workaround, use `vzdump` to backup and restore to a different VMID after
2016-04-09 17:27:27 +02:00
adding the node to the cluster.
To check the state of cluster:
# pvecm status
2016-04-09 17:21:50 +02:00
.Cluster status after adding 4 nodes
----
hp2# pvecm status
Quorum information
~~~~~~~~~~~~~~~~~~
Date: Mon Apr 20 12:30:13 2015
Quorum provider: corosync_votequorum
Nodes: 4
Node ID: 0x00000001
Ring ID: 1928
Quorate: Yes
Votequorum information
~~~~~~~~~~~~~~~~~~~~~~
Expected votes: 4
Highest expected: 4
Total votes: 4
Quorum: 2
Flags: Quorate
Membership information
~~~~~~~~~~~~~~~~~~~~~~
Nodeid Votes Name
0x00000001 1 192.168.15.91
0x00000002 1 192.168.15.92 (local)
0x00000003 1 192.168.15.93
0x00000004 1 192.168.15.94
----
If you only want the list of all nodes use:
# pvecm nodes
.List nodes in a cluster
----
hp2# pvecm nodes
Membership information
~~~~~~~~~~~~~~~~~~~~~~
Nodeid Votes Name
1 1 hp1
2 1 hp2 (local)
3 1 hp3
4 1 hp4
----
Remove a Cluster Node
2016-04-09 17:21:50 +02:00
---------------------
CAUTION: Read carefully the procedure before proceeding, as it could
not be what you want or need.
Move all virtual machines from the node. Make sure you have no local
data or backups you want to keep, or save them accordingly.
Log in to one remaining node via ssh. Issue a `pvecm nodes` command to
2016-04-09 17:27:27 +02:00
identify the node ID:
----
hp1# pvecm status
Quorum information
~~~~~~~~~~~~~~~~~~
Date: Mon Apr 20 12:30:13 2015
Quorum provider: corosync_votequorum
Nodes: 4
Node ID: 0x00000001
Ring ID: 1928
Quorate: Yes
Votequorum information
~~~~~~~~~~~~~~~~~~~~~~
Expected votes: 4
Highest expected: 4
Total votes: 4
Quorum: 2
Flags: Quorate
Membership information
~~~~~~~~~~~~~~~~~~~~~~
Nodeid Votes Name
0x00000001 1 192.168.15.91 (local)
0x00000002 1 192.168.15.92
0x00000003 1 192.168.15.93
0x00000004 1 192.168.15.94
----
IMPORTANT: at this point you must power off the node to be removed and
make sure that it will not power on again (in the network) as it
is.
----
hp1# pvecm nodes
Membership information
~~~~~~~~~~~~~~~~~~~~~~
Nodeid Votes Name
1 1 hp1 (local)
2 1 hp2
3 1 hp3
4 1 hp4
----
Log in to one remaining node via ssh. Issue the delete command (here
deleting node `hp4`):
hp1# pvecm delnode hp4
If the operation succeeds no output is returned, just check the node
list again with `pvecm nodes` or `pvecm status`. You should see
something like:
----
hp1# pvecm status
Quorum information
~~~~~~~~~~~~~~~~~~
Date: Mon Apr 20 12:44:28 2015
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000001
Ring ID: 1992
Quorate: Yes
Votequorum information
~~~~~~~~~~~~~~~~~~~~~~
Expected votes: 3
Highest expected: 3
Total votes: 3
Quorum: 3
Flags: Quorate
Membership information
~~~~~~~~~~~~~~~~~~~~~~
Nodeid Votes Name
0x00000001 1 192.168.15.90 (local)
0x00000002 1 192.168.15.91
0x00000003 1 192.168.15.92
----
IMPORTANT: as said above, it is very important to power off the node
*before* removal, and make sure that it will *never* power on again
(in the existing cluster network) as it is.
If you power on the node as it is, your cluster will be screwed up and
it could be difficult to restore a clean cluster state.
If, for whatever reason, you want that this server joins the same
cluster again, you have to
2016-09-27 10:58:58 +02:00
* reinstall {pve} on it from scratch
* then join it, as explained in the previous section.
2016-04-06 17:36:10 +02:00
Quorum
------
{pve} use a quorum-based technique to provide a consistent state among
all cluster nodes.
[quote, from Wikipedia, Quorum (distributed computing)]
____
A quorum is the minimum number of votes that a distributed transaction
has to obtain in order to be allowed to perform an operation in a
distributed system.
____
In case of network partitioning, state changes requires that a
majority of nodes are online. The cluster switches to read-only mode
if it loses quorum.
NOTE: {pve} assigns a single vote to each node by default.
Cluster Cold Start
------------------
It is obvious that a cluster is not quorate when all nodes are
offline. This is a common case after a power failure.
NOTE: It is always a good idea to use an uninterruptible power supply
(``UPS'', also called ``battery backup'') to avoid this state, especially if
you want HA.
On node startup, service `pve-manager` is started and waits for
quorum. Once quorate, it starts all guests which have the `onboot`
flag set.
When you turn on nodes, or when power comes back after power failure,
it is likely that some nodes boots faster than others. Please keep in
mind that guest startup is delayed until you reach quorum.
2016-04-06 17:36:10 +02:00
ifdef::manvolnum[]
include::pve-copyright.adoc[]
endif::manvolnum[]