186 Commits

Author SHA1 Message Date
Andrey Smirnov
e0650218a6 feat: support etcd recovery from snapshot on bootstrap
When Talos `controlplane` node is waiting for a bootstrap, `etcd`
contents can be recovered from a snapshot created with
`talosctl etcd snapshot` on a healthy cluster.

Bootstrap process goes same way as before, but the etcd data directory
is recovered from the snapshot.

This flow enables disaster recovery for the control plane: given that
periodic backups are available, destroy control plane nodes, re-create
them with the same config, and bootstrap one node with the saved
snapshot to recover etcd state at the time of the snapshot.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-08 10:15:37 -07:00
Artem Chernyshev
247bd50e05 docs: describe steps to install and boot Talos from the SSD on rockpi4
Describe that gross flow while I still remember it.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-04-07 13:06:58 -07:00
Alexey Palazhchenko
aca63b8829 docs: fix "DigitalOcean" spelling
Refs #3427.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-04-07 09:13:24 -07:00
Andrey Smirnov
fbfd1eb2b1 refactor: pull new version of os-runtime, update code
This is mostly refactoring to adapt to the new APIs.

There are some small changes which are not user-visible immediately (but
visible when using `talosctl get` to inspect low-level details):

* `extras` namespace is removed, it was a hack to distinguish extra and
system manifests
* `Manifests` are managed by two controllers as shared outputs, stored
in the `controlplane` namespace now
* `talosctl inspect dependencies` output got slightly changed
* resources now have `md.owner` set to the controller name which manages
the resource

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-07 06:55:09 -07:00
Alexey Palazhchenko
8737ea716a feat: allow external cloud provides configration
Closes #3312.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-04-06 22:54:24 -07:00
Artem Chernyshev
39c6dbcc7a feat: add --config-patch parameter to talosctl gen config
Fixes: https://github.com/talos-systems/talos/issues/3410

Same as in `talosctl cluster create`. Will apply RFC6902 json patch
during the config generation if specified.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-04-02 10:56:41 -07:00
Andrey Smirnov
e664362cec feat: add API and command to save etcd snapshot (backup)
This adds a simple API and `talosctl etcd snapshot` command to stream
snapshot of etcd from one of the control plane nodes to the local file.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-02 09:20:16 -07:00
Branden Cash
7bcb91a433 docs: fix typo for stage flag
docs mentioned `--staged` flag, but should be `--stage`

Signed-off-by: Branden Cash <ammmze@gmail.com>
2021-04-01 10:44:46 -07:00
Andrey Smirnov
e2bb5973da release(v0.10.0-alpha.1): prepare release
This is the official v0.10.0-alpha.1 release.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-31 23:17:31 +03:00
Alexey Palazhchenko
a9451f5712 feat: update Kubernetes to 1.21.0-beta.1
See CHANGELOG:
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md

Refs #3329.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-03-30 03:07:03 -07:00
Artem Chernyshev
4b42ced4c2 feat: add ability to disable comments in talosctl gen config
Fixes: https://github.com/talos-systems/talos/issues/3384

Instead of doing simple `--no-comments` flag, decided to use more
granular approach which allows to either disable examples, or docstring,
or both.

Thus the command looks like this:

```bash
talosctl gen config --with-docs=false --with-examples=false <...>
```

Both are enabled by default to provide better UX for users learning
Talos.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-03-29 10:52:14 -07:00
Andrey Smirnov
2ea20f598a feat: replace timed with time sync controller
This is a complete rewrite of time sync process.

Now the time sync process starts early at boot time, and it adapts to
configuration changes:

* before config is available, `pool.ntp.org` is used
* once config is available, configured time servers are used

Controller updates same time sync resource as other controllers had
dependency on, so they have a chance to wait for the time sync event.

Talos services which depend on time now wait on same resource instead of
waiting on timed health.

New features:

* time sync now sticks to the particular time server unless there's an
error from that server, and server is changed in that case, this
improves time sync accuracy

* time sync acts on config changes immediately, so it's possible to
reconfigure time sync at any time

* there's a new 'epoch' field in time sync resources which allows
time-dependent controllers to regenerate certs when there's a big enough
jump in time

Features to implement later:

* apid shouldn't depend on timed, it should be started early and it
should regenerate certs on time jump

* trustd should be updated in same way

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-29 09:29:43 -07:00
Spencer Smith
74b2b5578c docs: update AWS docs to ensure instances are tagged
This PR updates our AWS docs so that we specify a tag when creating
instances. This makes it easier to know which VMs were created as part
of this process, as well as quickly spot the init node.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2021-03-25 11:55:19 -04:00
Spencer Smith
946e74f047 docs: update path for kernel downloads in qemu docs
This PR fixes a docs bug where the name of the kernel and init to
download were incorrect for qemu.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2021-03-24 09:48:12 -07:00
Alexey Palazhchenko
ed272e604e feat: update Kubernetes to 1.21.0-beta.0
See CHANGELOG:
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md

Refs #3329.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-03-24 07:36:54 -07:00
Andrey Smirnov
b0209fd29d refactor: move networkd, timed APIs to machined, remove routerd
This moves implementation of the user-facing APIs to the machined, and
as now all the APIs are implemented by machined, remove routerd and
adjust apid to proxy to machined.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-24 00:00:28 -07:00
Artem Chernyshev
6ffabe5169 feat: add ability to find disk by disk properties
Fixes: https://github.com/talos-systems/talos/issues/3323

Not exactly matching with udevd generated `by-<id>` symlinks, but should
provide sufficient amount of property selectors to be able to pick
specific disks for any kind of disk: sd card, hdd, ssd, nvme.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-03-23 14:23:02 -07:00
Andrey Smirnov
2b1641a3b5 docs: add AMIs for Talos 0.9.0
Not all the regions were able to process the request, so list is a bit
shorter than usual.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-22 11:58:28 -07:00
Andrew Rynhard
79ceb428d4 docs: make v0.9 the default docs
This makes the v0.9 release the default documentation.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2021-03-22 11:20:20 -07:00
Andrey Smirnov
a5b62f4dc2 docs: add documentation for Talos 0.10
Move default docs generation to 0.10 folder.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-22 06:24:39 -07:00
Alexey Palazhchenko
f7d276b854 chore: remove old osctl reference
One place was missed.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-03-19 08:08:58 -07:00
Andrey Smirnov
f0512dfce9 feat: update Kubernetes to 1.20.5
See CHANGELOG:
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#changelog-since-v1204

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-19 03:14:46 -07:00
Andrey Smirnov
8810440744 docs: add control plane in-depth guide
Add FAQ on initial time sync.

Add 0.9 new videos.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-17 11:23:59 -07:00
Andrey Smirnov
cbc38418d8 release(v0.10.0-alpha.0): prepare release
This is the official v0.10.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-17 08:40:09 -07:00
Seán C McCord
2e22f20bd8 docs: minor fixes to getting started
Fixes a few minor errors in the Getting Started doc.

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2021-03-12 13:06:47 -08:00
Artem Chernyshev
83b4e7f744 feat: add Rock pi 4 support
Another nice addition to the list of supported SBCs.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-03-12 05:08:29 -08:00
Seán C McCord
1362966ff5 docs: rewrite getting-started for ISO
Update the Getting Started documentation to reflect the new ISO-based
installation method.

Fixes #3016

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2021-03-12 04:44:10 -08:00
Andrey Smirnov
6f7df3da1e fix: update output of convert-k8s command
This includes Sean's comments from #3278 and introduces a new flag which
is referenced in manual conversion process document.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-12 02:21:01 -08:00
Seán C McCord
dce6118c29 docs: add guide for VIP
Add documentation for using VIP, or shared IP addresses, for the
controlplane.

Fixes #3289

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2021-03-11 19:01:38 -08:00
Andrey Smirnov
7c529e1cbd docs: fix links in the documentation
Gridsome forces folders to be lower-case.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-11 12:26:50 -08:00
Spencer Smith
f596c7f6be docs: add video for raspberry pi install
This PR adds a quick how-to on installing talos on rpi.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2021-03-11 11:11:58 -05:00
Andrey Smirnov
47324dcaea docs: add guide on editing machine configuration
This covers new configuration update modes and new commands in 0.9.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-10 14:02:16 -08:00
Andrey Smirnov
99d5f894e1 chore: update website npm dependencies
This fixes some security vulnerabilities in the libraries.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-10 13:55:40 -08:00
Andrey Smirnov
11056a8034 docs: add highlights for 0.9 release
This describes high-level new features in Talos 0.9.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-10 07:21:13 -08:00
Andrey Smirnov
ae8bedb9a0 docs: add control plane conversion guide and 0.9 upgrade notes
These docs are critical to get 0.9.0-beta released.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-10 07:20:44 -08:00
Andrey Smirnov
ed9673e50a docs: add troubleshooting control plane documentation
Describe common failures and debugging approach.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Co-authored-by: Spencer Smith <rsmitty@users.noreply.github.com>
2021-03-09 13:31:08 -08:00
Andrey Smirnov
485cb1262f docs: update Kubernetes upgrade guide
CLI tool usage is same, but manual process is quite different.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-09 13:23:58 -08:00
Andrey Smirnov
d3798cd7a8 docs: document controller runtime, resources and talosctl get
This is more of a in-depth guide explaining internals.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Co-authored-by: Spencer Smith <rsmitty@users.noreply.github.com>
2021-03-09 11:27:48 -08:00
Andrey Smirnov
49853fc2ec fix: mkdir source of the extra mounts for the kubelet
This makes sure source directory exists before performing mount
operation.

Also adds an ability to patch the config bundle configs with JSON patch,
which is exposed in `talosctl cluster create`, this allowed me to easily
test this fix:

```
talosctl cluster create ... --config-patch='[{"op": "add", "path": "/machine/kubelet/extraMounts", "value": [{"destination": "/var/log/containers", "type": "bind", "source": "/var/log/containers", "options": ["rshared", "rbind", "rw"]}]}]'
```

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-05 11:47:55 -08:00
Andrey Smirnov
ec72ae892b release(v0.9.0-alpha.5): prepare release
This is the official v0.9.0-alpha.5 release.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-03 12:04:05 -08:00
Andrey Smirnov
60b7f79fd8 feat: add --on-reboot flag to talosctl edit/patch machineConfig
This allows to apply config even if sequencer is locked to recover from
confguration mistakes.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-03 08:48:29 -08:00
Andrey Smirnov
60aa011c7a feat: rename namespaces, resources, types etc
See https://github.com/talos-systems/os-runtime/pull/12 for new mnaming
conventions.

No functional changes.

Additionally implements printing extra columns in `talosctl get xyz`.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-02 13:34:15 -08:00
Andrey Smirnov
3a2caca781 release(v0.9.0-alpha.4): prepare release
This is the official v0.9.0-alpha.4 release.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-02 12:50:20 -08:00
Artem Chernyshev
02c0c25bad docs: bump v0.8 release version in the SBCs guides
Makes sense to update these guides to point to the v0.8.4 as it contains
many good fixes.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-03-02 07:09:33 -08:00
Artem Chernyshev
9333e2a600 docs: add disk encryption guide
Describe usage tips, caveats, flow.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-03-02 06:44:40 -08:00
Andrey Smirnov
a12a5dd255 release(v0.9.0-alpha.3): prepare release
This is the official v0.9.0-alpha.3 release.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-01 12:55:08 -08:00
Artem Chernyshev
376fdcf6cb feat: implement etcd remove-member cli command
Fixes: https://github.com/talos-systems/talos/issues/3219

We already have `etcd leave`, which makes the node exclude itself from
etcd members.
But in case if the node can't remove itself because it doesn't have
connection to etcd we need this etcd remove-member cli, which basically removes
a node from a different node.

No unit tests for that as it's going to destroy the test cluster.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-03-01 07:55:08 -08:00
Andrey Smirnov
d173fd4c01 feat: update etcd to 3.4.15
See https://github.com/etcd-io/etcd/releases/tag/v3.4.15

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-01 06:16:40 -08:00
Andrey Smirnov
c7ee239087 fix: show stopped/exited containers via CRI inspector
This fixes output of `talosctl containers` to show failed/exited
containers so that it's possible to see e.g. `kube-apiserver` container
when it fails to start. This also enables using ID from the container
list to see logs of failing containers, so it's easy to debug issues
when control plane pods don't start because of wrong configuration.

Also remove option to use either CRI or containerd inspector, default to
containerd for system namespace and to CRI for kubernetes namespace.

The only side effect is that we can't see `kubelet` container in the
output of `talosctl containers -k`, but `kubelet` itself is available in
`talosctl services` and `talosctl logs kubelet`.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-26 14:45:13 -08:00
Andrey Smirnov
d7cdc8cc15 feat: implement simple layer 2 shared IP for CP
This adds a VIP (virtual IP) option to the network configuration of an
interface, which will allow a set of nodes to share a floating IP
address among them.  For now, this is restricted to control plane use
and only a single shared IP is supported.

Fixes #3111

Signed-off-by: Seán C McCord <ulexus@gmail.com>
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-26 14:14:34 -08:00