91 Commits

Author SHA1 Message Date
Dmitriy Matrenichev
e06e1473b0
feat: update golangci-lint to 1.45.0 and gofumpt to 0.3.0
- Update golangci-lint to 1.45.0
- Update gofumpt to 0.3.0
- Fix gofumpt errors
- Add goimports and format imports since gofumports is removed
- Update Dockerfile
- Fix .golangci.yml configuration
- Fix linting errors

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-03-24 08:14:04 +04:00
Andrey Smirnov
883d401f9f
chore: rename github organization to siderolabs
Go module import paths still use talos-systems, packages use new
siderolabs name.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-23 21:07:46 +03:00
Andrey Smirnov
09efa62f68
chore: re-enable kexec and default to UEFI booting in tests
Fixes #4947

It turns out there's something related to boot process in BIOS mode
which leads to initramfs corruption on later `kexec`.

Booting via GRUB is always successful.

Problem with kexec was confirmed with:

* direct boot via QEMU
* QEMU boot via iPXE (bundled with QEMU)

The root cause is not known, but the only visible difference is the
placement of RAMDISK with UEFI and BIOS boots:

```
[    0.005508] RAMDISK: [mem 0x312dd000-0x34965fff]
```

or:

```
[    0.003821] RAMDISK: [mem 0x711aa000-0x747a7fff]
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-02 21:52:18 +03:00
Andrey Smirnov
85782faa24
feat: update Kubernetes to 1.23.3
Also bumps some dependencies and updates Talos version we use in the
upgrade tests.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-01-26 17:59:21 +03:00
Andrey Smirnov
97ffa7a645
feat: upgrade kubelet version in talosctl upgrade-k8s
Fixes #4656

As now changes to kubelet configuration can be applied without a reboot,
`talosctl upgrade-k8s` can handle the kubelet upgrades as well.

The gist is simply modifying machine config and waiting for `Node`
version to be updated, rest of the code is required for reliability of
the process.

Also fixed a bug in the API while watching deleted items with
tombstones.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-12-08 21:12:17 +03:00
Andrey Smirnov
64a4f6e77c
test: bump Talos versions in upgrade tests
In preparation for going 0.14-beta.0, bump versions in upgrade tests.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-12-06 18:07:24 +03:00
Andrey Smirnov
d4b0ca21a1
test: retry upgrade mutex lock failures
With recent changes and kexec, Talos upgrades much faster in the tests
and mutex is not released properly (#4525).

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-11-12 17:49:46 +03:00
Andrey Smirnov
38516a5499
test: update Talos versions in upgrade tests
Now 0.13.0 is the past release and 0.12.3 is the one before it.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-10-21 17:36:30 +03:00
Andrey Smirnov
d943bb0e28
feat: update Kubernetes to 1.22.2
See https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-09-16 13:59:51 +03:00
Andrey Smirnov
a059454045
chore: build using Go 1.17
`initramfs` size for amd64 shrinks by 1.3 MiB.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-09-13 22:33:47 +03:00
Andrey Smirnov
950f122c95
chore: update versions in upgrade tests
In preparation for 0.13, start testing upgrades to 0.12.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-08-25 18:02:47 +03:00
Alexey Palazhchenko
09d70b7eaf feat: update Kubernetes to v1.22.0
Closes #3967.
Closes #3997.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-08-06 09:06:32 -07:00
Alexey Palazhchenko
eea750de2c chore: rename "join" type to "worker"
Closes #3413.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-07-09 07:10:45 -07:00
Andrey Smirnov
84817f7334 chore: bump Talos version in upgrade tests
Preparing for 0.11 to be stable release soon.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-29 07:24:48 -07:00
Alexey Palazhchenko
42c16f67f4 chore: bump dependencies
Update k8s to 1.21.2.

See #3787 #3788 #3789 #3790 #3791 #3792 #3793 #3794 #3795 #3796 #3798.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-06-21 07:05:41 -07:00
Andrey Smirnov
5811f4dda1 feat: implement link (interface) controllers
The structure of the controllers is really similar to addresses and
routes:

* `LinkSpec` resource describes desired link state
* `LinkConfig` controller generates `LinkSpecs` based on machine
configuration and kernel cmdline
* `LinkMerge` controller merges multiple configuration sources into a
single `LinkSpec` paying attention to the config layer priority
* `LinkSpec` controller applies the specs to the kernel state

Controller `LinkStatus` (which was implemented before) watches the
kernel state and publishes current link status.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-01 09:36:25 -07:00
Andrey Smirnov
76e38b7b82 feat: update Kubernetes to 1.21.1
See https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-05-13 08:05:08 -07:00
Andrey Smirnov
daf2208749 test: update upgrade tests to 0.10 release
In preparation for going 0.10 beta, start testing upgrades to 0.10, drop
0.8 and self-hosted control plane handling in the tests.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-09 12:57:04 -07:00
Alexey Palazhchenko
1fcf38f9d6 feat: add support for "none" CNI type
Closes #3411.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-04-09 12:53:00 -07:00
Alexey Palazhchenko
37a5edf04a feat: update Kubernetes to 1.21.0 release
See CHANGELOG:
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md

Closes #3329.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-04-09 20:08:20 +03:00
Andrey Smirnov
abc2e17ebb test: update 0.9.x version in upgrade tests to 0.9.1
Version 0.9.1 contains a fix for concurrent map write on unmount which
was frequently breaking our upgrade tests.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-02 03:59:36 -07:00
Alexey Palazhchenko
a9451f5712 feat: update Kubernetes to 1.21.0-beta.1
See CHANGELOG:
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md

Refs #3329.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-03-30 03:07:03 -07:00
Alexey Palazhchenko
ed272e604e feat: update Kubernetes to 1.21.0-beta.0
See CHANGELOG:
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md

Refs #3329.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-03-24 07:36:54 -07:00
Andrey Smirnov
125b86f4ef fix: upgrade-k8s bug with empty config values and provision script
First, if the config for some component image (e.g. `apiServer`) is empty,
Talos pushes default image which is unknown to the script, so verify
that change is not no-op, as otherwise script will hang forvever waiting
for k8s control plane config update.

Second, with bootkube bootstrap it was fine to omit explicit kubernetes
version in upgrade test, but with Talos-managed that means that after
Talos upgrade Kubernetes gets upgraded as well (as Talos config doesn't
contain K8s version, and defaults are used). This is not what we want to
test actually.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-19 12:05:31 -07:00
Andrey Smirnov
f0512dfce9 feat: update Kubernetes to 1.20.5
See CHANGELOG:
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#changelog-since-v1204

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-19 03:14:46 -07:00
Andrey Smirnov
ca8a5596c7 chore: fix provision tests after changes to build-container
CNI was removed from build-container which works fine for
`talosctl cluster create` clusters as it installs its own CNI, but fails
for upgrade tests as they were never updated for the CNI bundle.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-12 09:59:15 -08:00
Alexey Palazhchenko
df52c13581 chore: fix //nolint directives
That's the recommended syntax:
https://golangci-lint.run/usage/false-positives/

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-03-05 05:58:33 -08:00
Andrey Smirnov
7e8f13652c chore: fix upgrade tests by bumping 0.9 to alpha.5
Resources/types were renamed after alpha.4, so we need Talos API to
match expectations of the upgrade test built against master.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-03 13:53:06 -08:00
Andrey Smirnov
1d8ed9b5cd chore: update provision/upgrade tests to 0.9.0-alpha.3
This drops support for 0.7.x in upgrade tests, and bumps tests to use
version 0.9.0-alpha.3 as the next stable (it will eventually graduate to
0.9.0).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-03-02 07:11:16 -08:00
Artem Chernyshev
7108bb3f5b test: upgrade master to master tests
Verify upgrade flow using the same version of the installer.
Run that with disk encryption enabled.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-02-24 07:56:44 -08:00
Andrey Smirnov
e2f1fbcfdb feat: support control plane upgrades with Talos managed control plane
Upgrade is performed by updating node configuration (node by node, service
by service), watching internal resource state to get new configuration
version and verifying that pod with matching version successfully
propagated to the API server state and pod is ready.

Process is similar to the rolling update of the DaemonSet.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-20 11:57:32 -08:00
Andrey Smirnov
e9fc54f6e3 feat: update Kubernetes to 1.20.3
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#changelog-since-v1202

Also updater pkgs for:

* talos-systems/pkgs#238 (raspberrypi-firmware update)
* talos-systems/pkgs#242 (Linux 5.10.17 + init_on_free=0)

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-19 05:22:34 -08:00
Andrey Smirnov
7751920dba feat: add a tool and package to convert self-hosted CP to static pods
This is required to upgrade from Talos 0.8.x to 0.9.x. After the cluster
is fully upgraded, control plane is still self-hosted (as it was
bootstrapped with bootkube).

Tool `talosctl convert-k8s` (and library behind it) performs the upgrade
to self-hosted version.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-17 23:26:57 -08:00
Artem Chernyshev
02b3719df9 feat: skip filesystem for state and ephemeral partitions in the installer
Filesystem creation step is moved on the later stage: when Talos mounts
the partition for the first time.
Now it checks if the partition doesn't have any filesystem and formats
it right before mounting.

Additionally refactored mount options a bit:
- replaced separate options with a set of binary flags.
- implemented pre-mount and post-unmount hooks.

And fixed typos in couple of places and increased timeout for `apid ready`.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-02-17 09:37:21 -08:00
Andrey Smirnov
daea9d3811 feat: support version contract for Talos config generation
This allows to generating current version Talos configs (by default) or
backwards compatible configuration (e.g. for Talos 0.8).

`talosctl gen config` defaults to current version, but explicit version
can be passed to the command via flags.

`talosctl cluster create` defaults to install/container image version,
but that can be overridden. This makes `talosctl cluster create` now
compatible with 0.8.1 images out of the box.

Upgrade tests use contract based on source version in the test.

When used as a library, `VersionContract` can be omitted (defaults to
current version) or passed explicitly. `VersionContract` can be
convienietly parsed from Talos version string or specified as one of the
constants.

Fixes #3130

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-10 13:02:52 -08:00
Andrey Smirnov
7f3dca8e4c test: add support for IPv6 in talosctl cluster create
Modify provision library to support multiple IPs, CIDRs, gateways, which
can be IPv4/IPv6. Based on IP types, enable services in the cluster to
run DHCPv4/DHCPv6 in the test environment.

There's outstanding bug left with routes not being properly set up in
the cluster so, IPs are not properly routable, but DHCPv6 works and IPs
are allocated (validates DHCPv6 client).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-09 13:28:53 -08:00
Andrey Smirnov
edf5777222 feat: add an option to force upgrade without checks
Our upgrades are safe by default - we check etcd health, take locks,
etc. But sometimes upgrades might be a way to recover broken (or
semi-broken) cluster, in that case we need upgrade to run even if the
checks are not passing. This is not a safe way to do upgrades, but it
might be a way to recover a cluster.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-09 10:20:03 -08:00
Andrey Smirnov
2277ce8abe feat: move to ECDSA keys for all Kubernetes/etcd certs and keys
ECDSA keys are smaller which decreases Talos config size, they are more
efficient in terms of key generation, signing, etc., so it makes boot
performance better (and config generation as well).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-02 13:25:00 -08:00
Andrey Smirnov
e0a0f58801 feat: use multi-arch images for k8s and Flannel CNI
Flannel got updated to 0.13 version which has multi-arch image.

Kubernetes images are multi-arch.

Fixes #3049

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-01-28 08:26:02 -08:00
Andrey Smirnov
0aaf8fa968 feat: replace bootkube with Talos-managed control plane
Control plane components are running as static pods managed by the
kubelets.

Whole subsystem is managed via resources/controllers from os-runtime.

Many supporting changes/refactoring to enable new code paths.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-01-26 14:22:35 -08:00
Andrey Smirnov
d71ac4c4ff feat: update Kubernetes to 1.20.2
Minor point release, official changelog:

https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-01-15 09:06:18 -08:00
Andrey Smirnov
f2c029a07d chore: update upgrade test version used
Now with official 0.8.0 release.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-12-24 18:49:29 +03:00
Andrey Smirnov
b1d4814308 feat: update Kubernetes to 1.20.1
See https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-12-21 23:52:29 +03:00
Andrey Smirnov
3dae6df27b test: stabilize upgrade test by running health check several times
For single node clusters, control plane is unstable after reboot, run
health check several times to let it settle down to avoid failures in
subsequent checks.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-12-11 08:31:01 -08:00
Andrey Smirnov
872e792dbc feat: update Kubernetes to 1.20.0
Official K8s release matching Talos 0.8.0.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-12-09 06:11:48 -08:00
Andrey Smirnov
350280eb59 feat: implement "staged" (failsafe/backup) upgrades
Regular upgrade path takes just one reboot, but it requires all the
processes to be stopped on the node before upgrade might proceed. Under
some circumstances and with potential Talos bugs it might not work
rendering Talos upgrades almost impossible.

Staged upgrades build upon regular install flow to run the upgrade on
the node reboot. Such upgrades require two reboots of the node, and it
requires two pulls of the installer image, but they should be much less
suspicious to the failure. Once the upgrade is staged, node can be
rebooted in any possible way, including hard reset and upgrade is
performed on the next boot.

New ADV format was implemented as well to allow to store install image
ref/options across reboots. New format allows for bigger values and
takes 50% of the `META` partition. Old ADV is still kept for
compatibility reasons.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-12-08 08:34:26 -08:00
Andrey Smirnov
1cf6b98fb8 test: bump Talos release version for upgrade test to 0.7.1
We should always use latest releases.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-12-08 18:41:28 +03:00
Andrey Smirnov
621968977e feat: update kubernetes to 1.20.0-rc.0
Talos 0.8 is going to ship with K8s 1.20.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-12-02 10:50:58 -08:00
Andrey Smirnov
28ba6e416e feat: update Kubernetes to v1.20.0-beta.2
Talos 0.8 is going to ship with K8s 1.20.x.

Changes to support new `control-plane` label,
upgrade-k8s supports automated fixups for 1.20.

See also: https://github.com/talos-systems/bootkube-plugin/pull/22

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-11-25 06:39:14 -08:00
Artem Chernyshev
b6874ee82a feat: add TUI based talos interactive installer
This is initial commit of the installer.
What's done:
- verifying node availability before starting any operations.
- gathering information about disks on the machine.
- allows setting: install disk, hostname, machine type, installer image,
  kubernetes version, dns domain, cluster-name.
- dumps/merges talosconfig to a file after applying configuration.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2020-11-18 12:34:15 -08:00