48 Commits

Author SHA1 Message Date
Dmitriy Matrenichev
949ad11a2d
chore: import siderolink as siderolink-launch subcommand
This PR ensures that we can test our siderolink communication using embedded siderolink-agent.
If `--with-siderolink` provided during `talos cluster create` talosctl will embed proper kernel string and setup `siderolink-agent` as a separate process. It should be used with combination of `--skip-injecting-config` and `--with-apply-config` (the latter will use newly generated IPv6 siderolink addresses which talosctl passes to the agent as a "pre-bind").

Fixes #8392

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-03-23 16:08:56 +03:00
Andrey Smirnov
36c8ddb5e1
feat: implement ingress firewall rules
Fixes #4421

See documentation for details on how to use the feature.

With `talosctl cluster create`, firewall can be easily test with
`--with-firewall=accept|block` (default mode).

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-11-30 22:58:16 +04:00
Andrey Smirnov
2b548ad0d9
feat: update containerd to 1.7.x
Also update Linux and other pkgs.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-09-28 16:33:57 +04:00
Andrey Smirnov
390137447f
feat: enable KubePrism by default
Fixes #7787

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-09-25 23:12:33 +04:00
Noel Georgi
6b0373ebef
chore: move bash tests to integration
move extensions and secureboot tests to integration.
Makes it easier to test.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-08-17 19:58:35 +05:30
Andrey Smirnov
87fe8f1a2a
feat: implement image generation profiles
Support full configuration for image generation, including image
outputs, support most features (where applicable) for all image output
types, unify image generation process.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-08-02 19:13:44 +04:00
Noel Georgi
209c34801e
chore: drop with-secureboot talosctl flag
The code picks up firmware files in the order it's defined. The
secureboot QEMU firmware files are defined first, so this flag is a
no-op. This was leftover from when `ovmfctl` was used.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-31 17:33:12 +04:00
Dmitriy Matrenichev
5f34f5b41f
chore: rename api load balancer to KubePrism
Closes #7432

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-07-14 15:23:53 +03:00
Noel Georgi
79365d9bac
feat: tpm2 based disk encryption
Support disk encryption using tpm2 and pre-calculated signed PCR values.

Fixes: #7266

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-12 20:41:28 +05:30
Artem Chernyshev
ce63abb219
feat: add KMS assisted encryption key handler
Talos now supports new type of encryption keys which rely on Sealing/Unsealing randomly generated bytes with a KMS server:

```
systemDiskEncryption:
  ephemeral:
    keys:
      - kms:
          endpoint: https://1.2.3.4:443
        slot: 0
```
gRPC API definitions and a simple reference implementation of the KMS server can be found in this
[repository](https://github.com/siderolabs/kms-client/blob/main/cmd/kms-server/main.go).

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2023-07-07 19:02:39 +03:00
Noel Georgi
e5306ef263
chore: format and cleanup test scripts
This formats and cleanups the test scripts.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-06-27 16:53:40 +05:30
Andrey Smirnov
fe0f46980f
feat: implement secure boot from disk
This includes sd-boot handling, EFI variables, etc.

There are some TODOs which need to be addressed to make things smooth.

Install to disk, upgrades work.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-16 20:15:16 +05:30
Dmitriy Matrenichev
445f5ad542
feat: support API server load balancer
This commit adds support for API load balancer. Quick way to enable it is during cluster creation using new `api-server-balancer-port` flag (0 by default - disabled). When enabled all API request will be routed across
cluster control plane endpoints.

Closes #7191

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-06-16 10:09:20 -04:00
Dmitriy Matrenichev
665702ddd3
chore: fix cilium e2e tests
`WITH_CONFIG_PATCH_WORKER` check result was overriding any value set in `CONFIG_PATCH_FLAG` variable.
Move it to the different variable.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-14 15:08:31 +04:00
Christian Rolland
e6dde8ffc5
feat: add network chaos to qemu development environment
Add flags for configuring the qemu bridge interface with chaos options:
- network-chaos-enabled
- network-jitter
- network-latency
- network-packet-loss
- network-packet-reorder
- network-packet-corrupt
- network-bandwidth

These flags are used in /pkg/provision/providers/vm/network.go at the end of the CreateNetwork function to first see if the network-chaos-enabled flag is set, and then check if bandwidth is set.  This will allow developers to simulate clusters having a degraded WAN connection in the development environment and testing pipelines.

If bandwidth is not set, it will then enable the other options.
- Note that if bandwidth is set, the other options such as jitter, latency, packet loss, reordering and corruption will not be used.  This is for two reasons:
	- Restriction the bandwidth can often intoduce many of the other issues being set by the other options.
	- Setting the bandwidth uses a separate queuing discipline (Token Bucket Filter) from the other options (Network Emulator) and requires a much more complex configuration using a Heirarchial Token Bucket Filter which cannot be configured at a granular enough level using the vishvananda/netlink library.

Adding both queuing disciplines to the same interface may be an option to look into in the future, but would take more extensive testing and control over many more variables which I believe is out of the scope of this PR.  It is also possible to add custom profiles, but will also take more research to develop common scenarios which combine different options in a realistic manner.

Signed-off-by: Christian Rolland <christian.rolland@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-06 20:15:26 +04:00
Noel Georgi
3b36993b99
fix: rlimit nofile test
The test was added at the wrong place.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-05-12 16:20:52 +05:30
Andrey Smirnov
d30cf9c86e
test: fix misprint in e2e scripts
This bug breaks `e2e-extensions`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-24 15:28:18 +04:00
Noel Georgi
a78281214d
feat: add cilium e2e tests
Add cilium e2e tests. The existing cilium check was very old, update to
latest cilium version and also add a test for KPR strict mode.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-03-03 20:03:25 +05:30
Noel Georgi
5a01d5fd47
chore: run extension build as downstream
Run extensions build as downstream

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-02-27 20:11:10 +05:30
Noel Georgi
2d01480180
feat: automatically load modules based on hw info
Fixes: #6802

Automatically load kernel modules based on hardware info and modules
alias info. udevd would automatically load modules based on HW
information present.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-02-14 19:57:13 +05:30
Andrey Smirnov
2f2d97b6b5
fix: don't wait for the hostname in maintenance mode
Fixes #6119

With new stable default hostname feature, any default hostname is
disabled until the machine config is available.

Talos enters maintenance mode when the default config source is empty,
so it doesn't have any machine config available at the moment
maintenance service is started.

Hostname might be set via different sources, e.g. kernel args or via
DHCP before the machine config is available, but if all these sources
are not available, hostname won't be set at all.

This stops waiting for the hostname, and skips setting any DNS names in
the maintenance mode certificate SANs if the hostname is not available.

Also adds a regression test via new `--disable-dhcp-hostname` flag to
`talosctl cluster create`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-23 17:52:20 +04:00
Noel Georgi
b62b18a972
feat: bump k8s to v1.25.0-beta.0
Bump k8s to v1.25.0-beta.0

Update most kubernetes `master` references to `controlplane`

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-08-10 22:17:53 +05:30
Andrey Smirnov
da2985fe1b
fix: respect local API server port
It wasn't used when building an endpoint to the local API server, so
Talos couldn't talk to the local API server when port was changed from
the default one.

Fixes #5706

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-06-09 00:33:49 +04:00
Andrey Smirnov
09efa62f68
chore: re-enable kexec and default to UEFI booting in tests
Fixes #4947

It turns out there's something related to boot process in BIOS mode
which leads to initramfs corruption on later `kexec`.

Booting via GRUB is always successful.

Problem with kexec was confirmed with:

* direct boot via QEMU
* QEMU boot via iPXE (bundled with QEMU)

The root cause is not known, but the only visible difference is the
placement of RAMDISK with UEFI and BIOS boots:

```
[    0.005508] RAMDISK: [mem 0x312dd000-0x34965fff]
```

or:

```
[    0.003821] RAMDISK: [mem 0x711aa000-0x747a7fff]
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-02 21:52:18 +03:00
Noel Georgi
151c9df091
chore: add CSI tests for e2e-qemu
Add tests for using rook as CSI for e2e-qemu
Allow specifying cpu/memory for workers

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-01-27 20:06:10 +05:30
Andrey Smirnov
0bf161dffb
test: add integration test for system extensions
This verifies system extensions via the gVisor system extension.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-01-26 23:29:15 +03:00
Florian Klink
a50c42980f
fix: use #!/usr/bin/env bash as shebang instead of #!/bin/bash
This will fix running these scripts on distros without /bin/bash, but
where bash is in $PATH, such as NixOS.

Currently, `make fmt` otherwise fails to run:

```
make[3]: Leaving directory '/home/flokli/dev/numtide/manifoldfinance/talos'
sh: ./hack/fix-artifacts.sh: /bin/bash: bad interpreter: No such file or directory
make[2]: *** [Makefile:163: local-fmt-protobuf] Error 126
make[2]: Leaving directory '/home/flokli/dev/numtide/manifoldfinance/talos'
make[1]: *** [Makefile:274: fmt-protobuf] Error 2
make[1]: Leaving directory '/home/flokli/dev/numtide/manifoldfinance/talos'
make: *** [Makefile:277: fmt] Error 2
```

Signed-off-by: Florian Klink <flokli@flokli.de>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-01-25 23:11:39 +03:00
Andrey Smirnov
6dcce20e6f
test: set proper pod CIDR for Cilium tests
This fixes the issue with kubelet picking up wrong IP on restart, as
Talos doesn't know pod IPs (Cilium is using its own pod CIDR, it doesn't
look up Kubernetes settings).

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-11-15 23:50:00 +03:00
Spencer Smith
c8e404e356
test: update vars for AWS cluster
This PR updates to use the newest var setup from our capi templates.

Signed-off-by: Spencer Smith <spencer.smith@talos-systems.com>
2021-10-19 16:33:52 -04:00
Andrey Smirnov
68c420e3c9
feat: enable cluster discovery by default
This enables cluster discovery by default for Talos 0.14. KubeSpan is
not enabled by default.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-10-15 14:46:32 +03:00
Alexey Palazhchenko
e030b2e8bb chore: use k8s 1.21.3 in CAPI tests for now
Refs #4046.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-08-10 13:28:37 -07:00
Andrey Smirnov
e7a9164b1e test: implement talosctl conformance command to run e2e tests
Command implements two modes:

* `fast`: conformance suite is run at maximum speed
* `certified`: conformance suite is run in serial mode, results
  are capture to produce artifacts ready for CNCF submission process

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-16 09:17:51 -07:00
Andrey Smirnov
d7cdc8cc15 feat: implement simple layer 2 shared IP for CP
This adds a VIP (virtual IP) option to the network configuration of an
interface, which will allow a set of nodes to share a floating IP
address among them.  For now, this is restricted to control plane use
and only a single shared IP is supported.

Fixes #3111

Signed-off-by: Seán C McCord <ulexus@gmail.com>
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-26 14:14:34 -08:00
Andrey Smirnov
84ad6cbb1a chore: switch CI to stop embedding local registry into the builds
This adds new `IMAGE_REGISTRY` variable (similar to `IMAGE_TAG`) which
affects only the registry image gets pushed to, but it's not built into
the binaries and images as a default registry.

This fixes a problem when release builds reference our CI local
registry.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-24 18:05:37 +03:00
Artem Chernyshev
54d6a45217 feat: add state encryption support
State partition encryption support adds a new section to the machine config.
And a new step to the sequencer flow which saves encryption
configuration object as json serialized value in the META partition.

Everything else is the same as is for the ephemeral partition.
Additionally enabled state partition encryption in the disk encryption
integration tests.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-02-18 06:55:22 -08:00
Artem Chernyshev
58ff2c9808 feat: implement ephemeral partition encryption
This PR introduces the first part of disk encryption support.
New config section `systemDiskEncryption` was added into MachineConfig.
For now it contains only Ephemeral partition encryption.

Encryption itself supports two kinds of keys for now:
- node id deterministic key.
- static key which is hardcoded in the config and mainly used for test
purposes.

Talosctl cluster create can now be told to encrypt ephemeral partition
by using `--encrypt-ephemeral` flag.

Additionally:
- updated pkgs library version.
- changed Dockefile to copy cryptsetup deps from pkgs.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-02-17 13:39:04 -08:00
Artem Chernyshev
73c81c501e fix: pass disk image flags to e2e-qemu cluster create command
Forgot to add it in the original PR.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2020-12-22 23:57:31 -08:00
Artem Chernyshev
6540e9bf70 feat: support disk image in talosctl cluster create
Fixes: https://github.com/talos-systems/talos/issues/2973

Can now supply disk image using `--disk-image-path` flag.
May need to enable `--with-apply-config` if it's necessary to bootstrap
nodes properly.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2020-12-22 17:06:00 +03:00
Andrey Smirnov
9d1ac81be5 chore: lower MTU to 1450 for the tests in the CI
This should help with the CNI encapsulation in the cluster.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-12-17 17:14:07 +03:00
Andrey Smirnov
350d75eb46 feat: build talosctl-cni-bundle, use it in talosctl for QEMU
This builds a bundle with CNI plugins for talosctl which is
automatically downloaded by `talosctl` if CNI plugins are missing.

CNI directories are moved by default to the `~/.talos/cni` path.

Also add a bunch of pre-flight checks to the QEMU provisioner to make it
easier to bootstrap the Talos QEMU cluster.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-10-30 16:30:37 -07:00
Artem Chernyshev
061b296530 feat: allow specifying user-disks in talosctl cluster create
User-disks are supported by QEMU and Firecracker providers.
Can be defined by using the following parameters:
```
--user-disk /mount/path:1GB
```

Can get more than 1 user disk.
Same set of user disks will be created for all master and worker nodes.

Additionally enable user-disks in qemu e2e test.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2020-10-30 08:44:08 -07:00
Andrey Smirnov
ff0d4b305a feat: build Talos images/artifacts for amd64/arm64
By default, build outside of Drone works the same and builds only amd64
version, loads images back into dockerd, etc.

If multiple platforms are used, multi-arch images are built which can't
be exported to docker or to `.tar` image, they're always pushed to the
registry (even for PR builds to our internal CI registry).

Artifacts as files (initramfs, kernel) now have `-arch` suffix:
`vmlinuz-amd64`, `initramfs-amd64.xz`. "Magic" script normalizes output
paths depending on whether single platform or multiple platforms were
given.

VM provisioners accept magic `${ARCH}` in initramfs/kernel paths which
gets replaced by cluster architecture.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-09-27 10:32:07 -07:00
Andrew Rynhard
7d2741fc4b chore: migrate to ghcr.io
Move to GHCR.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-09-23 15:06:30 -07:00
Andrey Smirnov
59adf7315d feat: provide option to run Talos under UEFI in QEMU
This also adds integration pipeline tests for UEFI.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-08-28 12:51:10 -07:00
Andrey Smirnov
55f3249783 test: use registry mirrors in CI
This relies on registry caching mirrors running in the CI.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-31 16:30:41 +03:00
Andrey Smirnov
58aa2b75bb test: destroy clusters in e2e tests (qemu/firecracker)
As the build runs inside containers which are part of a single pod, we
need to clean up networking bits (bridge interface, etc.), so that it
doesn't cause problems for other steps.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-31 06:21:09 -07:00
Andrey Smirnov
a5d64d97c1 test: update qemu/firecracker provisioners
Fixes #2363 #2364 #2370 #2371

Several changes packed together:

* use compressed `vmlinuz` everywhere, firecracker provisioner
uncompresses it before first use, drop `vmlinux`

* handle reboots in qemu launcher to support reset API case, update
empty disk check to handle reset behavior (erasing partition table)

* make bootloader support default in provisioners, and flag to disable
that

* early support for target architecture for qemu provisioner

This should allow us to use `qemu` in CI/CD (not included into this PR):
integration test passes with qemu.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-30 21:17:25 +03:00
Artem Chernyshev
c6eb18eed5 feat: qemu provisioner
Starts and stops qemu VMs, has some initial configuration subset.
Sets up networking through CNI tools, sets up DHCP server which gives IP
addresses to nodes.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2020-07-28 14:55:35 -07:00