56 Commits

Author SHA1 Message Date
Noel Georgi
8fe39eacba
chore: move csi tests as go test
Move rook-ceph CSI tests as go tests.
This allows us to add more CSI tests in the future.

Fixes: 

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-08-26 18:18:09 +05:30
Noel Georgi
36f83eea9f
chore: make qemu check flag consistent with code
Restructure code as per changes from .

This makes the flag name to be in sync with what it actually does.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-08-20 20:33:56 +05:30
Noel Georgi
2ac8d2274f
chore: support unsupported flag for mkfs
Support `unsupported` flag for mkfs, so that `STATE` partition with size
less than 300M can be created by `mkfs.xfs`.

This allows to bring in newer `xfsprogs` that can repair corrupted FS
better.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-08-08 20:21:02 +05:30
Noel Georgi
50e5f37efb
chore: add test for apparmor
Add a test that verifies pods can be scheduled with `RuntimeDefault`
apparmor profile.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-07-30 20:24:57 +05:30
Noel Georgi
8ee0872683
chore(ci): drop crashdump, save logs as artifacts
Drop `--crashdump` and save talos cluster logs as artifacts.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-06-07 10:52:05 +08:00
Noel Georgi
9c3ebad9fd
chore(ci): kresify gh actions
Kresify, only handle gh workflows.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-05-22 00:17:09 +05:30
Andrey Smirnov
84cd7dbec4
feat: update Linux to 6.6.29
Pull in fixes for cloud-image-uploader from #8667.:w

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-05-01 15:59:04 +04:00
Dmitriy Matrenichev
8dc4910c48
chore: enable "WG over GRPC" testing in siderolink agent tests
Fixes https://github.com/siderolabs/talos/issues/8514
For https://github.com/siderolabs/talos/issues/8392

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-04-01 18:24:57 +03:00
Dmitriy Matrenichev
949ad11a2d
chore: import siderolink as siderolink-launch subcommand
This PR ensures that we can test our siderolink communication using embedded siderolink-agent.
If `--with-siderolink` provided during `talos cluster create` talosctl will embed proper kernel string and setup `siderolink-agent` as a separate process. It should be used with combination of `--skip-injecting-config` and `--with-apply-config` (the latter will use newly generated IPv6 siderolink addresses which talosctl passes to the agent as a "pre-bind").

Fixes 

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-03-23 16:08:56 +03:00
Andrey Smirnov
36c8ddb5e1
feat: implement ingress firewall rules
Fixes 

See documentation for details on how to use the feature.

With `talosctl cluster create`, firewall can be easily test with
`--with-firewall=accept|block` (default mode).

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-11-30 22:58:16 +04:00
Andrey Smirnov
2b548ad0d9
feat: update containerd to 1.7.x
Also update Linux and other pkgs.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-09-28 16:33:57 +04:00
Andrey Smirnov
390137447f
feat: enable KubePrism by default
Fixes 

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-09-25 23:12:33 +04:00
Noel Georgi
6b0373ebef
chore: move bash tests to integration
move extensions and secureboot tests to integration.
Makes it easier to test.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-08-17 19:58:35 +05:30
Andrey Smirnov
87fe8f1a2a
feat: implement image generation profiles
Support full configuration for image generation, including image
outputs, support most features (where applicable) for all image output
types, unify image generation process.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-08-02 19:13:44 +04:00
Noel Georgi
209c34801e
chore: drop with-secureboot talosctl flag
The code picks up firmware files in the order it's defined. The
secureboot QEMU firmware files are defined first, so this flag is a
no-op. This was leftover from when `ovmfctl` was used.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-31 17:33:12 +04:00
Dmitriy Matrenichev
5f34f5b41f
chore: rename api load balancer to KubePrism
Closes 

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-07-14 15:23:53 +03:00
Noel Georgi
79365d9bac
feat: tpm2 based disk encryption
Support disk encryption using tpm2 and pre-calculated signed PCR values.

Fixes: 

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-12 20:41:28 +05:30
Artem Chernyshev
ce63abb219
feat: add KMS assisted encryption key handler
Talos now supports new type of encryption keys which rely on Sealing/Unsealing randomly generated bytes with a KMS server:

```
systemDiskEncryption:
  ephemeral:
    keys:
      - kms:
          endpoint: https://1.2.3.4:443
        slot: 0
```
gRPC API definitions and a simple reference implementation of the KMS server can be found in this
[repository](https://github.com/siderolabs/kms-client/blob/main/cmd/kms-server/main.go).

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2023-07-07 19:02:39 +03:00
Noel Georgi
e5306ef263
chore: format and cleanup test scripts
This formats and cleanups the test scripts.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-06-27 16:53:40 +05:30
Andrey Smirnov
fe0f46980f
feat: implement secure boot from disk
This includes sd-boot handling, EFI variables, etc.

There are some TODOs which need to be addressed to make things smooth.

Install to disk, upgrades work.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-16 20:15:16 +05:30
Dmitriy Matrenichev
445f5ad542
feat: support API server load balancer
This commit adds support for API load balancer. Quick way to enable it is during cluster creation using new `api-server-balancer-port` flag (0 by default - disabled). When enabled all API request will be routed across
cluster control plane endpoints.

Closes 

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-06-16 10:09:20 -04:00
Dmitriy Matrenichev
665702ddd3
chore: fix cilium e2e tests
`WITH_CONFIG_PATCH_WORKER` check result was overriding any value set in `CONFIG_PATCH_FLAG` variable.
Move it to the different variable.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-14 15:08:31 +04:00
Christian Rolland
e6dde8ffc5
feat: add network chaos to qemu development environment
Add flags for configuring the qemu bridge interface with chaos options:
- network-chaos-enabled
- network-jitter
- network-latency
- network-packet-loss
- network-packet-reorder
- network-packet-corrupt
- network-bandwidth

These flags are used in /pkg/provision/providers/vm/network.go at the end of the CreateNetwork function to first see if the network-chaos-enabled flag is set, and then check if bandwidth is set.  This will allow developers to simulate clusters having a degraded WAN connection in the development environment and testing pipelines.

If bandwidth is not set, it will then enable the other options.
- Note that if bandwidth is set, the other options such as jitter, latency, packet loss, reordering and corruption will not be used.  This is for two reasons:
	- Restriction the bandwidth can often intoduce many of the other issues being set by the other options.
	- Setting the bandwidth uses a separate queuing discipline (Token Bucket Filter) from the other options (Network Emulator) and requires a much more complex configuration using a Heirarchial Token Bucket Filter which cannot be configured at a granular enough level using the vishvananda/netlink library.

Adding both queuing disciplines to the same interface may be an option to look into in the future, but would take more extensive testing and control over many more variables which I believe is out of the scope of this PR.  It is also possible to add custom profiles, but will also take more research to develop common scenarios which combine different options in a realistic manner.

Signed-off-by: Christian Rolland <christian.rolland@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-06-06 20:15:26 +04:00
Noel Georgi
3b36993b99
fix: rlimit nofile test
The test was added at the wrong place.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-05-12 16:20:52 +05:30
Andrey Smirnov
d30cf9c86e
test: fix misprint in e2e scripts
This bug breaks `e2e-extensions`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-24 15:28:18 +04:00
Noel Georgi
a78281214d
feat: add cilium e2e tests
Add cilium e2e tests. The existing cilium check was very old, update to
latest cilium version and also add a test for KPR strict mode.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-03-03 20:03:25 +05:30
Noel Georgi
5a01d5fd47
chore: run extension build as downstream
Run extensions build as downstream

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-02-27 20:11:10 +05:30
Noel Georgi
2d01480180
feat: automatically load modules based on hw info
Fixes: 

Automatically load kernel modules based on hardware info and modules
alias info. udevd would automatically load modules based on HW
information present.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-02-14 19:57:13 +05:30
Andrey Smirnov
2f2d97b6b5
fix: don't wait for the hostname in maintenance mode
Fixes 

With new stable default hostname feature, any default hostname is
disabled until the machine config is available.

Talos enters maintenance mode when the default config source is empty,
so it doesn't have any machine config available at the moment
maintenance service is started.

Hostname might be set via different sources, e.g. kernel args or via
DHCP before the machine config is available, but if all these sources
are not available, hostname won't be set at all.

This stops waiting for the hostname, and skips setting any DNS names in
the maintenance mode certificate SANs if the hostname is not available.

Also adds a regression test via new `--disable-dhcp-hostname` flag to
`talosctl cluster create`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-23 17:52:20 +04:00
Noel Georgi
b62b18a972
feat: bump k8s to v1.25.0-beta.0
Bump k8s to v1.25.0-beta.0

Update most kubernetes `master` references to `controlplane`

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-08-10 22:17:53 +05:30
Andrey Smirnov
da2985fe1b
fix: respect local API server port
It wasn't used when building an endpoint to the local API server, so
Talos couldn't talk to the local API server when port was changed from
the default one.

Fixes 

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-06-09 00:33:49 +04:00
Andrey Smirnov
09efa62f68
chore: re-enable kexec and default to UEFI booting in tests
Fixes 

It turns out there's something related to boot process in BIOS mode
which leads to initramfs corruption on later `kexec`.

Booting via GRUB is always successful.

Problem with kexec was confirmed with:

* direct boot via QEMU
* QEMU boot via iPXE (bundled with QEMU)

The root cause is not known, but the only visible difference is the
placement of RAMDISK with UEFI and BIOS boots:

```
[    0.005508] RAMDISK: [mem 0x312dd000-0x34965fff]
```

or:

```
[    0.003821] RAMDISK: [mem 0x711aa000-0x747a7fff]
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-02 21:52:18 +03:00
Noel Georgi
151c9df091
chore: add CSI tests for e2e-qemu
Add tests for using rook as CSI for e2e-qemu
Allow specifying cpu/memory for workers

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-01-27 20:06:10 +05:30
Andrey Smirnov
0bf161dffb
test: add integration test for system extensions
This verifies system extensions via the gVisor system extension.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-01-26 23:29:15 +03:00
Florian Klink
a50c42980f
fix: use #!/usr/bin/env bash as shebang instead of #!/bin/bash
This will fix running these scripts on distros without /bin/bash, but
where bash is in $PATH, such as NixOS.

Currently, `make fmt` otherwise fails to run:

```
make[3]: Leaving directory '/home/flokli/dev/numtide/manifoldfinance/talos'
sh: ./hack/fix-artifacts.sh: /bin/bash: bad interpreter: No such file or directory
make[2]: *** [Makefile:163: local-fmt-protobuf] Error 126
make[2]: Leaving directory '/home/flokli/dev/numtide/manifoldfinance/talos'
make[1]: *** [Makefile:274: fmt-protobuf] Error 2
make[1]: Leaving directory '/home/flokli/dev/numtide/manifoldfinance/talos'
make: *** [Makefile:277: fmt] Error 2
```

Signed-off-by: Florian Klink <flokli@flokli.de>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-01-25 23:11:39 +03:00
Andrey Smirnov
6dcce20e6f
test: set proper pod CIDR for Cilium tests
This fixes the issue with kubelet picking up wrong IP on restart, as
Talos doesn't know pod IPs (Cilium is using its own pod CIDR, it doesn't
look up Kubernetes settings).

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-11-15 23:50:00 +03:00
Spencer Smith
c8e404e356
test: update vars for AWS cluster
This PR updates to use the newest var setup from our capi templates.

Signed-off-by: Spencer Smith <spencer.smith@talos-systems.com>
2021-10-19 16:33:52 -04:00
Andrey Smirnov
68c420e3c9
feat: enable cluster discovery by default
This enables cluster discovery by default for Talos 0.14. KubeSpan is
not enabled by default.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-10-15 14:46:32 +03:00
Alexey Palazhchenko
e030b2e8bb chore: use k8s 1.21.3 in CAPI tests for now
Refs .

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-08-10 13:28:37 -07:00
Andrey Smirnov
e7a9164b1e test: implement talosctl conformance command to run e2e tests
Command implements two modes:

* `fast`: conformance suite is run at maximum speed
* `certified`: conformance suite is run in serial mode, results
  are capture to produce artifacts ready for CNCF submission process

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-04-16 09:17:51 -07:00
Andrey Smirnov
d7cdc8cc15 feat: implement simple layer 2 shared IP for CP
This adds a VIP (virtual IP) option to the network configuration of an
interface, which will allow a set of nodes to share a floating IP
address among them.  For now, this is restricted to control plane use
and only a single shared IP is supported.

Fixes 

Signed-off-by: Seán C McCord <ulexus@gmail.com>
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-26 14:14:34 -08:00
Andrey Smirnov
84ad6cbb1a chore: switch CI to stop embedding local registry into the builds
This adds new `IMAGE_REGISTRY` variable (similar to `IMAGE_TAG`) which
affects only the registry image gets pushed to, but it's not built into
the binaries and images as a default registry.

This fixes a problem when release builds reference our CI local
registry.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-02-24 18:05:37 +03:00
Artem Chernyshev
54d6a45217 feat: add state encryption support
State partition encryption support adds a new section to the machine config.
And a new step to the sequencer flow which saves encryption
configuration object as json serialized value in the META partition.

Everything else is the same as is for the ephemeral partition.
Additionally enabled state partition encryption in the disk encryption
integration tests.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-02-18 06:55:22 -08:00
Artem Chernyshev
58ff2c9808 feat: implement ephemeral partition encryption
This PR introduces the first part of disk encryption support.
New config section `systemDiskEncryption` was added into MachineConfig.
For now it contains only Ephemeral partition encryption.

Encryption itself supports two kinds of keys for now:
- node id deterministic key.
- static key which is hardcoded in the config and mainly used for test
purposes.

Talosctl cluster create can now be told to encrypt ephemeral partition
by using `--encrypt-ephemeral` flag.

Additionally:
- updated pkgs library version.
- changed Dockefile to copy cryptsetup deps from pkgs.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-02-17 13:39:04 -08:00
Artem Chernyshev
73c81c501e fix: pass disk image flags to e2e-qemu cluster create command
Forgot to add it in the original PR.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2020-12-22 23:57:31 -08:00
Artem Chernyshev
6540e9bf70 feat: support disk image in talosctl cluster create
Fixes: https://github.com/talos-systems/talos/issues/2973

Can now supply disk image using `--disk-image-path` flag.
May need to enable `--with-apply-config` if it's necessary to bootstrap
nodes properly.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2020-12-22 17:06:00 +03:00
Andrey Smirnov
9d1ac81be5 chore: lower MTU to 1450 for the tests in the CI
This should help with the CNI encapsulation in the cluster.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-12-17 17:14:07 +03:00
Andrey Smirnov
350d75eb46 feat: build talosctl-cni-bundle, use it in talosctl for QEMU
This builds a bundle with CNI plugins for talosctl which is
automatically downloaded by `talosctl` if CNI plugins are missing.

CNI directories are moved by default to the `~/.talos/cni` path.

Also add a bunch of pre-flight checks to the QEMU provisioner to make it
easier to bootstrap the Talos QEMU cluster.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-10-30 16:30:37 -07:00
Artem Chernyshev
061b296530 feat: allow specifying user-disks in talosctl cluster create
User-disks are supported by QEMU and Firecracker providers.
Can be defined by using the following parameters:
```
--user-disk /mount/path:1GB
```

Can get more than 1 user disk.
Same set of user disks will be created for all master and worker nodes.

Additionally enable user-disks in qemu e2e test.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2020-10-30 08:44:08 -07:00
Andrey Smirnov
ff0d4b305a feat: build Talos images/artifacts for amd64/arm64
By default, build outside of Drone works the same and builds only amd64
version, loads images back into dockerd, etc.

If multiple platforms are used, multi-arch images are built which can't
be exported to docker or to `.tar` image, they're always pushed to the
registry (even for PR builds to our internal CI registry).

Artifacts as files (initramfs, kernel) now have `-arch` suffix:
`vmlinuz-amd64`, `initramfs-amd64.xz`. "Magic" script normalizes output
paths depending on whether single platform or multiple platforms were
given.

VM provisioners accept magic `${ARCH}` in initramfs/kernel paths which
gets replaced by cluster architecture.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-09-27 10:32:07 -07:00