2137 Commits

Author SHA1 Message Date
Andrey Smirnov
545f75fd7a
feat: acquire machine config inline from kernel cmdline
Fixes #9175

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-09-06 19:41:47 +04:00
Noel Georgi
361283401e
chore: version specific kube-scheduler health checks
Use K8s version specific kube-scheduler health checks.

Ref: https://github.com/siderolabs/go-kubernetes/pull/17

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-09-06 19:47:47 +05:30
Andrey Smirnov
dd4185b144
feat: add KubeSpan extra endpoint configuration
Fixes #9174

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-09-06 14:50:12 +04:00
Andrey Smirnov
3038ccfa88
feat: add configuration for EPHEMERAL volume
Fixes #9261

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-09-06 14:11:35 +04:00
Andrey Smirnov
07b91797ca
fix: report internally service as unhealthy if not running
Otherwise the internal code might assume that the service is still
running and healthy, never issuing a health change event.

Fixes #9271

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-09-04 22:43:31 +04:00
Andrey Smirnov
6f7c3a8e5c
fix: build of talosctl on non-Linux arches
Move META constants out to machinery, and fix up imports. The internal
`pkg/meta` package shold not be consumed in public-facing commands.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-08-30 22:17:38 +04:00
Andrey Smirnov
b453385bd9
feat: support volume configuration, provisioning, etc
This implements the first round of changes, replacing the volume backend
with the new implementation, while keeping most of the external
interfaces intact.

See #8367

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-08-30 18:32:34 +04:00
Noel Georgi
b6b16b35fb
chore: pause sequencer when talos installed and iso booted
Pause sequencer till the boot timeout if talos is booted from ISO/PXE, but
an existing talos is installed to disk and
`talos.iso.boot.halt_if_installed` kernel argument is set.

Fixes: #9232

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-08-30 18:11:13 +05:30
Andrey Smirnov
be2ebf6b4d
chore: bump dependencies
Update tools, pkgs, extras, Go dependencies, Go tools, etc.

Linux 6.6.47 and containerd 2.0.0-rc.4.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-08-29 20:44:37 +04:00
Noel Georgi
88601bff4e
chore: drop calico from interactive installer
Drop calico from interactive installer.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-08-28 19:57:22 +05:30
Noel Georgi
19a44c2b0b
chore: drop console ttyS0 argument
Drop `console=ttyS0` argument for metal images/installer.

`console=ttyS0` causes lot of issues with bare metal hardware when
trying to use a physical serial port.

Ref:

* https://bugzilla.redhat.com/show_bug.cgi?id=1839923
* https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763601;msg=17
* https://www.kernel.org/doc/html/latest/admin-guide/serial-console.html
* https://github.com/coreos/fedora-coreos-tracker/issues/567

Fixes: #8695
Fixes: #8657
Fixes: #8127

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-08-27 22:24:59 +05:30
Claus Albøge
75cecb4210
feat: add Apache Cloudstack support
Add support for new platform.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Signed-off-by: Claus Albøge <ca@netic.dk>
2024-08-27 18:18:03 +04:00
Andrey Smirnov
a9551b7caa
fix: host DNS access with firewall enabled
Explicitly enable access to host DNS from pod/service IPs.

Also fix the Kubernetes health checks to assert number of ready pods to
match expectation, otherwise the check might skip a pod (e.g.
`kube-proxy` one) which is not ready, allowing the test to proceed too
early.

Update DNS test to print more logs on error.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-08-27 15:44:14 +04:00
Dmitry Sharshakov
4834a61a8e
feat: report SELinux labels
This will be useful for debugging SELinux implementation. Make API report other xattrs for further development like IMA/EVM

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-08-26 16:19:38 +03:00
Noel Georgi
8fe39eacba
chore: move csi tests as go test
Move rook-ceph CSI tests as go tests.
This allows us to add more CSI tests in the future.

Fixes: #9135

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-08-26 18:18:09 +05:30
Andrey Smirnov
5ff6cf82ca
fix: drop /opt mount for containers/tink
The `/opt/cni/bin` in the rootfs contains CNI binaries, which get
overwritten by the volume mount.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-08-22 20:39:52 +04:00
Utku Ozdemir
3041d90751
fix: always handle PermissionDenied in dashboard resource watches
A single resource not being there (i.e., the type does not exist on an older version of Talos) or not allowed to be read for whatever reason should not interrupt the refresh cycle of the other resources' status.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2024-08-20 22:25:19 +02:00
Andrey Smirnov
ee4290f684
fix: bind HostDNS to 169.254.x link-local address
This is an attempt to fix many issues related with trying to use Service
IP for host DNS.

Fixes #9196

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-08-19 18:44:35 +04:00
Dmitriy Matrenichev
45cc8688a1
chore: replace if blocks with min/max functions
Simplify code where possible.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-08-16 10:40:44 +03:00
Dmitriy Matrenichev
a5bd770bf9
fix: retry with another upstream if the previous failed
Do not return response to the client if we got SERVFAIL or REFUSED,
until we run out of upstreams.

Fixes #9143

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-08-14 22:19:10 +03:00
Andrey Smirnov
3c36c41a91
feat: provide device extra settle timeout
Fixes #9092

This is a workaround for broken hardware drivers (e.g. RAID
controllers), which report settled event too early.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-08-14 17:36:45 +04:00
Andrey Smirnov
61a1c946bf
feat: bundle (some) CNI plugins with Talos core
Fixes https://github.com/siderolabs/extensions/issues/448

Bundle some CNI standard plugins plus Flannel CNI plugin (as Flannel is
the default CNI in Talos) in the Talos `initramfs`.

With this change, no plugin install is required, so the `install-cni`
step is dropped from the Flannel default manifest.

The bundled plugins:

```
$ talosctl -n 172.20.0.2 ls -lH /opt/cni/bin/
NODE         MODE         UID   GID   SIZE(B)   LASTMOD       NAME
172.20.0.2   drwxr-xr-x   0     0     109 B     7 hours ago   .
172.20.0.2   -rwxr-xr-x   0     0     3.2 MB    7 hours ago   bridge
172.20.0.2   -rwxr-xr-x   0     0     3.3 MB    7 hours ago   firewall
172.20.0.2   -rwxr-xr-x   0     0     2.4 MB    7 hours ago   flannel
172.20.0.2   -rwxr-xr-x   0     0     2.4 MB    7 hours ago   host-local
172.20.0.2   -rwxr-xr-x   0     0     2.4 MB    7 hours ago   loopback
172.20.0.2   -rwxr-xr-x   0     0     2.8 MB    7 hours ago   portmap
```

The `initramfs` for amd64 grows 67 -> 73 MiB with this change.

The path `/opt/cni/bin` is still an overlay mount, so extra plugins can
be dropped to this directory (no change here).

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-08-14 14:33:18 +04:00
Noel Georgi
091da163b7
chore: support arm64 kexec from zboot kernel images
When using kernel images that are using ZBOOT for arm64 we need to
extract the vmlinux from the vmlinuz EFI file and pass it on the the
kexec call.

Ref: https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git/tree/kexec/kexec-pe-zboot.c

Fixes: #8907

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-08-13 20:56:00 +05:30
Serge Logvinov
ee67da14c5
feat: scaleway routed ip
Support new network feature "routed ip".
IPv4 now attached to the VM directly.

Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-08-12 22:42:34 +04:00
Noel Georgi
f9f5e0ef55
chore: fix k8s tests
The check for k8s suite added in #9085 causes issues with applying k8s resources
which are global like `Namespace` or `StorageClass`.

Instead of failing just log.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-08-09 13:28:02 +05:30
Noel Georgi
2ac8d2274f
chore: support unsupported flag for mkfs
Support `unsupported` flag for mkfs, so that `STATE` partition with size
less than 300M can be created by `mkfs.xfs`.

This allows to bring in newer `xfsprogs` that can repair corrupted FS
better.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-08-08 20:21:02 +05:30
Utku Ozdemir
9d34158500
fix: fix graph diffs in dashboard when node aliases are used
When `talosctl dashboard` is used with node "aliases" (e.g., node names or machine IDs in Omni) passed via `-n` flag, the graphs in the monitor tab were not rendered correctly: The matching of the old and current data were done incorrectly.

Fix this by pushing node alias->IP resolution down to the (api & log) data sources of the dashboard, by passing a resolver to them.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2024-08-07 14:54:32 +02:00
Noel Georgi
64914b086c
chore: add test for crun extension
Add a test to verify the `crun` runtimeclass container-runtime extension
works as expected.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-08-02 20:15:01 +05:30
Andrey Smirnov
7a1c62b8bc
feat: publish installed extensions as node labels/annotations
Extensions are posted the following way:

`extensions.talos.dev/<name>=<version>`

The name should be valid as a label (annotation) key.

If the value is valid as a label value, use labels, otherwise use
annotations.

Also implements node annotations in the machine config as a side-effect.

Fixes #9089

Fixes #8971

See #9070

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-08-01 17:32:09 +04:00
Andrey Smirnov
3f2058aba2
fix: update containerd configuration and settings
Provide `XDG_RUNTIME_DIR` environment variable, this specifically fixes
the `kubectl exec` action when `/tmp` is filled up.

Update containerd configuration to version 3 and fix it up.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-07-31 19:15:19 +04:00
Noel Georgi
50e5f37efb
chore: add test for apparmor
Add a test that verifies pods can be scheduled with `RuntimeDefault`
apparmor profile.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-07-30 20:24:57 +05:30
Noel Georgi
3ce5492f85
feat: runc memfd-bind service
Add a `runc-memfd-bind` service so that runc binary is not copied for
every `runc` invocation.

Fixes: #9007.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-07-29 19:02:59 +05:30
Noel Georgi
117628aa60
chore: add test for gvisor extension with platform kvm
Add test for Gvisor extensions when kvm platform is used.

The test is marked as skipped until pod termination issue is resolved.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-07-25 19:15:27 +05:30
EricMa
0872901783
feat: use ethtool ioctl to get link status when netlink api not available
when kernel not support ethtool-netlink,we will use ethtool-ioctl to get link status

Signed-off-by: EricMa <307748790@qq.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-23 19:07:55 +04:00
Jean-Francois Roy
fd54dc191d
feat(talosctl): append microsoft secure boot certs
This patch adds a flag to `secureboot.database.Generate` to append the
Microsoft UEFI secure boot DB and KEK certificates to the appropriate
ESLs, in addition to complimentary command line flags.

This patch also includes a copy of said Microsoft certificates. The
certificates are downloaded from an official Microsoft repo.

Signed-off-by: Jean-Francois Roy <jf@devklog.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-22 14:15:42 +04:00
Andrey Smirnov
fd6ddd11ef
feat: provide POD_IP env var to scheduler and controller-manager
Fixes #9031

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-17 21:41:15 +04:00
Andrey Smirnov
c288ace7b1
fix: be more smart when merging DNS resolver config
Fixes #8690

Consider the following scenario (e.g. OpenStack): platform issues a
correct list of DNS servers, which includes both IPv4 and IPv6
resolvers, and configures DHCPv4 on the interface.

DHCPv4 returns a set of IPv4 resolvers (as it can't return IPv6 ones),
and this list completely overrides the list from the platform, wiping
out the IPv6 resolvers completely.

With this change, the merge process is more smart, as it tries to
preserve IPv6 resolvers for example if the next layer provides no
resolvers for IPv6.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-16 21:20:27 +04:00
Andrey Smirnov
d983e44308
fix: panic on shutdown
Fixes #9017

Don't assume the config is there before trying to access it.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-16 17:24:03 +04:00
Andrey Smirnov
b07338f547
feat: provide machine config document to update trusted CA roots
Fixes #8867

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-12 19:28:31 +04:00
Andrey Smirnov
f14c4795e5
fix: sort ports and merge adjacent ones in the nft rule
Fixes #9009

When building a port interval set, sort the ports and merge adjacent
ranges to prevent mismatch on the nftables side.

With address sets, this was already the case due to the way IPRange
builder works, but ports need a manual implementation.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-12 14:53:23 +04:00
Andrey Smirnov
cf5effabb2
feat: provide an option to enforce SecureBoot for TPM enrollment
Fixes #8995

There is no security impact, as the actual SecureBoot
state/configuration is measured into the PCR 7 and the disk encryption
key unsealing is tied to this value.

This is more to provide a way to avoid accidentally encrypting to the
TPM while SecureBoot is not enabled.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-11 22:21:47 +04:00
Andrey Smirnov
736c1485e2
fix: change the UEFI firmware search path order
Ensure that SecureBoot enabled images come before regular ones.

With Ubuntu 24.04 `ovmf` package, due to the ordering of the search
paths `talosctl` might pick up a wrong image and disable SecureBoot.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-11 21:56:33 +04:00
Andrey Smirnov
398151e64f
fix: remove host bind mount for /tmp for trustd
Not sure why this mount was needed, but it was added long time ago, and
I believe it's no longer needed.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-10 19:56:21 +04:00
Dmitriy Matrenichev
fbde9c556f
chore: bump deps
Bump github.com/siderolabs/grpc-proxy to v0.4.1 and replace deprecated calls to `grpc.CustomCodec`.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-07-09 20:01:13 +03:00
Andrey Smirnov
3bab15214d
feat: update Kubernetes to 1.31.0-alpha.3
Fixes #8911

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-09 17:49:06 +04:00
Dmitriy Matrenichev
dad9c40c73
chore: simplify code
- replace `interface{}` with `any` using `gofmt -r 'interface{} -> any -w'`
- replace `a = []T{}` with `var a []T` where possible.
- replace `a = []T{}` with `a = make([]T, 0, len(b))` where possible.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-07-08 18:14:00 +03:00
Andrey Smirnov
2512ef435f
test: fix the integrtion tests for apply-config
They got broken after refactoring.

Also use this PR to test things before the release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-08 14:06:45 +04:00
Dmitriy Matrenichev
076f3c4f20
chore: improve link spec controller code
`SortBonds` function bothered me since the last time I refactored this part.

We always know that it only accepts `network.LinkSpec`s, but we accepted the slice of untyped Resources because
this is what `List` method returns. Now we can do better, since `safe.List` now supports `Swap` method.

We can utilize `sort.Interface` and pass `safe.List` directly to `SortBonds`.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-07-05 16:39:27 +03:00
Andrey Smirnov
0454130ad9
feat: suppress controller runtime first N failures on the console
As the controllers might fail with transient errors on machine startup,
but errors are always retried, persisten errors will anyway show up in
the console.

The full `talosctl logs controller-runtime` are not suppressed.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-05 15:36:54 +04:00
Andrey Smirnov
be35f380cc
chore: update pkgs/tools/extras
This brings in Go 1.22.5 and new Flannel CNI plugin.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-03 20:38:55 +04:00