4598 Commits

Author SHA1 Message Date
Grzegorz Rożniecki
7ba18555b0
docs: fix typos in Akamai and AWS platform docs
Fix typos in Akamai Connected Cloud (Linode) and AWS platform docs.

Signed-off-by: Grzegorz Rozniecki <grozniec@akamai.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-16 14:34:23 +04:00
Artem Chernyshev
3dd1f4e88c
chore: extract pkg/imager/quirks to pkg/machinery
To make it possible to use it without pulling the whole Talos.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2024-04-15 21:37:47 +03:00
Bernard Gütermann
78bc3a433e
docs: update Cilium docs
Update the Cilium CNI documentation.

Signed-off-by: Bernard Gütermann <bernard.gutermann@sekops.ch>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-12 17:09:44 +04:00
Andrey Smirnov
831f3d39e9
feat: update Flannel to v0.25.1
See https://github.com/flannel-io/flannel/releases/tag/v0.25.1

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-12 16:19:45 +04:00
Andrey Smirnov
ea5b3ff0c2
feat: update Kubernetes to v1.30.0-rc.2
See https://github.com/kubernetes/kubernetes/releases/tag/v1.30.0-rc.2

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-12 14:05:39 +04:00
Andrey Smirnov
54dac5ed40
feat: update Linux 6.6.24, containerd 1.7.15
Updates to match 1.7.0-beta.1 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-11 16:23:42 +04:00
Evan Johnson
c51f146daf
docs: update Akamai platform docs
Update install docs for the Akamai platform.

Signed-off-by: Evan Johnson <ejohnson@akamai.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-11 14:13:02 +04:00
looklose
9550f5ff7a
docs: fix getAuthenticationMethod and completePathFromNode docs
Both of those contained incorrect comments.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-04-10 20:23:52 +03:00
Andrey Smirnov
bfbd02abfb
fix: assign different priority to IPv6 default gateway on OpenStack
Fixes #8558

Similar fix is done for other platforms, but not OpenStack.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-10 21:02:13 +04:00
Andrey Smirnov
c8f674bd3d
test: add a test for 'spin' container runtime
See https://github.com/siderolabs/extensions/pull/355

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-10 20:42:16 +04:00
Dmitriy Matrenichev
5390ccd48c
chore: replace []byte with string and use go:embed for templates
Optimize code a bit.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-04-10 17:47:43 +03:00
Dmitriy Matrenichev
ba7cdc8c8b
chore: optimize DNSResolveCacheController
Optimize `DNSResolveCacheController` type, including `dns.Server` optimization for easy start/stop. This PR ensures that we
delete server from runners on stop (even unexpected) and restart it properly. Also fixes incorrect assumption on unit-tests.

Fixes #8563

This PR also does those things:
- Removes `utils.Runner`
- Removes `ctxutil.MonitorFn`
- Removes `dns.Runner`
- Removes `network.dnsRunner`

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-04-10 17:24:19 +03:00
Andrey Smirnov
145f240630
fix: don't modify a global map of profiles
This shows up in image-factory tests, where multiple images are
generated at once, and the global map write access panics.

This was a bad idea in general to mutate global state on image
generation.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-10 17:52:48 +04:00
Andrey Smirnov
6fe91ad9cf
feat: provide Kubernets/Talos version compatibility for 1.8
Fixes #8572

This allows to use 1.7 machinery with future 1.8 (e.g. alpha) versions.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-10 16:55:42 +04:00
Andrey Smirnov
909a5800e4
fix: generate secureboot ISO .der certificate correctly
Previous approach relied on a field which is _only_ present if
file-based PKI is passed in, and fails for e.g. Azure KMS.

See https://github.com/siderolabs/image-factory/issues/104

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-10 16:04:16 +04:00
Andrey Smirnov
b0fdc3c8ca
fix: make static pods check output consistent
Sort the pod names, so the check output doesn't re-print itself on no
change to the list of pods.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-10 15:30:24 +04:00
Andrey Smirnov
c6ad0fcceb
fix: validate that workers don't get cluster CA key
Only the cert should be present on worker nodes, enforce this via
validation.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-10 14:24:46 +04:00
Utku Ozdemir
3735add87c
fix: reconnect to the logs stream in dashboard after reboot
The log stream displayed in the dashboard was stopping to work when a node was rebooted.
Rework the log data source to establish a per-node connection and use a retry loop to always reconnect until the dashboard is terminated.

Print the connection errors in the log stream in red color.

Closes siderolabs/talos#8388.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2024-04-10 10:43:45 +02:00
Andrey Smirnov
9aa1e1b79b
fix: present all accepted CAs to the kube-apiserver
This fixes an issue with a single controlplane cluster.

Properly present all accepted CAs to the apiserver, in the test let the
cluster fully recovery between two CA rotations performed.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-08 23:33:22 +04:00
Andrey Smirnov
336e611746
fix: close the apid connection to other machines gracefully
Fixes #8552

When `apid` notices update in the PKI, it flushes its client connections
to other machines (used for proxying), as it might need to use new
client certificate.

While flushing, just calling `Close` might abort already running
connections.

So instead, try to close gracefully with a timeout when the connection
is idle.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-08 19:47:04 +04:00
Andrey Smirnov
ff2c427b04
fix: pre-create nftables chain to make kubelet use nftables
In Talos, kubelet (and kube-proxy) images use `iptables-wrapper` script
to detect which version of `iptables` (legacy or NFT) to use.

The script assumes that `kubelet` runs on the host, and uses whatever
version of `iptables` which is being used by the host. In Talos,
`kubelet` runs in a container which has same `iptables-wrapper` script,
and it defaults to `legacy` mode in our case.

We can't check the `kubelet` image, as it would affect all Talos
version, so instead pre-create the chains/tables in `nftables` so that
kubelet will pick up `nft` version of `iptables`, and `kube-proxy` will
do the same.

Without this fix, the problem arises from the mix of `nft` used by Talos
for the firewall and Kubernetes world relying on `legacy` (`xtables`).

Fixes https://github.com/siderolabs/kubelet/issues/77

See e139a11535/iptables-wrapper-installer.sh (L102-L130)

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-08 16:24:42 +04:00
Dmitriy Matrenichev
5622f0e450
docs: change localDNS to hostDNS in release notes yaml section
Also add a note about how-to enable dns caching for k8s pods.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-04-05 20:08:46 +03:00
Dmitriy Matrenichev
01d8b897c4
fix: make safeReset truly safe to call multiple times
Reading documentation is important, because `timer.Stop()` explicitly says that it will return false if it
already expired *OR* it has been already stopped. Previous version of code would block forever and because of
that code tunnel relay never started.

Take that into account with new version.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-04-05 00:34:17 +03:00
Dmitry Sharshakov
653f838b09
feat: support multiple Docker cluster in talosctl cluster create
Dynamically map Kubernetes and Talos API ports to an available port on
the host, so every cluster gets its own unique set of parts.

As part of the changes, refactor the provision library and interfaces,
dropping old weird interfaces replacing with (hopefully) much more
descriprive names.

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-04 21:21:39 +04:00
Andrey Smirnov
951904554e
chore: bump dependencies (go 1.22.2)
Update Go to 1.22.2, update Go modules to resolve
[HTTP/2 issue](https://www.kb.cert.org/vuls/id/421644).

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-04 14:59:24 +04:00
Andrey Smirnov
862c76001b
feat: add support for CoreDNS forwarding to host DNS
This PR adds the support for CoreDNS forwarding to host DNS. We try to bind on 9th address on the first element from
`serviceSubnets` and create a simple service so k8s will not attempt to rebind it.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Co-authored-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-04-03 23:36:17 +03:00
Evan Johnson
e8ae5ef63a
feat: add akamai platform support
Add support for the Akamai(Linode) platform

Signed-off-by: Evan Johnson <ejohnson@akamai.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-03 19:50:42 +04:00
Andrey Smirnov
5c0f74b377
fix: don't announce the VIP on acquire failure
I noticed that while looking at #8493, but I don't know if this problem
actually happened in real life.

If acquiring a VIP fails (which can only fail for Equinix/HCloud, not L2
ARP announce), we should not set the leader flag, as it would make the
controller announce the IP, while it shouldn't do that.

If this call fails, there's no matching call to de-announce on failure.

The bug would show up as two nodes having same VIP assigned on the host.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-03 18:04:44 +04:00
Noel Georgi
2f0fe10d55
chore: update sbc docs
Update SBC docs to reflect change in schematic ID.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-04-03 18:53:55 +05:30
Andrey Smirnov
1b17008e9d
fix: handle more OpenStack link types
Fixes #8481

The issue was that the link 'bridge' was skipped, so Talos default was
applied to run DHCP and use the DHCP hostname (instead of using
platform's hostname).

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-03 16:54:36 +04:00
Andrey Smirnov
e7d8041404
fix: always update firewall rules (kubespan)
Fixes #8498

Before KubeSpan was reimplemented to use resources for firewall rules,
the update was happening always, but it got moved to a wrong section of
the controller which gets executed on resource updates, but ignores
updates of the peer statuses.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-03 16:33:16 +04:00
Andrey Smirnov
78b9bd9273
fix: report unsupported x86_64 microarchitecture level
Fixes #8361

Talos requires v2 (circa 2008), but VMs are often configured to limit
the exposed features to the baseline (v1).

```
[    0.779218] [talos] [initramfs] booting Talos v1.7.0-alpha.1-35-gef5bbe728-dirty
[    0.779806] [talos] [initramfs] CPU: QEMU Virtual CPU version 2.5+, 4 core(s), 1 thread(s) per core
[    0.780529] [talos] [initramfs] x86_64 microarchitecture level: 1
[    0.781018] [talos] [initramfs] it might be that the VM is configured with an older CPU model, please check the VM configuration
[    0.782346] [talos] [initramfs] x86_64 microarchitecture level 2 or higher is required, halting
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-03 16:09:57 +04:00
Dmitriy Matrenichev
71d90ba5f3
fix: retry in the fixed amount of time if grpc relay failed
Before this commit, if tunnel failed with error, it would never restart again until `siderolink.TunnelType` event happen.
For most of the time it's a good idea, because it might mean that destination has changed.

But tunnel can also fail because allowed peer list is not yet loaded on newly started Omni instance.

Because of that, we want to try again and not be tied to the runtime event channel.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-04-03 14:03:42 +03:00
Noel Georgi
d320498a44
chore: bump dependencies
Bump dependencies, bring in v1.30.0-rc.1 of k8s.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-04-03 12:25:10 +05:30
Andrey Smirnov
3195e5d15c
fix: force Flannel CNI to use KubePrism Kubernetes API endpoint
Fixes #8501

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-02 22:01:05 +04:00
Noel Georgi
917043fb55
chore: bump tools, pkgs and extra to stable
Bump tools, pkgs and extras to stable release.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-04-02 22:15:50 +05:30
Noel Georgi
f515741b52
chore: add equinix e2e-tests
Add equinix e2e-tests.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-04-02 17:16:59 +05:30
Andrey Smirnov
117e60583d
feat: add support for static extra fields for JSON logs
Fixes #7356

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-02 15:15:14 +04:00
Andrey Smirnov
090143b030
fix: allow platform cmdline args to be platform-specific
Fix Equnix Metal (where proper arm64 args are known) and metal platform
(using generic arm64 console arg).

Other platforms might need to be updated, but correct settings are not
known at the moment.

Fixes #8529

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-02 14:41:39 +04:00
Andrey Smirnov
7a68504b6b
feat: support rotating Kubernetes CA
Fixes #8440

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-01 22:08:02 +04:00
Andrey Smirnov
fac3dd0430
fix: don't set default endpoints on gen config
Fixes #8500

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-01 21:21:58 +04:00
Dmitriy Matrenichev
8dc4910c48
chore: enable "WG over GRPC" testing in siderolink agent tests
Fixes https://github.com/siderolabs/talos/issues/8514
For https://github.com/siderolabs/talos/issues/8392

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-04-01 18:24:57 +03:00
Noel Georgi
bac366e43e
chore: add ExtraInfo field for extensions
Add an extra field to extensions to store arbitrary info.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-04-01 19:30:29 +05:30
Niklas Wik
0fc24eeb09
feat: provide insecure flag to imager
provides flag for imager to pull images insecurely from private registries

Signed-off-by: Niklas Wik <niklas.wik@nokia.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-01 15:53:56 +04:00
Andrey Smirnov
a6b2f54564
feat: update Kubernetes to 1.30.0-rc.0, etcd to 3.5.13
See:

* https://github.com/etcd-io/etcd/releases/tag/v3.5.13
* https://github.com/kubernetes/kubernetes/releases/tag/v1.30.0-rc.0

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-04-01 14:50:52 +04:00
Justin Garrison
0361ff8956
docs: quickstart video and brew install
Change the quickstart guide to use brew install instructions. Updated
command formatting and added warning for macOS Docker Desktop users.

Signed-off-by: Justin Garrison <justin.garrison@siderolabs.com>
2024-03-28 09:56:13 -07:00
Dmitry Sharshakov
b752a86183
chore: talosctl: add openSUSE OVMF paths
Tested both secureboot and non-secure code. Not enabled SB by default

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
2024-03-25 18:49:08 +03:00
Dmitry Sharshakov
9456489147
feat: support hardware watchdog timers
Only enabled when activated by config, disabled on shutdown/reboot

Fixes #8284

Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com>
Signed-off-by: Dmitry Sharshakov <d3dx12.xx@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-25 18:19:39 +03:00
Dmitriy Matrenichev
949ad11a2d
chore: import siderolink as siderolink-launch subcommand
This PR ensures that we can test our siderolink communication using embedded siderolink-agent.
If `--with-siderolink` provided during `talos cluster create` talosctl will embed proper kernel string and setup `siderolink-agent` as a separate process. It should be used with combination of `--skip-injecting-config` and `--with-apply-config` (the latter will use newly generated IPv6 siderolink addresses which talosctl passes to the agent as a "pre-bind").

Fixes #8392

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-03-23 16:08:56 +03:00
Noel Georgi
ee51f04af3
chore: azure e2e
Add code to support azure e2e

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-03-23 17:30:36 +05:30