talos

Author	SHA1	Message	Date
Andrey Smirnov	1b17008e9d	fix: handle more OpenStack link types Fixes #8481 The issue was that the link 'bridge' was skipped, so Talos default was applied to run DHCP and use the DHCP hostname (instead of using platform's hostname). Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-04-03 16:54:36 +04:00
Andrey Smirnov	e7d8041404	fix: always update firewall rules (kubespan) Fixes #8498 Before KubeSpan was reimplemented to use resources for firewall rules, the update was happening always, but it got moved to a wrong section of the controller which gets executed on resource updates, but ignores updates of the peer statuses. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-04-03 16:33:16 +04:00
Andrey Smirnov	78b9bd9273	fix: report unsupported x86_64 microarchitecture level Fixes #8361 Talos requires v2 (circa 2008), but VMs are often configured to limit the exposed features to the baseline (v1). ``` [ 0.779218] [talos] [initramfs] booting Talos v1.7.0-alpha.1-35-gef5bbe728-dirty [ 0.779806] [talos] [initramfs] CPU: QEMU Virtual CPU version 2.5+, 4 core(s), 1 thread(s) per core [ 0.780529] [talos] [initramfs] x86_64 microarchitecture level: 1 [ 0.781018] [talos] [initramfs] it might be that the VM is configured with an older CPU model, please check the VM configuration [ 0.782346] [talos] [initramfs] x86_64 microarchitecture level 2 or higher is required, halting ``` Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-04-03 16:09:57 +04:00
Dmitriy Matrenichev	71d90ba5f3	fix: retry in the fixed amount of time if grpc relay failed Before this commit, if tunnel failed with error, it would never restart again until `siderolink.TunnelType` event happen. For most of the time it's a good idea, because it might mean that destination has changed. But tunnel can also fail because allowed peer list is not yet loaded on newly started Omni instance. Because of that, we want to try again and not be tied to the runtime event channel. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-04-03 14:03:42 +03:00
Andrey Smirnov	3195e5d15c	fix: force Flannel CNI to use KubePrism Kubernetes API endpoint Fixes #8501 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-04-02 22:01:05 +04:00
Noel Georgi	f515741b52	chore: add equinix e2e-tests Add equinix e2e-tests. Signed-off-by: Noel Georgi <git@frezbo.dev>	2024-04-02 17:16:59 +05:30
Andrey Smirnov	117e60583d	feat: add support for static extra fields for JSON logs Fixes #7356 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-04-02 15:15:14 +04:00
Andrey Smirnov	090143b030	fix: allow platform cmdline args to be platform-specific Fix Equnix Metal (where proper arm64 args are known) and metal platform (using generic arm64 console arg). Other platforms might need to be updated, but correct settings are not known at the moment. Fixes #8529 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-04-02 14:41:39 +04:00
Andrey Smirnov	7a68504b6b	feat: support rotating Kubernetes CA Fixes #8440 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-04-01 22:08:02 +04:00
Dmitriy Matrenichev	8dc4910c48	chore: enable "WG over GRPC" testing in siderolink agent tests Fixes https://github.com/siderolabs/talos/issues/8514 For https://github.com/siderolabs/talos/issues/8392 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-04-01 18:24:57 +03:00
Dmitry Sharshakov	9456489147	feat: support hardware watchdog timers Only enabled when activated by config, disabled on shutdown/reboot Fixes #8284 Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com> Signed-off-by: Dmitry Sharshakov <d3dx12.xx@gmail.com> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-03-25 18:19:39 +03:00
Dmitriy Matrenichev	949ad11a2d	chore: import siderolink as `siderolink-launch` subcommand This PR ensures that we can test our siderolink communication using embedded siderolink-agent. If `--with-siderolink` provided during `talos cluster create` talosctl will embed proper kernel string and setup `siderolink-agent` as a separate process. It should be used with combination of `--skip-injecting-config` and `--with-apply-config` (the latter will use newly generated IPv6 siderolink addresses which talosctl passes to the agent as a "pre-bind"). Fixes #8392 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-03-23 16:08:56 +03:00
Andrey Smirnov	8eacc4ba80	feat: support rotation of Talos API CA This allows to roll all nodes to use a new CA, to refresh it, or e.g. when the `talosconfig` was exposed accidentally. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-03-22 12:16:47 +04:00
Dmitry Sharshakov	84ec8c16f3	feat: support syncing to PTP clocks Also abstract away from NTP types. Signed-off-by: Dmitry Sharshakov <dmitry.sharshakov@siderolabs.com> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-03-21 17:20:26 +04:00
Andrey Smirnov	7d43c9aa6b	chore: annotate installer errors I want to catch a spurious error `ENODEV`, where exactly it comes from. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-03-21 16:58:34 +04:00
Andrey Smirnov	f737e6495c	fix: populate routes to BGP neighbors (Equinix Metal) Fixes #8267 Also refactor the code so that we don't fail hard on mutiple bonds, but it's not clear still how to attach addresses, as they don't have a interface name field, so for now attaching to the first bond. Fixes #8411 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-03-21 15:44:21 +04:00
Dmitriy Matrenichev	19f15a840c	chore: bump golangci-lint to 1.57.0 Fix all discovered issues. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-03-21 01:06:53 +03:00
Artem Chernyshev	113fb646ec	chore: use `go-talos-support` library The code for collecting Talos `support.zip` was extracted there. Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2024-03-19 18:28:46 +03:00
Andrey Smirnov	89fc68b459	fix: service lifecycle issues The core change is moving the context out of the `ServiceRunner` struct to be a local variable, and using a channel to notify about shutdown events. Add more synchronization between Run and the moment service started to avoid mis-identifying not running (yet) service as successfully finished. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com> Co-authored-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-03-19 18:11:13 +04:00
Andrey Smirnov	ead37abf09	test: disable volume tests They're flaky, disable until the root cause is known. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-03-19 16:40:42 +04:00
Andrey Smirnov	15beb14780	feat: implement blockdevice watch controller This controller combines kobject events, and scan of `/sys/block` to build a consistent list of available block devices, updating resources as the blockdevice changes. Based on these resources the next step can run probe on the blockdevices as they change to present a consistent view of filesystems/partitions. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-03-18 18:28:40 +04:00
Dmitriy Matrenichev	06e3bc0cbd	feat: implement Siderolink wireguard over GRPC For #8064 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-03-18 15:38:13 +03:00
Andrey Smirnov	9afa70baf3	fix: patch correctly config in `talosctl upgrade-k8s` The current code was stipping non-`v1alpha1.Config` documents. Provide a proper method in the config provider, and update places using it. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-03-15 20:42:44 +04:00
Andrey Smirnov	3130caf954	chore: re-enable DRBD extension See https://github.com/siderolabs/extensions/pull/343 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-03-15 15:55:18 +04:00
Utku Ozdemir	7376f34e82	fix: remove maintenance config when maintenance service is shut down We now remove the machine config with the id `maintenance` when we are done with it - when the maintenance service is shut down. Closes siderolabs/talos#8424, where in some configurations there would be machine configs with both `v1alpha1` and `maintenance` IDs present, causing the `talosctl edit machineconfig` to loop twice and causing confusion. Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>	2024-03-14 12:51:59 +01:00
Noel Georgi	d118a852b9	feat: implement `Install` for imager overlays Implement `Install` for imager overlays. Also add support for generating installers. Depends on: #8377 Fixes: #8350 Fixes: #8351 Fixes: #8350 Signed-off-by: Noel Georgi <git@frezbo.dev>	2024-03-12 22:46:29 +05:30
Dmitriy Matrenichev	32e0877607	chore: print all available logs containers in `logs` command completions This is a small quality of life improvement that allows `logs` subcommand to suggest all available logs. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-03-11 17:48:01 +03:00
Utku Ozdemir	1bb6027ccd	fix: fix nil panic on maintenance upgrade with partial config Fix the nil dereferences when a Talos node is attempted to be upgraded while in maintenance mode and having a partial machine config. Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>	2024-03-08 12:52:21 +03:00
Noel Georgi	1ec6683e0c	chore: use go-copy Use go-copy and drop `pkg/copy`. Signed-off-by: Noel Georgi <git@frezbo.dev>	2024-03-07 19:51:28 +05:30
Artem Chernyshev	3c8f51d707	chore: move cli formatters and version modules to machinery To be used in the `go-talos-support` module without importing the whole Talos repo. Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2024-03-07 16:29:15 +03:00
Noel Georgi	f23bd81448	fix: syslog parser Fixes a condition when the timestamp contains a single digit day. This started failing when the month started :sweat_smile. Also handle a case when `tag` and `hostname` are both missing. Signed-off-by: Noel Georgi <git@frezbo.dev>	2024-03-04 11:08:46 +05:30
Andrey Smirnov	bbed07e03a	feat: update Linux to 6.6.18 ZFS extension got re-enabled for 1.7. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-29 20:08:59 +04:00
Andrey Smirnov	0b9b4da12a	feat: update Kubernetes to 1.30.0-alpha.3 See https://github.com/kubernetes/kubernetes/releases/tag/v1.30.0-alpha.3 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-29 14:36:09 +04:00
Dmitriy Matrenichev	f8c556a1ce	chore: listen for dns requests on 127.0.0.53 Turns out there is actually no black magic in systemd, they simply listen on 127.0.0.53 and forward dns requests there in resolv.conf. Reason is the same as ours — to preserve compatibility with other applications. So we do the same in our code. This PR also does two things: - Adds `::1` into resolv.conf for IPv6 only resolvers. - Drops `SO_REUSEPORT` from control options (it works without them). Closes #8328 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-02-26 20:59:12 +03:00
Andrey Smirnov	8872a7a210	fix: ignore 'no such device' in addition to 'no such file' This errors pops up when `udevd` rescans the partition table with Talos trying to mount a device concurrently. This feels to be something new with Linux 6.6 probably. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-26 20:00:05 +04:00
Andrey Smirnov	67ac6933d3	fix: handle errors to watch apid/trustd certs Fixes #8345 Both `apid` and `trustd` services use a gRPC connection back to `machined` to watch changes to the certificates (new certificates being issued). This refactors the code to follow regular conventions, so that a failure to watch will crash the process, and they have a way to restart and re-establish the watch. Use the context and errgroup consistently. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-23 17:38:56 +04:00
Christian WALDBILLIG	c79d69c2e2	fix: only set gateway if set in context (opennebula) Fix the network config setup. Signed-off-by: Christian WALDBILLIG <christian@waldbillig.io> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-23 17:05:33 +04:00
Fabiano Fidêncio	64e9703f86	chore: add tests for the Kata Containers extension Let's add a very basic test for the Kata Containers extension, mimicing what's already in place for gVisor. This depends on the work being done in: https://github.com/siderolabs/extensions/pull/279 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Noel Georgi <git@frezbo.dev>	2024-02-20 18:49:47 +05:30
Andrey Smirnov	66f3ffdd4a	fix: ensure that Talos runs in a pod (container) Drop the Kubernetes manifests as static files clean up (this is only needed for upgrades from 1.2.x). Fix Talos handling of cgroup hierarchy: if started in container in a non-root cgroup hiearachy, use that to handle proper cgroup paths. Add a test for a simple TinK mode (Talos-in-Kubernetes). Update the docs. Fixes #8274 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-20 15:06:48 +04:00
Noel Georgi	9dbc33972a	feat: add basic syslog implementation Add a basic syslog listening on `/dev/log`. Fixes: #8087 Signed-off-by: Noel Georgi <git@frezbo.dev>	2024-02-20 15:02:06 +05:30
Utku Ozdemir	0b7a27e6a1	feat: allow access to all resources over siderolink in maintenance mode SideroLink is a secure channel, so we can allow read access to the resources. This will give us more control of the node via Omni and/or other systems using SideroLink. Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>	2024-02-16 16:39:11 +01:00
Andrey Smirnov	7ee999f8a3	fix: disable KubeSpan endpoint harvesting by default This disables by default (if not specified in the machine config) the endpoint harvesting for KubeSpan peers. The idea was to observe Wireguard endpoints as seen by other peers in the cluster, and add them to the list of endpoints for the node. This might be helpful only in case of some special type of NATs which are almost never seen in the wild today. So disable by default, but keep an option to enable it. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-16 18:18:33 +04:00
Utku Ozdemir	493bb60f81	fix: correctly handle partial configs in `DNSUpstreamController` Prevent `DNSUpstreamController` from panicking by checking if the `machine` section in the config is `nil`. This is the case when a machine has partial configuration, e.g., when the machine has only a `SideroLinkConfig` in its config. Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>	2024-02-16 10:31:54 +01:00
Andrey Smirnov	1366ce14a8	feat: update Kubernetes to v1.30.0-alpha.2 Talos Linux 1.7.0 will ship with Kubernetes v1.30.0. Drop some compatibility for Kubernetes < 1.25, as 1.25 is the minimum supported version now. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-15 21:56:56 +04:00
Noel Georgi	15e8bca2b2	feat: support environment in `ExtensionServicesConfig` Support setting extension services environment variables in `ExtensionServiceConfig` document. Refactor `ExtensionServicesConfig` -> `ExtensionServiceConfig` and move extensions config under `runtime` pkg. Fixes: #8271 Signed-off-by: Noel Georgi <git@frezbo.dev>	2024-02-15 20:16:29 +05:30
Matthieu S	3fe82ec461	feat: custom image settings for k8s upgrade Allows to use custom registry/images. Fixes: #8275 Co-authored-by: @g3offrey Signed-off-by: Matthieu STROHL <mstrohl@dive-in-it.com> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-15 17:54:01 +04:00
Dmitriy Matrenichev	fa3b933705	chore: replace fmt.Errorf with errors.New where possible This time use `eg` from `x/tools` repo tool to do this. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-02-14 17:39:30 +03:00
Noel Georgi	2f0421b406	fix: run xfs_repair on invalid argument error Run `xfs_repair` for invalid argument error. Part of: #8292 Signed-off-by: Noel Georgi <git@frezbo.dev>	2024-02-13 23:01:33 +05:30
Dmitriy Matrenichev	fa2d34dd88	chore: enable v6 support on the same port Replace `SO_REUSEPORT` with `SO_REUSEPORT`. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-02-13 01:02:27 +03:00
Dmitriy Matrenichev	83e0b0c19a	chore: adjust dns sockets settings Enable some TCP optimization, set minimal TTL, set socket reuse. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-02-12 17:13:03 +03:00
Dmitriy Matrenichev	5324d39167	chore: bump stuff Also fix .golangci.yml file. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-02-09 19:19:25 +03:00
Dmitriy Matrenichev	afa71d6b02	chore: use "handle-like" resource in `DNSResolveCacheController` Rework (and simplify) `DNSResolveCacheController` to use `DNSUpstream` "handle-like" resources. Depends on https://github.com/cosi-project/runtime/pull/400 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-02-08 21:40:57 +03:00
Andrey Smirnov	3f8a85f1b3	fix: unlock the upgrade mutex properly Fixes #4525 The previous implementation had several issues: * etcd concurrency session never closed * Unlock() with potentially closed context * unlocking when upgrade sequence finishes, but this overlaps with the machine reboot, so a chance that it never got unlocked Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-08 15:50:02 +04:00
Noel Georgi	1e6c8c4dec	feat: extensions services config Support config files for extension services. Fixes: #7791 Co-authored-by: Andrey Smirnov <andrey.smirnov@siderolabs.com> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com> Signed-off-by: Noel Georgi <git@frezbo.dev>	2024-02-06 17:12:01 +05:30
shurkys	989ca3ade1	feat: add OpenNebula platform support Initial support without documentation. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com> Signed-off-by: shurkys <no@mail.com>	2024-02-05 20:43:47 +04:00
Henno Schooljan	a04cc80154	fix: pass TTL when generating client certificate Pass the TTL to the talosconfig generation function. Signed-off-by: Henno Schooljan <github@sfynx.nl> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-05 18:54:16 +04:00
Dmitriy Matrenichev	3fe8c12ca6	fix: add log line about controller runtime failing While we decide what to do with #8263 and #8256 this quickfix at least allows us to see what went wrong Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-02-05 17:22:02 +03:00
Andrey Smirnov	ddbabc7e58	fix: use a separate cgroup for each extension service Fixes #8229 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-05 17:37:55 +04:00
Saiyam Pathak	4184e617ab	chore: add test for wasmedge runtime extension Add tests for WasmEdge container runtime system extension. Signed-off-by: Saiyam Pathak <saiyam911@gmail.com> Signed-off-by: Noel Georgi <git@frezbo.dev>	2024-02-05 18:18:13 +05:30
Andrey Smirnov	95ea3a6c65	chore: bump timeout in acquire tests With switching to RSA service account, machine config generation time is considerably higher now, so the test might not make it in time. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-05 15:18:22 +04:00
Andrey Smirnov	2ff81c06bc	feat: update runc 1.1.12, containerd 1.7.13 Also: * Linux 6.6.14 + XDP enablement * etcd 3.5.12 Various other bumps for the tools, utilities, and Go modules. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-01 17:01:04 +04:00
Andrey Smirnov	9d8cd4d058	chore: drop deprecated method EtcdRemoveMember It was deprecated 16 months ago, time to cleanup. (This is to prepare for the first v1.7 release) Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-01 15:54:29 +04:00
Andrey Smirnov	17567f19be	fix: take into account the moment seen when cleaning up CRI images Fixes #8069 The image age from the CRI is the moment the image was pulled, so if it was pulled long time ago, the previous version would nuke the image as soon as it is unreferenced. The new version would allow the image to stay for the full grace period in case the rollback is requested. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-02-01 14:44:22 +04:00
Andrey Smirnov	593afeea38	fix: run the interactive installer loop to report errors In the previous implementation, even though `installer.err` was set, it was never checked 🤦. The run loop was stolen from the dashboard code. Fixes #8205 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-01-31 19:20:46 +04:00
Andrey Smirnov	87be76b878	fix: be more tolerant to error handling in Mounts API Fixes #8202 If some mountpoint can't be queried successfully for 'diskfree' information, don't treat that as an error, and report zero values for disk usage/size instead. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-01-31 18:24:38 +04:00
Dmitriy Matrenichev	ebeef28525	feat: implement local caching dns server This PR adds a new controller - `DNSServerController` that starts tcp and udp dns servers locally. Just like `EtcFileController` it monitors `ResolverStatusType` and updates the list of destinations from there. Most of the caching logic is in our "lobotomized" "`CoreDNS` fork. We need this fork because default `CoreDNS` carries full Caddy server and various other modules that we don't need in Talos. On our side we implement random selection of the actual dns and request forwarding. Closes #7693 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-01-29 20:26:38 +03:00
Andrey Smirnov	b44551ccdb	feat: update Linux to 6.6.13 See https://github.com/siderolabs/pkgs/pull/873 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-01-29 16:50:33 +04:00
Andrey Smirnov	d677901b67	feat: implement device selector for 'physical' Closes #8090 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-01-23 15:05:51 +04:00
Andrey Smirnov	c1e45071f0	refactor: use etcd configuration from the EtcdSpec resource This is currently no-op, just noticed that while looking into another bug. This should make the intention more clean. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-01-22 16:06:16 +04:00
Andrey Smirnov	474eccdc4c	fix: watch bufer overrun for RouteStatus Fixes #8157 This PR contains two fixes, both related to the same problem. Several routes for different links but same IPv6 destination might exist at the same time, so route resource ID should handle that. The problem was that these routes were mis-reported causing internally updates for the same resources multiple times (equal to the number of the links). Don't trigger controllers more often than 10 times/seconds (with burst of 5) for kernel notifications. This ensures Talos doesn't try to reflect current state of the network subsystem too often as resources, which causes excessive CPU usage and might potentially lead to the buffer overrun under high rate of changes. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-01-17 19:28:25 +04:00
Andrey Smirnov	9782319c31	fix: support KubePrism settings in Kubernetes Discovery Fixes #8143 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-01-16 20:41:13 +04:00
Dmitriy Matrenichev	f70b47dddc	fix: force KubePrism to connect using IPv4 Before this change KubePrism used hardcoded "localhost" as destination which Go could resolve to IPv6 destination and then fail to connect to. This change forces KubePrism to connect using IPv4 and uses hardcoded "127.0.0.1" destination so it will always use IPv4. For #8112 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2024-01-15 21:25:05 +03:00
Utku Ozdemir	7fa7362ddc	fix: fix nodes on dashboard footer when node names are used in `--nodes` When the dashboard is used via the CLI through a proxy, e.g., through Omni, node names or IDs can be used in the `--nodes` flag instead of the IPs. This caused rendering inconsistencies in the dashboard, as some parts of it used the IPs and some used the names passed in the context. Fix this by collecting all node IPs on dashboard start, and map these IPs to the respective nodes passed as the `--nodes` flag. On the dashboard footer, we always display the node names as they are passed in the `--nodes` flag. As part of it, remove the node list change reactivity from the dashboard, so it will always take the passed nodes as the truth. The IP to node mapping collection at dashboard startup also solves another issue where the first API call by the dashboard triggered the interactive API authentication (e.g., the OIDC flow). Previously, because the terminal was already switched to the raw mode, it was not possible to authenticate properly. Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>	2024-01-12 12:00:08 +01:00
Jonomir	dea9bda2d0	fix: disk UUID & WWID always empty in `talosctl disks` Add missing attributes to conversion of go-blockdevice disk to protobuf disk. Signed-off-by: Jonomir <68125495+Jonomir@users.noreply.github.com> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2024-01-11 14:37:39 +04:00
Serge Logvinov	f6926faab5	fix: default priority for ipv6 We will use the default IPv6 gateway priority as 2048. The RA default is 1024, which leads to verbose messages such as 'error adding route: netlink receive: file exists.' Azure uses DHCPv6 and RA for configuring IPv6 on the node. The platform sets the default gateway as a fallback in case 'accept_ra' is not set to 2. Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-12-29 18:42:23 +04:00
Andrey Smirnov	0a30ef7845	fix: imager should support different Talos versions Add some quirks to make images generated with newer Talos compatible with images generated by older Talos. Specifically, reset options were adding in Talos 1.4, so we shouldn't add them for older versions. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-12-22 16:13:34 +04:00
Andrey Smirnov	e6e422b92a	chore: bump dependencies Go modules, tools, etc. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-12-21 19:01:16 +04:00
Dmitriy Matrenichev	59b62398f6	chore: modernize machined/pkg/controllers/k8s This is going to be multipart effort to finally use safe.* wrappers in the production code. Quick regexp search shows that there are around 150 direct type assertions on resources (excluding the ones in this commit). Also - migrate from `interface{}` to `any` and use `slices.Sort` instead of `sort.` where possible. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2023-12-15 19:33:06 +03:00
Andrey Smirnov	760f793d55	fix: use correct prefix when installing SBC files When creating an image under non-default mount prefix, it should be used explicitly when copying SBC files. See https://github.com/siderolabs/image-factory/issues/65 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-12-15 19:46:10 +04:00
Noel Georgi	0b94550c42	chore: fix the gvisor test The gvisor test was not using the correct runtimeclass and would have always passed the regardless. Signed-off-by: Noel Georgi <git@frezbo.dev>	2023-12-15 20:48:44 +05:30
Andrey Smirnov	d803e40ef2	docs: provide documentation for Talos 1.6 Updated lots of documentation with new/updated flows. Provide What's New for Talos 1.6.0. Update Troubleshooting guide to cover more steps. Make Talos 1.6 docs the default. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-12-15 16:36:57 +04:00
Andrey Smirnov	10c59a6b90	fix: leave discovery service later in the reset sequence Fixes #8057 I went back and forth on the way to fix it exactly, and ended up with a pretty simple version of a fix. The problem was that discovery service was removing the member at the initial phase of reset, which actually still requires KubeSpan to be up: * leaving `etcd` (need to talk to other members) * stopping pods (might need to talk to Kubernetes API with some CNIs) Now leaving discovery service happens way later, when network interactions are no longer required. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-12-13 19:16:12 +04:00
Andrey Smirnov	131a1b1671	fix: add a KubeSpan option to disable extra endpoint harvesting It works well for small clusters, but with bigger clusters it puts too much load on the discovery service, as it has quadratic complexity in number of endpoints discovered/reported from each member. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-12-12 14:07:31 +04:00
Artem Chernyshev	4547ad9afa	feat: send `actor id` to the SideroLink events sink This might come handy to distinguish sequences, tasks initiated by a particular API request. Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2023-12-11 21:59:02 +03:00
Dmitriy Matrenichev	6bb1e99aa3	chore: optimize pcap dump Reimplement `gopacket.PacketSource.PacketsCtx` as `forEachPacket`. - Use `ZeroCopyPacketDataSource` instead of `PacketDataSource`. I didn't find any specific reason why `PacketDataSource` exists at all, since `NewPacket` is doing copy inside if you don't explicitly tell it not to. - Use `WillPool` to pool packet buffers. It doesn't fully remove allocations, but it's a safe start. Send packets back into the pool after we are done with them. - Pass `Packet` directly to the closure instead of waiting for it on the channel. We don't store this packet anywhere so there is no reason to async this part. - Drop `time.Sleep` code in `forEachPacket` body. - Drop `SnapLen` support in client and server since it didn't work anyway (details in the PR). Closes #7994 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2023-12-11 15:44:42 +03:00
Andrey Smirnov	46121c9fec	docs: rework machine config documentation generation Generate a structured table of contents following the structure of the config. Make high-level examples follow the full structure of the config. Document new multi-doc machine config. Fixes #8023 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-12-08 14:16:40 +04:00
Andrey Smirnov	270604bead	fix: support user disks via symlinks The core blockdevice library already supported resolving symlinks, we just need to get the raw block device name from it, and use it afterwards. In QEMU provisioner, leave the first (system) disk as virtio (for performance), and mount user disks as 'ata', which allows `udevd` to pick up the disk IDs (not available for `virtio`), and use the symlink path in the tests. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-12-05 22:02:56 +04:00
Andrey Smirnov	474fa0480d	fix: store and execute desired action on emergency action Fixes #7854 Talos runs an emergency handler if the sequence experience and unrecoverable failure. The emergency handler was unconditionally executing "reboot" action if no other action was received (which only gets received if the sequence completes successfully), so the Shutdown request might result in a Reboot behavior on error during shutdown phase. This is not a pretty fix, but it's hard to deliver the intent from one part of the code to another right now, so instead use a global variable which stores default emergency intention, and gets overridden early in the Shutdown sequence. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-12-04 19:51:48 +04:00
Andrey Smirnov	dbf274ddf7	fix: skip writing the file if the contents haven't changed As the controller reconciles every /etc file present, it might be called multiple times for the same file, even if the actual contents haven't changed. Rewriting the file might lead to some concurrent process seeing incomplete file contents more often than needed. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-12-04 15:58:03 +04:00
Andrey Smirnov	d8a435f0e4	fix: initialize boot assets with defaults early The problem was that bootloaders were correctly picking up defaults for `installer` mode (vs. `imager` mode), but DTB and other SBC stuff wasn't properly initialized, so installing on SBC fails. Now all options are properly initialized with defaults early in the process. Fixes #8009 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-12-01 17:47:05 +04:00
Andrey Smirnov	c6835de17a	fix: pick etcd adverised addresses from 'current' addresses Fixes #7947 This way etcd advertised address can be picked from the `external IPs` of the machine. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-12-01 17:26:28 +04:00
Andrey Smirnov	e71e3e4161	feat: support extra arguments for `flanneld` Fixes #7754 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-12-01 16:18:02 +04:00
Andrey Smirnov	36c8ddb5e1	feat: implement ingress firewall rules Fixes #4421 See documentation for details on how to use the feature. With `talosctl cluster create`, firewall can be easily test with `--with-firewall=accept\|block` (default mode). Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-30 22:58:16 +04:00
Dmitriy Matrenichev	0b111ecb81	fix: support slices of enums and fix NfTablesConntrackStateMatch We already have the code which supports custom enums, so let's extend it to support custom enums in slices and fix the NfTablesConntrackStateMatch proto definition. Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2023-11-30 00:23:16 +03:00
Andrey Smirnov	9a85217412	feat: improve nftables backend Many changes to the nftables backend which will be used in the follow-up PR with #4421. 1. Add support for chain policy: drop/accept. 2. Properly handle match on all IPs in the set (`0.0.0.0/0` like). 3. Implement conntrack state matching. 4. Implement multiple ifname matching in a single rule. 5. Implement anonymous counters. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-29 21:22:47 +04:00
Noel Georgi	f041b26299	chore: add tests for mdadm extension Add tests for mdadm extension. See: https://github.com/siderolabs/extensions/pull/271 Signed-off-by: Noel Georgi <git@frezbo.dev>	2023-11-27 23:18:35 +05:30
Andrey Smirnov	e46e6a312f	feat: implement nftables backend Implement initial set of backend controllers/resources to handle nftables chains/rules etc. Replace the KubeSpan nftables operations with controller-based. See #4421 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-27 21:14:15 +04:00
Dmitriy Matrenichev	ba827bf8b8	chore: support getting multiple endpoints from the `Provision` rpc call The code will rotate through the endpoints, until it reaches the end, and only then it will try to do the provisioning again. Closes #7973 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2023-11-25 21:38:44 +03:00
Dmitriy Matrenichev	dd45dd06cf	chore: add custom node taints This PR adds support for custom node taints. Refer to `nodeTaints` in the `configuration` for more information. Closes #7581 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2023-11-25 18:33:18 +03:00
Dmitriy Matrenichev	70d53ee13c	chore: deprecate .persist and .extensions This commit deprecates those things: - Removes the support of `.persist` flag. From now, it should always be enabled or not defined in the config. - Removes the documentation for `.bootloader`. It never worked anyway. - Adds a warning for `.machine.install.extensions`, suggests to use boot-assets. Closes #7972 Closes #7507 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2023-11-22 20:35:38 +03:00
Noel Georgi	aca8b5e179	fix: ignore kernel command line in container mode Ignore kernel command line for `SideroLink` and `EventsSink` config when running in container mode. Otherwise when running Talos as a docker container in Talos it picks up the host kernel cmdline and try to configure SideroLink/EventsSink. Signed-off-by: Noel Georgi <git@frezbo.dev>	2023-11-21 18:55:37 +05:30
Andrey Smirnov	27d208c26b	feat: implement OAuth2 device flow for machine config Fixes #7939 See documentation in the PR for the description of the feature. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-20 14:31:43 +04:00
Noel Georgi	5c8fa2a803	chore: start containerd early in boot Start container early in the boot process so system extension services start in maintenance mode. Fixes: #7083 Signed-off-by: Noel Georgi <git@frezbo.dev>	2023-11-16 23:19:33 +05:30
Noel Georgi	0d3c3ed716	feat: support kube scheduler config Support kube-scheduler config. Fixes: #7905 Partially fixes: #7911 Signed-off-by: Noel Georgi <git@frezbo.dev>	2023-11-15 10:15:23 +05:30
Andrey Smirnov	06941b7e5c	fix: allow rootfs propagation configuration for extension services Fixes #7873 Some services which perform mounts inside the container which require mounts to propagate back to the host (e.g. `stargz-snapshotter`) require this configuration setting. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-13 21:58:22 +04:00
Noel Georgi	4f1ad16c76	feat: support kubelet credentialprovider config Support configuring kubelet credential provider config. Partially fixes: #7911 Signed-off-by: Noel Georgi <git@frezbo.dev>	2023-11-13 19:40:43 +05:30
Andrey Smirnov	f38eaaab87	feat: rework secureboot and PCR signing key Support different providers, not only static file paths. Drop `pcr-signing-key-public.pem` file, as we generate it on the fly now. See https://github.com/siderolabs/image-factory/issues/19 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-10 21:14:21 +04:00
Dmitriy Matrenichev	6eade3d5ef	chore: add ability to rewrite uuids and set unique tokens for Talos This PR does those things: - It allows API calls `MetaWrite` and `MetaRead` in maintenance mode. - SystemInformation resource now waits for available META - SystemInformation resource now overwrites UUID from META if there is an override - META now supports "UUID override" and "unique token" keys - ProvisionRequest now includes unique token and Talos version For #7694 Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>	2023-11-10 18:17:54 +03:00
Andrey Smirnov	e9c7ac17a9	fix: set max msg recv size when proxying Previously a fix was deployed in the Talos API client, but when the request passes through `apid`, we need to make sure that proxy doesn't reject large responses. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-09 21:08:54 +04:00
Andrey Smirnov	e22ab440d7	feat: update Linux 6.1.61, containerd 1.7.8, runc 1.1.10 Bump tools/pkgs/extras. Update Go dependencies. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-09 20:17:28 +04:00
Noel Georgi	75d3987c05	chore: drop sha1 from genereated pcr json Drop `sha1` algorithm from expected PCR json calculation. Signed-off-by: Noel Georgi <git@frezbo.dev>	2023-11-07 22:11:33 +05:30
Andrey Smirnov	87c40da6cc	fix: proper logging in machined on startup Move `setupLogging` inside the controller, so that logger is set up correctly before Talos starts printing first messages. This fixes an inconsistency that first messages are printed using "default" logger, while after that the proper logger is set up, and format of the messages matches kernel log. Also move `waitForUSBDelay` into the sequencer after `udevd` was started (this is when blockdevices including USB ones are discovered). Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-07 17:09:18 +04:00
Andrey Smirnov	a54da5f641	fix: image build for nanopi_4s Path was missing a slash. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-07 16:50:22 +04:00
Andrey Smirnov	6f3cd05935	refactor: update packet capture to use 'afpacket' interface First of all, this interface is way more performant than `pcap` interface. It is Linux-specific, but we don't care in Talos Linux :) Second, this drop dependency of `machined` on `gopacket/layers` package, which has huge issues with memory allocations and startup time. This cuts around 20MiB of process RSS for all Talos processes. (`talosctl` still requires this `gopacket/layers` library for decoding packets). Fixes #7880 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-07 15:52:04 +04:00
Andrey Smirnov	813442dd7a	fix: don't validate machine.install if installed As Talos doesn't consume `.machine.install` if already installed, there is no point in validating it once already installed. This fixes a problem users often run into: after a reboot/upgrade the system disk blockdevice name changes, due to the kernel upgrade, or just unpredictable behavior of device discovery. Talos fails to boot as it can't validate the machine config, while it's already installed, so actual blockdevice name doesn't matter. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-03 15:08:42 +04:00
Andrey Smirnov	807a9950ac	fix: use custom Talos/kernel version when generating UKI See https://github.com/siderolabs/image-factory/issues/44 Instead of using constants, use proper Talos version and kernel version discovered from the image. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-03 11:14:02 +04:00
Andrey Smirnov	2e78513e16	refactor: drop the dependency link platform -> network ctrl This leads to lots of unnecessary improts, as the chain from network controllers is pretty long. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-01 21:56:49 +04:00
Andrey Smirnov	6dc776b8aa	fix: when writing to META in the installer/imager, use fixed name Use fixed partition name instead of trying to auto-discover by label. Auto-discovery by label might hit completely wrong blockdevice. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-11-01 20:34:41 +04:00
Andrey Smirnov	cbe6e7622d	fix: generate images for SBCs using imager See https://github.com/siderolabs/image-factory/issues/43 Two fixes: * pass path to the dtb, uboot and rpi-firmware explicitly * include dtb, uboot and rpi-firmware into arm64 installer image when generated via imager (regular arm64 installer was fine) (The generation of SBC images was not broken for Talos itself, but only when used via Image Factory). Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-10-30 13:46:58 +04:00
Utku Ozdemir	5dff164f1c	fix: fix error output of cli action tracker Before we started a reboot/shutdown/reset/upgrade action with the action tracker (`--wait`), we were setting a flag to prevent cobra from printing the returned error from the command. This was to prevent the error from being printed twice, as the reporter of the action tracker already prints any errors occurred during the action execution. But if the error happens too early - i.e. before we even started the status printer goroutine, then that error wouldn't be printed at all, as we have suppressed the errors. This PR moves the suppression flag to be set after the status printer is started - so we still do not double-print the errors, but neither do we suppress any early-stage error from being printed. Closes siderolabs/talos#7900. Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>	2023-10-27 21:16:54 +02:00
Artem Chernyshev	ffa5e05cb9	fix: make Talos work on Rockpi 4c boards again Suppress `efivars` `ENODEV` errors: skip mount and proceed with boot sequence. Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2023-10-25 13:51:38 +03:00
Andrey Smirnov	8eba4c5999	feat: generate secrets bundle from the machine config This allows to "recover" secrets if the machine config was generated first without explicitly saving secrets bundle. Fixes #7895 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-10-25 13:44:14 +04:00
Nico Berlee	a009f5c60c	fix: accept sysctl paths with dots Fixes #7878 Signed-off-by: Nico Berlee <nico.berlee@on2it.net> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-10-20 21:16:15 +04:00
Nico Berlee	4919f6ee22	feat: add GOMEMLIMIT to shipped manifests with memory limits This commit integrates the GOMEMLIMIT environment variable into shipped K8S manifests when resources.limits.memory is defined. It is set to 95% of the memory limit to optimize the performance of the Go garbage collector, mitigating the risk of OOMKills in containerized environments. When configuring the controller-manager or scheduler custom resources in machine config, they where accepted, but ignored. This commit adds Resources to NewControlPlaneSchedulerController and NewControlPlaneControllerManagerController so machine config resources Fixes #7874 Signed-off-by: Nico Berlee <nico.berlee@on2it.net> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-10-20 20:41:40 +04:00
Andrey Smirnov	9dfae8467d	chore: update dependencies Containerd 1.7.7, Linux 6.1.58. Fixes #7859 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-10-17 17:41:38 +04:00
Serge Logvinov	38ce3c827a	feat: nocloud prefer mac address Use MAC address over network interface name. Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-10-16 17:35:58 +04:00
Andrey Smirnov	c3e4182000	refactor: use COSI runtime with new controller runtime DB See https://github.com/cosi-project/runtime/pull/336 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-10-12 19:44:44 +04:00
Serge Logvinov	0ff7350abe	fix: oracle integration fixes * Set static gateway IPv6 if it possible. Some cni do not work properly with ipv6, so we will fix it. * Disable talos dashboard. Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-10-12 17:51:50 +04:00
Andrey Smirnov	f9639fb531	test: fix 'talosctl gen' tests There were weird hacks put into the tests, while each test already runs in a temporary directory as 'working directory', so no hacks are needed. Moreover, using fixed `/tmp/...` paths leads to test failures, as CI runs docker & QEMU tests in parallel conflicting with each other. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-10-12 16:24:02 +04:00
Andrei Kvapil	6142d87a0f	feat: hostname configuration improvements on the NoCloud platform * support for local-hostname parameter * support for hostnames passed via user-data (for Proxmox VE) Signed-off-by: Andrei Kvapil <kvapss@gmail.com> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-10-12 15:45:25 +04:00
Andrey Smirnov	7bb205ebe2	fix: don't use runtime-specs Mount struct in machine config First of all, it breaks our backwards compatibility promises and breaks documentation generation. Upstream `specs.Mount` might change at any time. The issue was that containerd 1.7.x brings in new `specs.Mount` which contains extra fields which don't have `omitempty` for YAML, so machinery always generates them which confuses old Talos versions. Use a copy of the upstream struct with proper YAML tags, and also provide a special trick to make sure if the upstream struct changes, we have a chance to update our copy of the struct. Also this fixes docs and JSON schema. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-10-11 23:06:19 +04:00
Thomas Way	b87092ab69	fix: handle secure boot state policy pcr digest error This does not fix the underlying digest mismatch issue, but does handle the error and should provide further insight into issues (if present). Refs: #7828 Signed-off-by: Thomas Way <thomas@6f.io> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-10-09 18:24:56 +04:00
Thomas Way	336aee0fdb	fix: use tpm2 hash algorithm constants and allow non-SHA-256 PCRs The conversion from TPM 2 hash algorithm to Go crypto algorithm will fail for uncommon algorithms like SM3256. This can be avoided by checking the constants directly, rather than converting them. It should also be fine to allow some non SHA-256 PCRs. Fixes: #7810 Signed-off-by: Thomas Way <thomas@6f.io> Signed-off-by: Noel Georgi <git@frezbo.dev>	2023-10-04 01:02:20 +05:30
Noel Georgi	69d8054c9e	chore: drop UpdateEndpointSuite drop `UpdateEndpointSuite` suite since KubePrism is enabled by default starting Talos 1.6 and the test never passes since K8s node is always ready since it can connect to api server over KubePrism. Signed-off-by: Noel Georgi <git@frezbo.dev>	2023-10-04 00:26:59 +05:30
Andrey Smirnov	ef7be16c80	fix: clear the encryption config in META when STATE is reset When STATE is reset, we need to make sure we wipe the META keys containing encryption config as well. Fixes #7819 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-10-03 22:00:21 +04:00
Noel Georgi	9b5cfdd0bc	chore: add tests for iscsi Add tests for iscsi to make sure it works. Signed-off-by: Noel Georgi <git@frezbo.dev>	2023-10-02 22:12:42 +05:30
Andrey Smirnov	10ed130679	fix: the node IP for kubelet shouldn't change if nothing matches This was a fix some time ago, but it was incorrect (missing `continue`), which was failing the unit-tests. Also fix a data race in another unit-test (which is unit-test only, not affecting production). Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-09-29 14:22:52 +04:00
Andrey Smirnov	e7575ecaae	feat: support n-5 latest Kubernetes versions For Talos 1.6 this means 1.24-1.29 Kubernetes. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-09-29 13:41:56 +04:00
Andrey Smirnov	2b548ad0d9	feat: update containerd to 1.7.x Also update Linux and other pkgs. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-09-28 16:33:57 +04:00
Andrey Smirnov	a52d3cda3b	chore: update gen and COSI runtime No actual changes, adapting to use new APIs. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-09-22 12:13:13 +04:00
Noel Georgi	9c2ba7c6fa	chore: add tests for chelsio drivers Add tests for Chelsio drivers and firmware. Ref: https://github.com/siderolabs/extensions/pull/232 Signed-off-by: Noel Georgi <git@frezbo.dev>	2023-09-20 20:07:25 +05:30
Andrey Smirnov	5ca4d58dc9	fix: generate of modules.dep when on the machine When running on the machine, the extensionTreePath is not writeable, so create and clean up a temporary directory to host `modules.dep` extension. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-09-20 15:51:22 +04:00
Andrey Smirnov	96f2a62eaf	test: update upgrade tests versions Use a 1.4/1.5 releases. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-09-19 14:53:25 +04:00
Andrey Smirnov	e3b4940588	fix: build CPU ucode correctly for early loader Closes #7729 This follows the steps described in https://www.kernel.org/doc/html/v6.1/x86/microcode.html#early-load-microcode Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-09-18 14:03:41 +04:00
Andrey Smirnov	c5bd0ac5cf	refactor: reimplement the depmod extension rebuilder Drop loop device/mounts completely, use userspace utilities to extract and lay over module trees in the tmpfs. Discover kernel version automatically instead of hardcoding it to be current one (required for Image Service). Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-09-15 21:51:42 +04:00
Andrey Smirnov	a7edd0523f	fix: set default route priority for hcloud platform Otherwise route gets created with priority '0' and it seems to get into conflict with what Cilium tries to add. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-09-15 14:47:45 +04:00
Andrey Smirnov	9698e45479	fix: handle correctly change of listen address for maintenance service Fixes #7738 If the SideroLink address changes, maintenance service should listen on new address. Previously it worked "sometimes", as there was a race on maintenance config either be removed/recreated or just updated. In case of an update the listen address was not updated properly, but recreate case worked correctly. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-09-14 19:07:22 +04:00
Andrey Smirnov	a096f05a56	chore: update gRPC library and enable shared write buffers Fixes #7576 See https://github.com/grpc/grpc-go/pull/6309 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-09-13 21:27:46 +04:00
Artem Chernyshev	2960f93baa	feat: add readonly information to the disks API response Forward device readonly info from `go-blockdevice` library. Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>	2023-09-12 18:09:59 +03:00
Serge Logvinov	3f52320752	feat: upgrade-k8s without comments This feature allows us to remove any comments from the machineconfig after upgrading Kubernetes. Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>	2023-09-12 14:50:56 +04:00

1 2 3 4 5 ...

2110 Commits