4544 Commits

Author SHA1 Message Date
Andrey Smirnov
7d43c9aa6b
chore: annotate installer errors
I want to catch a spurious error `ENODEV`, where exactly it comes from.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-21 16:58:34 +04:00
Andrey Smirnov
f737e6495c
fix: populate routes to BGP neighbors (Equinix Metal)
Fixes #8267

Also refactor the code so that we don't fail hard on mutiple bonds, but
it's not clear still how to attach addresses, as they don't have a
interface name field, so for now attaching to the first bond.

Fixes #8411

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-21 15:44:21 +04:00
Dmitriy Matrenichev
19f15a840c
chore: bump golangci-lint to 1.57.0
Fix all discovered issues.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-03-21 01:06:53 +03:00
Noel Georgi
6840119632
docs: add docs for overlays
Add docs for overlays.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-03-20 19:19:43 +05:30
Noel Georgi
9b6ec5929a
chore: bump kernel
Bump PKGS to bring in kernel with new config options and more KSPP
fixes.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-03-20 17:54:24 +05:30
goodmost
69f0466cd8
docs: remove repetitive words
Documentation fixes.

Signed-off-by: goodmost <zhaohaiyang@outlook.com>
2024-03-19 20:58:09 +04:00
Artem Chernyshev
113fb646ec
chore: use go-talos-support library
The code for collecting Talos `support.zip` was extracted there.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2024-03-19 18:28:46 +03:00
Andrey Smirnov
89fc68b459
fix: service lifecycle issues
The core change is moving the context out of the `ServiceRunner` struct
to be a local variable, and using a channel to notify about shutdown
events.

Add more synchronization between Run and the moment service started to
avoid mis-identifying not running (yet) service as successfully finished.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Co-authored-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-03-19 18:11:13 +04:00
Andrey Smirnov
ead37abf09
test: disable volume tests
They're flaky, disable until the root cause is known.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-19 16:40:42 +04:00
Andrey Smirnov
c64523a7a1
feat: update Flannel to v0.24.4
See https://github.com/flannel-io/flannel/releases/tag/v0.24.4

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-18 18:55:14 +04:00
Andrey Smirnov
15beb14780
feat: implement blockdevice watch controller
This controller combines kobject events, and scan of `/sys/block` to
build a consistent list of available block devices, updating resources
as the blockdevice changes.

Based on these resources the next step can run probe on the blockdevices
as they change to present a consistent view of filesystems/partitions.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-18 18:28:40 +04:00
Dmitriy Matrenichev
06e3bc0cbd
feat: implement Siderolink wireguard over GRPC
For #8064

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-03-18 15:38:13 +03:00
Andrey Smirnov
9afa70baf3
fix: patch correctly config in talosctl upgrade-k8s
The current code was stipping non-`v1alpha1.Config` documents. Provide a
proper method in the config provider, and update places using it.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-15 20:42:44 +04:00
Andrey Smirnov
3130caf954
chore: re-enable DRBD extension
See https://github.com/siderolabs/extensions/pull/343

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-15 15:55:18 +04:00
Andrey Smirnov
3ba180d07d
release(v1.7.0-alpha.1): prepare release
This is the official v1.7.0-alpha.1 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-14 19:14:09 +04:00
Andrey Smirnov
403ad93c35
feat: update dependencies
containerd 1.7.14
Linux 6.6.21

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-14 16:17:24 +04:00
Utku Ozdemir
7376f34e82
fix: remove maintenance config when maintenance service is shut down
We now remove the machine config with the id `maintenance` when we are done with it - when the maintenance service is shut down.

Closes siderolabs/talos#8424, where in some configurations there would be machine configs with both `v1alpha1` and `maintenance` IDs present, causing the `talosctl edit machineconfig` to loop twice and causing confusion.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2024-03-14 12:51:59 +01:00
Noel Georgi
952801d8b2
fix: handle overlay partition options
Handling of Overlay PartitionOpts was missed in the previous code.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-03-14 15:39:59 +05:30
Andrey Smirnov
465b9a4e6c
fix: update discovery client with the fix for keepalive interval
See https://github.com/siderolabs/discovery-client/pull/9

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-13 16:25:57 +04:00
Andrey Smirnov
1e9f866aca
feat: update Kubernetes to v1.30.0-beta.0
See https://github.com/kubernetes/kubernetes/releases/tag/v1.30.0-beta.0

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-13 15:35:44 +04:00
Noel Georgi
d118a852b9
feat: implement Install for imager overlays
Implement `Install` for imager overlays.
Also add support for generating installers.

Depends on: #8377

Fixes: #8350
Fixes: #8351
Fixes: #8350

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-03-12 22:46:29 +05:30
Andrey Smirnov
cd5a5a4474
chore: migrate to go-grpc-middleware/v2
See https://github.com/grpc-ecosystem/go-grpc-middleware

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-12 16:10:04 +04:00
Andrey Smirnov
e3c2a63981
feat: set default NTP server to time.cloudflare.com
Fixes #8396

Pros:

* IPv6
* good CDN, small RTT

Cons:

* not community-run

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-12 14:43:14 +04:00
Dmitriy Matrenichev
32e0877607
chore: print all available logs containers in logs command completions
This is a small quality of life improvement that allows `logs` subcommand to suggest all available logs.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-03-11 17:48:01 +03:00
Noel Georgi
e89d755c52
fix: etcd config validation for worker
Fixes an ambigious error when etcd config is supplied to a worker as a
patch.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-03-11 17:23:29 +05:30
james-dreebot
1aa3c91821
docs: add DreeBot to ADOPTERS.md
Explain how DreeBot leverages Talos

Signed-off-by: James Sevener (DreeBot) <128485016+james-dreebot@users.noreply.github.com>
2024-03-08 09:20:18 -05:00
Utku Ozdemir
1bb6027ccd
fix: fix nil panic on maintenance upgrade with partial config
Fix the nil dereferences when a Talos node is attempted to be upgraded while in maintenance mode and having a partial machine config.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2024-03-08 12:52:21 +03:00
Pip Oomen
aa70bfb9dc
docs: add Redpill Linpro to adopters list
Says how Redpill uses talos

Signed-off-by: Pip Oomen <pepijn@redpill-linpro.com>
Signed-off-by: Justin Garrison <justin.garrison@siderolabs.com>
2024-03-07 13:11:48 -08:00
Utku Ozdemir
f02aeec922
fix: do not fail cluster create when input dir does not contain talosconfig
As `--input-dir` flag now supports partial configs, it should not fail when there is no talosconfig in the directory.

This was the missing part in siderolabs/talos#8333.

Additionally, allow the `--cidr` flag when `--input-dir` is used - it is used even when the input configs are provided.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2024-03-07 23:13:10 +03:00
Noel Georgi
1ec6683e0c
chore: use go-copy
Use go-copy and drop `pkg/copy`.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-03-07 19:51:28 +05:30
Artem Chernyshev
3c8f51d707
chore: move cli formatters and version modules to machinery
To be used in the `go-talos-support` module without importing the whole
Talos repo.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2024-03-07 16:29:15 +03:00
Andrey Smirnov
8152a6dd6b
feat: update Go to 1.22.1
Update Go and other dependencies as well.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-07 15:53:29 +04:00
Sebastiaan Gerritsen
8c79539914
docs: update replicated-local-storage-with-openebs-jiva.md
Change the path.

Signed-off-by: Sebastiаan Gerritsen <50165934+sebastiaan-dev@users.noreply.github.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-03-04 14:34:21 +04:00
Noel Georgi
f23bd81448
fix: syslog parser
Fixes a condition when the timestamp contains a single digit day.
This started failing when the month started :sweat_smile.

Also handle a case when `tag` and `hostname` are both missing.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-03-04 11:08:46 +05:30
Andrey Smirnov
bbed07e03a
feat: update Linux to 6.6.18
ZFS extension got re-enabled for 1.7.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-29 20:08:59 +04:00
Noel Georgi
8125e754b8
feat: imager overlay
Support overlays for imager.
The `Install` interface is not wired yet, it will be done as a different
PR.

This should be a no-op for existing imager.

Part of: #8350

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-02-29 20:44:31 +05:30
Andrey Smirnov
0b9b4da12a
feat: update Kubernetes to 1.30.0-alpha.3
See https://github.com/kubernetes/kubernetes/releases/tag/v1.30.0-alpha.3

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-29 14:36:09 +04:00
ebcrypto
3a764029ea
docs: fix typo in word governor
Docs typo.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-28 17:23:45 +04:00
Andrey Smirnov
d81d490003
chore: update CoreDNS renovate source
As we're using a mirrored image from `registry.k8s.io`, use that as a
source instead of GitHub. Mirrored image appears with some delay after
an official CoreDNS release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-27 17:12:25 +04:00
Andrey Smirnov
b2ad5dc5f8
fix: workaround a race in CNI setup (talosctl cluster create)
When provisioning VMs, each launch process sets up CNI network, and from
time to time CNI setup fails with something like:

```
error provisioning CNI network: plugin type="firewall" failed (add): running [/sbin/iptables -t filter -N CNI-ADMIN --wait]: exit status 4: iptables v1.8.10 (nf_tables)
```

This a race condition in the CNI plugins, and it looks like there is no
fix for it (see e.g. https://github.com/hashicorp/nomad/issues/8838).

As a workaround, take a mutex around CNI operation to serialize them.
CNI setup happens in different processes, so use a file-based mutex.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-27 15:28:12 +04:00
Andrey Smirnov
457507803d
fix: provide auth when pulling images in the imager
Use standard Docker/Podman auth methods plus `GITHUB_TOKEN`.

See #8363

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-27 11:50:48 +04:00
Spencer Smith
e707175ab5
docs: update config patch in cilium docs
We missed the `cluster` key in the config patch. Fixed to avoid user confusion.

Signed-off-by: Spencer Smith <spencer.smith@talos-systems.com>
2024-02-26 14:35:08 -05:00
Dmitriy Matrenichev
f8c556a1ce
chore: listen for dns requests on 127.0.0.53
Turns out there is actually no black magic in systemd, they simply listen on 127.0.0.53 and forward dns requests there in resolv.conf.
Reason is the same as ours — to preserve compatibility with other applications. So we do the same in our code.

This PR also does two things:
- Adds `::1` into resolv.conf for IPv6 only resolvers.
- Drops `SO_REUSEPORT` from control options (it works without them).

Closes #8328

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-02-26 20:59:12 +03:00
Andrey Smirnov
8872a7a210
fix: ignore 'no such device' in addition to 'no such file'
This errors pops up when `udevd` rescans the partition table with Talos
trying to mount a device concurrently.

This feels to be something new with Linux 6.6 probably.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-26 20:00:05 +04:00
Noel Georgi
1cb5443530
chore: uki der certs in iso
Add the uki signing cert into iso.

Fixes: #8131

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-02-26 19:35:14 +05:30
Andrey Smirnov
67ac6933d3
fix: handle errors to watch apid/trustd certs
Fixes #8345

Both `apid` and `trustd` services use a gRPC connection back to
`machined` to watch changes to the certificates (new certificates being
issued).

This refactors the code to follow regular conventions, so that a failure
to watch will crash the process, and they have a way to restart and
re-establish the watch.

Use the context and errgroup consistently.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-23 17:38:56 +04:00
Christian WALDBILLIG
c79d69c2e2
fix: only set gateway if set in context (opennebula)
Fix the network config setup.

Signed-off-by: Christian WALDBILLIG <christian@waldbillig.io>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-23 17:05:33 +04:00
Dmitry Sharshakov
4575dd8e74
chore: allow not preallocated disks for QEMU cluster
Preallocation still done by default for correct max usage estimates, but
in development environment it could be beneficial not to use up that
space, so I added a flag to disable preallocation

Signed-off-by: Dmitry Sharshakov <d3dx12.xx@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-23 16:45:44 +04:00
Kai Hanssen
0bddfea818
chore: add oceanbox.io to adopters
Add [oceanbox.io](oceanbox.io) to adopters list.

Signed-off-by: Kai Hanssen <hanssen.a.kai@outlook.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-02-23 09:26:52 +05:30
Noel Georgi
1364275926
chore: use proper talos_version_contract for TF tests
Use proper `talos_version_contract` for TF tests.

Depends on: https://github.com/siderolabs/contrib/pull/36

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-02-22 22:35:10 +05:30