4604 Commits

Author SHA1 Message Date
Andrey Smirnov
e8758dcbad
chore: support http downloads for assets in talosctl cluster create
This allows to pass direct URLs to Image Factory assets for disk
image/ISO/vmlinuz/initramfs, so that we can test Image Factory with
Talos.

Also add an integration test for Image Factory.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-25 18:58:25 +04:00
Andrey Smirnov
265f21be09
fix: replace the filemap implementation to not buffer in memory
This filemap is used to generate installer image layer with artifacts.

Previous dumb implementation buffered in memory which leads to extensive
memory usage.

See https://github.com/siderolabs/image-factory/issues/77

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-22 19:07:46 +04:00
Andrey Smirnov
8db3c5b3c6
fix: pick correctly base installer image layers
Only Talos 1.5+ provides proper optimized image,
Talos 1.4 provided a single-layer image (which worked in this case),
while Talos 1.2-1.3 have multi-layered images which can't be replaced
easily.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-22 17:09:05 +04:00
Andrey Smirnov
0a30ef7845
fix: imager should support different Talos versions
Add some quirks to make images generated with newer Talos compatible
with images generated by older Talos.

Specifically, reset options were adding in Talos 1.4, so we shouldn't
add them for older versions.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-22 16:13:34 +04:00
Andrey Smirnov
d6342cda53
docs: update latest version to v1.6.1
Also port a fix from #8103

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-22 14:42:03 +04:00
Andrey Smirnov
e6e422b92a
chore: bump dependencies
Go modules, tools, etc.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-21 19:01:16 +04:00
Andrey Smirnov
5a19d078ad
fix: properly overwrite files on install
Without truncate the file was not overwritten properly if the file with
the same name already exists and has smaller size.

Fixes #8097

Also add a 10 second timeout on UEFI ISO boot, so that boot menu can be
seen without pressing `Esc` many times.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-20 19:41:30 +04:00
Tim Jones
9eb6cea789
docs: secureboot sd-boot menu clarification
Add note to try spamming Esc to bring up the sd-boot menu option if keys
don't automatically enroll in UEFI firmware.

Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
2023-12-19 18:19:31 +01:00
Andrey Smirnov
01f0cbe61c
feat: support iPXE direct booting in talosctl cluster create
This embeds a tiny TFTP server which serves UEFI iPXE which embeds a
script that chainloads a given iPXE script.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-19 17:56:08 +04:00
Andrey Smirnov
3ba84701d9
feat: pull in kernel modules for mlx Infiniband and VFIO
See:

* https://github.com/siderolabs/pkgs/pull/854
* https://github.com/siderolabs/pkgs/pull/855

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-19 13:55:42 +04:00
Andrey Smirnov
ba993e0edd
docs: announce that SecureBoot is available
Restructure the docs a bit to start with the easiest option (via Image
Factory).

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-18 20:43:08 +04:00
Andrey Smirnov
241bc9312e
fix: update the way secureboot signer fetches certificate (azure)
The previous code was a mistake, the public part of the certificate is
more easily available.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-18 17:54:51 +04:00
Dmitriy Matrenichev
59b62398f6
chore: modernize machined/pkg/controllers/k8s
This is going to be multipart effort to finally use safe.* wrappers in the production code.
Quick regexp search shows that there are around 150 direct type assertions on resources (excluding the ones in this commit).

Also - migrate from `interface{}` to `any` and use `slices.Sort*` instead of `sort.*` where possible.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-12-15 19:33:06 +03:00
Andrey Smirnov
760f793d55
fix: use correct prefix when installing SBC files
When creating an image under non-default mount prefix, it should be
used explicitly when copying SBC files.

See https://github.com/siderolabs/image-factory/issues/65

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-15 19:46:10 +04:00
Noel Georgi
0b94550c42
chore: fix the gvisor test
The gvisor test was not using the correct runtimeclass and would have
always passed the regardless.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-12-15 20:48:44 +05:30
Andrey Smirnov
3a787c1d67
docs: update 1.6 docs with Noel's feedback
I merged docs PR before receiving those updates.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-15 18:48:17 +04:00
Andrey Smirnov
d803e40ef2
docs: provide documentation for Talos 1.6
Updated lots of documentation with new/updated flows.

Provide What's New for Talos 1.6.0.

Update Troubleshooting guide to cover more steps.

Make Talos 1.6 docs the default.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-15 16:36:57 +04:00
Andrey Smirnov
9a185a30f7
feat: update Kubernetes to v1.29.0
See https://github.com/kubernetes/kubernetes/releases/v1.29.0

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-13 22:59:17 +04:00
Andrey Smirnov
5934815d2f
chore: split more kernel modules on amd64
See https://github.com/siderolabs/pkgs/pull/844

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-13 21:26:32 +04:00
Andrey Smirnov
10c59a6b90
fix: leave discovery service later in the reset sequence
Fixes #8057

I went back and forth on the way to fix it exactly, and ended up with a
pretty simple version of a fix.

The problem was that discovery service was removing the member at the
initial phase of reset, which actually still requires KubeSpan to be up:

* leaving `etcd` (need to talk to other members)
* stopping pods (might need to talk to Kubernetes API with some CNIs)

Now leaving discovery service happens way later, when network
interactions are no longer required.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-13 19:16:12 +04:00
Noel Georgi
0c86ca1cc6
chore: enable kubespan+firewall for cilium tests
Enable kubespan and default block firewall with cilium tests.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-12-12 22:50:47 +05:30
Andrey Smirnov
98fd722d51
feat: provide compatibility for future Talos 1.7
Ensure that Talos 1.6 machinery can handle compatibility for Talos 1.7.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-12 15:10:11 +04:00
Andrey Smirnov
131a1b1671
fix: add a KubeSpan option to disable extra endpoint harvesting
It works well for small clusters, but with bigger clusters it puts too
much load on the discovery service, as it has quadratic complexity in
number of endpoints discovered/reported from each member.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-12 14:07:31 +04:00
Artem Chernyshev
4547ad9afa
feat: send actor id to the SideroLink events sink
This might come handy to distinguish sequences, tasks initiated by a
particular API request.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2023-12-11 21:59:02 +03:00
Andrey Smirnov
04e7745471
docs: cap max heading level
Markdown/HTML can't have headings after level 6, so make sure the
maximum heading level is capped at 6.

We have just a single place with such deep nesting.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-11 18:39:18 +04:00
Dmitriy Matrenichev
6bb1e99aa3
chore: optimize pcap dump
Reimplement `gopacket.PacketSource.PacketsCtx` as `forEachPacket`.

- Use `ZeroCopyPacketDataSource` instead of `PacketDataSource`. I didn't find any specific reason why `PacketDataSource` exists at all, since `NewPacket` is doing copy inside if you don't explicitly tell it not to.
- Use `WillPool` to pool packet buffers. It doesn't fully remove allocations, but it's a safe start.
  Send packets back into the pool after we are done with them.
- Pass `Packet` directly to the closure instead of waiting for it on the channel. We don't store this packet anywhere so there is no reason to async this part.
- Drop `time.Sleep` code in `forEachPacket` body.
- Drop `SnapLen` support in client and server since it didn't work anyway (details in the PR).

Closes #7994

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-12-11 15:44:42 +03:00
Andrey Smirnov
4f9d3b975f
feat: update Kubernetes to v1.29.0-rc.2
See https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.29.md

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-08 19:41:28 +04:00
Andrey Smirnov
46121c9fec
docs: rework machine config documentation generation
Generate a structured table of contents following the structure of the
config.

Make high-level examples follow the full structure of the config.

Document new multi-doc machine config.

Fixes #8023

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-08 14:16:40 +04:00
Andrey Smirnov
e128d3c827
fix: talosctl cluster create not to enforce kubeprism always
The command should be able to deploy old versions of Talos as well,
even before KubePrism.

The version contract correctly enables/disables KubePrism by default, so
take default flag value as "don't change defaults".

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-07 18:15:54 +04:00
Andrey Smirnov
320064c5a8
feat: update Go 1.21.5, Linux 6.1.65, etcd 3.5.11
For main version, cut the release notes to start the 1.7 process.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-07 16:52:28 +04:00
Andrey Smirnov
270604bead
fix: support user disks via symlinks
The core blockdevice library already supported resolving symlinks, we
just need to get the raw block device name from it, and use it
afterwards.

In QEMU provisioner, leave the first (system) disk as virtio (for
performance), and mount user disks as 'ata', which allows `udevd` to
pick up the disk IDs (not available for `virtio`), and use the symlink
path in the tests.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-05 22:02:56 +04:00
Andrey Smirnov
4f195dd271
chore: fix the release.toml
It was using `note` instead of `notes`, so some entries got dropped.

I blame CodePilot for that ;)

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-04 20:23:03 +04:00
Andrey Smirnov
474fa0480d
fix: store and execute desired action on emergency action
Fixes #7854

Talos runs an emergency handler if the sequence experience and
unrecoverable failure. The emergency handler was unconditionally
executing "reboot" action if no other action was received (which only
gets received if the sequence completes successfully), so the Shutdown
request might result in a Reboot behavior on error during shutdown
phase.

This is not a pretty fix, but it's hard to deliver the intent from one
part of the code to another right now, so instead use a global variable
which stores default emergency intention, and gets overridden early in
the Shutdown sequence.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-04 19:51:48 +04:00
Sebastian Gaiser
515ae2a184
docs: extend hetzner-cloud docs for arm64
Added docs for arm64 and updated packer plugin.

Signed-off-by: Sebastian Gaiser <sebastiangaiser@users.noreply.github.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-12-04 20:49:25 +05:30
Dmitriy Matrenichev
eecc4dbd51
fix: trim leading spaces\newlines in inline manifest contents
In route `LoadPatches` -> `configpatcher.Apply` -> `configloader.NewFromBytes` any leading newlines will be transformed  into `|4` yaml. We want to prevent that.

Closes #7993

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-12-04 17:12:20 +03:00
Andrey Smirnov
dbf274ddf7
fix: skip writing the file if the contents haven't changed
As the controller reconciles every /etc file present, it might be called
multiple times for the same file, even if the actual contents haven't
changed.

Rewriting the file might lead to some concurrent process seeing
incomplete file contents more often than needed.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-04 15:58:03 +04:00
Dmitriy Matrenichev
6329222bdc
fix: do not panic in merge.Merge if map value is nil
Checking for `zeroValue` is not enough when accessing `map[string]any`.

Closes #8005

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-12-04 12:38:09 +03:00
Andrey Smirnov
d8a435f0e4
fix: initialize boot assets with defaults early
The problem was that bootloaders were correctly picking up defaults for
`installer` mode (vs. `imager` mode), but DTB and other SBC stuff wasn't
properly initialized, so installing on SBC fails.

Now all options are properly initialized with defaults early in the
process.

Fixes #8009

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-01 17:47:05 +04:00
Andrey Smirnov
c6835de17a
fix: pick etcd adverised addresses from 'current' addresses
Fixes #7947

This way etcd advertised address can be picked from the `external IPs`
of the machine.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-01 17:26:28 +04:00
Andrey Smirnov
6b5bc8b85b
feat: update Linux to 6.1.64
Bump pkgs/extras to the final 1.6.0 versions.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-01 16:59:54 +04:00
Andrey Smirnov
e71e3e4161
feat: support extra arguments for flanneld
Fixes #7754

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-12-01 16:18:02 +04:00
Andrey Smirnov
36c8ddb5e1
feat: implement ingress firewall rules
Fixes #4421

See documentation for details on how to use the feature.

With `talosctl cluster create`, firewall can be easily test with
`--with-firewall=accept|block` (default mode).

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-11-30 22:58:16 +04:00
Dmitriy Matrenichev
0b111ecb81
fix: support slices of enums and fix NfTablesConntrackStateMatch
We already have the code which supports custom enums, so let's extend it to support custom enums in slices and
fix the NfTablesConntrackStateMatch proto definition.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-11-30 00:23:16 +03:00
Andrey Smirnov
9a85217412
feat: improve nftables backend
Many changes to the nftables backend which will be used in the follow-up
PR with #4421.

1. Add support for chain policy: drop/accept.
2. Properly handle match on all IPs in the set (`0.0.0.0/0` like).
3. Implement conntrack state matching.
4. Implement multiple ifname matching in a single rule.
5. Implement anonymous counters.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-11-29 21:22:47 +04:00
Andrey Smirnov
db4e2539d4
feat: update Kubernetes 1.29.0-rc.1 and other bumps
Bump Go modules, final tools and semi-final pkgs.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-11-29 18:29:52 +04:00
Noel Georgi
7a4a92854f
feat: support sanitized kernel args
Support dropping kernel args that start with `-`.

Fixes: #7613

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-11-28 16:23:05 +05:30
Noel Georgi
f041b26299
chore: add tests for mdadm extension
Add tests for mdadm extension.

See: https://github.com/siderolabs/extensions/pull/271

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-11-27 23:18:35 +05:30
Andrey Smirnov
e46e6a312f
feat: implement nftables backend
Implement initial set of backend controllers/resources to handle
nftables chains/rules etc.

Replace the KubeSpan nftables operations with controller-based.

See #4421

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2023-11-27 21:14:15 +04:00
Dmitriy Matrenichev
ba827bf8b8
chore: support getting multiple endpoints from the Provision rpc call
The code will rotate through the endpoints, until it reaches the end, and only then it will try to do the provisioning again.

Closes #7973

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-11-25 21:38:44 +03:00
Dmitriy Matrenichev
dd45dd06cf
chore: add custom node taints
This PR adds support for custom node taints. Refer to `nodeTaints` in the `configuration` for more information.

Closes #7581

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-11-25 18:33:18 +03:00