4527 Commits

Author SHA1 Message Date
Matthieu S
3fe82ec461
feat: custom image settings for k8s upgrade
Allows to use custom registry/images.

Fixes: #8275

Co-authored-by:  @g3offrey
Signed-off-by: Matthieu STROHL <mstrohl@dive-in-it.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-15 17:54:01 +04:00
Dmitriy Matrenichev
fa3b933705
chore: replace fmt.Errorf with errors.New where possible
This time use `eg` from `x/tools` repo tool to do this.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-02-14 17:39:30 +03:00
Andrey Smirnov
d4521ee9c4
feat: update kernel with sfc driver and LSM updates
See:

* https://github.com/siderolabs/pkgs/pull/890
* https://github.com/siderolabs/pkgs/pull/891

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-14 14:48:45 +04:00
Noel Georgi
2f0421b406
fix: run xfs_repair on invalid argument error
Run `xfs_repair` for invalid argument error.

Part of: #8292

Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-02-13 23:01:33 +05:30
Michael Stephenson
f868fb8e8f
docs: update vmware tools url
Fixed URL to point to repository that exists.

Signed-off-by: Michael Stephenson <m.k.stephenson@outlook.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-13 14:35:11 +04:00
Dmitriy Matrenichev
fa2d34dd88
chore: enable v6 support on the same port
Replace `SO_REUSEPORT` with `SO_REUSEPORT`.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-02-13 01:02:27 +03:00
Dmitriy Matrenichev
83e0b0c19a
chore: adjust dns sockets settings
Enable some TCP optimization, set minimal TTL, set socket reuse.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-02-12 17:13:03 +03:00
Andrey Smirnov
a1ec1705bc
chore: update Go to 1.22.0
Finally!

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-12 14:33:38 +04:00
Andrei Kvapil
76b50fcd4a
chore: add Ænix to the Adopters list
Add Ænix to the Adopters list.

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-02-12 15:02:08 +05:30
Dmitriy Matrenichev
5324d39167
chore: bump stuff
Also fix .golangci.yml file.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-02-09 19:19:25 +03:00
Andrey Smirnov
087b50f429
feat: support systemd-boot ISO enroll keys option
Fixes #8196

Example (profile excerpt):

```yaml
output:
  kind: iso
  isoOptions:
    sdBootEnrollKeys: force
  outFormat: raw
```

Defaults are still same (`if-safe` unless explicitly overridden).

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-09 17:48:13 +04:00
Dmitriy Matrenichev
afa71d6b02
chore: use "handle-like" resource in DNSResolveCacheController
Rework (and simplify) `DNSResolveCacheController` to use `DNSUpstream` "handle-like" resources.

Depends on https://github.com/cosi-project/runtime/pull/400

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-02-08 21:40:57 +03:00
Andrey Smirnov
013e130702
fix: error with decoding config document with wrong apiVersion
Fixes #8270

The base bug was that the registry will return `nil` `ConfigDocument` if
the version is not registered for a kind, which would result into weird
config decoding errors.

Add more unit-tests, while at it, also add more fuzzing samples.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-08 18:39:21 +04:00
Louis SCHNEIDER
1e77bb1c3d
chore: allow custom pkgs to build talos
Allow to override each package reference.

Signed-off-by: Louis SCHNEIDER <louis.schneider@bedrockstreaming.com>
Signed-off-by: Louis SCHNEIDER <louis@schne.id>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-08 17:07:31 +04:00
Andrey Smirnov
3f8a85f1b3
fix: unlock the upgrade mutex properly
Fixes #4525

The previous implementation had several issues:

* etcd concurrency session never closed
* Unlock() with potentially closed context
* unlocking when upgrade sequence finishes, but this overlaps with the
  machine reboot, so a chance that it never got unlocked

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-08 15:50:02 +04:00
AvnarJakob
61c3331b14
docs: update indentation in vip.md
Wrong YAML indentation.

Signed-off-by: AvnarJakob <75129695+AvnarJakob@users.noreply.github.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-08 15:16:40 +04:00
Andrey Smirnov
383e528df8
chore: allow uuid-based hostnames in talosctl cluster create
This is useful when the VMs are booted without machine config,
so default hostnames based on controlplanes/workers no longer make
sense.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-07 16:22:53 +04:00
Noel Georgi
1e6c8c4dec
feat: extensions services config
Support config files for extension services.

Fixes: #7791

Co-authored-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-02-06 17:12:01 +05:30
shurkys
989ca3ade1
feat: add OpenNebula platform support
Initial support without documentation.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Signed-off-by: shurkys <no@mail.com>
2024-02-05 20:43:47 +04:00
bri
914f887788
docs: update nocloud.md Proxmox information
Proxmox _does_ support manually editing the configuration files, but a safer option is to use the CLI or API for the sake of option validation.

This PR updates the documentation that suggested reading and editing the VM configuration by hand, and replaces that with CLI commands to do the same. The `qm` command needs to be run from a root shell, but you need to be `root` to edit (or even read!) the configuration via something like SFTP, anyway.

I also updated the UUID to be a real UUID, and then tested these commands on my home Proxmox server.

Signed-off-by: bri <284789+b-@users.noreply.github.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-05 20:05:09 +04:00
Henno Schooljan
a04cc80154
fix: pass TTL when generating client certificate
Pass the TTL to the talosconfig generation function.

Signed-off-by: Henno Schooljan <github@sfynx.nl>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-05 18:54:16 +04:00
Dmitriy Matrenichev
3fe8c12ca6
fix: add log line about controller runtime failing
While we decide what to do with #8263 and #8256 this quickfix at least allows us to
see what went wrong

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-02-05 17:22:02 +03:00
Andrey Smirnov
ddbabc7e58
fix: use a separate cgroup for each extension service
Fixes #8229

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-05 17:37:55 +04:00
Andrey Smirnov
6ccdd2c09c
chore: fix markdown-lint call
Don't ask me why this weird syntax for flags.

Don't ask me why it fails with exit code zero (success) on invalid
flags.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-05 17:18:45 +04:00
Saiyam Pathak
4184e617ab
chore: add test for wasmedge runtime extension
Add tests for WasmEdge container runtime system extension.

Signed-off-by: Saiyam Pathak <saiyam911@gmail.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
2024-02-05 18:18:13 +05:30
Andrey Smirnov
95ea3a6c65
chore: bump timeout in acquire tests
With switching to RSA service account, machine config generation time is
considerably higher now, so the test might not make it in time.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-05 15:18:22 +04:00
Andrey Smirnov
c19a505d8c
chore: bump docker dind image
We don't need hacked one anymore.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-05 14:43:39 +04:00
fazledyn-or
d7d4154d5d
chore: remove channel blocking in qemu launch
The channel is never read from.

Signed-off-by: fazledyn-or <ataf@openrefactory.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-02 18:57:36 +04:00
Andrey Smirnov
029d7f7b9b
release(v1.7.0-alpha.0): prepare release
This is the official v1.7.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-01 22:10:27 +04:00
Andrey Smirnov
2ff81c06bc
feat: update runc 1.1.12, containerd 1.7.13
Also:

* Linux 6.6.14 + XDP enablement
* etcd 3.5.12

Various other bumps for the tools, utilities, and Go modules.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-01 17:01:04 +04:00
Andrey Smirnov
9d8cd4d058
chore: drop deprecated method EtcdRemoveMember
It was deprecated 16 months ago, time to cleanup.

(This is to prepare for the first v1.7 release)

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-01 15:54:29 +04:00
Andrey Smirnov
17567f19be
fix: take into account the moment seen when cleaning up CRI images
Fixes #8069

The image age from the CRI is the moment the image was pulled, so if it
was pulled long time ago, the previous version would nuke the image as
soon as it is unreferenced. The new version would allow the image to
stay for the full grace period in case the rollback is requested.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-01 14:44:22 +04:00
Andrey Smirnov
aa03204b86
docs: document the process of building custom kernel packages
Fixes #7612

Drop the customizing rootfs docs, and point towards system extensions
documentation, as it is the right way.

Document building custom Talos Linux kernel.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-02-01 14:24:31 +04:00
Andrey Smirnov
7af48bd559
feat: use RSA key for kube-apiserver service account key
Fixes #8111

Starting with 1.7, use RSA instead of ECDSA.

RSA is way slower, but it has better support with other providers.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-01-31 23:05:50 +04:00
Andrey Smirnov
a5e13c696d
fix: retry blockdevice open in the installer
We had these retries in other places, but not here.

This seems to happen more frequently with Linux 6.6 update, the tl;dr is
same: `udevd` tries to rescan the partition table at the wrong moment,
preventing Talos installer to open the partition which was just created.

It's a race, so workaround it by retrying the call.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-01-31 22:17:20 +04:00
Andrey Smirnov
593afeea38
fix: run the interactive installer loop to report errors
In the previous implementation, even though `installer.err` was set, it
was never checked 🤦.

The run loop was stolen from the dashboard code.

Fixes #8205

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-01-31 19:20:46 +04:00
Andrey Smirnov
87be76b878
fix: be more tolerant to error handling in Mounts API
Fixes #8202

If some mountpoint can't be queried successfully for 'diskfree'
information, don't treat that as an error, and report zero values for
disk usage/size instead.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-01-31 18:24:38 +04:00
stereobutter
03add75030
docs: add section on using imager with extensions from tarball
Add an example of using a custom extension via tarball.

Signed-off-by: stereobutter <sascha.desch@hotmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-01-30 15:56:59 +04:00
Steve Francis
ee0fb5effc
docs: consolidate certificate management articles
Move around some docs.

Signed-off-by: Steve Francis <steve.francis@talos-systems.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-01-30 15:22:04 +04:00
Dmitriy Matrenichev
9c14dea209
chore: bump coredns
Bump our CoreDNS fork.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-01-30 02:12:36 +03:00
Dmitriy Matrenichev
ebeef28525
feat: implement local caching dns server
This PR adds a new controller - `DNSServerController` that starts tcp and udp dns servers locally. Just like `EtcFileController` it monitors `ResolverStatusType` and updates the list of destinations from there.

Most of the caching logic is in our "lobotomized" "`CoreDNS` fork. We need this fork because default `CoreDNS` carries
full Caddy server and various other modules that we don't need in Talos. On our side we implement
random selection of the actual dns and request forwarding.

Closes #7693

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2024-01-29 20:26:38 +03:00
edwinavalos
4a3691a273
docs: fix broken links in metal-network-configuration.md
Fixed the set of same links in 1.4, 1.5, 1.6, and 1.7, with an exception
of a link in 1.4 where the it links to boot assets and boot assets, if
we were to place a copy in that version, is missing a bunch of
supporting links. Opted to skip that update, as that documentation is
unsupported.

Signed-off-by: edwinavalos <edwin.a.avalos@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-01-29 18:44:21 +04:00
Spencer Smith
c4ed189a69
docs: provide sane defaults for each release series in vmware script
This PR sets proper defaults based on the series of talos. Defaults to last release in each series.

Signed-off-by: Spencer Smith <spencer.smith@talos-systems.com>
2024-01-29 09:25:04 -05:00
Andrey Smirnov
8138d54c6c
docs: clarify node taints/labels for worker nodes
`NodeRestriction` admission plugin heavily restricts what worker nodes
can set.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-01-29 17:56:46 +04:00
Andrey Smirnov
b44551ccdb
feat: update Linux to 6.6.13
See https://github.com/siderolabs/pkgs/pull/873

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-01-29 16:50:33 +04:00
Christian Mohn
385707c5f3
docs: update vmware.sh
Add support for using the GOVC_NETWORK environment variable to determine which vSphere vSwitch PortGroup to use.

This checks if the GOVC_NETWORK environment variable is set, if that's the case, use that value. If not, continue with the default PortGroup (VM Network) as before.

Checks added for both control plane and worker nodes.

Signed-off-by: Christian Mohn <christian@drible.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-01-29 14:55:21 +04:00
Spencer Smith
d1a79b845f
docs: fix small typo in etcd maintenance guide
This PR fixes a little typo in these docs, b/c etcd is under the cluster
key.

Signed-off-by: Spencer Smith <spencer.smith@talos-systems.com>
2024-01-29 14:22:04 +04:00
Utku Ozdemir
cf0603330a
docs: copy generated JSON schema to host
After the JSON schema is generated in a build container, copy it over to the host, so it becomes a part of the codebase.

This is required as the location of the schema changed recently from being under `pkg/machinery/config/types/` to be under `pkg/machinery/config/schemas/`.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2024-01-26 13:56:55 +01:00
Andrey Smirnov
f11139c229
docs: document local path provisioner install
Use kustomize (as the official supported way for Local Path
Provisioner).

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-01-26 14:30:45 +04:00
Andrey Smirnov
e0dfbb8fba
fix: allow META encoded values to be compressed
Fixes #8186

This is planned to be backported to Talos 1.6.3.

This allows to pass large META values (YAML for platform network
configuration) which might otherwise exceed the limit for kernel
command line params.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-01-23 17:24:18 +04:00