3915 Commits

Author SHA1 Message Date
Noel Georgi
cf101e56fb
fix: add --force flag for talosctl gen
Error out if file(s) already exists and warn user to use
`--force` to overwrite.

Fixes: #6963

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-03-17 15:07:12 +05:30
Utku Ozdemir
ea2aa06116
fix: fix data race on network config read
Fix a data race caused by the metadata field of PlatformNetworkConfig being edited after it was sent to the channel. It caused test failures.

Fix it by setting a copy of the metadata instead.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2023-03-17 00:24:22 +01:00
Andrey Smirnov
64e3d24c6b
feat: provide platform network config for 'metal' in META
A special META key might contain optional platform network config for
the `METAL` platform.

It is completely optional, but if present, it works same way as in the
clouds: it is applied with low priority (can be overridden with machine
config), but provides some initial defaults for the machine.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-15 23:54:39 +04:00
Andrey Smirnov
442cb9c1b0
feat: implement APIs to write to META
This allows to put keys to META partition.

META contents can be viewed with `talosctl get metakeys`.

There is not real usecase for it yet, but the next PRs will introduce
two special keys which can be written:

* platform network config for `metal`
* `${code}` variable

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-15 22:17:52 +04:00
Utku Ozdemir
9e07832db9
feat: implement summary dashboard
Implement the new summary dashboard with node info and logs.
Replace the previous metrics dashboard with the new dashboard which has multiple screens for node summary, metrics and editing network config.

Port the old metrics dashboard to the tview library and assign it to be a screen in the new dashboard, accessible by F2 key.

Add a new resource, infos.cluster.talos.dev which contains the cluster name and id of a node.

Disable the network config editor screen in the new dashboard until it is fully implemented with its backend.

Closes siderolabs/talos#4790.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2023-03-15 13:13:28 +01:00
Andrey Smirnov
1df841bb54
refactor: change the interface of META
Use a global instance, handle loading/saving META in global context.

Deprecate legacy syslinux ADV, provide an easier interface for
consumers.

Expose META as resources.

Fix the bootloader revert process (it was completely broken for quite a
while :sad:).

This is a first step which mostly does preparation work, real changes
will come in the next PRs:

* add APIs to write to META
* consume META keys for platform network config for `metal`
* custom key for URL `${code}`

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-15 15:43:16 +04:00
Spencer Smith
e9962bc3ea
chore: update CI to tag azure buckets
This PR updates CI to remove the immutability policy and tags the azure
"containers" (aka buckets) with a ci=true tag. This will allow us to
handle the deletion of buckets with the cloud-cleaner app.

Signed-off-by: Spencer Smith <spencer.smith@talos-systems.com>
2023-03-13 14:09:06 -04:00
Andrey Smirnov
9f5f5cf9bf
feat: update Flannel to v0.21.3
See https://github.com/flannel-io/flannel/releases/tag/v0.21.3

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-13 20:32:26 +04:00
Andrey Smirnov
02b0ff35ee
feat: generate Flannel CNI manifest from upstream
Fixes #6730

`go generate`-based step downloads the upstream manifest, transforms it
to match our requirements, and it is compiled in as the Flannel
manifest.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-13 20:00:35 +04:00
Andrey Smirnov
6656d35eca
docs: fix Talos version to use template
Fixes #6944

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-13 15:28:27 +04:00
xyhhx
72a6d1d708
docs: update nocloud
Use the correct link to nocloud cloudinit docs.

Signed-off-by: xyhhx <xyhhx@disr.it>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-13 14:51:28 +04:00
Serge Logvinov
9948a646d2
feat: coredns node uninitialized toleration
Launch CoreDNS even if the node is not initialized.
Network is ready already, but CCM didn't finish their job.

Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-13 14:29:14 +04:00
Andrey Smirnov
e03902b546
feat: update Go to 1.20.2
Also bump Linux to 6.1.15.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-10 16:41:17 +04:00
Steffen Windoffer
c8f8579f2d
fix: upgrade-k8s to flag should not be required since there is a default
Having a default and still requiring it confuses the user.

Signed-off-by: Steffen Windoffer <steffen@wind0r.de>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-09 17:33:21 +04:00
Erik Lund
230cfaf803
feat: use network information from guestinfo.metadata
Add VMware GuestInfo metadata to network configuration.

Fixes #6708

Signed-off-by: Erik Lund Jensen <info@erikjensen.it>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-09 16:51:08 +04:00
Nico Berlee
97048f7c37
feat: netstat in API and client
Implements netstat in Talos API and client (talosctl).

Signed-off-by: Nico Berlee <nico.berlee@on2it.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-09 15:48:30 +04:00
Andrey Smirnov
fda6da6929
fix: successful ACPI shutdown in maintenance mode
Fixes #6817

The original problem wasn't reproducible with `main`, but there was a
set of bugs in the shutdown sequence which was preventing it from
completing successfully, as in the maintenance mode nothing is running
and initialized yet.

Most of the bugs were `nil` pointer dereferences.

Fixed a small issue with final 'RebootError' printed as a failure in the
ACPI shutdown path.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-07 23:52:02 +04:00
Seán C McCord
b97e1abaa6
feat: set default image, validate empty image
Adds a default image URL and ensures that an empty image URL is not
sent when calling `talosctl upgrade`.

Fixes #6912

Signed-off-by: Seán C McCord <ulexus@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-07 18:21:54 +04:00
Artem Chernyshev
121220a3b3
chore: bump dependencies via renovate bot
Fixes: https://github.com/siderolabs/talos/pull/6914
Fixes: https://github.com/siderolabs/talos/pull/6915
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2023-03-07 15:58:25 +03:00
Dmitriy Matrenichev
ebc92f3c1d
chore: add container id to talosctl -k containers and talosctl -k logs
This PR adds first 12 symbols from container ID and adds them to `talosctl -k containers` each container output.
That way we can ensure that we get the logs from proper container even if there is a newer one.

Closes #6886

Co-authored-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-03-07 13:20:44 +03:00
Dmitriy Matrenichev
22ef81c1e7
feat: add grub option to drop to maintenance mode
- [x] Support `talos.experimental.wipe=system:EPHEMERAL,STATE` boot kernel arg
- [x] GRUB option to wipe like above
- [x] update GRUB library to handle that

Closes #6842

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-03-07 12:37:59 +03:00
Andrey Smirnov
642fe0c90c
feat: update pkgs with framebuffer console
This brings in new kernel & containerd, and the kernel has support for
framebuffer console enabled.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-06 22:13:33 +04:00
Noel Georgi
69cb414f01
docs: update cilium install instructions
Update cilium install instructions.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-03-06 22:57:39 +05:30
Dmitriy Matrenichev
e71cc6619b
fix: redo assertHostnames in HostnameMergeSuite.TestMerge
Use `rtestutils.AssertResources` for hostnames test.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-03-06 15:09:50 +03:00
Andrey Smirnov
8ea4bfad8f
refactor: improve the kubernetes upgrade flow
Use new version of go-kubernetes, and move the `kube-proxy` DaemonSet
update to follow common logic of bootstrap manifests update.

This fixes a confusing behavior when after `k8s-upgrade` the version of
`kube-proxy` is not updated in the machine config.

See https://github.com/siderolabs/go-kubernetes/pull/3

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-06 15:01:29 +04:00
Steve Francis
81879fc0ca
docs: add how tos for workloads on control planes, and scaling up
First set of how-tos.

Signed-off-by: Steve Francis <steve.francis@talos-systems.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-06 14:02:45 +04:00
Spencer Smith
05b0b721c9
chore: move blob storage to azure for builds
This PR moves blob storage to azure.

Signed-off-by: Spencer Smith <spencer.smith@talos-systems.com>
2023-03-04 15:50:04 -05:00
Noel Georgi
a78281214d
feat: add cilium e2e tests
Add cilium e2e tests. The existing cilium check was very old, update to
latest cilium version and also add a test for KPR strict mode.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-03-03 20:03:25 +05:30
Tim Jones
061640cccf
feat: add pod ip to kube-proxy spec
Exposes the pod IP as the `POD_IP` environment variable via the downward
API in the kube-proxy pod for use in e.g. metrics-bind-addr.

Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
2023-03-03 12:52:30 +01:00
Andrey Smirnov
dea17d7234
feat: update Kubernetes to v1.26.2
See https://github.com/kubernetes/kubernetes/releases/tag/v1.26.2

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-01 22:50:54 +04:00
Andrey Smirnov
337aaba7a7
feat: add 'os:operator' role
This introduces a new role for Talos API which fills the gap between
`os:reader` and `os:admin` roles.

Fixes #6898

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-01 16:12:25 +04:00
Andrey Smirnov
40e69af224
fix: improve etcd leave on reset process
When removing a member from `etcd`, the server does a pre-check to make
sure the member is connected to a quorum of other members, and the
remove request might fail. Add a retry to wait for the etcd to be fully
connected before giving up, as some parts of the reset flow alrady ran.

Also fix an issue which appears in the integration test, when `reset` is
called early in the boot sequence when local etcd hasn't started fully yet.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-03-01 14:51:49 +04:00
Dmitriy Matrenichev
638dc9128f
fix: fix "defer" leak in ResetUserDisks
Also, print error if we failed to close the device.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-02-28 21:51:37 +03:00
Dmitriy Matrenichev
bfba3677b0
chore: handle grub option - "wipe"
This PR ensures that we can handle third grub option - "wipe". We will use it in 1.4.

For #6842

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-02-28 21:21:28 +03:00
Andrey Smirnov
594f27d878
release(v1.4.0-alpha.2): prepare release
This is the official v1.4.0-alpha.2 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-28 18:03:05 +04:00
Artem Chernyshev
b520710810
feat: introduce new flag in reset API that makes Talos reset user disks
Fixes: https://github.com/siderolabs/talos/issues/6815

Additionally, make it possible to run reset in maintenance mode: to
enable a way for resetting system disk and remove all traces of Talos
from it.

The new reset flow works in a separate sequence, changed disk probe
lookup to check the boot partition instead of the ephemeral one.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2023-02-28 15:10:41 +03:00
Utku Ozdemir
f55f5df739
feat: move dashboard package & run it in tty2
Move dashboard package into a common location where both Talos and talosctl can use it.

Add support for overriding stdin, stdout, stderr and ctt in process runner.

Create a dashboard service which runs the dashboard on /dev/tty2.

Redirect kernel messages to tty1 and switch to tty2 after starting the dashboard on it.

Related to siderolabs/talos#6841, siderolabs/talos#4791.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2023-02-28 12:00:25 +01:00
Dmitriy Matrenichev
36e077ead4
chore: bump deps
- github.com/aws/aws-sdk-go to v1.44.209
- github.com/stretchr/testify to v1.8.2
- github.com/jsimonetti/rtnetlink to v1.3.1
- google.golang.org/genproto to v0.0.0-20230223222841-637eb2293923
- github.com/emicklei/dot to v1.3.1
- github.com/gdamore/tcell/v2 to v2.6.0
- github.com/insomniacslk/dhcp to v0.0.0-20230220063916-5369909a5de7
- github.com/jsimonetti/rtnetlink to v1.3.1
- github.com/opencontainers/runtime-spec to v1.1.0-rc.1.0.20230215090456-58ec43f9fc39
- github.com/rivo/tview to v0.0.0-20230226195229-47e7db7885b4

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-02-28 00:14:59 +03:00
Noel Georgi
5a01d5fd47
chore: run extension build as downstream
Run extensions build as downstream

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-02-27 20:11:10 +05:30
Noel Georgi
426fe9687d
fix: extension base folder permission
The `modules.dep` kernel module dependency tree extension root path was
previously created with a permission of `0o700` which means the talos
root go a permission of `0o700` when the kernel module tree was re-built
when extensions providing kernel modules was enabled. This means that
any binaries lost the executable permission when ran as non-root
creating an `EACCES` error. Fix by making sure the temporary directory
created for building kernel modules tree has `0o755` permission
explicitly.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-02-27 19:49:06 +05:30
Andrey Smirnov
609d3a8a69
feat: support strategic merge patches on VLAN configuration
Fixes #6884

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-27 14:03:11 +04:00
Andrey Smirnov
7e19f32d76
chore: provide version compatibility data for Talos 1.2.x
This provides Kubernetes version compatibility for Talos 1.2.x, so that
we have a unified source of data for Talos >= 1.2.x.

Also bump supported Kubernetes version for Talos 1.4.x to be 1.25-1.27,
as Talos 1.4 is expected to ship with Kubernetes 1.27.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-23 20:48:11 +04:00
Andrey Smirnov
230e46e567
refactor: extract parts of kubernetes libraries
The shared code is going out to the
github.com/siderolabs/go-kubernetes library.

The code will be used in Talos and other projects using same features.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-22 14:56:49 +04:00
Andrey Smirnov
f3d3f0f262
fix: update go-smbios library with Hyper-V data fix
See https://github.com/siderolabs/go-smbios/pull/15

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-21 18:32:27 +04:00
Dmitriy Matrenichev
8711eea962
fix: use passed --context in talosctl config cmd
Use context from command line flags. Also some minor fixes.

Closes #6846

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-02-21 15:00:04 +03:00
Utku Ozdemir
5ac9f43e45
feat: start machined earlier & in maintenance mode
Load & start machined earlier and in initialize sequence, so that it is possible to use its API over its unix socket in maintenance mode.

Additionally, do not return features from Version API  if a config is not yet available.

Related to siderolabs/talos#4791.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2023-02-21 12:21:36 +01:00
Andrey Smirnov
36ab414a1d
docs: fix the endpoints in the libvirt guide
See #6864

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-21 15:00:05 +04:00
Dmitriy Matrenichev
3d55bd80f4
fix: add --force flag to talosctl gen config
Only overwrite existing files if explicitly demanded.

Closes #6847

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-02-20 23:44:00 +03:00
Serge Logvinov
660b8874da
feat: cmdline integer netmask
Can set netmask as number.

Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-02-20 20:55:56 +04:00
Noel Georgi
1e3daacc48
docs: update nvidia component versions
Update NVIDIA component versions.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-02-17 20:03:17 +05:30