3925 Commits

Author SHA1 Message Date
Dmitriy Matrenichev
eb332cfcb7
feat: add health check for a minimal memory / disk size
This PR adds two additional checks which are performed during boot sequence and in `talosctl health`. They ensure that nodes have enough memory and disk.

- Boot check will print a warning if memory / disk size is not sufficient.
- Health check will fail if memory / disk size is not sufficient.

Closes #6467

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-12-10 07:05:08 +03:00
Andrey Smirnov
d04970dfa9
fix: ignore k8s additional addresses if nil
This fixes a potential panic which I found in the unit-tests logs.

The error 'not found' is ignored, so need an addiitonal check.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-09 19:25:07 +04:00
Andrey Smirnov
63c17104c5
feat: update Kubernets to 1.26.0
See https://github.com/kubernetes/kubernetes/releases/tag/v1.26.0

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-09 18:13:35 +04:00
Andrey Smirnov
f7a9a90db2
chore: update pkgs/tools (Go 1.19.4, containerd 1.6.11)
Update to the latest pkgs/tools to fix the build due to vulncheck.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-09 17:25:47 +04:00
Utku Ozdemir
cf7adc51c9
feat: add RedactSecrets method to v1alpha1.Config
Add a way to strip away the secrets from a config.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-12-08 13:03:51 +01:00
Michael Vorburger
4c31b9b1a3
docs: clarify what the deal is with /var
Explain when EPHEMERAL gets wiped.

Signed-off-by: Michael Vorburger ⛑️ <mike@vorburger.ch>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-07 00:05:22 +04:00
Dmitriy Matrenichev
a8ebcca4a9
chore: remove watchErr from metal.getResource
It's only used to detect if resource is `nil` or of incorrect type. Both errors are developer errors, so we should not collect them.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-12-06 22:04:28 +03:00
Dmitriy Matrenichev
1253513bd1
fix: fix nil pointer panic and incorrect error output
Currently `.Error()` call is panicking if `watchErr` is nil. Besides - we want to wrap errors the way we can unwrap them.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-12-06 21:03:25 +03:00
Andrey Smirnov
82e8c9e1f6
fix: workaround panic in the kubelet service controller
The traceback:

```
user: warning: [2022-12-02T17:31:09.496341098Z]: [talos] controller failed {"component": "controller-runtime", "controller": "k8s.KubeletServiceController", "error": "controller \x5c"k8s.KubeletServiceController\x5c" panicked: runtime error: invalid memory address or nil pointer dereference\x5cn\x5cngoroutine 308 [running]:\x5cnruntime/debug.Stack()\x5cn\x5ct/toolchain/go/src/runtime/debug/stack.go:24 +0x65\x5cngithub.com/cosi-project/runtime/pkg/controller/runtime.(*adapter).runOnce.func2()\x5cn\x5ct/.cache/mod/github.com/cosi-project/runtime@v0.1.1/pkg/controller/runtime/adapter.go:403 +0x5d\x5cnpanic({0x2b7b600, 0x536c7c0})\x5cn\x5ct/toolchain/go/src/runtime/panic.go:884 +0x212\x5cngithub.com/talos-systems/talos/internal/app/machined/pkg/controllers/k8s.updateKubeconfig(0xc0000d49b0?)\x5cn\x5ct/src/internal/app/machined/pkg/controllers/k8s/kubelet_service.go:302 +0xb8\x5cngithub.com/talos-systems/talos/internal/app/machined/pkg/controllers/k8s.(*KubeletServiceController).Run(0xc000956030, {0x389f7c0, 0xc000808040}, {0x38bce60, 0xc0000dfa80}, 0x0?)\x5cn\x5ct/s...
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-06 20:53:30 +04:00
Andrey Smirnov
a505b8909a
fix: update COSI and reset restart backoff on success
See https://github.com/cosi-project/runtime/pull/191

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-06 17:43:26 +04:00
Noel Georgi
e92fdcbad1
chore: bump kernel to 5.15.81
Bump kernel to [5.15.81](https://github.com/siderolabs/pkgs/pull/622)

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-12-05 20:07:49 +05:30
Andrey Smirnov
f0dddca2a3
docs: expand help for 'talosctl get'
Make it more obvious how to get list of all resources.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-05 17:42:28 +04:00
Andrey Smirnov
fcffc88790
fix: add ext4 filesystem detection
Fixes #6483

See https://github.com/siderolabs/go-blockdevice/pull/66

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-05 14:42:18 +04:00
Andrey Smirnov
5b2960efff
fix: introduce 'overridePath' setting and fix Talos resolver
There was inconsistency in the way `/v2` was appended to registry
endpoint path between containerd (CRI) and Talos:

* Talos only appended `/v2` to empty paths
* containerd appended `/v2` if it's not the suffix already

Fix Talos to act same as containerd, and introduce a setting
`overridePath` which stops both Talos and `containerd` from appending
`/v2` (should be required with e.g. Harbor registry mirror).

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-05 12:50:53 +04:00
Andrey Smirnov
0219d1124e
fix: use only kube-apiserver endpoints for Talos API access endpoints
Fixes #6566

This avoid putting all node addresses which might not be routeable
across Kubernetes.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-02 22:27:55 +04:00
Andrey Smirnov
dc5e0f4af0
fix: report errors to Equinix Metal event API
This provides more detailed event for better error analysis.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-02 21:24:00 +04:00
Utku Ozdemir
7ab140a94a
feat: add talosctl machineconfig patch command
Add talosctl machineconfig patch command which accepts a machine config as input and a list of patches, applying the patches and writing the result to a file or to stdout.

Link `talosctl machineconfig gen` to `talosctl gen config`, so they work the same way.

Closes siderolabs/talos#6562.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-12-02 15:42:48 +01:00
Andrey Smirnov
d3cf061149
fix: ignore many more filesystems in IMA
Fixes #6553

Talos itself defaults to XFS, so IMA measurements weren't done for Talos
own filesystems. But many other solutions create by default ext4
filesystems, or it might be something mounted by other means.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-01 20:16:41 +04:00
Utku Ozdemir
44e2799b8c
feat: add stdout and single config type support to talosctl gen config
Add support to specify the types of outputs to be generated by talosctl gen config.

Add support for writing a single type of output to stdout instead of a file.

Related to siderolabs/talos#6562.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-12-01 16:55:22 +01:00
Noel Georgi
4452f0e179
docs: bump talos version
Bump last released Talos version.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-12-01 20:00:26 +05:30
Andrey Smirnov
38e57bd12b
feat: update Kubernetes to v1.26.0-rc.1
See https://github.com/kubernetes/kubernetes/releases/tag/v1.26.0-rc.1

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-01 14:53:36 +04:00
Andrey Smirnov
4cd125d499
fix: correctly handle new watch event types
This is a fix after upgrade to COSI v0.2.0.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-01 13:53:22 +04:00
Andrey Smirnov
881b841520
feat: update Flannel to 0.20.2
See https://github.com/flannel-io/flannel/releases/tag/v0.20.2

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-30 19:30:27 +04:00
Andrey Smirnov
2ebe410e93
feat: update COSI to v0.2.0
This brings many fixes, including a new Watch with support for
Bootstapped and Errored event types.

`talosctl` from before this change is still compatible, as there's gRPC
API level backwards compatibility versioning.

New client doesn't yet depend on new event types, so it will work
against Talos 1.2.x.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-29 21:21:59 +04:00
Andrey Smirnov
00388651b2
chore: bump pkgs and Go dependencies
Update Linux to 5.15.80, final tagged versions of pkgs/tools/extras for
Talos 1.3.0.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-29 15:20:09 +04:00
Andrey Smirnov
bbb56840e4
chore: update protobuf API descriptors for 1.3.0
Set the API descriptors for v1.3.0.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-29 14:41:43 +04:00
Andrey Smirnov
fdbd380f60
feat: use 'registry.k8s.io' for Kubernetes images
See https://kubernetes.io/blog/2022/11/28/registry-k8s-io-faster-cheaper-ga/

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-28 14:13:54 +04:00
Andrey Smirnov
1103c5ad24
feat: implement pre-flight checks in the installer
Host Talos mounts machined socket for API access into the installer
container (for upgrades).

Installer runs any check it might need to verify compatibility.

At the moment following checks are implemented:

* Talos version (whether upgrade from version X to Y is supported)
* Kubernetes version (whether Kubernetes version X is supported with
  Talos Y).

Fixes #6149

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-28 13:45:49 +04:00
Andrey Smirnov
4a052eadf3
fix: disable kexec on upgrades from pre-BTF kernel
Enabling BTF in the kernel brakes kexec from pre-BTF kernel (e.g. when
upgrading from 1.2.x to 1.3.x).

As there's no way to detect Talos version in the installer at the
moment, use another way to detect whether BTF is enabled in the Talos
version which is running right now.

Fixes #6443

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-24 22:48:39 +04:00
Andrey Smirnov
732c459ecf
fix: parse and apply DHCP settings properly from cmdline
This allows multiple `ip=` parameters, and fixes setting DHCP for any
link on the cmdline.

Fixes #6475

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-24 21:47:29 +04:00
Alexandre Mclean
a9e9d71b24
fix: parse correctly upgrade cmd force flag
It was using value of a variable boud to another flag.

Signed-off-by: Alexandre Mclean <alexandre.mclean@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-24 20:07:23 +04:00
Andrey Smirnov
e85e64d6f8
docs: document metal-iso configuration method
This exists in the code, but it's not documented properly.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-24 19:48:20 +04:00
Steve Francis
c27adbe541
docs: update getting started
Fixed typos, added info about how to detect disks, simplified.

Signed-off-by: Steve Francis <steve.francis@talos-systems.com>
2022-11-24 14:09:41 +01:00
Andrey Smirnov
260684a930
chore: use build-container image for s3cmd
Looks like s3cmd image is broken now.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-24 16:32:08 +04:00
Andrey Smirnov
ee7a4777af
chore: bump dependencies
Linux 5.15.79, containerd 1.6.10

Other changes come from:

* https://github.com/siderolabs/toolchain/pull/57
* https://github.com/siderolabs/tools/pull/244
* https://github.com/siderolabs/pkgs/pull/619
* https://github.com/siderolabs/extras/pull/67

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-22 23:47:05 +04:00
Michael Vorburger ⛑️
49a4b14947
docs: clarify talosctl apply-config & talosctl get machineconfig
Fixes: #6522

Signed-off-by: Michael Vorburger ⛑️ <mike@vorburger.ch>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-22 23:25:23 +04:00
Serge Logvinov
a58c3d6699
feat: hcloud location properties
Receive regian/zone from metadata server.

Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-22 18:31:21 +04:00
Andrey Smirnov
6bce06f622
feat: update etcd 3.5.6
See https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.5.md

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-21 20:35:52 +04:00
Andrey Smirnov
c54bea1283
fix: don't publish external IPs as affiliate addresses
Fixes #5937

This removes external IPs from a set of addresses published by the node
(we source addresses from 'routed' now which excludes external). This is
definitely "right" thing to do, as those addresses are not on the node
itself and can't be routed to the node.

On other hand it also removes them from `talosctl get members`, but we
don't have to split this up right now.

For the KubeSpan endpoints, we still use 'all' addresses, as external
IPs are perfect as KubeSpan endpoints (Wireguard endpoints).

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-21 15:02:53 +04:00
Andrey Smirnov
54d9032ce2
test: fix log streaming for conformance tests
Global timeout in kubeconfig was cauing log streaming to abort.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-18 22:47:36 +04:00
Serge Logvinov
e432579d48
feat: kubespan node endpoints filter
This feature allows us to use only IPv4 or IPv6 stack to reach the peers.
Also, it can help to not share the node-specific IPs,
which cannot be accessible at all.

Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
2022-11-18 19:55:42 +04:00
Andrey Smirnov
6430ce1efc
fix: limit SideroLink Wireguard link MTU to 1280
See https://github.com/siderolabs/siderolink/pull/19

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-18 00:09:10 +04:00
Dmitriy Matrenichev
1f1128028a
chore: add flag to force talos cluster folder deletion
This is handy when the node with qemu went down, so you had to manually delete the folder after it restarted.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-11-17 20:15:50 +03:00
Andrey Smirnov
d9c2c6f0a5
chore: update Kubernetes Go modules to 0.26.0-rc.1
Follow up for Kubernetes version bump.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-17 15:37:58 +04:00
Utku Ozdemir
3d30ce6d7a
feat: add util function to extract GRPC status from error
Add a function to the machinery to extract GRPC status.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-11-16 23:24:31 +01:00
Andrey Smirnov
9e44341c44
release(v1.3.0-alpha.2): prepare release
This is the official v1.3.0-alpha.2 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-16 22:00:33 +04:00
Andrey Smirnov
aa56aed798
feat: publish discovered public IP as one of the KubeSpan endpoint
This resolves a case when a node is behind NAT, but KubeSpan port is
forwarded back to the node. Discovery Service returns public IP of the
client as it sees from the incoming request. That address is now
published to the KubeSpan endpoints.

Fixes #6508

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-16 17:36:38 +04:00
Andrey Smirnov
9382443baa
feat: update Kubernetes to v1.26.0-rc.0
Removed deprecated arg from the kubelet spec, as the arg is going to be
removed completely in v1.27 (kubelet defaults to remote CRI anyways).

Go modules not updated due to https://github.com/kubernetes/kubernetes/issues/113951

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-16 17:07:06 +04:00
Andrey Smirnov
6ffc381c59
feat: implement CRI configuration customization
This is tricky, as containerd doesn't merge itself plugin configuration
across multiple files. TOML can't load configuration correctly from
concatenated files.

Fixes #6390

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-16 15:38:44 +04:00
Philipp Sauter
e1e340bdd9
feat: expose Talos node labels as a machine configuration field
We add the `nodeLabels` key to the machine config to allow users to add
node labels to the kubernetes Node object. A controller
reads the nodeLabels from the machine config and applies them via the
kubernetes API.
Older versions of talosctl will throw an unknown keys error if `edit mc`
 is called on a node with this change.

Fixes #6301

Signed-off-by: Philipp Sauter <philipp.sauter@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-15 21:25:40 +04:00