412 Commits

Author SHA1 Message Date
Artem Chernyshev
5c6648e3d2
fix: make talosctl command return nonzero error codes if it had errors
Multinode requests were printing out the errors for each node to stderr,
but they didn't set the global error.

Refactor the code a bit to use a single function for handling that logic
to avoid rewriting it in many other places.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-08-12 14:19:45 +03:00
Artem Chernyshev
13499fc302
feat: support patching the machine config in the apply-config cmd
Fixes: https://github.com/siderolabs/talos/issues/6045

`talosctl apply-config` now supports `--config-patch` flag that takes
machine config patches as the input.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-08-11 13:56:23 +03:00
Noel Georgi
b62b18a972
feat: bump k8s to v1.25.0-beta.0
Bump k8s to v1.25.0-beta.0

Update most kubernetes `master` references to `controlplane`

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-08-10 22:17:53 +05:30
Utku Ozdemir
84e712a9f1
feat: introduce Talos API access from Kubernetes
We add a new CRD, `serviceaccounts.talos.dev` (with `tsa` as short name), and its controller which allows users to get a `Secret` containing a short-lived Talosconfig in their namespaces with the roles they need. Additionally, we introduce the `talosctl inject serviceaccount` command to accept a YAML file with Kubernetes manifests and inject them with Talos service accounts so that they can be directly applied to Kubernetes afterwards. If Talos API access feature is enabled on Talos side, the injected workloads will be able to talk to Talos API.

Closes siderolabs/talos#4422.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-08-08 18:27:26 +02:00
Andrey Smirnov
f9b664c947
fix: reload trusted CA list when client is recreated
Fixes #5652

This reworks and unifies HTTP client/transport management in Talos:

* cleanhttp is used everywhere consistently
* DefaultClient is using pooled client, other clients use regular
  transport
* like before, Proxy vars are inspected on each request (but now
  consistently)
* manifest download functions now recreate the client on each run to
  pick up latest changes
* system CA list is picked up from a fixed locations, and supports
  reloading on changes

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-04 20:01:35 +04:00
Andrey Smirnov
a6b010a8b4
chore: update Go to 1.19, Linux to 5.15.58
See https://go.dev/doc/go1.19

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-03 17:03:58 +04:00
Andrey Smirnov
fe2ee3b100
feat: implement MachineStatus resource
Fixes #5789

Example:

```yaml
spec:
    stage: running
    status:
        ready: false
        unmetConditions:
            - name: staticPods
              reason: kube-system/kube-controller-manager-talos-default-master-1 not ready, kube-system/kube-scheduler-talos-default-master-1 not ready
```

As events (CLI doesn't show full contents):

```
172.20.0.2   cbhf2l6f9lrs738hehfg   talos/runtime/machine.MachineStatusEvent   BOOTING   ready: false, unmet conditions: [time network services]
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-01 18:36:10 +04:00
Dmitriy Matrenichev
30f7851d2a
chore: bump golangci-lint from 1.45.2 to 1.47.2
Minor linter upgrade.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-07-22 17:49:44 +03:00
Noel Georgi
d650afb6cd
chore: fix typo in powercycle
Fix typo in `powercycle`

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-07-20 21:33:17 +05:30
Andrey Smirnov
80444a43d9
fix: remove data race in pcap capture
Capture handle should be closed in the same goroutine with packet
reading.

Fix a spurious error which might appear in `talosctl pcap`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-07-20 01:10:59 +04:00
Andrey Smirnov
065b59276c
feat: implement packet capture API
This uses the `go-packet` library with native bindings for the packet
capture (without `libpcap`). This is not the most performant way, but it
allows us to avoid CGo.

There is a problem with converting network filter expressions (like
`tcp port 3222`) into BPF instructions, it's only available in C
libraries, but there's a workaround with `tcpdump`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-07-19 01:23:09 +04:00
Utku Ozdemir
a75fe7600d
feat: gen secrets from kubernetes pki dir
This PR allows the ability to generate `secrets.yaml` (`talosctl gen secrets`) using a Kubernetes PKI directory path (e.g. `/etc/kubernetes/pki`) as input. Also introduces the flag `--kubernetes-bootstrap-token` to be able to set a static Kubernetes bootstrap token to the generated `secrets.yaml` file instead of a randomly-generated one. Closes siderolabs/talos#5894.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-07-16 13:06:32 +02:00
Andrey Smirnov
641f6a1e4e
feat: expose strategic merge config patches
The end result is that every Talos CLI accepts both JSON and strategic
patches to patch machine configuration.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-07-12 15:38:01 +04:00
Utku Ozdemir
d924901b79
feat: add cli subcommand to generate secrets
Adds a new command `talosctl gen secrets` to generate a `secrets.yaml` file with Talos and Kubenetes secrets. This file can later be used like `talosctl gen config ... --with-secrets secrets` to generate a config with these pre-generated secrets. Closes siderolabs/talos#5861.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-07-06 20:00:35 +02:00
Utku Ozdemir
ae3840dbc3
refactor: move kubeconfig package under public api
Move the kubeconfig package under pkg/ so that other projects can reuse parts of it.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-07-01 19:22:16 +02:00
Utku Ozdemir
6759fcd4ae
feat: use discovery service on cluster health checks
Query the discovery service to fetch the node list and use the results in health checks. Closes siderolabs#5554.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-06-15 16:01:38 +02:00
Utku Ozdemir
8d2be5e315
feat: extend node definition used in health checks
Introduce `cluster.NodeInfo` to represent the basic info about a node which can be used in the health checks. This information, where possible, will be populated by the discovery service in following PRs. Part of siderolabs#5554.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-06-13 14:13:42 +02:00
Philipp Sauter
7a11b4def7
fix: make talosctl bootstrap accept only single node
Previously talosctl would accept multiple nodes for the bootstrap
command which is a strictly single-node operation. Talosctl will abort
the bootstrap command if more than one node is specified either as a
command-line flag or in talosconfig.

Fixes #5636

Signed-off-by: Philipp Sauter <philipp.sauter@siderolabs.com>
2022-06-10 22:27:15 +02:00
Andrey Smirnov
da2985fe1b
fix: respect local API server port
It wasn't used when building an endpoint to the local API server, so
Talos couldn't talk to the local API server when port was changed from
the default one.

Fixes #5706

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-06-09 00:33:49 +04:00
Andrey Smirnov
e03266667f
fix: correctly validate reboot mode in CLI
This fixes an issue when invalid `--mode` option was treated as a
default mode.

Fixes #5712

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-06-08 23:45:04 +04:00
Dmitriy Matrenichev
70fc424099
chore: add generic methods and use them
Things like ToSet, Keys etc...

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-06-09 02:59:23 +08:00
Andrey Smirnov
fe858041bd
feat: enable version API in maintenance mode
Version API is only available over SideroLink connection.

This is useful to find Talos version as it got booted (e.g. to generate
proper machine configuration).

There's a security concern that version API might return sensitive
information via public API. At the same time Talos version can be
guessed by looking at the output of other APIs, e.g. resource type list
(`talosctl get rd`), which changes with every minor version.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-05-26 21:47:10 +04:00
Andrey Smirnov
3f88030ca7
test: use use correct method to generate Wireguard private key
`GenerateKey` generates random 32 bytes vs. the key suitable for
Wireguard endpoint key.

This is the only place in code with this bug, and it is only used in
test code (`talosctl cluster create` with fixed Wireguard
configuration).

SideroLink and Kubespan are not affected.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-05-24 23:18:23 +04:00
Tim Jones
4551cbd7fc
fix: cluster creation error message formatting
Use "%w" to properly unwrap the error operand

Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
2022-05-24 18:12:33 +02:00
Tim Jones
bafa1f49d4
fix: improve error message when creating cluster
Add extra context to error message when unable to properly
open the talos config file when creating a cluster.

Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
2022-05-24 13:40:15 +02:00
Philipp Sauter
b52e0b9b9e
fix: talosctl throws error if gen option and --input-dir flags are combined
The user will get an error message and talosctl aborts if `talosctl cluster create` is called with gen options and the --input-dir flag.

Fixes #2275

Signed-off-by: Philipp Sauter <philipp.sauter@siderolabs.com>
2022-05-04 16:18:03 +02:00
Artem Chernyshev
2b03057b91
feat: implement a new mode try in the config manipulation commands
The new mode allows changing the config for a period of time, which
allows trying the configuration and automatically rolling it back in case
if it doesn't work for example.

The mode can only be used with changes that can be applied without a
reboot.

When changed it doesn't write the configuration to disk, only changes it
in memory.
`--timeout` parameter can be used to customize the rollback delay.
The default timeout is 1 minute.

Any consequent configuration change will abort try mode and the last
applied configuration will be used.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-04-21 20:31:45 +03:00
Andrey Smirnov
68dfdd3311
fix: provide logger to the etcd snapshot restore
With update of the client library to 3.5.3, etcd library started using
the logger, so using `nil` isn't fine anymore.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-19 15:16:33 +03:00
Artem Chernyshev
2b9722d1f5
feat: add dry-run flag in apply-config and edit commands
Dry run prints out config diff, selected application mode without
changing the configuration.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-04-14 19:12:57 +03:00
Andrey Smirnov
8af50fcd27
fix: correct cri package import path
Containerd CRI plugin was merged into the main repo, but we were using
old import path, so our constants coming from the module were outdated.

This fixes the image version for the pause container.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-14 16:27:45 +03:00
Andrey Smirnov
19bf12af07
fix: enable IPv6 in Docker-based Talos clusters
Docker by default disable IPv6 completely in the containers which breaks
SideroLink on Docker-based clusters, as SideroLink is using IPv6
addresses for the Wiregurard tunnel.

This change might break `talosctl cluster create` on host systems which
have IPv6 disabled completely, so provide a flag to revert this
behavior.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-01 20:28:12 +03:00
Tim Jones
95d900de77
feat: use kubeconfig env var
When interating with the kubeconfig it can be
expected that a user may have the KUBECONFIG
environment variable set, so we need to use
it when appropriate.
Closes #5091

Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
2022-03-31 15:30:26 +02:00
Andrey Smirnov
12931dcedd
fix: align partitions on 1M boundary
Potentially fixes: #4985

See siderolabs/go-blockdevice#58 for details.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-31 14:36:13 +03:00
Dmitriy Matrenichev
b315ed9532
chore: use go:embed instead of ldflags
Generate separate file for each variable and assign them during go build using go:embed instead of using ldflags -X.

Resolves #5138

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-03-30 18:15:48 +04:00
Andrey Smirnov
7283efd568
chore: update the talosctl CNI download url
There was hardcoded org/username.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-24 15:08:05 +03:00
Dmitriy Matrenichev
e06e1473b0
feat: update golangci-lint to 1.45.0 and gofumpt to 0.3.0
- Update golangci-lint to 1.45.0
- Update gofumpt to 0.3.0
- Fix gofumpt errors
- Add goimports and format imports since gofumports is removed
- Update Dockerfile
- Fix .golangci.yml configuration
- Fix linting errors

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-03-24 08:14:04 +04:00
Nico Berlee
6bec084299
feat: add talosctl completions to copy, usage, logs, restart and service
talosctl copy: path completion of src node path, dst client path was done by default
talosctl usage: directory completion of node path
talosctl logs: service completion
talosctl restart: running container/pod completion
talosctl service: service and action completion

Signed-off-by: Nico Berlee <nico.berlee@on2it.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-16 13:33:06 +03:00
Andrey Smirnov
59681b8c9a
fix: backport fixes from release-1.0 branch
They were discovered as we tagged 1.0.0 version:

* wrong deprecated version
* incompatibility in extension compatibility checks

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-04 23:28:06 +03:00
Artem Chernyshev
a50747a64a
fix: align list and diskusage command flags with their Linux analogs
Fixes: https://github.com/talos-systems/talos/issues/3018

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-03-02 22:27:56 +03:00
Andrey Smirnov
09efa62f68
chore: re-enable kexec and default to UEFI booting in tests
Fixes #4947

It turns out there's something related to boot process in BIOS mode
which leads to initramfs corruption on later `kexec`.

Booting via GRUB is always successful.

Problem with kexec was confirmed with:

* direct boot via QEMU
* QEMU boot via iPXE (bundled with QEMU)

The root cause is not known, but the only visible difference is the
placement of RAMDISK with UEFI and BIOS boots:

```
[    0.005508] RAMDISK: [mem 0x312dd000-0x34965fff]
```

or:

```
[    0.003821] RAMDISK: [mem 0x711aa000-0x747a7fff]
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-02 21:52:18 +03:00
Andrey Smirnov
6ccfdbaf1b
fix: avoid replacing default gRPC codec in machinery
Fixes #4987

As machinery is supposed to be widely used project, and gRPC lacks
proper support to override default codec easily, it might come into
conflict with other projects.

Instead, move codec to core talos, and register it explicitly in the
server code (which covers machined, apid, trustd) and client code
(talosctl).

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-02-18 00:39:41 +03:00
Charlie Haley
fef99892d5
chore: pin kubernetes version to talosctl gen config
Pin talos default k8s version to `talosctl gen config`

Signed-off-by: Charlie Haley <charlie.haley@hotmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-02-11 16:47:49 +03:00
Tim Jones
fe40e7b1b3
feat: drain node on shutdown
Cordon & drain a node when the Shutdown message is received.
Also adds a '--force' option to the shutdown command in case the control
plane is unresponsive.

Signed-off-by: Tim Jones <timniverse@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-02-01 00:06:32 +03:00
Bernard Sébastien
7f0b3aae0a
feat: add multiple config patches, patches from files, YAML support
Include filename content if value begins with @ (see curl for example).

Add multiple config-path option on cmdline to apply them in order.

ex:

```
talosctl-linux-amd64 gen config talos1 https://127.0.0.1:6443 --config-patch-control-plan @cidrs.json --config-patch-worker @sysctls-workders.json --config-path @cluster-name.json
```

Load JSON patch from YAML.

This applies to all commands handling config patches.

Closes: https://github.com/talos-systems/talos/issues/4764

Signed-off-by: Sébastien Bernard <sbernard@nerim.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-01-31 22:50:46 +03:00
Florian Klink
4245f72d3f
feat: add --extra-uefi-search-paths option
This allows specifying additional paths to look for UEFI firmware.

Signed-off-by: Florian Klink <flokli@flokli.de>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-01-27 19:55:36 +03:00
Noel Georgi
151c9df091
chore: add CSI tests for e2e-qemu
Add tests for using rook as CSI for e2e-qemu
Allow specifying cpu/memory for workers

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-01-27 20:06:10 +05:30
Artem Chernyshev
ebec5d4a0c
feat: support full disk path in the diskSelector
Fixes: https://github.com/talos-systems/talos/issues/4788

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-01-27 15:23:00 +03:00
Artem Chernyshev
2f2bdb26aa
feat: replace flags with --mode in apply, edit and patch commands
Fixes: https://github.com/talos-systems/talos/issues/4588

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-01-13 16:09:53 +03:00
Andrey Smirnov
2f4b9d8d6d
feat: make machine configuration read-only in Talos (almost)
Talos shouldn't try to re-encode the machine config it was provided
with.

So add a `ReadonlyWrapper` around `*v1alpha1.Config` which makes sure
that raw config object is not available anymore (it's a private field),
but config accessors are available for read-only access.

Another thing that `ReadonlyWrapper` does is that it preserves the
original `[]byte` encoding of the config keeping it exactly same way as
it was loaded from file or read over the network.

Improved `talosctl edit mc` to preserve the config as it was submitted,
and preserve the edits on error from Talos (previously edits were lost).

`ReadonlyWrapper` is not used on config generation path though - config
there is represented by `*v1alpha.Config` and can be freely modified.

Why almost? Some parts of Talos (platform code) patch the machine
configuration with new data. We need to fix platforms to provide
networking configuration in a different way, but this will come with
other PRs later.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-12-28 20:12:55 +03:00
Andrey Smirnov
f235cfbaed
fix: multiple usability fixes
Add `cat` alias for `talosctl read`, I always end up writing
`talosctl cat`.

Improve error messages for APIs available only on control plane nodes:

was:

```
$ talosctl -n 172.20.0.5 kubeconfig
rpc error: code = Unknown desc = error getting Kubernetes CA: failed to parse PEM block
error initializing gzip: EOF

$ talosctl -n 172.20.0.5 etcd members
error getting members: 1 error occurred:
	* 172.20.0.5: rpc error: code = Unknown desc = error building etcd client TLS config: open /system/secrets/etcd/admin.crt: no such file or directory
```

now:

```
$ talosctl -n 172.20.0.6 kubeconfig
rpc error: code = Unimplemented desc = kubeconfig is only available on control plane nodes
error initializing gzip: EOF

$ talosctl -n 172.20.0.6 etcd remove-member foo
1 error occurred:
	* 172.20.0.6: rpc error: code = Unimplemented desc = etcd remove member is only available on control plane nodes
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-12-27 17:41:33 +03:00