190 Commits

Author SHA1 Message Date
Andrey Smirnov
89dbb0ecf0
release(v1.4.0-alpha.0): prepare release
This is the official v1.4.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-12-23 22:32:09 +04:00
Andrey Smirnov
bbb56840e4
chore: update protobuf API descriptors for 1.3.0
Set the API descriptors for v1.3.0.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-29 14:41:43 +04:00
Serge Logvinov
e432579d48
feat: kubespan node endpoints filter
This feature allows us to use only IPv4 or IPv6 stack to reach the peers.
Also, it can help to not share the node-specific IPs,
which cannot be accessible at all.

Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
2022-11-18 19:55:42 +04:00
Philipp Sauter
e1e340bdd9
feat: expose Talos node labels as a machine configuration field
We add the `nodeLabels` key to the machine config to allow users to add
node labels to the kubernetes Node object. A controller
reads the nodeLabels from the machine config and applies them via the
kubernetes API.
Older versions of talosctl will throw an unknown keys error if `edit mc`
 is called on a node with this change.

Fixes #6301

Signed-off-by: Philipp Sauter <philipp.sauter@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-15 21:25:40 +04:00
Philipp Sauter
4e114ca120
feat: use the etcd member id for etcd operations instead of hostname
We add a controller that provides the etcd member id as a resource
and change the etcd related commands to support member ids next to
hostnames.

Fixes: #6223

Signed-off-by: Philipp Sauter <philipp.sauter@siderolabs.com>
2022-11-10 19:17:56 +04:00
Serge Logvinov
06fea24414
feat: expand platform metadata resources
* add IPv6 to the ExternalIPs resource.
* platformMetadata can define Spot instances.

Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-07 18:57:17 +04:00
Andrey Smirnov
96aa9638f7
chore: rename talos-systems/talos to siderolabs/talos
There's a cyclic dependency on siderolink library which imports talos
machinery back. We will fix that after we get talos pushed under a new
name.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-03 16:50:32 +04:00
Andrey Smirnov
30bbf6463a
refactor: use siderolabs/net version with netip.Addr
Replace most of `net.IP` usage in Talos with `netip.Addr`, refactor code
accordingly.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-11-02 14:21:03 +04:00
Serge Logvinov
8bfa7ac1d6
feat: platform metadata resource
This resource stores common platform metadata information.
Such as:

* Hostname
* Region
* Zone
* InstanceType (SKU)
* InstanceID
* ProviderID (CCM cloud native magic string)

Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-10-28 14:32:39 +04:00
Philipp Sauter
23842114f0
feat: support encryption with secretbox
We add support for encryption with secretbox. While AESCBC is still
supported secretbox will take precedence if both are configured.
Secretbox is not the default encryption for new clusters.

Fixes: #6362

Signed-off-by: Philipp Sauter <philipp.sauter@siderolabs.com>
2022-10-26 19:06:53 +02:00
Philipp Sauter
c6e1702eca
feat: use URL-based manifests to present static pods to the kubelet
Previously static pod manifests were written to and read from a folder
on the disk. We add a controller that cleans up the default static pod
manifests on the disk and serves them as a PodList manifest via HTTP.
The to the manifest is injected into the kubelet. File based static pod
manifests are still supported and may be enabled by setting the key
kubelet -> enableManifestsDirectory in the machine config.

Fixes #5494

Signed-off-by: Philipp Sauter <philipp.sauter@siderolabs.com>
2022-10-25 14:30:19 +02:00
Serge Logvinov
dc70d892a3
fix: support setting KubeSpan link MTU
Kubespan creates package size more than MTU external interface size.

This PR adds capabilities to change MTU size through machine config.
And sets MTU of the default kubespan route.

Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-10-17 14:39:15 +04:00
Andrey Smirnov
993743f634
fix: skip hostname via DHCP on OpenStack platform
Introduce new DHCP operator option to skip hostname request/response,
and use that in OpenStack platform.

OpenStack configures interface with DHCP, while providing dummy hostname
over DHCP and proper hostname over metadata. As operators override
platform settings, DHCP hostname takes over OpenStack hostname. As a
fix, ignore DHCP hostname while on OpenStack.

Fixes #6350

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-10-10 14:18:46 +04:00
Noel Georgi
48dee48057
feat: support mtu for routes
Support setting MTU for routes.

Fixes: #6324

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-30 16:38:22 +05:30
Serge Logvinov
18c377a4d1
feat: customize audit policy
Add resource `AuditPolicyConfigs.kubernetes.talos.dev`.
It can be changed through machine config `cluster.apiServer.auditPolicy`

Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-28 13:46:44 +04:00
Andrey Smirnov
0b2767c164
feat: implement 'permanent addr' in link statuses
Permanent address is only available for physical links, and it might be
different from the 'hardware address': when bonding, 'hardware address'
gets overridden from the bond master, while 'permanent address' still
shows MAC of the interface.

This part of the fix for incorrect bonding issue on Equinix Metal.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-26 14:45:46 +04:00
Andrey Smirnov
ce12c7b380
chore: update COSI runtime to v0.2.0-alpha.1
This adds metadata annotations and fixes some hanging watch loops.

There should be no functional changes for Talos.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-09-20 22:02:57 +04:00
Noel Georgi
5e21cca52d
feat: support setting kernel parameters
Support setting kernel parameters via machine config.

Fixes: #6206

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-09-05 23:45:51 +05:30
Dmitriy Matrenichev
bd56621cdf
feat: add structprotogen tool
This commit adds structprotogen tool which is used to generate proto file from Go structs.

Closes #6078.

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-09-05 16:54:00 +03:00
Andrey Smirnov
7e527777e8
chore: update API descriptors
Re-generate protobuf API descriptors in preparation for 1.2.0-beta.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-15 17:35:09 +04:00
Andrey Smirnov
9baca49662
refactor: implement COSI resource API for Talos
Overview: deprecate existing Talos resource API, and introduce new COSI
API.

Consequences:

* COSI API can only go via one-2-one proxy (`client.WithNode`)
* client-side API access is way easier with `state.State` wrappers
* lots of small changes on the client side to use new APIs

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-12 22:31:54 +04:00
Dmitriy Matrenichev
e422ea63d0
chore: add proto definitions for common types
This commit adds proto definitions for this types;
- *url.URL
- netaddr.IP
- netaddr.IPPort
- netaddr.IPPrefix
- *x509.PEMEncodedKey
- *x509.PEMEncodedCertificateAndKey

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-08-12 15:38:31 +03:00
Utku Ozdemir
b5da686a7b
feat: add actor ID to events & emit an initial empty event
Add a new field `actorID` to the events and populate it with a UUID for the lifecycle actions `reboot`, `reset`, `upgrade` and `shutdown`. This actor ID will be present on all events emitted by this triggered action. We can use this ID later on the client side to be able to track triggered actions.

We also emit an event with an empty payload on the events streaming GRPC endpoint when a client connects. The purpose of this event is to signal to the client that the event streaming has actually started.

Server-side part of siderolabs/talos#5499.

Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
2022-08-11 15:14:11 +02:00
Noel Georgi
b62b18a972
feat: bump k8s to v1.25.0-beta.0
Bump k8s to v1.25.0-beta.0

Update most kubernetes `master` references to `controlplane`

Signed-off-by: Noel Georgi <git@frezbo.dev>
2022-08-10 22:17:53 +05:30
Dmitriy Matrenichev
7b80a747bc
feat: add protobuf encoding/decoding for Go structs
This commit adds the support for encoding/decoding Go structs with `protobuf:<n>` tags.

Closes #5940

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2022-08-10 16:04:08 +03:00
Andrey Smirnov
92314e47bf
refactor: use controllers/resources to feed trustd with data
This is mostly same as the way `apid` consumes certificates generated by
`machined` via COSI API connection.

Service `trustd` consumes two resources:

* `secrets.Trustd` which contains `trustd` server TLS certificates and
  it gets refreshed as e.g. node IP changes
* `secrets.OSRoot` which contains Talos API CA and join token

This PR fixes an issue with `trustd` certs not always including all IPs
of the node, as previously `trustd` certs will only capture addresses of
the node at the moment of `trustd` startup.

Another thing is that refactoring allows to dynamically change API CA
and join token. This needs more work, but `trustd` should now pick up
changes without any additional changes.

Fixes #5863

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-04 23:45:34 +04:00
Andrey Smirnov
fe2ee3b100
feat: implement MachineStatus resource
Fixes #5789

Example:

```yaml
spec:
    stage: running
    status:
        ready: false
        unmetConditions:
            - name: staticPods
              reason: kube-system/kube-controller-manager-talos-default-master-1 not ready, kube-system/kube-scheduler-talos-default-master-1 not ready
```

As events (CLI doesn't show full contents):

```
172.20.0.2   cbhf2l6f9lrs738hehfg   talos/runtime/machine.MachineStatusEvent   BOOTING   ready: false, unmet conditions: [time network services]
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-08-01 18:36:10 +04:00
Artem Chernyshev
ae1bec59e9
feat: allow running only one sequence at a time
Fix `Talos` sequencer to run only a single sequence at the same time.
Sequences priority was updated. To match the table:

| what is running (columns) what is requested (rows) | boot | reboot | reset | upgrade |
|----------------------------------------------------|------|--------|-------|---------|
| reboot                                             | Y    | Y      | Y     | N       |
| reset                                              | Y    | N      | N     | N       |
| upgrade                                            | Y    | N      | N     | N       |

With a small addition that `WithTakeover` is still there.
If set, priority is ignored.

This is mainly used for `Shutdown` sequence invokation.
And if doing apply config with reboot enabled.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-07-27 17:21:36 +03:00
Andrey Smirnov
065b59276c
feat: implement packet capture API
This uses the `go-packet` library with native bindings for the packet
capture (without `libpcap`). This is not the most performant way, but it
allows us to avoid CGo.

There is a problem with converting network filter expressions (like
`tcp port 3222`) into BPF instructions, it's only available in C
libraries, but there's a workaround with `tcpdump`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-07-19 01:23:09 +04:00
Andrey Smirnov
022581d809
release(v1.2.0-alpha.0): prepare release
This is the official v1.2.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-06-30 19:01:07 +04:00
Andrey Smirnov
f2997c0f22
chore: bump dependencies
dependabot + go-mod-outdated

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-06-06 23:27:17 +04:00
Artem Chernyshev
2b03057b91
feat: implement a new mode try in the config manipulation commands
The new mode allows changing the config for a period of time, which
allows trying the configuration and automatically rolling it back in case
if it doesn't work for example.

The mode can only be used with changes that can be applied without a
reboot.

When changed it doesn't write the configuration to disk, only changes it
in memory.
`--timeout` parameter can be used to customize the rollback delay.
The default timeout is 1 minute.

Any consequent configuration change will abort try mode and the last
applied configuration will be used.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-04-21 20:31:45 +03:00
Artem Chernyshev
2b9722d1f5
feat: add dry-run flag in apply-config and edit commands
Dry run prints out config diff, selected application mode without
changing the configuration.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-04-14 19:12:57 +03:00
Andrey Smirnov
25d19131d3
release(v1.1.0-alpha.0): prepare release
This is the official v1.1.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-04-01 18:23:19 +03:00
Tomasz Zurkowski
cc7719c9d0
docs: improve comments in security proto
The existing comments did not match the service definition (they look
like copy paste from another service). I also added a little bit more
comments for the fields in the request and response.

Signed-off-by: Tomasz Zurkowski <zurkowski@google.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-16 14:18:48 +03:00
Caleb Woodbine
d256b5c5e4
docs: fix spelling mistakes
Resolve spelling with `misspell -w .`

Signed-off-by: Caleb Woodbine <calebwoodbine.public@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-15 15:38:25 +03:00
Andrey Smirnov
59681b8c9a
fix: backport fixes from release-1.0 branch
They were discovered as we tagged 1.0.0 version:

* wrong deprecated version
* incompatibility in extension compatibility checks

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-03-04 23:28:06 +03:00
Tim Jones
fe40e7b1b3
feat: drain node on shutdown
Cordon & drain a node when the Shutdown message is received.
Also adds a '--force' option to the shutdown command in case the control
plane is unresponsive.

Signed-off-by: Tim Jones <timniverse@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2022-02-01 00:06:32 +03:00
Artem Chernyshev
ebec5d4a0c
feat: support full disk path in the diskSelector
Fixes: https://github.com/talos-systems/talos/issues/4788

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-01-27 15:23:00 +03:00
Artem Chernyshev
2f2bdb26aa
feat: replace flags with --mode in apply, edit and patch commands
Fixes: https://github.com/talos-systems/talos/issues/4588

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2022-01-13 16:09:53 +03:00
Andrey Smirnov
cb548a368a
release(v0.15.0-alpha.0): prepare release
This is the official v0.15.0-alpha.0 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-12-30 16:27:19 +03:00
Rohit Dandamudi
7f9922296a
feat: add powercycle mode in reboot
- Fixes #4569
- Updated reboot process sequence
- Updted api.descriptors to avoid proto type change linting error https://github.com/talos-systems/talos/pull/4612#discussion_r758599242
Signed-off-by: Rohit Dandamudi <rohit.dandamudi@siderolabs.com>

Signed-off-by: Rohit Dandamudi <rohit.dandamudi@siderolabs.com>
2021-12-02 22:40:04 +05:30
Alexey Palazhchenko
8d1cbeef9f
chore: add API breaking changes detector
Closes #4576.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-11-30 15:06:05 +00:00
Alexey Palazhchenko
0f169bf9b1
chore: add API deprecations mechanism
Refs #4576.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-11-30 06:31:55 +00:00
Alexey Palazhchenko
20d39c0b48
chore: format .proto files
Refs #2722.

Co-authored-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-11-23 15:05:25 +00:00
Artem Chernyshev
f730252579
feat: add new event types
Add config load + validation errors and address + hostnames events.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2021-11-18 18:48:35 +03:00
Andrey Smirnov
c97becdd95
chore: remove interfaces and routes APIs
Fixes #4279

These APIs were deprecated in 0.13, now it's time to drop them for 0.14.

They were not used anywhere in Talos, so no changes on Talos side.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-10-27 15:34:17 +03:00
Andrey Smirnov
b450b7cef0
chore: deprecate Interfaces and Routes APIs
Fixes #4094

Deprecate old networkd APIs, `talosctl interfaces` and `talosctl routes`
now suggest different commands to be used to achieve same task.

TUI installer was updated to stop using Interfaces API.

Those APIs will be completely removed in 0.14.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-09-27 15:21:02 +03:00
Andrey Smirnov
dadaa65d54
feat: print uid/gid for the files in ls -l
This adds information about file ownership in the long listing which is
crucial sometimes.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-08-13 00:10:49 +03:00
Andrey Smirnov
eefe1c21c3
feat: add new etcd members in learner mode
Fixes #3714

This provides more safe way to join new members to the etcd cluster.

See https://etcd.io/docs/v3.4/learning/design-learner/

With learner mode join there are few differences:

* new nodes are joined one by one, because etcd enforces a single
learner member in the cluster
* learner members are not counted in quorum calculations, so while
learner catches up with the master node, quorum is not affected and
cluster is still operational

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2021-08-12 17:56:57 +03:00