278 Commits

Author SHA1 Message Date
Alexey Palazhchenko
09d70b7eaf feat: update Kubernetes to v1.22.0
Closes #3967.
Closes #3997.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@talos-systems.com>
2021-08-06 09:06:32 -07:00
Andrey Smirnov
69ead37353 fix: preserve PMBR bootable flag correctly
See https://github.com/talos-systems/go-blockdevice/pull/41

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-08-04 22:58:15 -07:00
Andrey Smirnov
dee6305170 fix: align partitions with minimal I/O size
Also print discovered blockdevice properties before partitioning the
device.

See https://github.com/talos-systems/go-blockdevice/pull/40

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-08-04 11:51:00 -07:00
Serge Logvinov
b9d04928d9 feat: move system processes to cgroups
* use cgroup v2
* cgroups: /init, /system, /system/runtime
* kubelet cgroup metrics

Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
2021-08-04 09:00:38 -07:00
Andrey Smirnov
79b8fa64b9 feat: update containerd to 1.5.5
* https://github.com/containerd/containerd/releases/tag/v1.5.5

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-08-03 10:26:21 -07:00
Andrey Smirnov
539f42090e chore: bump dependencies via dependabot
Fixes #3993

Fixes #3994

Fixes #3995

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-08-03 10:25:17 -07:00
Artem Chernyshev
5f027615ff feat: expose more encryption options to the machine config
Fixes: https://github.com/talos-systems/talos/issues/3606

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-07-27 11:19:26 -07:00
Alexey Palazhchenko
585152a0be chore: bump dependencies
Closes #3983.
Closes #3984.
Closes #3985.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-07-26 04:37:25 -07:00
Artem Chernyshev
55e17ccdd1 chore: bump dependencies
Fixes: https://github.com/talos-systems/talos/pull/3954 https://github.com/talos-systems/talos/pull/3955 https://github.com/talos-systems/talos/pull/3956 https://github.com/talos-systems/talos/pull/3957 https://github.com/talos-systems/talos/pull/3958 https://github.com/talos-systems/talos/pull/3959 https://github.com/talos-systems/talos/pull/3960 https://github.com/talos-systems/talos/pull/3961 https://github.com/talos-systems/talos/pull/3962 https://github.com/talos-systems/talos/pull/3963 https://github.com/talos-systems/talos/pull/3964

And update kubelet to 1.21.3.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-07-19 06:06:01 -07:00
dependabot[bot]
604434c43e chore: bump github.com/prometheus/procfs from 0.6.0 to 0.7.0
Bumps [github.com/prometheus/procfs](https://github.com/prometheus/procfs) from 0.6.0 to 0.7.0.
- [Release notes](https://github.com/prometheus/procfs/releases)
- [Commits](https://github.com/prometheus/procfs/compare/v0.6.0...v0.7.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/procfs
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-07-12 04:33:39 -07:00
Andrey Smirnov
d930a26502 chore: implement DeepCopy for machine configuration
Resources code extensively uses DeepCopy to prevent in-memory copy of
the resource to be mutated outside of the resource model.

Previous implementation relied on YAML serialization to copy the
machine configuration which was slow, potentially might lead to panics
and it generates pressure on garbage collection.

This implementation uses k8s code generator to generate DeepCopy methods
with some manual helpers when code generator can't handle it.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-07-08 07:21:24 -07:00
Andrey Smirnov
6b661114d0 fix: make COSI runtime history depth smaller
This reduces Talos memory usage.

See https://github.com/cosi-project/runtime/pull/51

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-07-07 10:32:54 -07:00
Alexey Palazhchenko
71c6f7004e chore: bump go.mod dependencies
Closes #3879, #3880, #3881, #3882, #3883, #3884, #3885, #3886.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-07-05 06:59:14 -07:00
Alexey Palazhchenko
f228af4061 chore: bump go.mod dependencies
Closes #3848, #3849, #3850, #3851.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-06-28 02:25:43 -07:00
Andrey Smirnov
829e54f1a4 fix: limit apid access to COSI runtime resources
This makes sure that apid can't access any resources than the one it
actually needs. This improves the security in case of a container
breach.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-24 14:10:27 -07:00
Andrey Smirnov
b5244bf182 chore: bump go.mod dependencies, fix netaddr API changes
Bump dependencies, clean up go.mod files, update for netaddr changes
(all around `netaddr.IPPrefix` being a private struct now).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-24 08:37:37 -07:00
Andrey Smirnov
3a34f1a51d chore: bump Talos Go modules to release versions
No actual changes, just tag updates.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-24 07:45:41 -07:00
Andrey Smirnov
d3f4e6006f fix: replace tabs with spaces in console output
See https://github.com/talos-systems/go-kmsg/pull/2

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-23 16:12:47 -07:00
Artem Chernyshev
1990ad2525 feat: add created and updated timestamps to the resource metadata
This will allow to keep track of when the resource was created and
updated.
Update is tied to the version bump.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-06-23 13:56:49 -07:00
Serge Logvinov
b52b206665 feat: split etcd certificates to peer/client
Changes:
* Etcd peer port key usage: ServerAuth,ClientAuth
* Etcd client port key usage: ServerAuth,ClientAuth
* Talos etcd client key usage: ClientAuth
* KubeAPI etcd client key usage: ClientAuth
* List of etcd allowed ciphers

Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-23 13:26:48 -07:00
Andrey Smirnov
d8c2bca1b5 feat: reimplement apid certificate generation on top of COSI
This PR can be split into two parts:

* controllers
* apid binding into COSI world

Controllers
-----------

* `k8s.EndpointController` provides control plane endpoints on worker
nodes (it isn't required for now on control plane nodes)
* `secrets.RootController` now provides OS top-level secrets (CA cert)
and secret configuration
* `secrets.APIController` generates API secrets (certificates) in a bit
different way for workers and control plane nodes: controlplane nodes
generate directly, while workers reach out to `trustd` on control plane
nodes via `k8s.Endpoint` resource

apid Binding
------------

Resource `secrets.API` provides binding to protobuf by converting
itself back and forth to protobuf spec.

apid no longer receives machine configuration, instead it receives
gRPC-backed socket to access Resource API. apid watches `secrets.API`
resource, fetches certs and CA from it and uses that in its TLS
configuration.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-23 13:07:00 -07:00
Alexey Palazhchenko
b6e02311a3 feat: use COSI RD's sensitivity for RBAC
Refs #3421.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-06-21 14:05:06 -07:00
Alexey Palazhchenko
42c16f67f4 chore: bump dependencies
Update k8s to 1.21.2.

See #3787 #3788 #3789 #3790 #3791 #3792 #3793 #3794 #3795 #3796 #3798.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-06-21 07:05:41 -07:00
Andrey Smirnov
60f78419e4 chore: bump etcd client libraries to final 3.5.0 release
There should be no actual changes to the client library, just tag
updates.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-21 05:59:39 -07:00
Alexey Palazhchenko
f63ab9dd9b feat: implement talosctl config new command
Refs #3421.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-06-17 09:06:43 -07:00
Andrey Smirnov
1a1378be16 fix: update retry package with a fix for errors.Is
See https://github.com/talos-systems/go-retry/pull/9

This fixes dropping into maintenance mode on Equinix Metal.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-16 11:44:05 -07:00
Andrey Smirnov
654dcad475 chore: bump dependencies via dependabot
See #3751 #3752 #3753 #3754 #3755 #3756 #3756 #3757

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-16 07:28:17 -07:00
Andrey Smirnov
f2ae9cd0c1 feat: replace networkd with new network implementation
This removes networkd, updates network ready condition, enables all the
controllers which were previously disabled.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-15 17:37:28 -07:00
Andrey Smirnov
6a35c8f110 feat: implement virtual IP (shared IP) network operator
Code mostly copy-pasted from the networkd implementation, but adapted to
the resource model:

* watch for etcd state to start/stop activities
* watch for kube-api-server static pod state to give up VIP when
api-server goes down

Gratuitous ARP code re-implemented to drop dependency on kube-vip (once
the old networkd code is removed).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-11 05:49:22 -07:00
Alexey Palazhchenko
5ad314fe7e feat: implement basic RBAC interceptors
It is not enforced yet.

Refs #3421.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-06-07 09:28:22 -07:00
Andrey Smirnov
8b0763f6a2 chore: bump dependencies via dependabot
PRs #3720 #3721 #3722 #3723 #3724 #3725 #3726
    #3727 #3728 #3729

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-07 06:28:51 -07:00
Artem Chernyshev
14e696d068 feat: update COSI runtime and add support for tail in the Talos gRPC
Updated protobufs to expose tail length option.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-06-03 11:46:39 -07:00
Andrey Smirnov
33db8857aa fix: use COSI runtime DestroyReady input type
See https://github.com/cosi-project/runtime/pull/35

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-01 12:30:52 -07:00
Andrey Smirnov
5811f4dda1 feat: implement link (interface) controllers
The structure of the controllers is really similar to addresses and
routes:

* `LinkSpec` resource describes desired link state
* `LinkConfig` controller generates `LinkSpecs` based on machine
configuration and kernel cmdline
* `LinkMerge` controller merges multiple configuration sources into a
single `LinkSpec` paying attention to the config layer priority
* `LinkSpec` controller applies the specs to the kernel state

Controller `LinkStatus` (which was implemented before) watches the
kernel state and publishes current link status.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-06-01 09:36:25 -07:00
Alexey Palazhchenko
c036b94948 chore: bump dependencies
Closes #3699, #3668, #3698, #3697, #3696, #3695, #3694, #3693, #3692.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-05-31 06:12:06 -07:00
Artem Chernyshev
76dbfb3699 feat: add ability to mark MBR partition bootable
Fixes: https://github.com/talos-systems/talos/issues/3532

Machine install section now has `markMBRBootable` option.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-05-27 12:44:50 -07:00
Alexey Palazhchenko
e0f5b1e20a chore: split mgmt/gen.go into several files
No functional changes in this PR, to make future PRs easier.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-05-26 12:54:48 -07:00
Artem Chernyshev
723597657a feat: enable GORACE=halt_on_panic=1 in machined binary
Fixes: https://github.com/talos-systems/talos/issues/3533

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-05-25 11:24:07 -07:00
Artem Chernyshev
1db301edf6 feat: switch controller-runtime to zap.Logger
Enable logging using default development config with some fine tuning.
Additionally, now `info` and below logs go to kmsg.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-05-25 02:15:31 -07:00
Andrey Smirnov
59cfd312c1 chore: bump dependencies via dependabot
There were some upstream code changes in etcd, some code got moved
around.

PRs #3651 #3652 #3653 #3654 #3655 #3655 #3656 #3657 #3658
    #3659 #3660 #3661 #3662 #3663

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-05-24 12:15:15 -07:00
Andrey Smirnov
04ddda962f feat: update containerd to 1.5.2, runc to 1.0.0-rc95
This also updates libseccomp and add support for `netxen` networkd card.

This addresses[CVE-2021-30465](https://github.com/opencontainers/runc/security/advisories/GHSA-c3xm-pvg7-gh7r).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-05-19 15:24:28 -07:00
Andrey Smirnov
6bc6658b51 feat: update containerd to 1.5.1
See https://github.com/containerd/containerd/releases/tag/v1.5.1

Also brings Talos kernel with Geneve encapsulation for Openvswitch (see
talos-systems/pkgs#278).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-05-17 10:33:49 -07:00
Andrey Smirnov
c6567fae9c chore: dependabot updates
PRs #3622 #3623 #3624 #3625 #3627 #3628

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-05-17 07:46:24 -07:00
Alexey Palazhchenko
c81cfb2167 chore: allow building with debug handlers
Refs #3534.

Signed-off-by: Alexey Palazhchenko <alexey.palazhchenko@gmail.com>
2021-05-13 02:20:15 -07:00
Spencer Smith
c9651673b9 feat: update go-smbios library
This pulls in a newer version of smbios so that we can detect lower
smbios version and handle endianness if necessary.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2021-05-12 10:49:59 -07:00
Andrey Smirnov
95c656fb72 feat: update containerd to 1.5.0, runc to 1.0.0-rc94
Fixes #3538

See also talos-systems/pkgs#276

As new containerd is now Go module-based, it pulls many more
dependencies if simply imported in `go.mod`, so I had to replace the
reference to the constant in `pkg/machinery/` to `containerd` volume
with simple value to avoid pulling Kubernetes dependencies into
`pkg/machinery`.

Also updates the kernel to include PR talos-systems/pkgs#275 for AES-NI
support.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-05-11 14:43:27 -07:00
Andrey Smirnov
db9c35b570 feat: implement AddressStatusController
This controller queries addresses of all the interfaces in the system
and presents them as resources. The idea is that can be a source for
many decisions - e.g. whether network is ready (physical interface has
scope global address assigned).

This is also good for debugging purposes.

Examples:

```
$ talosctl -n 172.20.0.2 get addresses
NODE         NAMESPACE   TYPE            ID                                          VERSION
172.20.0.2   network     AddressStatus   cni0/10.244.0.1/24                          1
172.20.0.2   network     AddressStatus   cni0/fe80::9c87:cdff:fe8e:5fdc/64           2
172.20.0.2   network     AddressStatus   eth0/172.20.0.2/24                          1
172.20.0.2   network     AddressStatus   eth0/fe80::ac1b:9cff:fe19:6b47/64           2
172.20.0.2   network     AddressStatus   flannel.1/10.244.0.0/32                     1
172.20.0.2   network     AddressStatus   flannel.1/fe80::440b:67ff:fe99:c18f/64      2
172.20.0.2   network     AddressStatus   lo/127.0.0.1/8                              1
172.20.0.2   network     AddressStatus   lo/::1/128                                  1
172.20.0.2   network     AddressStatus   veth178e9b31/fe80::6040:1dff:fe5b:ae1a/64   2
172.20.0.2   network     AddressStatus   vethb0b96a94/fe80::2473:86ff:fece:1954/64   2
```

```
$ talosctl -n 172.20.0.2 get addresses -o yaml eth0/172.20.0.2/24
node: 172.20.0.2
metadata:
    namespace: network
    type: AddressStatuses.net.talos.dev
    id: eth0/172.20.0.2/24
    version: 1
    owner: network.AddressStatusController
    phase: running
spec:
    address: 172.20.0.2/24
    local: 172.20.0.2
    broadcast: 172.20.0.255
    linkIndex: 4
    linkName: eth0
    family: inet4
    scope: global
    flags: permanent
```

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-05-11 13:32:17 -07:00
Andrey Smirnov
1cf011a809 chore: bump dependencies via dependabot
See PRs #3596 #3593 #3592

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2021-05-11 11:20:23 -07:00
Artem Chernyshev
e3f407a1df fix: properly pass disk type selector from config to matcher
Also updated go-blockdevice library that makes disk type string case
insensitive.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-05-11 09:35:40 -07:00
Artem Chernyshev
0e8de04698 fix: update go-blockdevice to fix disk type detection
Otherwise it never detected mmc and nvme devices as such.

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2021-05-07 07:33:11 -07:00