4250 Commits

Author SHA1 Message Date
Andrey Smirnov
ffa48ac803
chore: workaround AWS AMI failures, disable Azure uploader
Fixes #7513

AWS image uploads recently consistently fail in some regions, which
blocks the release process. Allow to skip some AMIs if they fail to
upload.

Disable Azure until #7512 is resolved.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-26 17:14:31 +04:00
Spencer Smith
4cd7623cf7
chore: add alx drivers
This PR adds the alx drivers from pkgs to talos

Signed-off-by: Spencer Smith <spencer.smith@talos-systems.com>
2023-07-25 11:00:12 -04:00
Andrey Smirnov
663264c864
release(v1.5.0-alpha.3): prepare release
This is the official v1.5.0-alpha.3 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-25 17:26:08 +04:00
Andrey Smirnov
d2f64af863
chore: disable cloud-images, pull in new kernel and gre module
Disable cloud-images step due to the issue with AWS & Azure atm.

Pull in https://github.com/siderolabs/pkgs/pull/761

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-25 15:15:54 +04:00
Scott Cariss
8edce49063
docs: improve proxmox install guide
Improve proxmox install guide.

Fixes: #7402

Signed-off-by: Scott Cariss <scott@cariss.dev>
Signed-off-by: Noel Georgi <git@frezbo.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-24 17:59:39 +04:00
Sacha Trémoureux
c783458be0
docs: typo dhcp -> dhcp
Small typo in reference/kernel/

Signed-off-by: Sacha Trémoureux <sacha@tremoureux.fr>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-24 16:08:14 +04:00
Thomas Lemarchand
003cbd1611
docs: warn about secretboxEncryptionSecret in kubeadm migration guide
Migrating from kubeadm fix.

Signed-off-by: Thomas Lemarchand <tlemarchand@users.noreply.github.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-24 15:24:39 +04:00
Andrey Smirnov
786e86f5b8
refactor: rewrite the way Talos acquires the machine configuration
Fixes #7453

The goal is to make it possible to load some multi-doc configuration
from the platform source (or persisted in STATE) before machine acquires
full configuration.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-24 14:26:42 +04:00
Andrey Smirnov
5e13cafe5b
feat: enforce kernel lockdown for UKI
UKI is meant to be for UEFI Secure Boot, so it's expected to enforce
kernel lockdown. We might reconsider in the future to use a kernel patch
instead: b1a0314b08

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-22 13:52:46 +04:00
Andrey Smirnov
4d96d642fd
feat: update default Kubernetes version to 1.28.0-beta.0
See https://github.com/kubernetes/kubernetes/releases/tag/v1.28.0-beta.0

Go modules are not tagged yet, so skipped updating them.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-21 22:04:19 +04:00
Noel Georgi
170a73e161
chore: support creating qemu guest socket
Support creating a qemu guest agent socket so we can test
`qemu-guest-agent` extension in CI.

Ref: https://github.com/siderolabs/extensions/pull/173#issuecomment-1611911106

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-21 22:46:13 +05:30
Christian Rolland
59ac38a6bf
docs: add docs for installing azure ccm and csi
Add docs for installing Azure ccm and csi on Talos.

Signed-off-by: Christian Rolland <christian.rolland@siderolabs.com>
2023-07-21 12:30:26 -04:00
Andrey Smirnov
6288cd970e
release(v1.5.0-alpha.2): prepare release
This is the official v1.5.0-alpha.2 release.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-20 20:57:01 +04:00
Andrey Smirnov
60c304126f
chore: bump dependencies
* go.mod dependencies
* Linux 6.1.39
* runc 1.1.8
* dm-raid kernel module

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-20 18:25:41 +04:00
Andrey Smirnov
9ef4e5efca
fix: log explicitly when kubelet has no nodeIP match
Fixes #7487

When `.kubelet.nodeIP` filters yield no match, Talos should not start
the kubelet, as using empty address list results in `--node-ip=` empty
kubelet arg, which makes kubelet pick up "the first" address.

Instead, skip updating (creating) the nodeIP and log an explicit
warning.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-20 00:41:47 +04:00
Andrey Smirnov
6b39c6a4d3
fix: enable compression and bump gRPC max msg size
Fixes #7482

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-19 22:46:37 +04:00
Noel Georgi
2f2eca8617
chore: basic support for shutdown/poweroff flags
This adds basic support for shutdown/poweroff flags.
it can distringuish between halt/shutdown/reboot.

In the case of Talos halt/shutdown is same op.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-19 23:35:32 +05:30
Florian Klink
b84277d7dc
docs: fix wrong capability name
It's CAP_SYS_BOOT, not CAP_BOOT.

Signed-off-by: Florian Klink <flokli@flokli.de>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-19 21:23:05 +04:00
Noel Georgi
59d7d9344b
chore: use machined for shutdown, poweroff
Use the `machined` socket for `shutdown` and `poweroff` aliases. This
ensures that worker nodes does not have to wait on apid to start.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-19 21:48:15 +05:30
Dmitriy Matrenichev
2439bfb719
chore: explicitly add timestamps to machined logs
We can safely do it on `io.Writer` level, since `log.Logger.Output` (called by `Print|Printf`) pretty much promises
that every call to `Write` ends with `\n`.

Closes #7439

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-07-19 18:29:17 +03:00
Noel Georgi
14966e718a
fix: skip over tpm2 1.2 devices
For rng seed and pcr extend, let's ignore if the device is not TPM2.0
based. Seal/Unseal operations would still error out since it's
explicitly user enabled feature.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-18 12:58:45 +05:30
Dmitriy Matrenichev
6716e7bc0b
docs: update cilium documentation about KubePrism usage
Closes #7400

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-07-17 19:25:09 +03:00
Noel Georgi
166d75fe88
fix: tpm2 encrypt/decrypt flow
The previous flow was using TPM PCR 11 values to bound the policy which
means TPM cannot unseal when UKI changes. Now it's fixed to use PCR 7
which is bound to the SecureBoot state (SecureBoot status and
Certificates). This provides a full chain of trust bound to SecureBoot
state and signed PCR signature.

Also the code has been refactored to use PolicyCalculator from the TPM
library.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-14 23:58:59 +05:30
Dmitriy Matrenichev
130518de71
chore: change missing renames of KubePrism
For #7432

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-07-14 17:18:25 +03:00
Dmitriy Matrenichev
5f34f5b41f
chore: rename api load balancer to KubePrism
Closes #7432

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-07-14 15:23:53 +03:00
Andrey Smirnov
c8b7095c01
refactor: use tpm2 library to calculate policy hash
No real change, just using library to do the work (should be more
readable).

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-14 15:07:47 +04:00
Dmitriy Matrenichev
078aac92ee
chore: bump deps
Bump:
- REVERT cilium/cilium-cli to v0.14.7
- github.com/Azure/azure-sdk-for-go/sdk/azcore to v1.7.0
- github.com/Azure/azure-sdk-for-go/sdk/storage/azblob to v1.1.0
- github.com/aws/aws-sdk-go to v1.44.300
- github.com/beevik/ntp to v1.2.0
- github.com/docker/docker to v24.0.4+incompatible
- github.com/gomarkdown/markdown to v0.0.0-20230711084535-11b03c0ae6d6
- github.com/hetznercloud/hcloud-go to v1.48.0
- github.com/iancoleman/orderedmap to v0.3.0
- github.com/jsimonetti/rtnetlink to v1.3.4
- github.com/siderolabs/go-debug to v0.2.3
- golang.org/x/net to v0.12.0
- golang.org/x/tools to v0.11.0
- google.golang.org/genproto/googleapis/rpc to v0.0.0-20230711160842-782d3b101e98
- google.golang.org/grpc to v1.56.2
- google.golang.org/protobuf to v1.31.0

Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
2023-07-14 12:44:58 +03:00
Andrey Smirnov
53873b8444
refactor: move ukify into Talos code
This is intemediate step to move parts of the `ukify` down to the main
Talos source tree, and call it from `talosctl` binary.

The next step will be to integrate it into the imager and move `.uki`
build out of the Dockerfile.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-13 19:14:32 +04:00
Noel Georgi
d5f6fb9ff2
chore: add vendor info
Add extra vendor info in `os-release`.

Fixes: #7446

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-12 23:58:59 +05:30
Noel Georgi
79365d9bac
feat: tpm2 based disk encryption
Support disk encryption using tpm2 and pre-calculated signed PCR values.

Fixes: #7266

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-12 20:41:28 +05:30
Andrey Smirnov
06369e8195
fix: retry CRI pod removal, fix upgrade flow in the tests
It seems that CRI has a bit of eventual consistency, and it might fail
to remove a stopped pod failing that it's still running.

Rewrite the upgrade API call in the upgrade test to actually wait for
the upgrade to be successful, and fail immediately if it's not
successful. This should improve the test stability and it should make
it easier to find issues immediately.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-12 16:20:10 +04:00
Andrey Smirnov
d32dd3a820
chore: update Go to 1.20.6
See https://go.dev/doc/devel/release#go1.20.6

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-12 15:21:26 +04:00
Andrey Smirnov
8017afb107
feat: implement CRI image management and pre-pull on K8s upgrade
Fixes #6391

Implement a set of APIs and commands to manage images in the CRI, and
pre-pull images on Kubernetes upgrades.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-11 19:25:10 +04:00
Andrey Smirnov
1c2f19b367
feat: update Kubernetes to 1.28.0-alpha.4
The Go modules were not tagged for alpha.4, so using alpha.3 tag.

Talos 1.5 will ship with Kubernetes 1.28.0.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-11 15:40:24 +04:00
Noel Georgi
94e9891c1b
chore: bump sd-boot to v254-rc1
Bump sd-boot.
Fix parsing PE executable offsets.
Set the PE file alignment to be 512 bytes.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-11 15:52:57 +05:30
Artem Chernyshev
936111ce06
fix: properly set up tls for KMS endpoint
The condition was inverted 🤦

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2023-07-10 21:10:02 +03:00
Artem Chernyshev
cb226eec46
fix: rewrite encryption system information flow
Pass getter to the key handler instead of already fetched node uuid.

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2023-07-10 19:07:46 +03:00
Noel Georgi
3206db5289
feat: drop tpm simulator for ukify measure
We do not need a tpm simulator for ukify measure. We can pre-calculate
the values. This also means we can build ukify as a static binary.

Signed-off-by: Noel Georgi <git@frezbo.dev>
2023-07-10 19:21:32 +05:30
Andrey Smirnov
bd4f89f633
fix: disable dashboard on Azure, GCP and Scaleway
Fixes #7416

These platforms don't have video console access.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-10 17:05:56 +04:00
Andrey Smirnov
bdb96189fa
refactor: make maintenance service controller-based
Fixes #7430

Introduce a set of resources which look similar to other API
implementations: CA, certs, cert SANs, etc.

Introduce a controller which manages the service based on resource
state.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-10 15:41:52 +04:00
Andrey Smirnov
d23d04de2a
feat: seed the kernel random pool from the TPM
Use the TPM2 feature to provide high-quality random bytes.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-07 23:51:11 +04:00
LukasAuerbeck
c81ce8cfb0
feat: support controlplane resources configuration
Fixes #7379

Add possibility to configure the controlplane static pod resources via
APIServer, ControllerManager and Scheduler configs.

Signed-off-by: LukasAuerbeck <17929465+LukasAuerbeck@users.noreply.github.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-07 22:44:56 +04:00
Andrey Smirnov
74de562b29
fix: mount hugepages with nosuid + nodev
Fixes #7445

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-07 21:57:19 +04:00
Artem Chernyshev
ce63abb219
feat: add KMS assisted encryption key handler
Talos now supports new type of encryption keys which rely on Sealing/Unsealing randomly generated bytes with a KMS server:

```
systemDiskEncryption:
  ephemeral:
    keys:
      - kms:
          endpoint: https://1.2.3.4:443
        slot: 0
```
gRPC API definitions and a simple reference implementation of the KMS server can be found in this
[repository](https://github.com/siderolabs/kms-client/blob/main/cmd/kms-server/main.go).

Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
2023-07-07 19:02:39 +03:00
Andrey Smirnov
dafbe9debd
chore: optimize dockerfile instructions
Use shell here-doc to unify multiple commands into a single layer to
have less layers created.

Use `--link` to pull in pkgs.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-07 17:54:53 +04:00
Andrey Smirnov
a4289e8703
chore: fix CLI docs generation stability
The problem first spotted by Artem, leads to spurious dirty checks.

The sort order was checking wrong (lowered) keys, so the order was
actually random.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-05 19:20:40 +04:00
Andrey Smirnov
2fec8388fc
chore: bump dependencies
Go modules, pkgs, Cilium CLI, CAPI base version.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-05 18:30:54 +04:00
Steve Francis
c1b4262dd6
docs: split simple and more complex getting started guides
Split the documents to provide easier version for the as the starting
guide.

Signed-off-by: Steve Francis <steve.francis@talos-systems.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-04 21:27:42 +04:00
Andrey Smirnov
c9a9f95611
refactor: extract secure boot certificate generation
Fixes #7412

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-03 16:55:02 +04:00
Andrey Smirnov
6be5a13d5d
feat: implement machine config documents for event and log streaming
Fixes #7228

Add some changes to make Talos accept partial machine configuration
without main v1alpha1 config.

With this change, it's possible to connect a machine already running
with machine configuration (v1alpha1), the following patch will connect
to a local SideroLink endpoint:

```yaml
apiVersion: v1alpha1
kind: SideroLinkConfig
apiUrl: grpc://172.20.0.1:4000/?jointoken=foo
---
apiVersion: v1alpha1
kind: KmsgLogConfig
name: apiSink
url: tcp://[fdae:41e4:649b:9303::1]:4001/
---
apiVersion: v1alpha1
kind: EventSinkConfig
endpoint: "[fdae:41e4:649b:9303::1]:8080"
```

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
2023-07-01 00:22:44 +04:00