IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Run `xfs_repair` on XFS filesystems that needs repairing indicated by
the `unix.EUCLEAN` error when mounting
Fixes#5319Fixes#5437
Signed-off-by: Noel Georgi <git@frezbo.dev>
This increases `initramfs` size by 356060 bytes (raw text database is
1.3 MiB).
In QEMU:
```
$ talosctl -n 172.20.0.2 get links eth0 -o yaml
spec:
...
productID: "0x1000"
vendorID: "0x1af4"
product: Virtio network device
vendor: Red Hat, Inc.
```
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Talos historically relied on `kubernetes` `Endpoints` resource (which
specifies `kube-apiserver` endpoints) to find other controlplane members
of the cluster to connect to the `etcd` nodes for the cluster (when node
local etcd instance is not up, for example). This method works great,
but it relies on Kubernetes endpoint being up. If the Kubernetes API is
down for whatever reason, or if the loadbalancer malfunctions, endpoints
are not available and join/leave operations don't work.
This PR replaces the endpoints lookup to use the `Endpoints` COSI
resource which is filled in using two methods:
* from the discovery data (if discovery is enabled, default to enabled)
* from the Kubernetes `Endpoints` resource
If the discovery is disabled (or not available), this change does almost
nothing: still Kubernetes is used to discover control plane endpoints,
but as the data persists in memory, even if the Kubernetes control plane
endpoint went down, cached copy will be used to connect to the endpoint.
If the discovery is enabled, Talos can join the etcd cluster immediately
on boot without waiting for Kubernetes to be up on the bootstrap node
which means that Talos cluster initial bootstrap runs in parallel on all
control plane nodes, while previously nodes were waiting for the first
node to finish bootstrap enough to fill in the endpoints data.
As the `etcd` communication is anyways protected with mutual TLS,
there's no risk even if the discovery data is stale or poisoned, as etcd
operations would fail on TLS mismatch.
Most of the changes in this PR actually enable populating Talos
`Endpoints` resource based on the `Kubernetes` `endpoints` resource
using the watch API.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Dry run prints out config diff, selected application mode without
changing the configuration.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
When interating with the kubeconfig it can be
expected that a user may have the KUBECONFIG
environment variable set, so we need to use
it when appropriate.
Closes#5091
Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
They should cause no harm as every extension as an image on its own, so
hardlinks are only available between the files in one image only.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Add a mock D-Bus daemon and a mock logind implementation over D-Bus.
Kubelet gets a handle to the D-Bus socket, connects over it to our
logind mock and negotiates shutdown activities.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
With system extensions, size of the `initramfs` might increase
significantly. With 1000 MiB `/boot`, as we store `A` and `B` boot
directories, we have 500 MiB for each Talos boot (size of the kernel and
initramfs).
Fixes#5096
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
They were discovered as we tagged 1.0.0 version:
* wrong deprecated version
* incompatibility in extension compatibility checks
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4694
User services run alongside with Talos system services.
Every user service container root filesystem should be already present
in the Talos root filesystem.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4816
This changes the way system extensions are packaged into the squashfs
images: `/lib/firmware` is now moved out of the future squashfs images
and becomes part of `initramfs` to make firmware available in the early
boot.
Talos will bind-mount `/lib/firmware` into rootfs as well, so it will be
available in the rootfs as well.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
See #4816
Depending on the hardware and firmware type, firmware might be either
needed during initial boot (`initramfs`) or in the Talos running phase
(`rootfs`). As we don't want to have two copies of same firmware, share
the firmware by bind-mounting it from the `initramfs` down to `rootfs`
on switchroot.
This also cleans up `Dockerfile` to keep firmware only in `initramfs`.
Eventually we might get rid of some of the firmware and move it to the
system extensions.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4815
This implements the following steps:
* machine configuration updates
* pulling and unpacking system extension images
* validating, listing system extensions
* re-packing system extensions
* preserving installed extensions in `/etc/extensions.yaml`
Once extension is enabled, raw information can be queried with:
```
$ talosctl -n 172.20.0.2 cat /etc/extensions.yaml
layers:
- image: 000.ghcr.io-smira-gvisor-c927b54-dirty.sqsh
metadata:
name: gvisor
version: 20220117.0-v1.0.0
author: Andrew Rynhard
description: |
This system extension provides gVisor using containerd's runtime handler.
compatibility:
talos:
version: '> v0.15.0-alpha.1'
```
This was tested with the `gvisor` system extension.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Containerd doesn't support merging plugin configuration from multiple
sources, and Talos has several pieces which configure CRI plugin:
(see https://github.com/containerd/containerd/issues/5837)
* base config
* registry mirror config
* system extensions
* ...
So we implement our own simple way of merging config parts (by simply
concatenating text files) to build a final `cri.toml`.
At the same time containerd migrated to a new format to specify registry
mirror configuration, while old way (via CRI config) is going to be
removed in 1.7.0. New way also allows to apply most of registry
configuration (except for auth) on the fly.
Also, containerd was updated to 1.6.0-rc.0 and runc to 1.1.0.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
The list of layers should come from the `/extensions.yaml` configuration
file.
Closes: https://github.com/talos-systems/talos/issues/4814
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
As `talosctl time` relies on default time server set in the config, and
our nodes start with `pool.ntp.org`, sometimes request to the timeserver
fails failing the tests.
Retry such errors in the tests to avoid spurious failures.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4688
Instead of using generic library, build some handcrafted code to
reuse buffers, do partial parsing of the data we need for the processes
API.
Benchmark (it runs with significant number of processes on the host):
```
name time/op
PrometheusProcfs-16 3.42ms ± 8%
Processes-16 2.36ms ± 5%
name alloc/op
PrometheusProcfs-16 366kB ± 0%
Processes-16 255kB ± 0%
name allocs/op
PrometheusProcfs-16 6.76k ± 0%
Processes-16 3.83k ± 0%
```
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Tmpfs uses shared mamory. The owner of it is system cgroup.
It can be broke the system, put the big file on it.
* set mount options to /tmp, /run folder as many OS have.
* limit /tmp size to 64Mb.
Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
See https://github.com/containerd/cri/pull/1543Fixes#4274
Fix is applied on two levels:
* for Talos-initiated pulls, update API call
* for Kubernetes-initiated pulls, update CRI plugin config
Comparison of `/var` usage before/after, as reported by
`talosctl mounts` (in GiB):
| | before | after |
|--------------|:------:|------:|
| controlplane | 1.98 | 1.74 |
| worker | 1.17 | 1.01 |
It's hard to measure effect on pulls to system containerd, like
`installer` image, as it's ephemeral, but it should also reduce space
usage in `tmpfs`.
Also fixes output of `talosctl mounts`.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
As SideroLink addresses are ephemeral and point-to-point, filter them
out for node addresses, Kubelet, etcd, etc.
Fixes#4448
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Next blockdevice library release reads MBR along with GPT and raises
an error if GPT is not set.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
Fixes#4425
* add more logging for responses and sync process
* adjust time sync constants
* change the way poll interval is chosen (increasing on good sync,
decreasing on variation)
* filter out spikes
Based on flow in https://github.com/systemd/systemd/blob/main/src/timesync/timesyncd-manager.c
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
If the mount is skipped, we shouldn't record it and create a matching
resource.
This fixes a problem discovered by cluster discovery tests when node
establishes more than a single identity on initial boot with first one
being lost, but still exists in the discovery service.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Only `jump` syncs are logged to the console and any errors syncing.
Regular `slew` syncs are suppressed (only visible in
`talosctl logs controller-runtime`).
The very first sync is always reported to console.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4094
Deprecate old networkd APIs, `talosctl interfaces` and `talosctl routes`
now suggest different commands to be used to achieve same task.
TUI installer was updated to stop using Interfaces API.
Those APIs will be completely removed in 0.14.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4232
The result:
```
talosctl -n 172.20.0.2 get members
NODE NAMESPACE TYPE ID VERSION HOSTNAME MACHINE TYPE OS ADDRESSES
172.20.0.2 cluster Member talos-default-master-1 2 talos-default-master-1 controlplane Talos (v0.13.0-alpha.0-13-gfdd80a12-dirty) ["172.20.0.2","fdd1:f54:2697:3902:44f8:92ff:fe2e:1aea"]
172.20.0.2 cluster Member talos-default-worker-1 1 talos-default-worker-1 worker Talos (v0.13.0-alpha.0-13-gfdd80a12-dirty) ["172.20.0.3","fdd1:f54:2697:3902:d4ba:55ff:fe8a:f551"]
172.20.0.2 cluster Member talos-default-worker-2 1 talos-default-worker-2 worker Talos (v0.13.0-alpha.0-13-gfdd80a12-dirty) ["172.20.0.4","fdd1:f54:2697:3902:e00d:f4ff:fecf:51c8"]
```
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This PR makes sure that some capabilities (SYS_BOOT and SYS_MODULES) and
never be gained by any process running on Talos except for `machined`
itself.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This implements pushing to and pulling from Kubernetes cluster discovery
registry which is simply using extra Talos annotations on the Node
resources.
Note: cluster discovery is still disabled by default.
This means that each Talos node is going to push data from its own local
`Affiliate` structure to the `Node` resource, and also watches the other
`Node`s to scrape data to build `Affiliate`s from each other cluster
member.
Further down the pipeline, `Affiliate` is converted to a cluster
`Member` which is an easy way to see the cluster membership.
In its current form, `talosctl get members` is mostly equivalent to
`kubectl get nodes`, but as we add more registries, it will become more
powerful.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
We have multiple calls for `mountState` even when `STATE` is already
mounted, so we should handle it properly.
Example error:
```
[ 152.736427] [talos] apply configuration failed: error running phase 2 in applyConfiguration sequence: task 1/1: failed, error creating mount status resource: resource MountStatuses.runtime.talos.dev(runtime/STATE@1) already exists
```
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#4133
This is pretty limited resource, as it covers only system mounts, but
this is all we need for KubeSpan for now. More complete solution should
probably involve COSIfying whole mount subsystem.
Example:
```
$ talosctl -n 172.20.0.2 get mounts
NODE NAMESPACE TYPE ID VERSION SOURCE TARGET FILESYSTEM TYPE
172.20.0.2 runtime MountStatus EPHEMERAL 1 /dev/vda6 /var xfs
172.20.0.2 runtime MountStatus STATE 1 /dev/vda5 /system/state xfs
```
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fix mount option nsdelegate.
It makes delegation safe (more restrictions in the cgroup namespace).
Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
When running with cgroupsv2 and the deeply nested nature of our CI, we
need to take extra steps to make sure tests are working fine.
Some tests were disabled under cgroupsv2 as I can't make them work.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
For the `trustd`, this change is simple as it doesn't access any files
on the host filesystem.
For the `apid`, there are more things involved:
* `apid.sock` used for internal API calls should be createable by `apid`
* `runtime.sock` used for apid to COSI communication should be
accessible for `apid`
* `machined.sock` used for proxying calls to machined should be as well
made available to the `apid`.
Plus fixes default permissions for `tmpfs` mountpoints.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Fixed: https://github.com/talos-systems/talos/issues/3686
Replaced sequencer tasks for KSPP and Kubernetes required sysctl props
by the ones set by controllers.
KernelParam flow includes of 3 controllers and 2 resources:
- `KernelParamConfigController` - handles user sysctls coming from v1alpha1
config.
- `KernelParamDefaultsController` - handles our built-in KSPP and K8s
required sysctls.
- `KernelParamSpecController` - consumes `KernelParamSpec`s created by the
previous two controllers, applies them and updates the corresponding
`KernelParamStatus`.
Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>