IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Discovered in #6971. Go compiler cannot deduce proper type on 32bit architectures for those constants,
in `fmt.Print(f)` functions. Since we only compare them with uint32 variables, it makes sense to add proper
types to them.
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
The problem showed up on 'reset' of the Talos node which had multiple
endpoints for other control plane nodes, many of which weren't actually
available.
When 'grpc.WithBlock()' is used, etcd will try to dial the first
endpoint and return an error if the dial fails.
Use noblock mode by default with multiple endpoints, and blocking mode
with a single endpoint.
Pass the context to etcd to properly abort dial operations if the
context get canceled.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Instead of doing excessive get/list requests, do a watch per node in an infinite retry.
Additionally, refactor the dashboard code to make the various data listener namings more consistent and reorganize the packages.
Closessiderolabs/talos#6960.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
If link has no `Info` field we can't do anything meaningful, so we'll just log and skip.
Also fix race in test.
For #6956
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Fix a data race caused by the metadata field of PlatformNetworkConfig being edited after it was sent to the channel. It caused test failures.
Fix it by setting a copy of the metadata instead.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
A special META key might contain optional platform network config for
the `METAL` platform.
It is completely optional, but if present, it works same way as in the
clouds: it is applied with low priority (can be overridden with machine
config), but provides some initial defaults for the machine.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This allows to put keys to META partition.
META contents can be viewed with `talosctl get metakeys`.
There is not real usecase for it yet, but the next PRs will introduce
two special keys which can be written:
* platform network config for `metal`
* `${code}` variable
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Implement the new summary dashboard with node info and logs.
Replace the previous metrics dashboard with the new dashboard which has multiple screens for node summary, metrics and editing network config.
Port the old metrics dashboard to the tview library and assign it to be a screen in the new dashboard, accessible by F2 key.
Add a new resource, infos.cluster.talos.dev which contains the cluster name and id of a node.
Disable the network config editor screen in the new dashboard until it is fully implemented with its backend.
Closessiderolabs/talos#4790.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Use a global instance, handle loading/saving META in global context.
Deprecate legacy syslinux ADV, provide an easier interface for
consumers.
Expose META as resources.
Fix the bootloader revert process (it was completely broken for quite a
while :sad:).
This is a first step which mostly does preparation work, real changes
will come in the next PRs:
* add APIs to write to META
* consume META keys for platform network config for `metal`
* custom key for URL `${code}`
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#6730
`go generate`-based step downloads the upstream manifest, transforms it
to match our requirements, and it is compiled in as the Flannel
manifest.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Launch CoreDNS even if the node is not initialized.
Network is ready already, but CCM didn't finish their job.
Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#6817
The original problem wasn't reproducible with `main`, but there was a
set of bugs in the shutdown sequence which was preventing it from
completing successfully, as in the maintenance mode nothing is running
and initialized yet.
Most of the bugs were `nil` pointer dereferences.
Fixed a small issue with final 'RebootError' printed as a failure in the
ACPI shutdown path.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This PR adds first 12 symbols from container ID and adds them to `talosctl -k containers` each container output.
That way we can ensure that we get the logs from proper container even if there is a newer one.
Closes#6886
Co-authored-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Use new version of go-kubernetes, and move the `kube-proxy` DaemonSet
update to follow common logic of bootstrap manifests update.
This fixes a confusing behavior when after `k8s-upgrade` the version of
`kube-proxy` is not updated in the machine config.
See https://github.com/siderolabs/go-kubernetes/pull/3
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Exposes the pod IP as the `POD_IP` environment variable via the downward
API in the kube-proxy pod for use in e.g. metrics-bind-addr.
Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
This introduces a new role for Talos API which fills the gap between
`os:reader` and `os:admin` roles.
Fixes#6898
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
When removing a member from `etcd`, the server does a pre-check to make
sure the member is connected to a quorum of other members, and the
remove request might fail. Add a retry to wait for the etcd to be fully
connected before giving up, as some parts of the reset flow alrady ran.
Also fix an issue which appears in the integration test, when `reset` is
called early in the boot sequence when local etcd hasn't started fully yet.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This PR ensures that we can handle third grub option - "wipe". We will use it in 1.4.
For #6842
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Fixes: https://github.com/siderolabs/talos/issues/6815
Additionally, make it possible to run reset in maintenance mode: to
enable a way for resetting system disk and remove all traces of Talos
from it.
The new reset flow works in a separate sequence, changed disk probe
lookup to check the boot partition instead of the ephemeral one.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
Move dashboard package into a common location where both Talos and talosctl can use it.
Add support for overriding stdin, stdout, stderr and ctt in process runner.
Create a dashboard service which runs the dashboard on /dev/tty2.
Redirect kernel messages to tty1 and switch to tty2 after starting the dashboard on it.
Related to siderolabs/talos#6841, siderolabs/talos#4791.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
- github.com/aws/aws-sdk-go to v1.44.209
- github.com/stretchr/testify to v1.8.2
- github.com/jsimonetti/rtnetlink to v1.3.1
- google.golang.org/genproto to v0.0.0-20230223222841-637eb2293923
- github.com/emicklei/dot to v1.3.1
- github.com/gdamore/tcell/v2 to v2.6.0
- github.com/insomniacslk/dhcp to v0.0.0-20230220063916-5369909a5de7
- github.com/jsimonetti/rtnetlink to v1.3.1
- github.com/opencontainers/runtime-spec to v1.1.0-rc.1.0.20230215090456-58ec43f9fc39
- github.com/rivo/tview to v0.0.0-20230226195229-47e7db7885b4
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
The `modules.dep` kernel module dependency tree extension root path was
previously created with a permission of `0o700` which means the talos
root go a permission of `0o700` when the kernel module tree was re-built
when extensions providing kernel modules was enabled. This means that
any binaries lost the executable permission when ran as non-root
creating an `EACCES` error. Fix by making sure the temporary directory
created for building kernel modules tree has `0o755` permission
explicitly.
Signed-off-by: Noel Georgi <git@frezbo.dev>
The shared code is going out to the
github.com/siderolabs/go-kubernetes library.
The code will be used in Talos and other projects using same features.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Load & start machined earlier and in initialize sequence, so that it is possible to use its API over its unix socket in maintenance mode.
Additionally, do not return features from Version API if a config is not yet available.
Related to siderolabs/talos#4791.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Can set netmask as number.
Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Talos always supported that, but CRI config lacked support for it.
Now with recent containerd the new `_default` host is used as a
fallback, so this re-enables the support and updates the docs.
See https://github.com/containerd/containerd/pull/8065
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes: #6802
Automatically load kernel modules based on hardware info and modules
alias info. udevd would automatically load modules based on HW
information present.
Signed-off-by: Noel Georgi <git@frezbo.dev>
This fixes the issue when the overlay mount target directory was used as
lowerdir for the mount, creating extra folders in the extension.
Fix the issue by adding support for normal overlay mounts to use a
source directory when specified.
Also fixes a small issue where messages was logged when error is nil.
Signed-off-by: Noel Georgi <git@frezbo.dev>
Not sure how I missed it in the first PR, but that's the only character
which was not quoted properly.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Wait for the network before trying to access the metadata service.
Retry the calls when appropriate (most platforms use `download.Download`
function which does proper retries).
Co-authored-by: Noel Georgi <git@frezbo.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
One case was missing: when network section is present, but value is
omitted.
Fixes#6825
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Use process wrapper introduced in #6814 to drop capabilities. This change
also means the capabilities are dropped per process level and not for
PID 1 (machined), which allows us to drop capabilities per process.
Signed-off-by: Noel Georgi <git@frezbo.dev>
Use a wrapper for starting processes which can setup proper cgroups,
OOMscore, and also drop capabilities for the process, then it calls
`execve`.
The containerd tests is also fixed to support cgroups when
running tests in buildkit. It used to pass previously as we did not
error if cgroup setup failed.
Signed-off-by: Noel Georgi <git@frezbo.dev>
This fixes multiple issues:
* `log.Fatalf` in the machined code leads to kernel panic
* return URL if some expansion fails
* correctly handle destroyed event (wait for the next one)
Fixes#6807
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
One of the fields in the GRUB config - boot arguments - contains
user-controlled input. Talos supports variable expansion in
`talos.config` parameter, and uses `${var}` syntax.
In GRUB config, `}` is a special character, and introduction of `}`
breaks config parsing both for GRUB and Talos.
Correctly escape and unescape special characters.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
The previous `udevd` healthcheck was incomplete and if `udevd` took more
time to startup the initial `udevadm trigger` would have silently failed
failing to setup proper devices. `udevadm trigger` returns an exit code
of zero even if `udevd` is not running. This PR fixes by first checking
if the `udevd` control socket exists, which is a faster check, then
making sure `udevd` is up by running `udevadm control` command. This
ensures that `udevd` is properly initialized before running any `udevadm
trigger` commands even if `udevd` is restarted/killed.
Signed-off-by: Noel Georgi <git@frezbo.dev>
This excludes it out of the `NodeAddress`.
Needs extra testing to confirm that it actually still works as anchor
IP.
Fixes#6760
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
As the client returns wrapped errors, unwrap them using our own method
which does `errors.As` instead of gRPC one which doesn't do unwrapping.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>