IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
While we decide what to do with #8263 and #8256 this quickfix at least allows us to
see what went wrong
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
With switching to RSA service account, machine config generation time is
considerably higher now, so the test might not make it in time.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Also:
* Linux 6.6.14 + XDP enablement
* etcd 3.5.12
Various other bumps for the tools, utilities, and Go modules.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
It was deprecated 16 months ago, time to cleanup.
(This is to prepare for the first v1.7 release)
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#8069
The image age from the CRI is the moment the image was pulled, so if it
was pulled long time ago, the previous version would nuke the image as
soon as it is unreferenced. The new version would allow the image to
stay for the full grace period in case the rollback is requested.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
In the previous implementation, even though `installer.err` was set, it
was never checked 🤦.
The run loop was stolen from the dashboard code.
Fixes#8205
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#8202
If some mountpoint can't be queried successfully for 'diskfree'
information, don't treat that as an error, and report zero values for
disk usage/size instead.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This PR adds a new controller - `DNSServerController` that starts tcp and udp dns servers locally. Just like `EtcFileController` it monitors `ResolverStatusType` and updates the list of destinations from there.
Most of the caching logic is in our "lobotomized" "`CoreDNS` fork. We need this fork because default `CoreDNS` carries
full Caddy server and various other modules that we don't need in Talos. On our side we implement
random selection of the actual dns and request forwarding.
Closes#7693
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
This is currently no-op, just noticed that while looking into another
bug. This should make the intention more clean.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#8157
This PR contains two fixes, both related to the same problem.
Several routes for different links but same IPv6 destination might exist
at the same time, so route resource ID should handle that. The problem
was that these routes were mis-reported causing internally updates for
the same resources multiple times (equal to the number of the links).
Don't trigger controllers more often than 10 times/seconds (with burst of
5) for kernel notifications. This ensures Talos doesn't try to reflect
current state of the network subsystem too often as resources, which
causes excessive CPU usage and might potentially lead to the buffer
overrun under high rate of changes.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Before this change KubePrism used hardcoded "localhost" as destination which Go could resolve to IPv6 destination and
then fail to connect to. This change forces KubePrism to connect using IPv4 and uses hardcoded "127.0.0.1" destination so
it will always use IPv4.
For #8112
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
When the dashboard is used via the CLI through a proxy, e.g., through Omni, node names or IDs can be used in the `--nodes` flag instead of the IPs.
This caused rendering inconsistencies in the dashboard, as some parts of it used the IPs and some used the names passed in the context.
Fix this by collecting all node IPs on dashboard start, and map these IPs to the respective nodes passed as the `--nodes` flag.
On the dashboard footer, we always display the node names as they are passed in the `--nodes` flag.
As part of it, remove the node list change reactivity from the dashboard, so it will always take the passed nodes as the truth.
The IP to node mapping collection at dashboard startup also solves another issue where the first API call by the dashboard triggered the interactive API authentication (e.g., the OIDC flow). Previously, because the terminal was already switched to the raw mode, it was not possible to authenticate properly.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Add missing attributes to conversion of go-blockdevice disk
to protobuf disk.
Signed-off-by: Jonomir <68125495+Jonomir@users.noreply.github.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
We will use the default IPv6 gateway priority as 2048.
The RA default is 1024, which leads to verbose messages such as 'error adding route: netlink receive: file exists.'
Azure uses DHCPv6 and RA for configuring IPv6 on the node.
The platform sets the default gateway as a fallback in case 'accept_ra' is not set to 2.
Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Add some quirks to make images generated with newer Talos compatible
with images generated by older Talos.
Specifically, reset options were adding in Talos 1.4, so we shouldn't
add them for older versions.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This is going to be multipart effort to finally use safe.* wrappers in the production code.
Quick regexp search shows that there are around 150 direct type assertions on resources (excluding the ones in this commit).
Also - migrate from `interface{}` to `any` and use `slices.Sort*` instead of `sort.*` where possible.
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
When creating an image under non-default mount prefix, it should be
used explicitly when copying SBC files.
See https://github.com/siderolabs/image-factory/issues/65
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Updated lots of documentation with new/updated flows.
Provide What's New for Talos 1.6.0.
Update Troubleshooting guide to cover more steps.
Make Talos 1.6 docs the default.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#8057
I went back and forth on the way to fix it exactly, and ended up with a
pretty simple version of a fix.
The problem was that discovery service was removing the member at the
initial phase of reset, which actually still requires KubeSpan to be up:
* leaving `etcd` (need to talk to other members)
* stopping pods (might need to talk to Kubernetes API with some CNIs)
Now leaving discovery service happens way later, when network
interactions are no longer required.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
It works well for small clusters, but with bigger clusters it puts too
much load on the discovery service, as it has quadratic complexity in
number of endpoints discovered/reported from each member.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This might come handy to distinguish sequences, tasks initiated by a
particular API request.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
Reimplement `gopacket.PacketSource.PacketsCtx` as `forEachPacket`.
- Use `ZeroCopyPacketDataSource` instead of `PacketDataSource`. I didn't find any specific reason why `PacketDataSource` exists at all, since `NewPacket` is doing copy inside if you don't explicitly tell it not to.
- Use `WillPool` to pool packet buffers. It doesn't fully remove allocations, but it's a safe start.
Send packets back into the pool after we are done with them.
- Pass `Packet` directly to the closure instead of waiting for it on the channel. We don't store this packet anywhere so there is no reason to async this part.
- Drop `time.Sleep` code in `forEachPacket` body.
- Drop `SnapLen` support in client and server since it didn't work anyway (details in the PR).
Closes#7994
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Generate a structured table of contents following the structure of the
config.
Make high-level examples follow the full structure of the config.
Document new multi-doc machine config.
Fixes#8023
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The core blockdevice library already supported resolving symlinks, we
just need to get the raw block device name from it, and use it
afterwards.
In QEMU provisioner, leave the first (system) disk as virtio (for
performance), and mount user disks as 'ata', which allows `udevd` to
pick up the disk IDs (not available for `virtio`), and use the symlink
path in the tests.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#7854
Talos runs an emergency handler if the sequence experience and
unrecoverable failure. The emergency handler was unconditionally
executing "reboot" action if no other action was received (which only
gets received if the sequence completes successfully), so the Shutdown
request might result in a Reboot behavior on error during shutdown
phase.
This is not a pretty fix, but it's hard to deliver the intent from one
part of the code to another right now, so instead use a global variable
which stores default emergency intention, and gets overridden early in
the Shutdown sequence.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
As the controller reconciles every /etc file present, it might be called
multiple times for the same file, even if the actual contents haven't
changed.
Rewriting the file might lead to some concurrent process seeing
incomplete file contents more often than needed.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The problem was that bootloaders were correctly picking up defaults for
`installer` mode (vs. `imager` mode), but DTB and other SBC stuff wasn't
properly initialized, so installing on SBC fails.
Now all options are properly initialized with defaults early in the
process.
Fixes#8009
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#7947
This way etcd advertised address can be picked from the `external IPs`
of the machine.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#4421
See documentation for details on how to use the feature.
With `talosctl cluster create`, firewall can be easily test with
`--with-firewall=accept|block` (default mode).
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
We already have the code which supports custom enums, so let's extend it to support custom enums in slices and
fix the NfTablesConntrackStateMatch proto definition.
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Many changes to the nftables backend which will be used in the follow-up
PR with #4421.
1. Add support for chain policy: drop/accept.
2. Properly handle match on all IPs in the set (`0.0.0.0/0` like).
3. Implement conntrack state matching.
4. Implement multiple ifname matching in a single rule.
5. Implement anonymous counters.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Implement initial set of backend controllers/resources to handle
nftables chains/rules etc.
Replace the KubeSpan nftables operations with controller-based.
See #4421
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The code will rotate through the endpoints, until it reaches the end, and only then it will try to do the provisioning again.
Closes#7973
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
This PR adds support for custom node taints. Refer to `nodeTaints` in the `configuration` for more information.
Closes#7581
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
This commit deprecates those things:
- Removes the support of `.persist` flag. From now, it should always be enabled or not defined in the config.
- Removes the documentation for `.bootloader`. It never worked anyway.
- Adds a warning for `.machine.install.extensions`, suggests to use boot-assets.
Closes#7972Closes#7507
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Ignore kernel command line for `SideroLink` and `EventsSink` config when
running in container mode. Otherwise when running Talos as a docker
container in Talos it picks up the host kernel cmdline and try to
configure SideroLink/EventsSink.
Signed-off-by: Noel Georgi <git@frezbo.dev>
Start container early in the boot process so system extension services
start in maintenance mode.
Fixes: #7083
Signed-off-by: Noel Georgi <git@frezbo.dev>
Fixes#7873
Some services which perform mounts inside the container which require
mounts to propagate back to the host (e.g. `stargz-snapshotter`) require
this configuration setting.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>