IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Fixes#8481
The issue was that the link 'bridge' was skipped, so Talos default was
applied to run DHCP and use the DHCP hostname (instead of using
platform's hostname).
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#8498
Before KubeSpan was reimplemented to use resources for firewall rules,
the update was happening always, but it got moved to a wrong section of
the controller which gets executed on resource updates, but ignores
updates of the peer statuses.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#8361
Talos requires v2 (circa 2008), but VMs are often configured to limit
the exposed features to the baseline (v1).
```
[ 0.779218] [talos] [initramfs] booting Talos v1.7.0-alpha.1-35-gef5bbe728-dirty
[ 0.779806] [talos] [initramfs] CPU: QEMU Virtual CPU version 2.5+, 4 core(s), 1 thread(s) per core
[ 0.780529] [talos] [initramfs] x86_64 microarchitecture level: 1
[ 0.781018] [talos] [initramfs] it might be that the VM is configured with an older CPU model, please check the VM configuration
[ 0.782346] [talos] [initramfs] x86_64 microarchitecture level 2 or higher is required, halting
```
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Before this commit, if tunnel failed with error, it would never restart again until `siderolink.TunnelType` event happen.
For most of the time it's a good idea, because it might mean that destination has changed.
But tunnel can also fail because allowed peer list is not yet loaded on newly started Omni instance.
Because of that, we want to try again and not be tied to the runtime event channel.
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Fix Equnix Metal (where proper arm64 args are known) and metal platform
(using generic arm64 console arg).
Other platforms might need to be updated, but correct settings are not
known at the moment.
Fixes#8529
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This PR ensures that we can test our siderolink communication using embedded siderolink-agent.
If `--with-siderolink` provided during `talos cluster create` talosctl will embed proper kernel string and setup `siderolink-agent` as a separate process. It should be used with combination of `--skip-injecting-config` and `--with-apply-config` (the latter will use newly generated IPv6 siderolink addresses which talosctl passes to the agent as a "pre-bind").
Fixes#8392
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
This allows to roll all nodes to use a new CA, to refresh it, or e.g.
when the `talosconfig` was exposed accidentally.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#8267
Also refactor the code so that we don't fail hard on mutiple bonds, but
it's not clear still how to attach addresses, as they don't have a
interface name field, so for now attaching to the first bond.
Fixes#8411
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The core change is moving the context out of the `ServiceRunner` struct
to be a local variable, and using a channel to notify about shutdown
events.
Add more synchronization between Run and the moment service started to
avoid mis-identifying not running (yet) service as successfully finished.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Co-authored-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
This controller combines kobject events, and scan of `/sys/block` to
build a consistent list of available block devices, updating resources
as the blockdevice changes.
Based on these resources the next step can run probe on the blockdevices
as they change to present a consistent view of filesystems/partitions.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The current code was stipping non-`v1alpha1.Config` documents. Provide a
proper method in the config provider, and update places using it.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
We now remove the machine config with the id `maintenance` when we are done with it - when the maintenance service is shut down.
Closessiderolabs/talos#8424, where in some configurations there would be machine configs with both `v1alpha1` and `maintenance` IDs present, causing the `talosctl edit machineconfig` to loop twice and causing confusion.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Implement `Install` for imager overlays.
Also add support for generating installers.
Depends on: #8377Fixes: #8350Fixes: #8351Fixes: #8350
Signed-off-by: Noel Georgi <git@frezbo.dev>
This is a small quality of life improvement that allows `logs` subcommand to suggest all available logs.
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Fix the nil dereferences when a Talos node is attempted to be upgraded while in maintenance mode and having a partial machine config.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
To be used in the `go-talos-support` module without importing the whole
Talos repo.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
Fixes a condition when the timestamp contains a single digit day.
This started failing when the month started :sweat_smile.
Also handle a case when `tag` and `hostname` are both missing.
Signed-off-by: Noel Georgi <git@frezbo.dev>
Turns out there is actually no black magic in systemd, they simply listen on 127.0.0.53 and forward dns requests there in resolv.conf.
Reason is the same as ours — to preserve compatibility with other applications. So we do the same in our code.
This PR also does two things:
- Adds `::1` into resolv.conf for IPv6 only resolvers.
- Drops `SO_REUSEPORT` from control options (it works without them).
Closes#8328
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
This errors pops up when `udevd` rescans the partition table with Talos
trying to mount a device concurrently.
This feels to be something new with Linux 6.6 probably.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#8345
Both `apid` and `trustd` services use a gRPC connection back to
`machined` to watch changes to the certificates (new certificates being
issued).
This refactors the code to follow regular conventions, so that a failure
to watch will crash the process, and they have a way to restart and
re-establish the watch.
Use the context and errgroup consistently.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Let's add a very basic test for the Kata Containers extension, mimicing
what's already in place for gVisor.
This depends on the work being done in:
https://github.com/siderolabs/extensions/pull/279
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
Drop the Kubernetes manifests as static files clean up (this is only
needed for upgrades from 1.2.x).
Fix Talos handling of cgroup hierarchy: if started in container in a
non-root cgroup hiearachy, use that to handle proper cgroup paths.
Add a test for a simple TinK mode (Talos-in-Kubernetes).
Update the docs.
Fixes#8274
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
SideroLink is a secure channel, so we can allow read access to the resources. This will give us more control of the node via Omni and/or other systems using SideroLink.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
This disables by default (if not specified in the machine config) the
endpoint harvesting for KubeSpan peers.
The idea was to observe Wireguard endpoints as seen by other peers in
the cluster, and add them to the list of endpoints for the node. This
might be helpful only in case of some special type of NATs which are
almost never seen in the wild today.
So disable by default, but keep an option to enable it.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Prevent `DNSUpstreamController` from panicking by checking if the `machine` section in the config is `nil`. This is the case when a machine has partial configuration, e.g., when the machine has only a `SideroLinkConfig` in its config.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Talos Linux 1.7.0 will ship with Kubernetes v1.30.0.
Drop some compatibility for Kubernetes < 1.25, as 1.25 is the minimum
supported version now.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#4525
The previous implementation had several issues:
* etcd concurrency session never closed
* Unlock() with potentially closed context
* unlocking when upgrade sequence finishes, but this overlaps with the
machine reboot, so a chance that it never got unlocked
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
While we decide what to do with #8263 and #8256 this quickfix at least allows us to
see what went wrong
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
With switching to RSA service account, machine config generation time is
considerably higher now, so the test might not make it in time.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Also:
* Linux 6.6.14 + XDP enablement
* etcd 3.5.12
Various other bumps for the tools, utilities, and Go modules.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
It was deprecated 16 months ago, time to cleanup.
(This is to prepare for the first v1.7 release)
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#8069
The image age from the CRI is the moment the image was pulled, so if it
was pulled long time ago, the previous version would nuke the image as
soon as it is unreferenced. The new version would allow the image to
stay for the full grace period in case the rollback is requested.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
In the previous implementation, even though `installer.err` was set, it
was never checked 🤦.
The run loop was stolen from the dashboard code.
Fixes#8205
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#8202
If some mountpoint can't be queried successfully for 'diskfree'
information, don't treat that as an error, and report zero values for
disk usage/size instead.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This PR adds a new controller - `DNSServerController` that starts tcp and udp dns servers locally. Just like `EtcFileController` it monitors `ResolverStatusType` and updates the list of destinations from there.
Most of the caching logic is in our "lobotomized" "`CoreDNS` fork. We need this fork because default `CoreDNS` carries
full Caddy server and various other modules that we don't need in Talos. On our side we implement
random selection of the actual dns and request forwarding.
Closes#7693
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
This is currently no-op, just noticed that while looking into another
bug. This should make the intention more clean.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#8157
This PR contains two fixes, both related to the same problem.
Several routes for different links but same IPv6 destination might exist
at the same time, so route resource ID should handle that. The problem
was that these routes were mis-reported causing internally updates for
the same resources multiple times (equal to the number of the links).
Don't trigger controllers more often than 10 times/seconds (with burst of
5) for kernel notifications. This ensures Talos doesn't try to reflect
current state of the network subsystem too often as resources, which
causes excessive CPU usage and might potentially lead to the buffer
overrun under high rate of changes.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Before this change KubePrism used hardcoded "localhost" as destination which Go could resolve to IPv6 destination and
then fail to connect to. This change forces KubePrism to connect using IPv4 and uses hardcoded "127.0.0.1" destination so
it will always use IPv4.
For #8112
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
When the dashboard is used via the CLI through a proxy, e.g., through Omni, node names or IDs can be used in the `--nodes` flag instead of the IPs.
This caused rendering inconsistencies in the dashboard, as some parts of it used the IPs and some used the names passed in the context.
Fix this by collecting all node IPs on dashboard start, and map these IPs to the respective nodes passed as the `--nodes` flag.
On the dashboard footer, we always display the node names as they are passed in the `--nodes` flag.
As part of it, remove the node list change reactivity from the dashboard, so it will always take the passed nodes as the truth.
The IP to node mapping collection at dashboard startup also solves another issue where the first API call by the dashboard triggered the interactive API authentication (e.g., the OIDC flow). Previously, because the terminal was already switched to the raw mode, it was not possible to authenticate properly.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Add missing attributes to conversion of go-blockdevice disk
to protobuf disk.
Signed-off-by: Jonomir <68125495+Jonomir@users.noreply.github.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
We will use the default IPv6 gateway priority as 2048.
The RA default is 1024, which leads to verbose messages such as 'error adding route: netlink receive: file exists.'
Azure uses DHCPv6 and RA for configuring IPv6 on the node.
The platform sets the default gateway as a fallback in case 'accept_ra' is not set to 2.
Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Add some quirks to make images generated with newer Talos compatible
with images generated by older Talos.
Specifically, reset options were adding in Talos 1.4, so we shouldn't
add them for older versions.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This is going to be multipart effort to finally use safe.* wrappers in the production code.
Quick regexp search shows that there are around 150 direct type assertions on resources (excluding the ones in this commit).
Also - migrate from `interface{}` to `any` and use `slices.Sort*` instead of `sort.*` where possible.
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
When creating an image under non-default mount prefix, it should be
used explicitly when copying SBC files.
See https://github.com/siderolabs/image-factory/issues/65
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Updated lots of documentation with new/updated flows.
Provide What's New for Talos 1.6.0.
Update Troubleshooting guide to cover more steps.
Make Talos 1.6 docs the default.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#8057
I went back and forth on the way to fix it exactly, and ended up with a
pretty simple version of a fix.
The problem was that discovery service was removing the member at the
initial phase of reset, which actually still requires KubeSpan to be up:
* leaving `etcd` (need to talk to other members)
* stopping pods (might need to talk to Kubernetes API with some CNIs)
Now leaving discovery service happens way later, when network
interactions are no longer required.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
It works well for small clusters, but with bigger clusters it puts too
much load on the discovery service, as it has quadratic complexity in
number of endpoints discovered/reported from each member.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This might come handy to distinguish sequences, tasks initiated by a
particular API request.
Signed-off-by: Artem Chernyshev <artem.chernyshev@talos-systems.com>
Reimplement `gopacket.PacketSource.PacketsCtx` as `forEachPacket`.
- Use `ZeroCopyPacketDataSource` instead of `PacketDataSource`. I didn't find any specific reason why `PacketDataSource` exists at all, since `NewPacket` is doing copy inside if you don't explicitly tell it not to.
- Use `WillPool` to pool packet buffers. It doesn't fully remove allocations, but it's a safe start.
Send packets back into the pool after we are done with them.
- Pass `Packet` directly to the closure instead of waiting for it on the channel. We don't store this packet anywhere so there is no reason to async this part.
- Drop `time.Sleep` code in `forEachPacket` body.
- Drop `SnapLen` support in client and server since it didn't work anyway (details in the PR).
Closes#7994
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Generate a structured table of contents following the structure of the
config.
Make high-level examples follow the full structure of the config.
Document new multi-doc machine config.
Fixes#8023
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The core blockdevice library already supported resolving symlinks, we
just need to get the raw block device name from it, and use it
afterwards.
In QEMU provisioner, leave the first (system) disk as virtio (for
performance), and mount user disks as 'ata', which allows `udevd` to
pick up the disk IDs (not available for `virtio`), and use the symlink
path in the tests.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#7854
Talos runs an emergency handler if the sequence experience and
unrecoverable failure. The emergency handler was unconditionally
executing "reboot" action if no other action was received (which only
gets received if the sequence completes successfully), so the Shutdown
request might result in a Reboot behavior on error during shutdown
phase.
This is not a pretty fix, but it's hard to deliver the intent from one
part of the code to another right now, so instead use a global variable
which stores default emergency intention, and gets overridden early in
the Shutdown sequence.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
As the controller reconciles every /etc file present, it might be called
multiple times for the same file, even if the actual contents haven't
changed.
Rewriting the file might lead to some concurrent process seeing
incomplete file contents more often than needed.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The problem was that bootloaders were correctly picking up defaults for
`installer` mode (vs. `imager` mode), but DTB and other SBC stuff wasn't
properly initialized, so installing on SBC fails.
Now all options are properly initialized with defaults early in the
process.
Fixes#8009
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#7947
This way etcd advertised address can be picked from the `external IPs`
of the machine.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#4421
See documentation for details on how to use the feature.
With `talosctl cluster create`, firewall can be easily test with
`--with-firewall=accept|block` (default mode).
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
We already have the code which supports custom enums, so let's extend it to support custom enums in slices and
fix the NfTablesConntrackStateMatch proto definition.
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Many changes to the nftables backend which will be used in the follow-up
PR with #4421.
1. Add support for chain policy: drop/accept.
2. Properly handle match on all IPs in the set (`0.0.0.0/0` like).
3. Implement conntrack state matching.
4. Implement multiple ifname matching in a single rule.
5. Implement anonymous counters.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Implement initial set of backend controllers/resources to handle
nftables chains/rules etc.
Replace the KubeSpan nftables operations with controller-based.
See #4421
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The code will rotate through the endpoints, until it reaches the end, and only then it will try to do the provisioning again.
Closes#7973
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
This PR adds support for custom node taints. Refer to `nodeTaints` in the `configuration` for more information.
Closes#7581
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
This commit deprecates those things:
- Removes the support of `.persist` flag. From now, it should always be enabled or not defined in the config.
- Removes the documentation for `.bootloader`. It never worked anyway.
- Adds a warning for `.machine.install.extensions`, suggests to use boot-assets.
Closes#7972Closes#7507
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Ignore kernel command line for `SideroLink` and `EventsSink` config when
running in container mode. Otherwise when running Talos as a docker
container in Talos it picks up the host kernel cmdline and try to
configure SideroLink/EventsSink.
Signed-off-by: Noel Georgi <git@frezbo.dev>
Start container early in the boot process so system extension services
start in maintenance mode.
Fixes: #7083
Signed-off-by: Noel Georgi <git@frezbo.dev>
Fixes#7873
Some services which perform mounts inside the container which require
mounts to propagate back to the host (e.g. `stargz-snapshotter`) require
this configuration setting.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Support different providers, not only static file paths.
Drop `pcr-signing-key-public.pem` file, as we generate it on the fly
now.
See https://github.com/siderolabs/image-factory/issues/19
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This PR does those things:
- It allows API calls `MetaWrite` and `MetaRead` in maintenance mode.
- SystemInformation resource now waits for available META
- SystemInformation resource now overwrites UUID from META if there is an override
- META now supports "UUID override" and "unique token" keys
- ProvisionRequest now includes unique token and Talos version
For #7694
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
Previously a fix was deployed in the Talos API client, but when the
request passes through `apid`, we need to make sure that proxy doesn't
reject large responses.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Move `setupLogging` inside the controller, so that logger is set up
correctly before Talos starts printing first messages.
This fixes an inconsistency that first messages are printed using
"default" logger, while after that the proper logger is set up, and
format of the messages matches kernel log.
Also move `waitForUSBDelay` into the sequencer after `udevd` was
started (this is when blockdevices including USB ones are discovered).
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
First of all, this interface is way more performant than `pcap`
interface. It is Linux-specific, but we don't care in Talos Linux :)
Second, this drop dependency of `machined` on `gopacket/layers` package,
which has huge issues with memory allocations and startup time.
This cuts around 20MiB of process RSS for all Talos processes.
(`talosctl` still requires this `gopacket/layers` library for decoding
packets).
Fixes#7880
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
As Talos doesn't consume `.machine.install` if already installed, there
is no point in validating it once already installed.
This fixes a problem users often run into: after a reboot/upgrade the
system disk blockdevice name changes, due to the kernel upgrade, or just
unpredictable behavior of device discovery. Talos fails to boot as it
can't validate the machine config, while it's already installed, so
actual blockdevice name doesn't matter.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
See https://github.com/siderolabs/image-factory/issues/44
Instead of using constants, use proper Talos version and kernel version
discovered from the image.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This leads to lots of unnecessary improts, as the chain from network
controllers is pretty long.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Use fixed partition name instead of trying to auto-discover by label.
Auto-discovery by label might hit completely wrong blockdevice.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
See https://github.com/siderolabs/image-factory/issues/43
Two fixes:
* pass path to the dtb, uboot and rpi-firmware explicitly
* include dtb, uboot and rpi-firmware into arm64 installer image when
generated via imager (regular arm64 installer was fine)
(The generation of SBC images was not broken for Talos itself, but only
when used via Image Factory).
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Before we started a reboot/shutdown/reset/upgrade action with the action tracker (`--wait`), we were setting a flag to prevent cobra from printing the returned error from the command.
This was to prevent the error from being printed twice, as the reporter of the action tracker already prints any errors occurred during the action execution.
But if the error happens too early - i.e. before we even started the status printer goroutine, then that error wouldn't be printed at all, as we have suppressed the errors.
This PR moves the suppression flag to be set after the status printer is started - so we still do not double-print the errors, but neither do we suppress any early-stage error from being printed.
Closessiderolabs/talos#7900.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
This allows to "recover" secrets if the machine config was generated
first without explicitly saving secrets bundle.
Fixes#7895
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This commit integrates the GOMEMLIMIT environment variable into shipped K8S
manifests when resources.limits.memory is defined. It is set to 95% of the
memory limit to optimize the performance of the Go garbage collector,
mitigating the risk of OOMKills in containerized environments.
When configuring the controller-manager or scheduler custom resources in
machine config, they where accepted, but ignored.
This commit adds Resources to NewControlPlaneSchedulerController and
NewControlPlaneControllerManagerController so machine config resources
Fixes#7874
Signed-off-by: Nico Berlee <nico.berlee@on2it.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Use MAC address over network interface name.
Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
* Set static gateway IPv6 if it possible.
Some cni do not work properly with ipv6, so we will fix it.
* Disable talos dashboard.
Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
There were weird hacks put into the tests, while each test already runs
in a temporary directory as 'working directory', so no hacks are needed.
Moreover, using fixed `/tmp/...` paths leads to test failures, as CI
runs docker & QEMU tests in parallel conflicting with each other.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
* support for local-hostname parameter
* support for hostnames passed via user-data (for Proxmox VE)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
First of all, it breaks our backwards compatibility promises and breaks
documentation generation. Upstream `specs.Mount` might change at any
time.
The issue was that containerd 1.7.x brings in new `specs.Mount` which
contains extra fields which don't have `omitempty` for YAML, so
machinery always generates them which confuses old Talos versions.
Use a copy of the upstream struct with proper YAML tags, and also
provide a special trick to make sure if the upstream struct changes, we
have a chance to update our copy of the struct.
Also this fixes docs and JSON schema.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This does not fix the underlying digest mismatch issue, but does handle the error and should provide
further insight into issues (if present).
Refs: #7828
Signed-off-by: Thomas Way <thomas@6f.io>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The conversion from TPM 2 hash algorithm to Go crypto algorithm will fail for
uncommon algorithms like SM3256. This can be avoided by checking the constants
directly, rather than converting them. It should also be fine to allow some non
SHA-256 PCRs.
Fixes: #7810
Signed-off-by: Thomas Way <thomas@6f.io>
Signed-off-by: Noel Georgi <git@frezbo.dev>
drop `UpdateEndpointSuite` suite since KubePrism is enabled by default
starting Talos 1.6 and the test never passes since K8s node is always
ready since it can connect to api server over KubePrism.
Signed-off-by: Noel Georgi <git@frezbo.dev>
When STATE is reset, we need to make sure we wipe the META keys
containing encryption config as well.
Fixes#7819
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This was a fix some time ago, but it was incorrect (missing `continue`),
which was failing the unit-tests.
Also fix a data race in another unit-test (which is unit-test only, not
affecting production).
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
When running on the machine, the extensionTreePath is not writeable, so
create and clean up a temporary directory to host `modules.dep`
extension.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Drop loop device/mounts completely, use userspace utilities to extract
and lay over module trees in the tmpfs.
Discover kernel version automatically instead of hardcoding it to be
current one (required for Image Service).
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Otherwise route gets created with priority '0' and it seems to get into
conflict with what Cilium tries to add.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes#7738
If the SideroLink address changes, maintenance service should listen on
new address. Previously it worked "sometimes", as there was a race on
maintenance config either be removed/recreated or just updated. In case
of an update the listen address was not updated properly, but recreate
case worked correctly.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This feature allows us to remove any comments from the machineconfig after
upgrading Kubernetes.
Signed-off-by: Serge Logvinov <serge.logvinov@sinextra.dev>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>