IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Currently, upgrade-k8s adds both node internal and external IPs.
This commit uses the internal IP if available; external IP is
only used as a fallback.
Signed-off-by: Alex Lubbock <code@alexlubbock.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Use `pigz` and `--sparse` to handle more efficiently compression of the
assets.
Also move tasks out of `setup-ci` step, as it runs always, including for
the promoted pipelines.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This is a port of ukify.py and systemd-measure from systemd.
This requires no actual TPM to be present to calculate the PCR
signatures.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
Provide a link to explain what versions are supported.
Signed-off-by: Alex Corcoles <alex@pdp7.net>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This cleans up `Dockerfile` and `Makefile` targets to be in similar
parity with `kres` auto-generated targets.
Now `make talosctl` would only build the one for the specific local
machine making development easier. Also added a `iso` docker target
that builds iso for local development without having to push and pull
the imager. (`make local-iso DEST=_out`)
Signed-off-by: Noel Georgi <git@frezbo.dev>
See #7230
Refactor more config interfaces, move config accessor interfaces
to different package to break the dependency loop.
Make `.RawV1Alpha1()` method typed to avoid type assertions everywhere.
No functional changes.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#7253
Also fix the case that `kube-proxy` version was updated in the machine
config in `--dry-run` mode.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
See #7230
This is a step towards preparing for multi-doc config.
Split the `config.Provider` interface into parts which have different
implementation:
* `config.Config` accesses the config itself, it might be implemented by
`v1alpha1.Config` for example
* `config.Container` will be a set of config documents, which implement
validation, encoding, etc.
`Version()` method dropped, as it makes little sense and it was almost
not used.
`Raw()` method renamed to `RawV1Alpha1()` to support legacy direct
access to `v1alpha1.Config`, next PR will refactor more to make it
return proper type.
There will be many more changes coming up.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This PR changes the default disk size for cloud images to be 8GiB
instead. This was prompted b/c the disk price in azure between tiers is
doubled and the cutoff for the tier is 8GiB.
Signed-off-by: Spencer Smith <spencer.smith@talos-systems.com>
Fixes#7246
The problem was that `udevd` watches via `inotify` any attempts to open
blockdevices with 'write' access.
Talos was opening with write access, but actually accessing as
read-only, so the fix is to open as read-only.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
This PR adds a flag to imager that allows for tweaking the size of the created disk. Additionally, it sets the default value of that created disk to 10GB, as most images are cloud images that fail when uploaded b/c it only picks up a 1GB disk currently. Also adds some processing the makefile to make sure we set the default small value for metal images and SBCs.
Signed-off-by: Spencer Smith <spencer.smith@talos-systems.com>
This is controlled with a feature flag which gets enabled automatically
for Talos 1.5+.
Fixes#7181
If enabled, configures kubelet to use project quotas to track xfs volume
usage, which is much more efficient than doing `du` periodically.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Create Azure Community Gallery Image Version on release:
- Add /hack/cloud-image-uploader/azure.go
- Upload vhd file to container for all architectures
- Create managed disk from vhd file for all architectures
- Create image version from managed disk for all architectures
- Modify /hack/cloud-image-uploader/main.go
- Start Community Gallery processes concurently with AWS upload
- Modify /hack/cloud-image-uploader/options.go
- Add additional Options for Community Gallery processes
- Modify .drone.jsonnet to use secrets for environment variables
- The following secrets need to be created for this to work:
- azure_subscription_id
- azure_client_id
- azure_client_secret
- azure_tenant_id
Signed-off-by: Christian Rolland <christian.rolland@siderolabs.com>
chore: fix linting errors in readme
Fix linting errors in readme
Signed-off-by: Christian Rolland <christian.rolland@siderolabs.com>
chore: fix markdown linting errors
Fix markdown linting errors in readme
Signed-off-by: Christian Rolland <christian.rolland@siderolabs.com>
chore: fix markdown linting errors
Fix markdown linting errors in readme
Signed-off-by: Christian Rolland <christian.rolland@siderolabs.com>
chore: change disk size to match new 10GB cloud image size
Change disk size to match 10GB cloud image size
Signed-off-by: Christian Rolland <christian.rolland@siderolabs.com>
Kubelet doesn't refresh self-issued serving certificates, so force it by
removing the cert on each restart.
Fix the code which was forcing rejoin when the nodename changes, it was
broken, as it was checking serving certificate instead of client
certificate. It worked by accident when not using controlplane-issued
serving certificates.
Fixes#7235
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
* drop old resources API, which was deprecated long time ago
* use bootstrapped event in `talosctl get --watch` to better align
columns in the table output
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
- github.com/containerd/typeurl to v2.1.1
- github.com/aws/aws-sdk-go to v1.44.264
- alpine to 3.18.0
- node to 20.2.0-alpine
- github.com/containernetworking/plugins to v1.3.0
- github.com/docker/docker to v23.0.6+incompatible
- github.com/hetznercloud/hcloud-go to v1.45.1
- github.com/insomniacslk/dhcp to v0.0.0-20230516061539-49801966e6cb
- github.com/rivo/tview to v0.0.0-20230511053024-822bd067b165
- tools to v1.5.0-alpha.0-7-gd2dde48
- pkgs to v1.5.0-alpha.0-16-g7958db1
Signed-off-by: Dmitriy Matrenichev <dmitry.matrenichev@siderolabs.com>
This PR changes the url used in the Makefile from a legacy
URL to point to the new community owned download host.
Signed-off-by: Ricky Sadowski <richard.j.sadowski@gmail.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
This reverts commit a2565f67416e9b9bc22f2d5506df9ea7771c0c8c.
The fix done in `a2565f67`, was actually a no-op caused by the
misunderstanding the fix done in Go and backported to [Go 1.20.4](ecf7e00db8).
The fix gave a false confidence that it was working when it was tested
against Talos `main` branch since the PR #7190 bumped `x/sys` package
from [v0.7.0 -> v0.8.0](ecf7e00db8), the actual change in `x/sys` can be found here at ff18efa0a3 which meant that when updating Go to 1.20.4 the `x/sys` package should been updated too. The `x/sys` package changed how the syscall to set the rlimit was called, it got moved into the Go stdlib instead of calling rlimit syscall in the `x/sys` package, which meant a combination of using Go 1.20.4 and an older `x/sys` package means `RLIMIT_NOFILE` value would not be set back to the original value.
The Talos 1.4 release branch currently have `x/sys`
at [v0.7.0(https://github.com/siderolabs/talos/blob/v1.4.3/go.mod#L133),
so the backport would consist of this change along another commit bumping `x/sys` package to `v0.8.0`.
Fixes: #7198Fixes: #7206
Co-authored-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Signed-off-by: Noel Georgi <git@frezbo.dev>
This bug is pretty cosmetic, but it shows up as a wrong check when
performing worker upgrade - Talos pretends it checks e.g. kube-apiserver
version which doesn't make sense for workers.
There were two bugs in the code:
* check for machine type was done against `TypeWorker`, while
`MachineType` resource is initially created as `TypeUnknown`
* the cleanup code was not implemented
As I touched the code, I updated controller and tests to use modern
conventions.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Introduce a new resource, `SiderolinkConfig`, to store SideroLink connection configuration (api endpoint for now).
Introduce a controller for this resource which populates it from the Kernel cmdline.
Rework the SideroLink `ManagerController` to take this new resource as input and reconfigure the link on changes.
Additionally, if the siderolink connection is lost, reconnect to it and reconfigure the links/addresses.
Closessiderolabs/talos#7142, siderolabs/talos#7143.
Signed-off-by: Utku Ozdemir <utku.ozdemir@siderolabs.com>
Fixes#7159
The change looks big, but it's actually pretty simple inside: the static
pods had an annotation which tracks a version of the secrets which
forced control plane pods to reload on a change. At the same time
`kube-apiserver` can reload certificate inputs automatically from files
without restart.
So the inputs were split: the dynamic (for kube-apiserver) inputs don't
need to be reloaded, so its version is not tracked in static pod
annotation, so they don't cause a reload. The previous non-dynamic
resource still causes a reload, but it doesn't get updated when e.g.
node addresses change.
There might be many more refactoring done, the resource chain is a bit
of a mess there, but I wanted to keep number of changes minimal to keep
this backportable.
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Talos doesn't have `rpc.statsd` running, so mounting without locking is
the only option. Some places in Kubernetes don't allow to set mount
options for NFS, so setting defaults is the only way.
Fixes#6582
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Ensure to wait as long as possibly given to kubelet shutdown timers.
Related to fix of siderolabs#7138
Signed-off-by: Niklas Wik <niklas.wik@nokia.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Fixes#7137
The `umount` syscall might hang "forever" if the underlying network
filesystem endpoint is down.
To be on the safe side, add a timeout around unmount operations, and try
to umount with force as a last resort.
Sample log:
```
14795.458779] [talos] task unmountPodMounts (2/2): unmounting /var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/dbe8d7f58e21d06cbef1ae0849317661eba4e82776722e7db5c65194ad73e916/globalmount/0001-0009-rook-ceph-0000000000000001-1051beb3-8d7a-4291-bf45-5711c13523d1
[14795.459797] [talos] task unmountPodMounts (2/2): unmounting /var/lib/kubelet/pods/f3f4d789-7f48-4dd9-9ef5-649b002c8f9c/volumes/kubernetes.io~csi/pvc-a4e72749-a8a1-43d9-9152-5bc1f757c924/mount
[14795.460555] EXT4-fs (rbd0): unmounting filesystem.
[14813.461319] [talos] task unmountPodMounts (2/2): unmounting /var/lib/kubelet/pods/f3f4d789-7f48-4dd9-9ef5-649b002c8f9c/volumes/kubernetes.io~csi/pvc-a4e72749-a8a1-43d9-9152-5bc1f757c924/mount is taking longer than expected, still waiting for 1m11.999162834s
[14831.460813] [talos] task unmountPodMounts (2/2): unmounting /var/lib/kubelet/pods/f3f4d789-7f48-4dd9-9ef5-649b002c8f9c/volumes/kubernetes.io~csi/pvc-a4e72749-a8a1-43d9-9152-5bc1f757c924/mount is taking longer than expected, still waiting for 53.999567033s
[14849.461336] [talos] task unmountPodMounts (2/2): unmounting /var/lib/kubelet/pods/f3f4d789-7f48-4dd9-9ef5-649b002c8f9c/volumes/kubernetes.io~csi/pvc-a4e72749-a8a1-43d9-9152-5bc1f757c924/mount is taking longer than expected, still waiting for 35.998979117s
[14867.460748] [talos] task unmountPodMounts (2/2): unmounting /var/lib/kubelet/pods/f3f4d789-7f48-4dd9-9ef5-649b002c8f9c/volumes/kubernetes.io~csi/pvc-a4e72749-a8a1-43d9-9152-5bc1f757c924/mount is taking longer than expected, still waiting for 17.999502128s
[14885.461123] [talos] task unmountPodMounts (2/2): unmounting /var/lib/kubelet/pods/f3f4d789-7f48-4dd9-9ef5-649b002c8f9c/volumes/kubernetes.io~csi/pvc-a4e72749-a8a1-43d9-9152-5bc1f757c924/mount with force
[14885.462395] [talos] ignoring unmount error /var/lib/kubelet/pods/f3f4d789-7f48-4dd9-9ef5-649b002c8f9c/volumes/kubernetes.io~csi/pvc-a4e72749-a8a1-43d9-9152-5bc1f757c924/mount: invalid argument
[14885.463529] [talos] task unmountPodMounts (2/2): unmounting /var/run/netns/cni-0888dc71-ba9e-af8a-d322-074f654561e5
[14885.464267] [talos] task unmountPodMounts (2/2): done, 1m30.028862262s
```
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
API server takes care of setting priority for "regular" pods from
priorityClassName, but nothing does that for static pods, so we have to
specify the priotity explicitly for static pods.
This fixes the graceful node shutdown (kubelet) to stop non-critical
pods before the api-server and friends (critical pods).
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Describe scaling down Talos cluster.
Signed-off-by: Steve Francis <steve.francis@talos-systems.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
Adds back in the required TARGETARCH for installer so extensions can be built off installer again as nvidia nonfree extension building was broken.
Fixes: #7155
Refs: #7115
Signed-off-by: Michael A. Davis <6325127+mrmichaeladavis@users.noreply.github.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>