talos

Author	SHA1	Message	Date
Andrew Rynhard	6a85a47ffa	feat: upgrade containerd to v1.4.0 This brings in the latest stable containerd. Signed-off-by: Andrew Rynhard <andrew@rynhard.io>	2020-09-04 02:59:08 -07:00
Andrey Smirnov	f6ecf000c9	refactor: extract packages loadbalancer and retry This removes in-tree packages in favor of: * github.com/talos-systems/go-retry * github.com/talos-systems/go-loadbalancer Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-09-02 13:46:22 -07:00
Andrey Smirnov	ac4ab11d36	chore: update k8s modules to 1.19 final version I think we missed that when updating K8s for the final version. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-09-02 06:25:28 -07:00
Andrey Smirnov	6a7cc02648	fix: handle bootkube recover correctly, support recovery from etcd Bootkube recover process (and `talosctl recover`) was actually regenerating assets each time `recover` runs forcing control plane to be at the state when cluster got created. This PR fixes that by running recover process correctly. Recovery via etcd was fixed to handle encrypted etcd data: it follows the way `apiserver` handles encryption at rest, and as at the moment AES CBC is the only supported encryption method, code simply follows the same path. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-08-18 14:24:14 -07:00
Andrey Smirnov	7875e9499f	chore: re-import talos-systems/pkg/crypto/tls See also https://github.com/talos-systems/crypto/pull/2 This should break dependency of `pkg/client` on `pkg/grpc`. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-08-17 08:06:38 -07:00
Andrey Smirnov	2697b99b7d	refactor: extract `pkg/net` as `github.com/talos-systems/net` This extracts common package as new module/repository. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-08-14 11:04:50 -07:00
Andrey Smirnov	52c5911fcd	chore: extract pkg/crypto as external module Package `pkg/crypto` was extracted as `github.com/talos-systems/crypto` repository and Go module. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-08-14 06:33:30 -07:00
Andrey Smirnov	7474b8ba52	feat: upgrade etcd to 3.4.10 This upgrades etcd to latest v3.4.x version as smooth upgrade from version 3.3.22 in 0.6. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-08-13 07:33:51 -07:00
Andrew Rynhard	849959fefc	feat: add dynamic config decoder This adds the ability to dynamically decode mult-doc YAML files. Signed-off-by: Andrew Rynhard <andrew@rynhard.io>	2020-07-30 08:07:14 -07:00
Andrey Smirnov	3926442704	feat: taint master nodes with `NoSchedule` taint Fixes #2350 This also brings in a fix for `coredns` tolerations from https://github.com/talos-systems/bootkube-plugin/pull/19. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-29 14:02:41 -07:00
Andrew Rynhard	1b491d0a66	feat: upgrade Kubernetes to v1.19.0-rc.3 This brings in the latest version of Kubernetes. Signed-off-by: Andrew Rynhard <andrew@rynhard.io>	2020-07-29 11:04:50 -07:00
Andrey Smirnov	f23c9111d1	feat: upgrade etcd to 3.3.22 version Latest version in 3.3 branch is 3.3.23, but it's broken, so we use previous stable version. Switch to official etcd gcr.io registry, early support for arm64. Move `etcd` service to run in system containerd. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-21 09:44:43 -07:00
Andrey Smirnov	4cd6e7e200	refactor: use `humanize.Bytes` everywhere This removes dependency on `bytefmt` package. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-20 07:26:33 -07:00
Andrey Smirnov	1a0e1bc393	chore: update module dependencies Fixes #2316 Simply update dependencies we don't track on version level to be compatible with Talos components (like etcd or k8s). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-16 12:00:50 -07:00
Andrew Rynhard	0617a10027	feat: upgrade Kubernetes to v1.19.0-rc.0 This brings in the latest version of Kubernetes. Signed-off-by: Andrew Rynhard <andrew@rynhard.io>	2020-07-14 13:07:18 -07:00
Artem Chernyshev	8fc352ec4f	feat: merge mode in talosctl kubeconfig New flag `-m` will enable merge mechanism in `talosctl kubeconfig` Command examples: ``` talosctl kubeconfig -m talosctl kubeconfig -m ~/.kube/config ``` Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>	2020-07-10 12:39:30 -07:00
Andrey Smirnov	4cc074cdba	feat: implement API access to event history 1. Add [xid-based](https://github.com/rs/xid) event IDs. Xids are sortable and unique enough. Xids also encode event publishing time with a second precision. 2. Add three ways to look back into event history: based on number of events, on time and ID. Lookup via ID might be used to restart event polling in case of broken API connection from the same moment. 3. Reimplement core event buffer with positions which are always incremented instead of generation+index, this implementation is much more simple (idea from circular buffer). 4. By default, Events API works the same - it shows no history and starts streaming new events only. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-07-08 10:54:50 -07:00
Andrew Rynhard	a5a2d959ed	feat: upgrade runc to v1.0.0-rc90 This updates runc to the same version vendored by containerd. Signed-off-by: Andrew Rynhard <andrew@rynhard.io>	2020-07-02 13:19:33 -07:00
Spencer Smith	90115bb3ef	feat: update kubernetes to 1.19.0-beta.1 This PR brings in all changes necessary to deploy kubernetes 1.19.x. It relies on an update to our bootkube-plugin project, as well as implementation of some Image() functions for our various control plane components, since they are all distinct images and not just hyperkube. Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-06-10 15:01:11 -04:00
Andrey Smirnov	d1a4e6ee64	feat: adjust time properly in timed via adjtime() This should be proper way to adjust time incrementally without causing jumps one in +/- direction. Time-sensitive services might be confused by huge jumps. This also implements timed healh check based on first successful time sync. Fixed some random health check related issues in other services. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-06-03 09:39:48 -07:00
Andrew Rynhard	9412e2b478	fix: allow all seccomp profile names This updates the bootkube plugin that brings in a fix that allows any seccomp profile name to be used. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-05-19 18:48:22 -07:00
Andrew Rynhard	56d7bf19fe	feat: add recovery API This adds an API for recovering the self-hosted control plane. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-05-04 19:38:30 -07:00
Seán C McCord	c1299d3ff0	feat: allow dual-stack support with bootkube wrapper Handle dual-stack configurations with the bootkube wrapper. This uses the new PodCIDRs and ServiceCIDRs `asset.Config` parameters in bootkube. It also relies on the bootkube-plugin features for manipulating kube-proxy config and installing the dual-stack DNS service. Fixes #2055 Signed-off-by: Seán C McCord <ulexus@gmail.com>	2020-04-28 20:10:58 -07:00
Andrew Rynhard	49307d554d	refactor: improve machined This is a rewrite of machined. It addresses some of the limitations and complexity in the implementation. This introduces the idea of a controller. A controller is responsible for managing the runtime, the sequencer, and a new state type introduced in this PR. A few highlights are: - no more event bus - functional approach to tasks (no more types defined for each task) - the task function definition now offers a lot more context, like access to raw API requests, the current sequence, a logger, the new state interface, and the runtime interface. - no more panics to handle reboots - additional initialize and reboot sequences - graceful gRPC server shutdown on critical errors - config is now stored at install time to avoid having to download it at install time and at boot time - upgrades now use the local config instead of downloading it - the upgrade API's preserve option takes precedence over the config's install force option Additionally, this pulls various packes in under machined to make the code easier to navigate. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-04-28 08:20:55 -07:00
Andrew Rynhard	d34e9f0984	fix: pass dev path to mkfs This fixes a bug caused by a missing device argument to `mkfs.xfs`. Without a device, `mkfs.xfs` will error out. Additionally, this ensures that the installer container is started with the `kmsg` writer that ensures logs are formatted correctly for `/dev/kmsg`. Without this we lose a lot of the logs output by the container, one of them being any error from `mkfs.xfs` Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-04-21 16:30:35 -07:00
Andrew Rynhard	3791fb5cbc	refactor: use upstream bootkube This moves us off of our bootkube fork and makes us of our bootkube plugin. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-04-21 10:27:01 -07:00
Spencer Smith	3a4eaeeef0	feat: upgrade kubernetes to 1.18 This PR will pull in the latest release of k8s 1.18 so we can start validating it through our test suite. Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-03-26 14:59:43 -04:00
Andrew Rynhard	eba80b453f	feat: update bootkube This brings in the latest changes from our fork of bootkube. One thing to note is a fix that stops the pod controller cache object. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-03-24 15:53:08 -07:00
Spencer Smith	3485ea9f09	fix: update k8s to 1.17.3 This PR will update k8s to v1.17.3 to address CVEs mentioned in https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!topic/kubernetes-security-announce/2UOlsba2g0s Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-03-23 17:08:52 -07:00
Spencer Smith	2f4ccfda9a	fix: respect dns domain from machine config BREAKING CHANGE: This PR fixes a bug where we were only passing `cluster.local` to the kubelet configuration. It will also pull in a new version of the bootkube fork to ensure that custom domains got propogated down to the API Server certs, as well as the CoreDNS configuration for a cluster. Existing users should be aware that, if they were previously trying to use this option in machine configs, that an upgrade will may break their cluster. It will update a kubelet flag with the new domain, but CoreDNS and API Server certs will not change since bootkube has already run. One option may be to change these values manually inside the Kubernetes cluster. However, it may prove easier to rebuild the cluster if necessary. Additionally, this PR also exposes a flag to `osctl config generate` to allow tweaking this domain value as well. Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-03-20 12:28:17 -04:00
Andrey Smirnov	7e136fee67	chore: update Firecracker Go SDK to the official release This release includes all the fixes we upstreamed before. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-03-18 01:11:46 +03:00
Andrey Smirnov	d5d3035c8c	test: enable upgrade tests 0.4.x -> latest With the fix #1904, it's now possible to upgrade 0.4.x with `machine.File` extra files (caused by registry mirror for registry.ci.svc). Bump resources for upgrade tests in attempt to speed it up. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-02-26 00:09:32 +03:00
Andrew Rynhard	64b5b32732	refactor: use go-procfs This makes use of the external procfs pacakge that is based on the pacakge we are removing here. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-02-19 15:58:57 -08:00
Andrew Rynhard	63ca83a02c	feat: support sending machine info This allows users to specify well known query parameters in `talos.config`. The only supported parameter in this change is `uuid`. This will send the node's UUID determined from SMBIOS along with the request for the config. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-02-19 13:15:28 -08:00
Andrew Rynhard	fe7847e0b8	feat: add reboot flag to reset API This adds the ability to automatically reboot a machine after a reboot. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-02-19 05:10:58 -08:00
Andrey Smirnov	33332f4c74	chore: support bootloader emulation in firecracker provisioner Firecracker launches tries to open VM disk image before every boot, parses partition table, finds boot partition, tries to read it as FAT32 filesystem, extracts uncompressed kernel from `bzImage` (firecracker doesn't support `bzImage` yet), extracts initramfs and passes it to firecracker binary. This flow allows for extended tests, e.g. testing installer, upgrade and downgrade tests, etc. Bootloader emulation is disabled by default for now, can be enabled via `--with-bootloader-emulation` flag to `osctl cluster create`. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-02-13 23:21:37 +03:00
Andrey Smirnov	76c2038b13	chore: implement loadbalancer for firecracker provisioner This PR contains generic simple TCP loadbalancer code, and glue code for firecracker provisioner to use this loadbalancer. K8s control plane is passed through the load balancer, and Talos API is passed only to the init node (for now, as some APIs, including kubeconfig, don't work with non-init node). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-02-13 23:07:13 +03:00
Andrey Smirnov	4950f35440	chore: use upstream version of Firecracker Go SDK With all our PRs merged, we can switch back to upstream version. No tag yet, so we have to follow `master` for now. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-02-04 08:59:39 -08:00
Spencer Smith	e27b0cbfdb	chore: update bootkube This PR updates the talos branch of bootkube to add extraArgs to bootstrap controlplane components as well. Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-01-31 11:38:34 -08:00
Spencer Smith	ff393f8ae3	chore: update bootkube fork This PR will pull in the latest of our bootkube fork and fix a bug with extraArgs. Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-01-31 09:39:43 -05:00
Andrey Smirnov	fae5e6915d	chore: rework firecracker code around upstream Go SDK + PRs This removes use of private fork with custom `ip=` kernel argument handling and switches fully to upstream version of it. Firecracker Go SDK version is `master` + following PRs: * https://github.com/firecracker-microvm/firecracker-go-sdk/pull/167 * https://github.com/firecracker-microvm/firecracker-go-sdk/pull/177 * https://github.com/firecracker-microvm/firecracker-go-sdk/pull/178 MTU handling support was implemented as well. Changes: * hostname to each node is passed via `talos.hostname=` kernel arg * IP configuration is generated by SDK from CNI result * fixed bugs with wrong netmask * nameservers & MTU is passed via Talos config Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-01-29 02:35:15 +03:00
Spencer Smith	aabd46e651	fix: re-enable control plane flags This PR aims to fix the ability to pass extra flags to control plane components. This will close #1523 Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-01-23 14:28:02 -05:00
Andrew Rynhard	f3623d22b0	refactor: use tls.Config as client credentials The `client.Creds` struct was not used very often, and made using the `client.NewClient` function impossible to use in combination with the `RemoteRenewingFileCertificateProvider`. This modifies `client.NewClient` to accept a `tls.Config` instead of `client.Creds`, allowing for the use of `RemoteRenewingFileCertificateProvider` with `client.NewClient`. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-01-21 17:10:07 -08:00
Spencer Smith	1368bfa451	chore: update bootkube config to include cluster name This PR will add the new cluster name field to our bootkube options. This allows for the generated kubeconfig to include the context-name for the default context. Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>	2020-01-21 16:56:57 -05:00
Andrey Smirnov	2bf8540855	test: provision Talos clusters via Firecracker VMs This is initial PR to push the initial code, it has several known problems which are going to be addressed in follow-up PRs: 1. there's no "cluster destroy", so the only way to stop the VMs is to `pkill firecracker` 2. provisioner creates state in `/tmp` and never deletes it, that is required to keep cluster running when `osctl cluster create` finishes 3. doesn't run any controller process around firecracker to support reboots/CNI cleanup (vethxyz interfaces are lingering on the host as they're never cleaned up) The plan is to create some structure in `~/.talos` to manage cluster state, e.g. `~/.talos/clusters/<name>` which will contain all the required files (disk images, file sockets, VM logs, etc.). This directory structure will also work as a way to detect running clusters and clean them up. For point number 3, `osctl cluster create` is going to exec lightweight process to control the firecracker VM process and to simulate VM reboots if firecracker finishes cleanly (when VM reboots). Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2020-01-16 00:27:08 +03:00
Brad Beam	95666900a7	fix: Update bootkube to include node ready check This ensures bootkube waits until all pods and nodes are ready before tearing down the bootstrap control plane. Signed-off-by: Brad Beam <brad.beam@talos-systems.com>	2020-01-14 16:51:53 -06:00
Andrew Rynhard	5cac4f5f39	fix: set kube-dns labels This updates the CoreDNS to use the 'kube-dns' naming convention. This naming convention is used throughout the Kubrnetes documentation. This also fixing the kube-dns service. The label label selector was wrong, breaking cluster DNS. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-01-09 18:29:35 -08:00
Andrew Rynhard	79878c1d8d	feat: enable DynamicKubeletConfiguration This moves to using the KubeletConfiguration instead of flags to the kubelet. It also enables DynamicKubeletConfiguration, which allows users to configure kubelets using a ConfigMap. Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>	2020-01-08 16:06:44 -08:00
Brad Beam	0742e5245a	feat: Upgrade bootkube This brings in the changes to run controller manager and scheduler as a daemonset. Signed-off-by: Brad Beam <brad.beam@talos-systems.com>	2020-01-08 15:41:37 -08:00
Andrey Smirnov	0081ac5fac	refactor: extract Talos cluster provisioner as common code This extracts Docker Talos cluster provisioner as common code which might be shared between `osctl cluster` and integration-test. There should be almost no functional changes. As proof of concept, abstract cluster readiness checks were implemented based on provisioned cluster state. It implements same checks as `basic-integration.sh` in pure Go via Talos/K8s clients. `conditions` package was promoted from machined-internal to `internal/pkg` as it is used to run the checks. Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>	2019-12-27 12:14:19 -08:00

1 2 3

133 Commits