213 Commits

Author SHA1 Message Date
Spencer Smith
4238d4428b feat: update kubernetes to v1.19.0
This PR version bumps all of the kubnernetes version defaults to the
v1.19.0 release.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-08-26 15:30:36 -07:00
Andrew Rynhard
83aa3bd3ab chore: bump next version to v0.6.0-beta.2
This updates the "next" version in our integration tests.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-08-21 01:44:26 -07:00
Andrey Smirnov
bddd4f1bf6 refactor: move external API packages into machinery/
This moves `pkg/config`, `pkg/client` and `pkg/constants`
under `pkg/machinery` umbrella.

And `pkg/machinery` is published as Go module inside Talos repository.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-08-17 09:56:14 -07:00
Andrey Smirnov
050d34275a chore: integrate importvet
This integrates [importvet](https://github.com/talos-systems/importvet)
into `lint` target.

First rule file was added for public packages `pkg/` which shouldn't
depend on other parts of Talos tree (except for the API definitions).

Only one change: `internal/cis` was moved under single user -
`pkg/config/internal/cis` to satisfy the rules.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-08-11 13:19:15 -07:00
Andrew Rynhard
16c8f167c4 chore: update packages
This bring in:

- Go v1.14.7
- Linux v5.7.14

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-08-08 09:18:18 -07:00
Andrey Smirnov
a5d64d97c1 test: update qemu/firecracker provisioners
Fixes #2363 #2364 #2370 #2371

Several changes packed together:

* use compressed `vmlinuz` everywhere, firecracker provisioner
uncompresses it before first use, drop `vmlinux`

* handle reboots in qemu launcher to support reset API case, update
empty disk check to handle reset behavior (erasing partition table)

* make bootloader support default in provisioners, and flag to disable
that

* early support for target architecture for qemu provisioner

This should allow us to use `qemu` in CI/CD (not included into this PR):
integration test passes with qemu.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-30 21:17:25 +03:00
Andrew Rynhard
1b491d0a66 feat: upgrade Kubernetes to v1.19.0-rc.3
This brings in the latest version of Kubernetes.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-07-29 11:04:50 -07:00
Andrew Rynhard
6f5d24cc3d chore: add release notes
This ensures that releases have notes.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-07-28 14:14:33 -07:00
Andrey Smirnov
2770d6414c test: upgrade versions the upgrade tests are operating on
This bumps next version to the latest 0.6 alpha and latest 0.5.

This also enables single node preserve test.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-28 12:35:37 -07:00
Andrey Smirnov
76c44ac468 test: remove apid load balancer for firecracker
We're not using load balancer for `apid` (always using client-side load
balancing), so we can remove this safely.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-28 20:21:21 +03:00
Andrey Smirnov
b5b70ec858 chore: upgrade pkgs and tools for Go 1.14.6
This also brings in multi-arch pkgs and tools, but we're not consuming
arm64 images yet.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-27 12:33:53 -07:00
Andrew Rynhard
1f31d24e55 chore: use Kubernetes pipelines
This moves to using Kubernetes pipelines.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-07-27 12:09:53 -07:00
Andrey Smirnov
41d5f7859a chore: update golangci-lint to 1.28.3
Fixes #2272

`gofumpt` is now included into `golangci-lint`, but not the
`gofumports`, so we keep it using it as separate binary, but we keep
versions in sync with `golangci-lint`.

This contains fixes from:

* `gofumpt` (automated, mostly around octal constants)
* `exhaustive` in `switch` statements
* `noctx` (adding context with default timeout to http requests)

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-16 08:05:42 -07:00
Andrey Smirnov
e82895ccc5 chore: upgrade Go to 1.14.5
go1.14.5 (released 2020/07/14) includes security fixes to the
crypto/x509 and net/http packages.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-16 07:05:54 -07:00
Spencer Smith
f290f88160 chore: update clusterctl for CI testing
This PR brings in the latest version of clusterctl that has built-in
support for the talos repos. I'll be chasing this with a move to using
the control-plane provider as well!

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-07-15 19:33:59 -04:00
Andrew Rynhard
0617a10027 feat: upgrade Kubernetes to v1.19.0-rc.0
This brings in the latest version of Kubernetes.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-07-14 13:07:18 -07:00
Andrew Rynhard
a5a2d959ed feat: upgrade runc to v1.0.0-rc90
This updates runc to the same version vendored by containerd.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-07-02 13:19:33 -07:00
Andrey Smirnov
3ae5e0e749 test: add short integration test with custom CNI
This adds new flug to `cluster create` to launch cluster with custom
CNI, `integration` pipeline gets a new step to run short test with
Cilium 1.8.0 CNI.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-01 11:19:19 -07:00
Andrey Smirnov
e46a09f56a chore: make default pipeline run shorter integration test
This moves full integratation test and provision tests to
the `integration` pipeline.

Docker test wasn't affected much, as anyways docker can't run long
integration tests, so it mostly affects firecracker and provision tests.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-01 00:14:55 +03:00
Andrey Smirnov
197369bdc1 fix: make installer re-read partition table before formatting
This hopefully should fix errors like:

```
2020/06/25 18:23:22 attaching loopback device
2020/06/25 18:23:22 partitioning /dev/loop2 - ESP
2020/06/25 18:23:22 partitioning /dev/loop2 - EPHEMERAL
2020/06/25 18:23:22 formatting partition "/dev/loop2p1" as "fat" with label "ESP"
2020/06/25 18:23:22 detaching loopback device
2020/06/25 18:23:22 failed to format device: exit status 1: mkfs.vfat: can't open '/dev/loop2p1': No such file or directory
```

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-06-26 12:22:00 -07:00
Andrew Rynhard
d0d2ac3c74 test: default to using the bootstrap API
This moves our test scripts to using the bootstrap API. Some
automation around invoking the bootstrap API was also added
to give the same ease of use when creating clusters with the
CLI.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-06-24 08:46:10 -07:00
Spencer Smith
90115bb3ef feat: update kubernetes to 1.19.0-beta.1
This PR brings in all changes necessary to deploy kubernetes 1.19.x.

It relies on an update to our bootkube-plugin project, as well as
implementation of some Image() functions for our various control plane
components, since they are all distinct images and not just hyperkube.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-06-10 15:01:11 -04:00
Spencer Smith
e03a68f8eb feat: update k8s and sonobuoy versions
This PR will update k8s to the latest 1.18 release and bump sonobuoy to
help resolve some e2e flakes. Also adds some retry logic around the
sonobuoy run.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-06-10 06:47:36 -07:00
Andrew Rynhard
77150f51cf chore: update provision test versions
This adds latest 0.6 alpha and 0.5 stable to the upgrade tests.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-05-29 14:58:54 -07:00
Andrey Smirnov
2fb00344ab chore: upgrade Go to 1.14.3 and use toolchain for race detector
With Go 1.14.3 we can run race-enabled code on muslc, so this opens path
to run unit-tests-race under Talos environment with rootfs, enabling all
the tests to run under race detector.

Also fixed the tests run by specifying platform in the test environment.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-05-25 08:35:11 -07:00
Andrey Smirnov
652531853f test: update Talos versions for upgrade tests
Our policy it to support two last releases (0.4, 0.5 at the moment).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-05-20 07:43:10 -07:00
Spencer Smith
c1b6f05b00 chore: use clusterctl and v1alpha3 providers for tests
This PR will update our testing ocde to make use of the clusterctl tool,
as well as use the newer versions of various providers and updated
manifests.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-05-01 07:42:19 -07:00
Andrew Rynhard
49307d554d refactor: improve machined
This is a rewrite of machined. It addresses some of the limitations and
complexity in the implementation. This introduces the idea of a
controller. A controller is responsible for managing the runtime, the
sequencer, and a new state type introduced in this PR.

A few highlights are:

- no more event bus
- functional approach to tasks (no more types defined for each task)
  - the task function definition now offers a lot more context, like
    access to raw API requests, the current sequence, a logger, the new
    state interface, and the runtime interface.
- no more panics to handle reboots
- additional initialize and reboot sequences
- graceful gRPC server shutdown on critical errors
- config is now stored at install time to avoid having to download it at
  install time and at boot time
- upgrades now use the local config instead of downloading it
- the upgrade API's preserve option takes precedence over the config's
  install force option

Additionally, this pulls various packes in under machined to make the
code easier to navigate.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-04-28 08:20:55 -07:00
Andrew Rynhard
a10acd592a chore: address random CI nits
This PR does the following:

- updates the conform config
- cleans up conform scopes
- moves slash commands to the talos-bot
- adds a check list to the pull request template
- disables codecov comments
- uses `BOT_TOKEN` so all actions are performed as the talos-bot user
- adds a `make conformance` target to make it easy for contributors to
check their commit before creating a PR
- bumps golangci-lint to v1.24.0

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-04-13 13:01:14 -07:00
Andrey Smirnov
7fd19fd3b6 feat: upgrade Go to 1.14.2
https://github.com/talos-systems/tools/pull/91

https://github.com/talos-systems/pkgs/pull/114

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-04-09 10:15:58 -07:00
Andrey Smirnov
ff2267eb99 test: update versions used for upgrade tests
We should stick to the latest version in each release series.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-04-07 15:51:56 -07:00
Andrey Smirnov
2d5c6f4c10 test: serialize docs step execution
`make docs` removes and then regenerates contents of some docs, so it
might cause random `-dirty` issue when running concurrently with build
steps.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-04-07 23:46:16 +03:00
Spencer Smith
502611f28e chore: update sonobuoy to v0.18.0
This PR will bring in the latest sonobuoy that is designed for the 1.18
branch on kubernetes.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-04-06 15:11:40 -04:00
Spencer Smith
3a4eaeeef0 feat: upgrade kubernetes to 1.18
This PR will pull in the latest release of k8s 1.18 so we can start
validating it through our test suite.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-03-26 14:59:43 -04:00
Andrey Smirnov
e38cde9b48 chore: update upgrade tests for new version, split into two tracks
This updates upgrade tests to run two flows with 3+1 clusters:

1. 0.3 -> current (testing upgrade with partition wiping)
2. 0.4-alpha.7 -> current (testing upgrade without partition wiping,
boot-a/boot-b)

And small upgrade with preserve enabled for single-node cluster.

Provision tests are now split into two parallel tracks in Drone.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-03-24 15:30:00 -07:00
Spencer Smith
3485ea9f09 fix: update k8s to 1.17.3
This PR will update k8s to v1.17.3 to address CVEs mentioned in https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!topic/kubernetes-security-announce/2UOlsba2g0s

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-03-23 17:08:52 -07:00
Andrew Rynhard
5dbc26c7a3 feat: rename osctl to talosctl
This is a rename of the osctl binary. We decided that talosctl is a
better name for the Talos CLI. This does not break any APIs, but does
make older documentation only accurate for previous versions of Talos.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-03-20 19:07:39 -07:00
Andrey Smirnov
a1350aa819 feat: upgrade Go to version 1.14.1
Fixes #1934

See talos-systems/pkgs#106, talos-systems/tools#90

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-03-20 21:42:47 +03:00
Andrey Smirnov
e6dc87dfa4 chore: update pkgs & tools for Go 1.14
See also:

* https://github.com/talos-systems/tools/pull/89
* https://github.com/talos-systems/pkgs/pull/103

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-02-27 01:15:46 +03:00
Andrey Smirnov
923ef4537b test: implement new class of tests: provision tests (upgrades)
This class of tests is included/excluded by build tags, but as it is
pretty different from other integration tests, we build it as separate
executable. Provision tests provision cluster for the test run, perform
some actions and verify results (could be upgrade, reset, scale up/down,
etc.)

There's now framework to implement upgrade tests, first of the tests
tests upgrade from latest 0.3 (0.3.2 at the moment) to current version
of Talos (being built in CI). Tests starts by booting with 0.3
kernel/initramfs, runs 0.3 installer to install 0.3.2 cluster, wait for
bootstrap, followed by upgrade to 0.4 in rolling fashion. As Firecracker
supports bootloader, this boots 0.4 system from boot disk (as installed
by installer).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-02-21 07:04:03 -08:00
Niklas Wik
08b1a782cd feat: support proxy in docker buildx
This allows building when http(s) proxy is enforced to download content on the build machine

Signed-off-by: Niklas Wik <niklas.wik@nokia.com>
2020-02-20 05:35:17 -08:00
Andrey Smirnov
5f330f1f64 chore: push installer & talos images to the CI registry on every build
This enables a way to run the matching installer image in firecracker
tests. New image is used in firecracker tests and bootloader support to
use installed kernel/initramfs, which opens path for upgrade tests.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-02-18 07:32:45 -08:00
Andrey Smirnov
f51e9a14fe chore: build app container images skipping export to host
Container images for `apid`, `networkd`, etc. are now built inside the
buildkit using the `img` tool. This means that all the dependencies are
now controlled in `buildkit` and many more stages can run in parallel
without problems (overwriting content in `_out/images`).

This also simplifies Drone configuration, as we can let buildkit handle
the dependencies. I also enabled more stages to run in parallel.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-02-14 13:17:25 -08:00
Andrew Rynhard
88667641df chore: refactor E2E scripts
This PR aims to simplify our E2E scripts.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-26 20:47:25 -08:00
Andrew Rynhard
f87c6d74d3 chore: use firecracker in basic-integration
This adds a basic integration step that uses firecracker.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-23 05:52:22 -08:00
Andrew Rynhard
a0d8656ca0 chore: use v0.1.0 tools and pkgs
This brings in the official v0.1.0 releases of tools and pkgs.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-20 07:53:08 -08:00
Andrey Smirnov
2bf8540855 test: provision Talos clusters via Firecracker VMs
This is initial PR to push the initial code, it has several known
problems which are going to be addressed in follow-up PRs:

1. there's no "cluster destroy", so the only way to stop the VMs is to
`pkill firecracker`

2. provisioner creates state in `/tmp` and never deletes it, that is
required to keep cluster running when `osctl cluster create` finishes

3. doesn't run any controller process around firecracker to support
reboots/CNI cleanup (vethxyz interfaces are lingering on the host as
they're never cleaned up)

The plan is to create some structure in `~/.talos` to manage cluster
state, e.g. `~/.talos/clusters/<name>` which will contain all the
required files (disk images, file sockets, VM logs, etc.). This
directory structure will also work as a way to detect running clusters
and clean them up.

For point number 3, `osctl cluster create` is going to exec lightweight
process to control the firecracker VM process and to simulate VM reboots
if firecracker finishes cleanly (when VM reboots).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-01-16 00:27:08 +03:00
Andrey Smirnov
810e9b418b chore: bump tools/pkgs for Go 1.13.6
Ref: https://github.com/talos-systems/tools/pull/85,
https://github.com/talos-systems/pkgs/pull/87

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-01-13 20:55:17 +03:00
Andrew Rynhard
d824d0bfdb chore: publish boot.tar.gz
This adds a convenience tarball that includes vmlinuz, and initramfs.xz
in a single tarball.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-09 12:38:21 -08:00
Andrew Rynhard
794d9e6066 chore: update all target in Makefile
We should build the most common things by default.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-06 11:08:27 -08:00