203 Commits

Author SHA1 Message Date
Andrey Smirnov
b5b70ec858 chore: upgrade pkgs and tools for Go 1.14.6
This also brings in multi-arch pkgs and tools, but we're not consuming
arm64 images yet.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-27 12:33:53 -07:00
Andrew Rynhard
1f31d24e55 chore: use Kubernetes pipelines
This moves to using Kubernetes pipelines.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-07-27 12:09:53 -07:00
Andrey Smirnov
41d5f7859a chore: update golangci-lint to 1.28.3
Fixes #2272

`gofumpt` is now included into `golangci-lint`, but not the
`gofumports`, so we keep it using it as separate binary, but we keep
versions in sync with `golangci-lint`.

This contains fixes from:

* `gofumpt` (automated, mostly around octal constants)
* `exhaustive` in `switch` statements
* `noctx` (adding context with default timeout to http requests)

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-16 08:05:42 -07:00
Andrey Smirnov
e82895ccc5 chore: upgrade Go to 1.14.5
go1.14.5 (released 2020/07/14) includes security fixes to the
crypto/x509 and net/http packages.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-16 07:05:54 -07:00
Spencer Smith
f290f88160 chore: update clusterctl for CI testing
This PR brings in the latest version of clusterctl that has built-in
support for the talos repos. I'll be chasing this with a move to using
the control-plane provider as well!

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-07-15 19:33:59 -04:00
Andrew Rynhard
0617a10027 feat: upgrade Kubernetes to v1.19.0-rc.0
This brings in the latest version of Kubernetes.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-07-14 13:07:18 -07:00
Andrew Rynhard
a5a2d959ed feat: upgrade runc to v1.0.0-rc90
This updates runc to the same version vendored by containerd.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-07-02 13:19:33 -07:00
Andrey Smirnov
3ae5e0e749 test: add short integration test with custom CNI
This adds new flug to `cluster create` to launch cluster with custom
CNI, `integration` pipeline gets a new step to run short test with
Cilium 1.8.0 CNI.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-01 11:19:19 -07:00
Andrey Smirnov
e46a09f56a chore: make default pipeline run shorter integration test
This moves full integratation test and provision tests to
the `integration` pipeline.

Docker test wasn't affected much, as anyways docker can't run long
integration tests, so it mostly affects firecracker and provision tests.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-01 00:14:55 +03:00
Andrey Smirnov
197369bdc1 fix: make installer re-read partition table before formatting
This hopefully should fix errors like:

```
2020/06/25 18:23:22 attaching loopback device
2020/06/25 18:23:22 partitioning /dev/loop2 - ESP
2020/06/25 18:23:22 partitioning /dev/loop2 - EPHEMERAL
2020/06/25 18:23:22 formatting partition "/dev/loop2p1" as "fat" with label "ESP"
2020/06/25 18:23:22 detaching loopback device
2020/06/25 18:23:22 failed to format device: exit status 1: mkfs.vfat: can't open '/dev/loop2p1': No such file or directory
```

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-06-26 12:22:00 -07:00
Andrew Rynhard
d0d2ac3c74 test: default to using the bootstrap API
This moves our test scripts to using the bootstrap API. Some
automation around invoking the bootstrap API was also added
to give the same ease of use when creating clusters with the
CLI.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-06-24 08:46:10 -07:00
Spencer Smith
90115bb3ef feat: update kubernetes to 1.19.0-beta.1
This PR brings in all changes necessary to deploy kubernetes 1.19.x.

It relies on an update to our bootkube-plugin project, as well as
implementation of some Image() functions for our various control plane
components, since they are all distinct images and not just hyperkube.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-06-10 15:01:11 -04:00
Spencer Smith
e03a68f8eb feat: update k8s and sonobuoy versions
This PR will update k8s to the latest 1.18 release and bump sonobuoy to
help resolve some e2e flakes. Also adds some retry logic around the
sonobuoy run.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-06-10 06:47:36 -07:00
Andrew Rynhard
77150f51cf chore: update provision test versions
This adds latest 0.6 alpha and 0.5 stable to the upgrade tests.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-05-29 14:58:54 -07:00
Andrey Smirnov
2fb00344ab chore: upgrade Go to 1.14.3 and use toolchain for race detector
With Go 1.14.3 we can run race-enabled code on muslc, so this opens path
to run unit-tests-race under Talos environment with rootfs, enabling all
the tests to run under race detector.

Also fixed the tests run by specifying platform in the test environment.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-05-25 08:35:11 -07:00
Andrey Smirnov
652531853f test: update Talos versions for upgrade tests
Our policy it to support two last releases (0.4, 0.5 at the moment).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-05-20 07:43:10 -07:00
Spencer Smith
c1b6f05b00 chore: use clusterctl and v1alpha3 providers for tests
This PR will update our testing ocde to make use of the clusterctl tool,
as well as use the newer versions of various providers and updated
manifests.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-05-01 07:42:19 -07:00
Andrew Rynhard
49307d554d refactor: improve machined
This is a rewrite of machined. It addresses some of the limitations and
complexity in the implementation. This introduces the idea of a
controller. A controller is responsible for managing the runtime, the
sequencer, and a new state type introduced in this PR.

A few highlights are:

- no more event bus
- functional approach to tasks (no more types defined for each task)
  - the task function definition now offers a lot more context, like
    access to raw API requests, the current sequence, a logger, the new
    state interface, and the runtime interface.
- no more panics to handle reboots
- additional initialize and reboot sequences
- graceful gRPC server shutdown on critical errors
- config is now stored at install time to avoid having to download it at
  install time and at boot time
- upgrades now use the local config instead of downloading it
- the upgrade API's preserve option takes precedence over the config's
  install force option

Additionally, this pulls various packes in under machined to make the
code easier to navigate.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-04-28 08:20:55 -07:00
Andrew Rynhard
a10acd592a chore: address random CI nits
This PR does the following:

- updates the conform config
- cleans up conform scopes
- moves slash commands to the talos-bot
- adds a check list to the pull request template
- disables codecov comments
- uses `BOT_TOKEN` so all actions are performed as the talos-bot user
- adds a `make conformance` target to make it easy for contributors to
check their commit before creating a PR
- bumps golangci-lint to v1.24.0

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-04-13 13:01:14 -07:00
Andrey Smirnov
7fd19fd3b6 feat: upgrade Go to 1.14.2
https://github.com/talos-systems/tools/pull/91

https://github.com/talos-systems/pkgs/pull/114

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-04-09 10:15:58 -07:00
Andrey Smirnov
ff2267eb99 test: update versions used for upgrade tests
We should stick to the latest version in each release series.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-04-07 15:51:56 -07:00
Andrey Smirnov
2d5c6f4c10 test: serialize docs step execution
`make docs` removes and then regenerates contents of some docs, so it
might cause random `-dirty` issue when running concurrently with build
steps.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-04-07 23:46:16 +03:00
Spencer Smith
502611f28e chore: update sonobuoy to v0.18.0
This PR will bring in the latest sonobuoy that is designed for the 1.18
branch on kubernetes.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-04-06 15:11:40 -04:00
Spencer Smith
3a4eaeeef0 feat: upgrade kubernetes to 1.18
This PR will pull in the latest release of k8s 1.18 so we can start
validating it through our test suite.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-03-26 14:59:43 -04:00
Andrey Smirnov
e38cde9b48 chore: update upgrade tests for new version, split into two tracks
This updates upgrade tests to run two flows with 3+1 clusters:

1. 0.3 -> current (testing upgrade with partition wiping)
2. 0.4-alpha.7 -> current (testing upgrade without partition wiping,
boot-a/boot-b)

And small upgrade with preserve enabled for single-node cluster.

Provision tests are now split into two parallel tracks in Drone.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-03-24 15:30:00 -07:00
Spencer Smith
3485ea9f09 fix: update k8s to 1.17.3
This PR will update k8s to v1.17.3 to address CVEs mentioned in https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!topic/kubernetes-security-announce/2UOlsba2g0s

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-03-23 17:08:52 -07:00
Andrew Rynhard
5dbc26c7a3 feat: rename osctl to talosctl
This is a rename of the osctl binary. We decided that talosctl is a
better name for the Talos CLI. This does not break any APIs, but does
make older documentation only accurate for previous versions of Talos.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-03-20 19:07:39 -07:00
Andrey Smirnov
a1350aa819 feat: upgrade Go to version 1.14.1
Fixes #1934

See talos-systems/pkgs#106, talos-systems/tools#90

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-03-20 21:42:47 +03:00
Andrey Smirnov
e6dc87dfa4 chore: update pkgs & tools for Go 1.14
See also:

* https://github.com/talos-systems/tools/pull/89
* https://github.com/talos-systems/pkgs/pull/103

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-02-27 01:15:46 +03:00
Andrey Smirnov
923ef4537b test: implement new class of tests: provision tests (upgrades)
This class of tests is included/excluded by build tags, but as it is
pretty different from other integration tests, we build it as separate
executable. Provision tests provision cluster for the test run, perform
some actions and verify results (could be upgrade, reset, scale up/down,
etc.)

There's now framework to implement upgrade tests, first of the tests
tests upgrade from latest 0.3 (0.3.2 at the moment) to current version
of Talos (being built in CI). Tests starts by booting with 0.3
kernel/initramfs, runs 0.3 installer to install 0.3.2 cluster, wait for
bootstrap, followed by upgrade to 0.4 in rolling fashion. As Firecracker
supports bootloader, this boots 0.4 system from boot disk (as installed
by installer).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-02-21 07:04:03 -08:00
Niklas Wik
08b1a782cd feat: support proxy in docker buildx
This allows building when http(s) proxy is enforced to download content on the build machine

Signed-off-by: Niklas Wik <niklas.wik@nokia.com>
2020-02-20 05:35:17 -08:00
Andrey Smirnov
5f330f1f64 chore: push installer & talos images to the CI registry on every build
This enables a way to run the matching installer image in firecracker
tests. New image is used in firecracker tests and bootloader support to
use installed kernel/initramfs, which opens path for upgrade tests.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-02-18 07:32:45 -08:00
Andrey Smirnov
f51e9a14fe chore: build app container images skipping export to host
Container images for `apid`, `networkd`, etc. are now built inside the
buildkit using the `img` tool. This means that all the dependencies are
now controlled in `buildkit` and many more stages can run in parallel
without problems (overwriting content in `_out/images`).

This also simplifies Drone configuration, as we can let buildkit handle
the dependencies. I also enabled more stages to run in parallel.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-02-14 13:17:25 -08:00
Andrew Rynhard
88667641df chore: refactor E2E scripts
This PR aims to simplify our E2E scripts.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-26 20:47:25 -08:00
Andrew Rynhard
f87c6d74d3 chore: use firecracker in basic-integration
This adds a basic integration step that uses firecracker.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-23 05:52:22 -08:00
Andrew Rynhard
a0d8656ca0 chore: use v0.1.0 tools and pkgs
This brings in the official v0.1.0 releases of tools and pkgs.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-20 07:53:08 -08:00
Andrey Smirnov
2bf8540855 test: provision Talos clusters via Firecracker VMs
This is initial PR to push the initial code, it has several known
problems which are going to be addressed in follow-up PRs:

1. there's no "cluster destroy", so the only way to stop the VMs is to
`pkill firecracker`

2. provisioner creates state in `/tmp` and never deletes it, that is
required to keep cluster running when `osctl cluster create` finishes

3. doesn't run any controller process around firecracker to support
reboots/CNI cleanup (vethxyz interfaces are lingering on the host as
they're never cleaned up)

The plan is to create some structure in `~/.talos` to manage cluster
state, e.g. `~/.talos/clusters/<name>` which will contain all the
required files (disk images, file sockets, VM logs, etc.). This
directory structure will also work as a way to detect running clusters
and clean them up.

For point number 3, `osctl cluster create` is going to exec lightweight
process to control the firecracker VM process and to simulate VM reboots
if firecracker finishes cleanly (when VM reboots).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-01-16 00:27:08 +03:00
Andrey Smirnov
810e9b418b chore: bump tools/pkgs for Go 1.13.6
Ref: https://github.com/talos-systems/tools/pull/85,
https://github.com/talos-systems/pkgs/pull/87

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-01-13 20:55:17 +03:00
Andrew Rynhard
d824d0bfdb chore: publish boot.tar.gz
This adds a convenience tarball that includes vmlinuz, and initramfs.xz
in a single tarball.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-09 12:38:21 -08:00
Andrew Rynhard
794d9e6066 chore: update all target in Makefile
We should build the most common things by default.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-06 11:08:27 -08:00
Andrew Rynhard
288d4d0b51 chore: push latest tag on tag events
This ensures that the latest tag is updated on git tag events.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-01-01 11:41:49 -08:00
Andrey Smirnov
ebd40bd0eb chore: use osctl cluster --wait in basic-integration
There are few workarounds for Drone way of running integration test:
DinD runs as a separate pod, and we can only access its exposed on the
"host" ports, while from Talos cluster this endpoint is not reachable.

So internally Talos nodes still use addresses like "10.5.0.2", while
test is using "docker" to access it (that's name of the `docker` service
in the pipeline).

When running locally, 127.0.0.1 is used as endpoint, which should work
fine both on OS X and Linux.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-12-30 15:15:42 -08:00
Andrew Rynhard
5a7eb631b2 feat: add installer command to installer container
This replaces the entrypoint.sh shell script with a go binary.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-12-26 06:41:25 -08:00
Andrew Rynhard
e4a1bc3cf9 chore: add help menu to the Makefile
This adds a help  menu to the Makefile. It documents all build
dependencies, and how to get started.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-12-25 11:11:41 -08:00
Andrew Rynhard
831f5524a1 chore: refactor Makefile to be more DRY
This PR aims to make the Makefile more DRY.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-12-24 10:48:32 -08:00
Andrew Rynhard
6602a85976 chore: use docker buildx
This replaces buildkit and buildctl with the docker buildx plugin.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-12-24 08:30:39 -08:00
Brad Beam
9584b47cd7 feat: Upgrade kubernetes to 1.17.0
Primarily doc/constant changes.

Added additionnal bits to `docs` target in makefile to generate osctl
docs as well as config files. Explicitly define a HOME variable so we
get consistent home directories for talosconfig variables in our docs.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-12-10 16:03:35 -08:00
Spencer Smith
264c5440ef chore: rewrite basic integration in go instead of bash
This PR will be the start of several. It rewrites the basic integration
in go. We'll do these one at a time.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-12-05 15:55:19 -05:00
Andrew Rynhard
70b9186be0 chore: push edge tag on succesful conformance
This adds a step to the conformance pipeline that pushes all containers
with the tag "edge." This Will allow us to start using and edge
"channel" for upgrades.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-11-27 08:10:25 -08:00
Andrew Rynhard
4680f66bc5 docs: add autogenerated config reference
This adds a small program to parse our config structs and generate
markdown from them. This will allow us to enforce a standard and require
documentation for fields as they get added.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-11-11 08:38:39 -08:00