229 Commits

Author SHA1 Message Date
Andrey Smirnov
c85608b8d9 test: add an option to bind docker to specific host IP
This allows to override default `0.0.0.0` (`*`) to a specific IP to
avoid conflicts.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-27 21:13:28 +03:00
Andrey Smirnov
f23c9111d1 feat: upgrade etcd to 3.3.22 version
Latest version in 3.3 branch is 3.3.23, but it's broken, so we use previous
stable version.

Switch to official etcd gcr.io registry, early support for arm64.

Move `etcd` service to run in system containerd.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-21 09:44:43 -07:00
Andrey Smirnov
70a65cbb01 feat: make partitions on additional disk without size occupy full disk
Fixes #2214

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-21 07:33:07 -07:00
dependabot[bot]
0aae950518 chore: bump lodash from 4.17.15 to 4.17.19 in /docs/website
Bumps [lodash](https://github.com/lodash/lodash) from 4.17.15 to 4.17.19.
- [Release notes](https://github.com/lodash/lodash/releases)
- [Commits](https://github.com/lodash/lodash/compare/4.17.15...4.17.19)

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-20 11:17:18 -07:00
steverfrancis
8dd81b0693 docs: use latest talosctl download link
Update download example to reference latest release.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-07-18 14:45:52 -07:00
Andrey Smirnov
ad99cb6421 feat: implement talosctl dashboard command
This builds a simple CLI UI for Talos cluster monitoring.

Some new APIs were added for monitoring based on Prometheus procfs
package.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-16 14:24:04 -07:00
Andrey Smirnov
c54639e541 feat: implement server-side API for cluster health checks
This implements existing server-side health checks as defined in
`internal/pkg/cluster/checks` in Talos API.

Summary of changes:

* new `cluster` API

* `apid` now listens without auth on local file socket

* `cluster` API is for now implemented in `machined`, but we can move it
to the new service if we find it more appropriate

* `talosctl health` by default now does server-side health check

UX: `talosctl health` without arguments does health check for the
cluster if it has healthy K8s to return master/worker nodes. If needed,
node list can be overridden with flags.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-15 13:52:13 -07:00
Spencer Smith
7d10677ee8 docs: update worker creation flags for azure docs
This PR updates the worker flags for azure. Fixes an issue where, if you
have multiple subnets and the talos one isn't default, the workers and
control plane nodes came up on different subnets. Requires updating the
firewalls if they don't come up in the same subnet, so this is better
UX.

Also added a note that azure support is broken in v0.5.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-07-15 12:03:33 -07:00
Andrew Rynhard
0617a10027 feat: upgrade Kubernetes to v1.19.0-rc.0
This brings in the latest version of Kubernetes.

Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
2020-07-14 13:07:18 -07:00
Andrey Smirnov
cbb7ca8390 refactor: merge osd into machined
This merges `osd` API into `machined`. API was copied from `osd` into
`machined`, and `osd` API was deprecated.

For backwards compatibility, `machined` still implements `osd` API, so
older Talos API clients can still talk to the node without changes.

Docs were updated. No functional changes.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-13 12:50:00 -07:00
Artem Chernyshev
8fc352ec4f feat: merge mode in talosctl kubeconfig
New flag `-m` will enable merge mechanism in `talosctl kubeconfig`

Command examples:

```
talosctl kubeconfig -m

talosctl kubeconfig -m ~/.kube/config
```

Signed-off-by: Artem Chernyshev <artem.0xD2@gmail.com>
2020-07-10 12:39:30 -07:00
Andrey Smirnov
9590030a84 feat: print crash dump in talosctl cluster create on failure
When cluster fails to be bootstrapped or it fails the health check, it's
hard to find the root cause without the logs.

This change adds optional crashdump (it dumps firecracker logs or docker
logs) after provisioning failure. It's not enabled by default.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-10 11:54:07 -07:00
Andrey Smirnov
50db9b6073 docs: update firecracker for new home of tc-redirect-tap plugin
See https://github.com/firecracker-microvm/firecracker-go-sdk/issues/174#issuecomment-655798205

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-09 11:47:28 -07:00
Andrey Smirnov
4cc074cdba feat: implement API access to event history
1. Add [xid-based](https://github.com/rs/xid) event IDs. Xids
are sortable and unique enough. Xids also encode event publishing
time with a second precision.

2. Add three ways to look back into event history: based on number of
events, on time and ID. Lookup via ID might be used to restart event
polling in case of broken API connection from the same moment.

3. Reimplement core event buffer with positions which are always
incremented instead of generation+index, this implementation is much
more simple (idea from circular buffer).

4. By default, Events API works the same - it shows no history and
starts streaming new events only.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-08 10:54:50 -07:00
Andrey Smirnov
0cd86f17c3 fix: provide default DNS domain to talosctl cluster create
Fixes #2263

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-02 13:42:45 -07:00
Andrey Smirnov
3ae5e0e749 test: add short integration test with custom CNI
This adds new flug to `cluster create` to launch cluster with custom
CNI, `integration` pipeline gets a new step to run short test with
Cilium 1.8.0 CNI.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-07-01 11:19:19 -07:00
Patatman
90acb01a4e docs: digital rebar docs
Digital rebar docs in the guide section.

Signed-off-by: Patatman <git@jeursen.nl>
2020-06-30 18:52:39 -07:00
Andrey Smirnov
51112a1d86 fix: use kubernetes version in config generator
Update all k8s image references to point to the version specified by the user.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-06-26 17:05:19 -07:00
Andrey Smirnov
dacbac35c4 docs: add local registry cache documentation
This can be expanded one day to air-gapped solution, but gives good
starting point for those who run clusters locally.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-06-26 11:07:56 -04:00
Andrey Smirnov
470fc51c0a docs: update firecracker with one more CNI plugin
Plugin `static` is used for IPAM on interfaces.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-06-25 20:44:54 +03:00
Patatman
3369c0822c docs: specs added
specs added to the quickstart, to fix #2200

Signed-off-by: Patatman <git@jeursen.nl>
2020-06-18 08:20:53 -04:00
Patatman
69cb8a02f1 docs: specs added
specs added to the quickstart, to fix #2200

Signed-off-by: Patatman <git@jeursen.nl>
2020-06-18 08:20:53 -04:00
Spencer Smith
d57c97fdb6 feat: allow ability to create dummy nics
This PR will introduce a new field to v1alpha1 configs that allows users
to set `dummy: true` when specifying interfaces. If present, we will
create a dummy interface with the CIDR information given. This is useful
for users that don't want to use loopback for things like ECMP (or want
more than one dummy interface).

The created dummy interface looked like this with `ip a`:

```
3: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1000
    link/ether 66:4a:e3:5f:38:10 brd ff:ff:ff:ff:ff:ff
    inet 10.254.0.5/32 brd 10.254.0.5 scope global dummy0
       valid_lft forever preferred_lft forever
```

Will close #2186.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-06-17 17:15:07 -04:00
Andrew Rynhard
c21b15f3cf feat: add rollback command
This adds the `rollback` command to `talosctl`.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-06-16 20:06:15 -07:00
Andrey Smirnov
3d8f20732a chore: use neutral terminology
Replace blacklist with denylist, it was only used internally.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-06-15 14:00:55 -07:00
Spencer Smith
90115bb3ef feat: update kubernetes to 1.19.0-beta.1
This PR brings in all changes necessary to deploy kubernetes 1.19.x.

It relies on an update to our bootkube-plugin project, as well as
implementation of some Image() functions for our various control plane
components, since they are all distinct images and not just hyperkube.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-06-10 15:01:11 -04:00
Andrew Rynhard
336f983c21 docs: add v0.6 docs
This adds the documentation for v0.6 and removes v0.3 since
it is no longer supported.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-06-10 10:39:38 -07:00
Spencer Smith
e03a68f8eb feat: update k8s and sonobuoy versions
This PR will update k8s to the latest 1.18 release and bump sonobuoy to
help resolve some e2e flakes. Also adds some retry logic around the
sonobuoy run.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-06-10 06:47:36 -07:00
Andrew Rynhard
8f472675ee docs: add kernel options to firecracker reqs
This adds a note on a few more requirements on the host kernel for
running Talos with firecracker.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-06-09 11:26:30 -07:00
Timothy Gerla
6a5b788d06 docs: remove repeated component in the Arges architecture image
- Removed the repeated "metal metadata server" line in the Arges
architecture image.

Signed-off-by: Timothy Gerla <tim@gerla.net>
2020-05-29 08:46:23 -07:00
Patatman
f648f555b6 docs: add talosctl docs document
Initial version of the talosctl docs.

Signed-off-by: Patatman <git@jeursen.nl>
2020-05-29 08:45:44 -07:00
Timothy Gerla
172a55f2f0 docs: fix a few minor styling issues
- center the "Certified Kubernetes" logo
- adjust margin on an unordered list

Signed-off-by: Timothy Gerla <tim@gerla.net>
2020-05-28 11:12:50 -07:00
Andrew Rynhard
20e721c47a docs: make v0.5 docs the default
This updates links and dropdown menus to point to the v0.5
documentation.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-05-27 10:09:47 -07:00
Patatman
cbc0ab9e58 docs: add metal overview diagram
This adds a diagram to the metal overview that illustrates the PXE boot and
installation process. Fixes #2130.

Signed-off-by: Patatman <git@jeursen.nl>
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-05-25 10:10:35 -07:00
Timothy Gerla
e70b7e3073 docs: fix broken links in components pages (fixes #2117)
- Intra-site docs links need to be relative
- Add nuxt-interpolation to rewrite <a> tags to <nuxt-link> tags
which improves the single-page-app behavior when clicking on internal links.

Signed-off-by: Timothy Gerla <tim@gerla.net>
2020-05-18 08:06:23 -07:00
Timothy Gerla
0b6b371bca docs: add some information about Arges and expand the bare metal section a bit
- Add links to Arges in 0.4 and 0.5 docs
- Add an Arges architecture diagram
- Add margins around images in docs

Signed-off-by: Timothy Gerla <tim@gerla.net>
2020-05-18 08:00:53 -07:00
nold
fa6ae016a9 docs: overview of talos components
This should fix issue #1933

Signed-off-by: Gerrit Pannek <nold@gnu.one>
2020-05-16 09:10:37 -07:00
Spencer Smith
c63c7f15e2 fix: respect nameservers when using docker cluster
This PR will fix some unexpected user behavior where nameservers were
always getting written to 8.8.8.8,1.1.1.1 for the docker-based talos
clusters. This occurred even when updating the docker daemon's config.
This PR will make the docker provisioner respect the --nameserver flag
and allow that to be used to override the defaults.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2020-05-15 13:58:30 -07:00
Timothy Gerla
8fca374ca6 docs: add a sitemap and Netlify redirects
- add nuxtjs/sitemap for an automatic sitemap generator
- add auto-generated explicit redirects for docs pages: right now, if you
navigate to a deep docs page (/docs/v0.5/en/guides/cloud/aws, for instance),
you will get an HTTP 404 from Netlify because the page doesn't exist
on disk, but the resulting single-page-app javascript will show you the content.
These redirects are an attempt to solve the 404 problem which probably affects
search engines.

Signed-off-by: Timothy Gerla <tim@gerla.net>
2020-05-13 12:28:01 -07:00
Andrew Rynhard
1902519727 feat: add events API
This adds an event stream to the runtime, and the ability to stream
events via the API.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-05-13 12:18:10 -07:00
Timothy Gerla
5348332d26 docs: adjust docs layouts and add tables of contents
- add an auto-generated table of contents with markdown-toc
- docs pages now fill the whole page width; other pages are are 4/5ths wide as before
- clean up and reorganize some styles
- version dropdown moved to the left
- cleaned up the github edit link
- a couple of responsive cleanups
- add page title to HTML title attribute

Signed-off-by: Timothy Gerla <tim@gerla.net>
2020-05-11 10:26:31 -07:00
Timothy Gerla
fdc4bc506c docs: update copyright date
- Update the page footer copyright date to 2020

Signed-off-by: Timothy Gerla <tim@gerla.net>
2020-05-11 07:24:01 -07:00
Andrew Rynhard
8e07b1bab3 feat: add bootstrap API
This adds the ability to bootstrap a cluster using the API.
The API simply starts the bootkube service.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-05-07 16:47:28 -07:00
Timothy Gerla
18f830f85f docs: backport intro text to 0.3 and 0.4 docs
- Replaced the basic intro text for 0.3 and 0.4 on the docs home page with
more useful information and links to next steps.

Signed-off-by: Timothy Gerla <tim@gerla.net>
2020-05-05 10:02:05 -07:00
Timothy Gerla
fb71eeed91 docs: fix netlify deep linking for 0.5 docs by generating fallback routes
From https://nuxtjs.org/faq/netlify-deployment#for-site-generated-in-spa-mode

Signed-off-by: Timothy Gerla <tim@gerla.net>
2020-05-05 07:35:07 -07:00
Andrew Rynhard
56d7bf19fe feat: add recovery API
This adds an API for recovering the self-hosted control plane.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2020-05-04 19:38:30 -07:00
Timothy Gerla
f59620473e docs: add 0.5 pre-release docs, add linkable anchors, other fixes
- add 0.5 docs branched from 0.4
- add intro page and "get help" pages
- moved Docker and Firecracker into a "Local Clusters" category
- switch to markdown-it from markd for consistency between corp site and docs site
- use markdown-it-anchor to create linkable anchors to sections within a page
- improve urls to use / instead of # for docs pages (WARNING: this breaks old links)
- continue to simplify handling in the Content.vue component
- update JS deps

Signed-off-by: Timothy Gerla <tim@gerla.net>
2020-05-04 16:04:53 -07:00
Timothy Gerla
688efabb93 fix: clean up docs page scripts in preparation for 0.5 docs
- simplify the docs page handling logic and get more nuxt-like
- the handleClick function was vestigial and didn't do anything anymore, remove it
- simplify the Vuex state quite a bit, remove activeDocPath
- clean up github link generation code, and fix #2076

Signed-off-by: Timothy Gerla <tim@gerla.net>
2020-05-02 02:49:19 -07:00
Seán C McCord
c1299d3ff0 feat: allow dual-stack support with bootkube wrapper
Handle dual-stack configurations with the bootkube wrapper.  This uses
the new PodCIDRs and ServiceCIDRs `asset.Config` parameters in bootkube.
It also relies on the bootkube-plugin features for manipulating
kube-proxy config and installing the dual-stack DNS service.

Fixes #2055

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2020-04-28 20:10:58 -07:00
Andrey Smirnov
55dcbbc8d0 feat: add commands talosctl health/crashdump
This extracts health & crashdump features which were specific to
provisioning code into separate package which can be used standalone.

Everything else is just new glue.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2020-04-27 20:43:10 -07:00