1091 Commits

Author SHA1 Message Date
Andrew Rynhard
82c706a0fb feat: upgrade Kubernetes to v1.16.0
Brings in Kubernetes v1.16.0.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-19 20:19:29 -07:00
Andrew Rynhard
9230ff4e35 feat: return a data structure in version RPC
A byte slice is not very useful. Having a struct with fields makes for a
better experience.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-19 16:58:07 -07:00
Seán C McCord
1a64ece04f fix(machined): add nil checks to metal initializer
Check that the userdata has an Install section before trying to use it

Fixes #1186

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-09-19 12:35:11 -07:00
Andrew Rynhard
6efd6fbe08 chore: move gRPC API to public
In order for other projects to make use of our APIs, they must not
reside underneath the internal directory. This moves the protobuf
definitions to a top-level "api" directory and scopes them according to
their domain. This change also removes generated code from the gitignore
file so that users don't have to generate the code themseleves.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-19 08:55:13 -07:00
Andrew Rynhard
20302eb8f6 chore: fix AWS image dependency
We no longer need to wait for the installer image to be pushed before
creating the AWS image.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-17 21:12:03 -07:00
Andrew Rynhard
c2e71bd2bc chore: prepare release v0.2.0-beta.0
This is the official v0.2.0-beta.0 release.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-17 20:17:41 -07:00
Andrew Rynhard
472f1aa6e8 chore: upgrade Sonobuoy to v0.15.4
This version has a fix for a bug that is affecting us.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-17 14:52:10 -07:00
Andrew Rynhard
21670978ca fix: log system services to /run/system/log
Writing system logs to /var/log breaks upgrades. The system disk unmount
fails with EBUSY. For now we can log to /run/system/log to avoid this.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-17 07:54:01 -07:00
Andrew Rynhard
db80688c5e chore: remove dead code
This code is never used.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-16 21:24:46 -07:00
Andrew Rynhard
b7755b3154 fix: conditionally set log path
This is not the best solution to this, but it stops the bleeding. We can
conditionally build the log base path based on the service logs
requested.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-16 18:29:30 -07:00
Andrew Rynhard
3e62973b2c chore: upgrade conformange image
This upgrade the kube-conformance image used by sonobouy to
v1.16.0-rc.2.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-16 16:05:24 -07:00
Andrew Rynhard
4912d71389 fix: generate client admin cert with 1 year expiry
The default of 24 hours is much too short for the admin credentials.
This makes them expire in a year.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-16 15:52:22 -07:00
Andrew Rynhard
ab4e058489 feat: upgrade Kubernetes to v1.16.0-rc.2
This brings in the release candidate for Kubernetes v1.16.0.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-16 14:56:55 -07:00
Andrey Smirnov
54dd1bd95d chore: make ntpd depend on networkd
As ntpd relies on outbound networking, it makes sense to wait for
networkd.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-09-17 00:30:08 +03:00
Andrey Smirnov
c2176ee0fa chore: update github.com/stretchr/testify library to 1.4.0
New release comes with bugfixes (we got some of them integrated for
not tagged release), and few interesting new assertions, including
`Eventually` for polling.

See: https://github.com/stretchr/testify/milestone/2?closed=1

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-09-16 19:06:47 +03:00
Andrey Smirnov
669bb5e1c6 chore: move interface type assertion to unit-tests
This moves optional interface checks to unit-tests, removing type checks
via global variable assignment.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-09-16 17:16:30 +03:00
Andrey Smirnov
7d8c40e3aa chore: randomize containerd namespace in tests
Looks like containerd creates shim file sockets in Linux abstract
namespace which are fixed (don't depend on containerd root directory)
and depend on container namespace and id. So if two containerd instances
on the same host run same namespace/id pair, that is going to create a
conflict on that shim filesocket.

Avoid that by randomizing namespace name. CRI tests should be fine as
namespace is fixed, but container ID is random.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-09-13 23:56:40 +03:00
Andrew Rynhard
75746266ce feat: upgrade Kubernetes to v1.16.0-rc.1
This brings in the latest RC of 1.16.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-12 20:20:48 -07:00
Andrey Smirnov
362d403707 chore: make TestRunRestartFailed test more reliable
Replace sleep with polling for desired state.

Fixes #1162

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-09-13 01:18:22 +03:00
Andrey Smirnov
b68e6395d8 feat(machined): filter actions stop/start/restart on per-service level
This implements 'default deny' policy for service operations via the
API: services do not allow operations.

Service whitelists itself for stop/start/restart by implementing the
interface and returning boolean flag which might depend on userdata.

Machined APIs `Stop/Start` were renamed to `ServiceStop`/`ServiceStart`
to avoid confusion with osd API `Restart` which is not related to
services. Old APIs are deprecated and compatibility code forwards old
APIs to the new code.

`ServiceRestart` API was introduced to distinguish restart action from
stop/start (previously restart was implemented as stop+start in the
CLI).

Service udevd-trigger was whitelisted for all operations (allows
stopping hanging run, restarting to trigger once again).

Services proxyd & ntpd were whitelisted for restart and start (start is
whitelisted to help with service stuck in stopped state while restarting).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-09-13 00:38:19 +03:00
Andrew Rynhard
5ee554128e chore: move from gofumpt to gofumports
The gofumports does everything that gofumpt does with the addition of
formatting imports. This change proposes the use of the `-local` flag so
that we can have imports separated in the following order:

- standard library
- third party
- Talos specific

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-12 07:49:12 -07:00
Andrew Rynhard
2e8b570302 chore: add fmt target
This provides a target that can be useful for developers. It will format
code according to our standards.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-11 15:19:53 -07:00
Andrey Smirnov
980829708e chore: upgrade golancgi-lint to 1.18.0
New linter 'funlen' was disabled as too many functions break the default
limit, but might be considered for the future.

To limit peak memory usage, `GOGC=50` was added to the golangci-lint run
to make Go's garbage collector more aggressive. With this setting peak
seems to be around 8Gb.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-09-11 15:18:57 -07:00
Andrew Rynhard
d563988778 fix: use /var/log for default log path
This moves the default log path to /var/log. An expection is made for
machined-api and system-containerd since they must have zero
dependencies on the ephemeral disk. In the case of machined-api, we
cannot stop the service since it is required to perform an upgrade. As
for system-containerd, it starts before any ephemeral disk is mounted so
we will fail to start the service since /var/log is a read-only file system.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-11 15:07:34 -07:00
Andrew Rynhard
20c88bac2c feat: move node certificate to tmpfs
This ensures that node certificates are ephemeral by storing them in a
tmpfs.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-11 14:10:34 -07:00
Spencer Smith
fa9b08145f docs: add machine configuration proposal
This PR will add the machine configuration proposal for review and merge
once agreed upon.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-09-11 12:01:40 -07:00
Andrew Rynhard
2955428850 chore: format code with gofumpt
The gofumpt linter is a stricter drop-in replacement for gofmt. The
rules are ones that I strongly agree with and I think it would be better
if we added this linter instead of nit picking every PR.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-11 11:03:29 -07:00
Brad Beam
a8c69bf753 chore(machined): Clean up unnecessary ticker alloc
There was a new ticker being created for each run of the healthcheck.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-09-11 12:10:02 -05:00
Andrew Rynhard
bf16b1e916 chore: remove invalid TODO
This TODO no longer applies. We have setteled on a fixed boot size. This
also removes variables no longer needed.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-10 10:53:36 -07:00
Andrew Rynhard
298ddc8f49 fix: enable slub_debug=P
This is the last KSPP kernel parameter we need to be compliant with KSPP
guidelines.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-10 10:53:19 -07:00
Andrew Rynhard
38690d72df chore: remove unneeded packages
This removes packages we don't need anymore.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-10 08:12:07 -07:00
Spencer Smith
473df84cf6 fix: move to per-platform console setup
This PR will make sure that each platform gets the console settings it
needs by setting them as extra flags in the makefile. This should ensure
that we have console logs flowing properly for each cloud.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-09-10 07:50:34 -07:00
Andrew Rynhard
761805e910 feat: set expiry of certificates to 24 hours
This defaults certificates to a 24 hour TTL.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-10 07:34:25 -07:00
Andrew Rynhard
e48cee6343 chore: remove existing AMI
We need to remove an exiting AMI, if it exists, in order to create a new
one with the same name.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-10 04:52:43 -07:00
Andrew Rynhard
44dd2fc7c9 chore: remove packer from installer
This moves to making AWS releases align with Azure, and GCP. We no
longer need packer since we will now release an artifact that users can
import.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-09 18:54:37 -07:00
Brad Beam
9a50da0ed7 fix(osd): Mount host directory for grpc sockets
Should prevent broken mounts from occurring when services are restarted.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-09-09 16:20:38 -05:00
Brad Beam
309856083b fix: Add retry/delay to probing device file
Fixes flakey image creation.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-09-09 16:05:35 -05:00
Brad Beam
63eb62f52c fix(machined): Fix hostname value when retrieving from cloud providers
There was an issue where the hostname was getting set too early in the boot. This caused
the hostnam retrieved from platform.Hostname() to be ignored.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-09-09 15:55:46 -05:00
Brad Beam
f21d1244bd test(ci): Add aws for e2e and conformance targets
Add additional scripts and steps to enable doing tests against aws.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-09-09 13:56:19 -05:00
Spencer Smith
aed8c06730 chore: rename v1 node configs to v1alpha1
This PR moves to using v1alpha1 as the inital node config version, so
we can graduate these configs a little more cleanly later on.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-09-09 13:03:49 -04:00
Brad Beam
be4f7e1e6a chore: Rename maintainers channel
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-09-09 10:59:48 -05:00
Seán C McCord
a99637cc0a fix: use ntp client constructor
Uses NTP client constructor so that defaults are appropriately used.

Fixes #1126

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-09-08 19:18:37 -07:00
Seán C McCord
3c41770478 fix: translate machine.network to networking.os
Add translation for v1 to v0 machine networking.  Also adds "Ignore"
property to v1 network interfaces.

Fixes #1134

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-09-08 18:20:10 -07:00
Seán C McCord
beecb70374 feat: Allow spec of canonical controlplane addr
Broke the binding between the discrete IP addresses of the control plane
elements and the ControlPlaneEndpoint.  This allows the specification of
a canonical controlplane address which may optionally be a DNS name.

Fixes #1131

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-09-08 17:18:52 -07:00
Seán C McCord
47a361c5b6 fix(osctl): use real userdata as defaults for install
This modifies `osctl install` to use the provided userdata as the source
for default installation values.  This allows such things as
userdata-supplied extra kernel parameters to be automatically
included in the bootloader.

Fixes #1102

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-09-08 17:00:12 -07:00
Seán C McCord
bcb6a2d3a5 fix: prepend custom options for kernel commandline
Added a decomposition option to the kernel.NewDefaultCmdline() so that
the Defaults can be added _after_ constructing a custom commandline.
This is then implemented for `osctl install`.

Fixes #1128

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-09-08 16:58:49 -07:00
Seán C McCord
f7ad24ec4f feat: allow network interface to be ignored
Added a property to userdata to allow a network interface to be ignored,
such that Talos will perform no operations on it (including DHCP).

Also added kernel commandline parameter (talos.network.interface.ignore)
to specify a network interface should be ignored.

Also allows chaining of kernel cmdline parameter Contains() where the
parameter in question does not exist.

Fixes #1124

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-09-07 16:33:52 -07:00
Andrew Rynhard
71e8a5fccf chore: remove top output border
This should give it a closer feel to the rest of the UX.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-09-06 19:48:12 -07:00
Brad Beam
2fadd4da6f chore(machined): Increase pid_max to 262k
Minor improvement for busy systems

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-09-06 19:47:24 -07:00
Spencer Smith
8b019d8f33 chore: update provider-components for capi v0.1.9
This PR updates our e2e tests with the provider-components file that's
generated by our capi v0.1.9 update.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-09-06 22:45:44 -04:00