708 Commits

Author SHA1 Message Date
Andrew Rynhard
835d72b74a fix: create overlay mounts after install
Without running the install task first, /var is read-only. This causes
the overlay phase to fail as it tries to create /var/system.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-01 06:35:12 -07:00
Andrey Smirnov
3024c26a55 chore: update dockerfile/buildkit versions
New buildkit release: https://github.com/moby/buildkit/releases/tag/v0.6.0

New release was published for buildkit's dockerfile:
https://github.com/moby/buildkit/releases/tag/dockerfile%2F1.1.2-experimental,
so we can stick to release version now.

These releases include fixes/implementation for `RUN --security=insecure`.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-08-01 01:05:42 +03:00
Andrey Smirnov
084378ac04 fix(init): flip concurrency of tasks/services, fix small issues
Phases should run sequentially, while tasks concurrently in a phase.

There are two potential issues fixed:

1. `result` multierror was updated inside goroutine without any
synchronization, so this is a data race
2. panic inside task/phase runner might happen and as unhandled panic in a
goroutine aborts whole process, this might lead to a system halt as
as the 'machined' exits

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-07-31 14:21:07 -07:00
Spencer Smith
bc5fe085bd fix: set mtu value regardless of interface state
This PR will fix a bug we encountered in GCE, where the interface was
already up and the MTU value wasn't getting set.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-07-31 15:02:02 -04:00
Andrey Smirnov
ac963ad7e1 feat(osctl): allow configurable number of masters to cluster create
This allows to run tiny Talos clusters (which is sometimes nice for
local testing), e.g. with just a single master and zero workers.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-07-30 15:32:16 -07:00
Andrew Rynhard
e2e5236f62 chore: prepare release v0.1.0
This is the official v0.1.0 release.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-29 21:17:10 -07:00
Andrew Rynhard
12486ef0e2 chore: remove rootfs output param
This removes the `--output` flag from the rootfs target. With the output
specified it was outputting the file directory structure to the build
directory.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-29 20:26:54 -07:00
Andrew Rynhard
f0c469c558 chore: prepare release v0.2.0-alpha.4
This is the official v0.2.0-alpha.4 release.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-29 19:31:59 -07:00
Andrew Rynhard
92b72311c7 chore: add AMI build
This will build AMIs and publish them to our official account on a
release.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-29 18:56:33 -07:00
Andrey Smirnov
587011e250 chore: remove hack/dev/ scripts & docker-compose
They are outdated, `osctl cluster` implements cluster up/down in a
better way. K8s manifests are left intact, they are used in integration
tests.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-07-30 00:47:58 +03:00
Andrew Rynhard
e63c882b89 refactor: split machined into phases
This change aims to standardize the boot process. It introduces the
concept of a phase, which is comprised of tasks. Phases are ran in serial and
the tasks that make up a phase are ran concurrently.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-29 12:40:03 -07:00
Andrey Smirnov
f56a9d5b96 chore: implement first version of CRI runner
It runs containers via CRI interface in a pod sandbox. This is the very
first version:  I tried not to introduce any changes to common runner
interface.

There should be some CRI-speficic options for the runner (like polling
interval, as it doesn't have nice `Wait()` API), plus my plan so far is
to use OCI as the common layer for container options, so that we can
analyze OCI and translate to CRI (when possible, return errors when
option is not implemented).

CRI interface doesn't have a concept of 'unpacking' an image, so we
probably need to unpack via containerd API (or any other
runtime-specific API) by targeting CRI namespace.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-07-26 21:07:46 +03:00
Andrey Smirnov
3e6993c648 chore: fix build cache
Remove `-a` flag to `go build` which caused cache to be missed all the
time. Add cache mount where missing, update path to match Go build cache
exactly.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-07-26 10:38:00 -07:00
Andrew Rynhard
b7a9acbe88 refactor: move setup logic into machined
The responsibility of init should only be to mount the rootfs. This
change moves Talos specific logic into machined. This will allow us to
define a version of Talos in a single binary instead of split across
two. This will enable cleaner upgrades and helps make the codebase
easier to reason about.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-26 07:48:49 -07:00
Andrew Rynhard
a7d76b9410 fix: Run cleanup script earlier in rootfs build
This change fixes a bug that caused the API server to fail due to a
missing directory at /usr/share/ca-certificates.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-25 14:51:13 -07:00
Andrew Rynhard
6852fa969f chore: create raw image as sparse file
This change reduces the size of raw disk significantly by creating it as
a sparse file.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-25 11:28:07 -07:00
Andrew Rynhard
0ec17e4169 feat: run rootfs from squashfs
This change moves the rootfs to a squashfs image.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-25 08:38:31 -07:00
Andrew Rynhard
0b8778d772 feat: enable missing KSPP sysctls
These were disabled in previous versions of Talos since BPF was
completely disabled. With this change, we now implement all recommended
sysctls.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-24 22:41:43 -07:00
Andrew Rynhard
5a68b8b371 fix: mount cgroups properly
This change mounts cgroups properly.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-24 22:10:15 -07:00
Andrew Rynhard
b4383e35db feat: move df API to init
This change allows for more accurate mount reporting as /proc/mounts is
a symlink to /proc/self/mounts and contains mounts that are relative to
the running process. In our case this was osd. This caused inaccurate
reporting of mounts since they were relative to osd when we really
wanted mounts relative to machined.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-24 15:28:37 -07:00
Seán C McCord
8884b85905 fix(trustd): allow hostnames for trustd endpoints
Fixes #666

Also adds IPv6 to tests for trustd endpoints

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-07-24 15:28:03 -07:00
Andrey Smirnov
b1c184b616 chore: fix GOCACHE dir location
`go env GOCACHE` tells it's actually `/.cache` in our build environment
(probably because `$HOME` is not set?)

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-07-25 00:44:57 +03:00
Spencer Smith
2208eb5924 fix: check proper value of parseip in dhcp
This PR fixes a small bug where we weren't properly checking the value
of a net.ParseIP() call and setting the hostname to the first octet of
an IP.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-07-24 15:11:27 -04:00
Andrey Smirnov
8c59adb9dc chore: allow to run tests only for specified packages
This allows to do `make test TESTPKGS=./internal/app/machined`.

Also update Dockerfile slug as
https://github.com/moby/buildkit/pull/1081 was merged into master.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-07-23 22:17:22 +03:00
Spencer Smith
45def0a242 feat: attempt to connect to all trustd endpoints when downloading PKI
This PR will connect to each trustd endpoint specified, returning once
successful. Closes #891.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-07-22 19:40:12 -04:00
Andrew Rynhard
6fa7c1fcbd chore: compress Azure image
The image needs to be compressed in order to publish it to GitHub.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-22 14:48:24 -07:00
Andrew Rynhard
32961efbe0 chore: remove the raw disk after Azure build
The raw disk causes the release to fail.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-22 14:20:50 -07:00
Andrew Rynhard
fdb48c981a chore: fix release
This change makes the release step wait for the Azure artifact.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-22 13:49:04 -07:00
Andrew Rynhard
0e2b5f9227 chore: fix image builds on tags
The GCE and Azure steps need to run in serial since they both output the
same artifact name.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-22 13:45:32 -07:00
Andrew Rynhard
1553a31d20 chore: prepare v0.2.0-alpha.3 release
Please see the CHANGELOG for a list of changes.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-22 12:56:25 -07:00
Spencer Smith
089890f36b chore: setup gce for e2e builds
This PR will provide a basis for running e2e tests on GCE several times
a day. We'll need to add a cron event to the drone repo once merged.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-07-22 12:46:02 -04:00
Andrew Rynhard
88bdedf3e6 fix: make /etc/resolv.conf writable
We need /etc/resolv.conf to be writable so that networkd can update it.
This change achieves this by creating a symlink at /etc/resolv.conf that
points to /var/resolv.conf.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-19 20:37:00 -07:00
Andrey Smirnov
7df9ef049c chore: repair 'make all'
Target 'binaries' was referencing non-existent `Dockerfile` target, so
any `make all` failed.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-07-19 08:12:23 -07:00
Andrey Smirnov
9f9acf1f05 chore: run tests in the buildkit itself
This relies on two PRs to the buildkit:

* https://github.com/moby/buildkit/pull/1081
* https://github.com/moby/buildkit/pull/1085

Sysfs fix was merged to upstream, so updated tag, while using
`Dockerfile` slug I can switch to dockerfile2llb with support for
`--security=insecure`.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-07-19 07:53:49 -07:00
Spencer Smith
a15499d25a fix: Only generate pki from trustd if not control plane
This PR will fix a bug where the non-init nodes were not generating
their certs locally and relying on trustd instead. This broke down
because we aren't saving the CA key when we're generating with the
trustd identity function (because we don't need it for workers).

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-07-18 20:20:38 -04:00
Matt Welch
8d3ee182d9 docs: minor spelling corrections.
Minor spelling corrections.

Signed-off-by: Matt Welch <matt.welch@gmail.com>
2019-07-18 16:40:08 -07:00
Spencer Smith
c9f0dbbd4c feat: set default mtu for gce platform
This PR is needed so that the eth0 device will have the proper mtu when
coming online in google cloud

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-07-17 19:16:50 -04:00
Spencer Smith
4a31b66850 feat: allow mtu specification for network devices
This PR is needed so we can specify an MTU of 1460 for GCE VMs

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-07-17 13:51:23 -07:00
Andrew Rynhard
7a5f56cd10 chore: prepare release v0.1.0-rc.0
This commit updates the CHANGELOG.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-17 13:27:38 -07:00
Brad Beam
f650e32833 fix: Truncate hostname if necessary
We should only set the hostname to the actual host name instead of FQDN.
This hasnt been much of an issue, but GCE does return the FQDN for the
hostname field in dhcp.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-07-17 09:17:26 -07:00
Spencer Smith
6fd685dad0 feat: allow specification of mtu for cluster create
This PR adds the ability to set mtu for the cluster create networks.
Default is 1440, which seems to be the default for calico.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-07-17 07:34:28 -07:00
Andrew Rynhard
75ea51633c fix: prefix file stat with rootfs prefix
Without this, the check for the existence of the symlinks created in the
rootfs preparation step will always fail. On a reboot init will fail
because it tries to create a symlink that already exists.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-16 22:09:30 -07:00
Andrew Rynhard
4c4141d161 chore: publish Azure image on releases
Produces a VHD suitable for uploading to Azure and creating a Talos
node.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-16 21:21:53 -07:00
Andrew Rynhard
fe2b81f4b4 fix: create symlinks to /etc/ssl/certs
In order to accomodate the various ways that SSL certs are managed by
the different Linux distros, kubeadm creates control plane compoents
with volume mounts of the type DirectoryOrCreate to all well known SSL
cert locations. This change creates symlinks to /etc/ss/certs at all the
well known paths to account for the fact that the rootfs is read-only.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-16 16:35:59 -07:00
Andrew Rynhard
8e8aae98dd feat: add machined
This commit splits our current init into init and machined.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-16 13:12:21 -07:00
Brad Beam
7adef1ea62 feat(init): Add azure as a supported platform
Update initramfs to interact with azure endpoints for userdata.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-07-16 12:59:53 -07:00
Brad Beam
e9482a4041 fix: Fix integration of extra kernel args
Switch from `StringSliceVar` to `StringArrayVar` to maintain commas
in kernel args.

Update entrypoint script to allow specifying extra kernel args.

Remove default console settings in kernel config.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-07-16 14:38:55 -05:00
Andrew Rynhard
40ae00d90c chore: add step to drone for kernel
Now that we manage dependencies manually, we need to explicitly build
the kernel target so that vmlinuz and vmlinux are placed into the build
directory.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-15 15:55:15 -07:00
Andrew Rynhard
0a21502e4d chore: prepare release v0.2.0-alpha.2
Details can be found in the CHANGELOG.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-15 14:40:42 -07:00
Andrew Rynhard
0c17564398 chore: move init to /sbin
In order to run Talos with ignite, we need to have init at /sbin/init.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-15 13:26:09 -07:00