66 Commits

Author SHA1 Message Date
Seán C McCord
d0ff28a8c7 fix: enclose server address is bracks if IPv6
Fixes #980

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-08-10 17:42:17 -07:00
Andrey Smirnov
ae54f7e40d fix: stalls in local Docker cluster boot
Problem was triggered by udevd trigger, root cause is not clear, but
workaround is to disable it for container mode.

Implement CPU/mem limits for `osctl cluster create`, apply defaults,
bump defaults for cicd.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-08-10 13:31:47 +03:00
Andrew Rynhard
90c91807bd refactor: restructure the project layout
This change moves packages into more appropriate places.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-01 22:19:42 -07:00
Andrew Rynhard
ca35b85300 refactor: improve installation reliability
This change aims to make installations more unified and reliable. It
introduces the concept of a mountpoint manager that is capable of
mounting, unmounting, and moving a set of mountpoints in the correct
order.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-08-01 11:44:40 -07:00
Andrey Smirnov
9c63f4ed0a feat(init): implement complete API for service lifecycle (start/stop)
It is now possible to `start`/`stop`/`restart` any service via `osctl`
commands.

There are some changes in `ServiceRunner` to support re-use (re-entering
running state). `Services` singleton now tracks service running state to
avoid calling `Start()` on already running `ServiceRunner` instance.
Method `Start()` was renamed to `LoadAndStart()` to break up service
loading (adding to the list of service) and actual service start.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-08-01 11:16:57 -07:00
Andrey Smirnov
ac963ad7e1 feat(osctl): allow configurable number of masters to cluster create
This allows to run tiny Talos clusters (which is sometimes nice for
local testing), e.g. with just a single master and zero workers.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-07-30 15:32:16 -07:00
Andrew Rynhard
e63c882b89 refactor: split machined into phases
This change aims to standardize the boot process. It introduces the
concept of a phase, which is comprised of tasks. Phases are ran in serial and
the tasks that make up a phase are ran concurrently.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-29 12:40:03 -07:00
Andrew Rynhard
6852fa969f chore: create raw image as sparse file
This change reduces the size of raw disk significantly by creating it as
a sparse file.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-25 11:28:07 -07:00
Andrew Rynhard
0ec17e4169 feat: run rootfs from squashfs
This change moves the rootfs to a squashfs image.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-25 08:38:31 -07:00
Andrew Rynhard
b4383e35db feat: move df API to init
This change allows for more accurate mount reporting as /proc/mounts is
a symlink to /proc/self/mounts and contains mounts that are relative to
the running process. In our case this was osd. This caused inaccurate
reporting of mounts since they were relative to osd when we really
wanted mounts relative to machined.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-24 15:28:37 -07:00
Spencer Smith
6fd685dad0 feat: allow specification of mtu for cluster create
This PR adds the ability to set mtu for the cluster create networks.
Default is 1440, which seems to be the default for calico.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-07-17 07:34:28 -07:00
Andrew Rynhard
8e8aae98dd feat: add machined
This commit splits our current init into init and machined.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-16 13:12:21 -07:00
Brad Beam
e9482a4041 fix: Fix integration of extra kernel args
Switch from `StringSliceVar` to `StringArrayVar` to maintain commas
in kernel args.

Update entrypoint script to allow specifying extra kernel args.

Remove default console settings in kernel config.

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-07-16 14:38:55 -05:00
Andrew Rynhard
0c17564398 chore: move init to /sbin
In order to run Talos with ignite, we need to have init at /sbin/init.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-15 13:26:09 -07:00
Andrew Rynhard
d197d5c6cd feat: add install flag for extra kernel args
In addition to adding a flag, this adds a field to the user data that allows
for extra kernel arguments to be specified.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-12 13:27:44 -07:00
Spencer Smith
ff9934cfe2 chore: update toolchain version and output created config files
Decided to combine two very small changes (which I'm now grumpy at myself for doing).

First, we'll update the toolchain image versions to allow for the use of a new containerd and runc. Also updated go.mod and go.sum to make use of newer containerd version. Closes #743 and #744.

Second, I added the bit of logic to osctl config generate to determine the working directory and let the user know that we created the various yaml files there. Closes #760.

Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-07-05 17:59:25 -04:00
Andrew Rynhard
5d8ee0a3a5 fix: use existing logic to perform reset
This PR moves the reset API to the init API definition.
It leverages the same code we use for upgrades.

Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-04 18:26:14 -07:00
Andrew Rynhard
cca60ed121
fix: probe specified install device (#818)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-07-02 20:46:29 -07:00
Andrey Smirnov
237e903f91 feat(osd): implement CRI inspector for containers (#817)
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-07-02 15:48:00 -07:00
Andrey Smirnov
0662af19d1 chore: seed math.rand PRNG on startup in every service (#801)
This is important as otherwise `math/rand` outputs predictable sequence
each time.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-28 11:03:15 -07:00
Andrey Smirnov
17f28d3461 feat(osctl): improve output of stats and ps commands (#788)
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-26 15:37:54 -07:00
Andrey Smirnov
6d5ee0ca80
feat(init): unify filesystem walkers for ls/cp APIs (#779)
This unifies low-level filesystem walker code for `ls` and `cp`.

New features:

* `ls` now reports relative filenames
* `ls` now prints symlink destination for symlinks
* `cp` now properly always reports errors from the API
* `cp` now reports all the errors back to the client

Example for `ls`:

```
osctl-linux-amd64 --talosconfig talosconfig ls -l /var
MODE          SIZE(B)   LASTMOD       NAME
drwxr-xr-x    4096      Jun 26 2019   .
Lrwxrwxrwx    4         Jun 25 2019   etc -> /etc
drwxr-xr-x    4096      Jun 26 2019   lib
drwxr-xr-x    4096      Jun 21 2019   libexec
drwxr-xr-x    4096      Jun 26 2019   log
drwxr-xr-x    4096      Jun 21 2019   mail
drwxr-xr-x    4096      Jun 26 2019   opt
Lrwxrwxrwx    6         Jun 21 2019   run -> ../run
drwxr-xr-x    4096      Jun 21 2019   spool
dtrwxrwxrwx   4096      Jun 21 2019   tmp
-rw-------    14979     Jun 26 2019   userdata.yaml
```

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-26 17:43:09 +03:00
Seán C. McCord
81163cefb4 feat(osd): extend Routes API (#756)
Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-06-22 08:03:13 -07:00
Andrey Smirnov
76071abbb8
feat(init): move 'ls' API to init from osd (#755)
Service `osd` doesn't have access to rootfs, as it is running in a
container, so move API to `init` which has unconstrained access to
rootfs. (This is in line with another API, `osctl cp`).

Fixes: #752

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-21 22:29:39 +03:00
Andrew Rynhard
1f36f0e7df
refactor(osctl): use UserHomeDir to detect user home directory (#749)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-06-20 17:57:57 -07:00
Andrey Smirnov
9ed45f7090 feat(osctl): implement 'cp' to copy files out of the Talos node (#740)
Actual API is implemented in the `init`, as it has access to root
filesystem. `osd` proxies API back to `init` with some tricks to support
grpc streaming.

Given some absolute path, `init` produces and streams back .tar.gz
archive with filesystem contents.

`osctl cp` works in two modes. First mode streams data to stdout, so
that we can do e.g.: `osctl cp /etc - | tar tz`. Second mode extracts
archive to specified location, dropping ownership info and adjusting
permissions a bit. Timestamps are not preserved.

If full dump with owner/permisisons is required, it's better to stream
data to `tar xz`, for quick and dirty look into filesystem contents
under unprivileged user it's easier to use in-place extraction.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-20 17:02:58 -07:00
Andrey Smirnov
0c0a0340b2
fix(osctl): allow '-target' flag for osctl restart (#732)
I couldn't find any use for the `timeout` flag nor the value passed in
the API, but it block much more useful and present in other commands
flag 'target'.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-14 21:37:57 +03:00
Andrey Smirnov
fb320a894b
fix(osctl): Revert "display non-fatal errors from ps/stats in osctl (#724)" (#727)
This reverts commit f200eb7a8a0b7c2d29710f695000eb7680ce8b7d.

grpc can't send back both response and an error.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-07 22:50:05 +03:00
Seán C. McCord
532a53bfaf feat(init): Implement 'ls' command (#721)
Fixes #719

Signed-off-by: Seán C McCord <ulexus@gmail.com>
2019-06-07 10:19:20 -07:00
Andrey Smirnov
f5969d2c6c
fix(osctl): avoid panic on empty 'talosconfig' (#725)
When talosconfig doesn't exist, `osctl` creates empty one behind the
scenes, but that leads to immediate panic if the command tries to build
osd client:

```
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x11d6786]

goroutine 1 [running]:
github.com/talos-systems/talos/cmd/osctl/pkg/client.NewDefaultClientCredentials(0x7ffd720f5100, 0xb, 0xc000559ce8, 0x757014, 0xc0000d5500)
	/src/cmd/osctl/pkg/client/client.go:50 +0xa6
github.com/talos-systems/talos/cmd/osctl/cmd.setupClient(0x16ca3f0)
	/src/cmd/osctl/cmd/root.go:100 +0x3d
github.com/talos-systems/talos/cmd/osctl/cmd.glob..func22(0x24ad7c0, 0xc00058c240, 0x0, 0x3)
	/src/cmd/osctl/cmd/ps.go:32 +0x37
github.com/spf13/cobra.(*Command).execute(0x24ad7c0, 0xc0005f8a00, 0x3, 0x4, 0x24ad7c0, 0xc0005f8a00)
	/toolchain/gopath/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:766 +0x2ae
github.com/spf13/cobra.(*Command).ExecuteC(0x24ae140, 0x2507030, 0x162f2d7, 0xb)
	/toolchain/gopath/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:852 +0x2ec
github.com/spf13/cobra.(*Command).Execute(...)
	/toolchain/gopath/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:800
github.com/talos-systems/talos/cmd/osctl/cmd.Execute()
	/src/cmd/osctl/cmd/root.go:93 +0x24f
main.main()
	/src/cmd/osctl/main.go:10 +0x20
```

Fix that by returning explicit error:

```
error getting client credentials: 'context' key is not set in the config
```

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-07 17:40:28 +03:00
Andrey Smirnov
f200eb7a8a
fix(osctl): display non-fatal errors from ps/stats in osctl (#724)
Logging those errors in osd makes them hard to discover.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-06-07 17:10:18 +03:00
Spencer Smith
921114dd99
fix: ensure index remains in bounds for ud gen (#710)
Signed-off-by: Spencer Smith <robertspencersmith@gmail.com>
2019-06-04 17:37:54 -04:00
Andrey Smirnov
f96d3ce7cb
fix(osctl): don't print message on first ^C (#704)
This resolves extra messages when user does ^C to stop osctl. Message is
still printed on the second ^C and process is aborted on the third.

For the `logs` command, as it is streaming, suppress context canceled
error (before context changes process was crashing before printing an error).

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-05-31 23:37:57 +03:00
Brad Beam
8537e7eeb6
feat(init): Add support for control plane join config (#700)
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-05-31 12:21:00 -05:00
Andrey Smirnov
ca95469247
feat(osctl): handle ^C by aborting context (#693)
This provides a bit better handling for the handing grpc
requests (or just slow requests):

```
$ osctl-linux-amd64 --talosconfig talosconfig version
Client:
	Tag:         ad410fb-dirty
	SHA:         ad410fb-dirty
	Built:
	Go version:  go1.12.5
	OS/Arch:     linux/amd64

^CSignal received, aborting, press Ctrl+C once again to abort immediately...
error getting version: rpc error: code = Canceled desc = context canceled
```

For now we catch `SIGINT` & `SIGTERM`. Second signal kills process
immediately as signal handler is removed.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-05-30 00:11:58 +03:00
Andrey Smirnov
ad410fb7f2
refactor(osctl): move cli code out of 'client' package (#692)
This moves cli code (rendering output, etc.) out of 'client' package, so
that client package is usable outside of cli.

Consistently accept context as first param to API methods, so that we
can build graceful request cancellation.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-05-29 01:10:25 +03:00
Brad Beam
6cf260c5af fix(osctl): Generate correct config with master IPs (#681)
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-05-27 18:59:41 -07:00
Andrey Smirnov
f704cb2cc3
refactor(osctl): DRY up osctl sources by using common client setup (#686)
Remove duplicated code which was setting up grpc client with common
method. Should have no functional changes otherwise.

Add args len check where missing.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-05-27 22:55:20 +03:00
Brad Beam
d8249c8779
refactor(init): Allow kubeadm init on controlplane (#658)
* refactor(init): Allow kubeadm init on controlplane

This shifts the cluster formation from init(bootstrap) and join(control plane)
to init(control plane).

This makes use of the previously implemented initToken to provide a TTL for
cluster initialization to take place and allows us to mostly treat all control
plane nodes equal. This also sets up the path for us to handle master upgrades
and not be concerned with odd behavior when upgrading the previously defined
init node.

To facilitate kubeadm init across all control plane nodes, we make use of the
initToken to run `kubeadm init phase certs` command to generate any missing
certificates once. All other control plane nodes will attempt to sync the
necessary certs/files via all defined trustd endpoints and being the startup
process.

* feat(init): Add service runner context to PreFunc

Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-05-24 16:05:49 -05:00
Brad Beam
b6a01d6e5b
fix: Address lint warning for unknown linter (#676)
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-05-21 10:59:13 -05:00
Brad Beam
a64de7ed51
feat(init): Add initToken parameter to userdata (#664)
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-05-20 14:23:38 -05:00
Andrew Rynhard
496bb83078
feat: add plural alias of service command (#670)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-05-18 09:17:09 -07:00
Andrey Smirnov
75b2ce7fd2
feat(init): implement services list API and osctl service CLI (#662)
This returns list of all the services registered, with their current
status, past events, health state, etc.

New CLI is `osctl service [<id>]`: without `<id>` it prints list of all
the services, with specific `<id>` it provides details for a service.

I decided to create "parallel" data structures in protobuf as Go
structures don't map nicely onto what protoc generates: pointers vs.
values, additional fields like mutexes, etc. Probably there's a better
approach, I'm open for it.

For CLI, I tried to keep CLI stuff in `cmd/` package, and I also created
simple wrapper to remove duplicated code which sets up client for each
command.

Examples:

```
$ osctl service
SERVICE      STATE     HEALTH   LAST CHANGE   LAST EVENT
containerd   Running   OK       21s ago       Health check successful
kubeadm      Running   ?        2s ago        Started task kubeadm (PID 280) for container kubeadm
kubelet      Running   ?        0s ago        Started task kubelet (PID 383) for container kubelet
ntpd         Running   ?        14s ago       Started task ntpd (PID 129) for container ntpd
osd          Running   ?        14s ago       Started task osd (PID 126) for container osd
proxyd       Waiting   ?        14s ago       Waiting for conditions
trustd       Running   ?        14s ago       Started task trustd (PID 125) for container trustd
udevd        Running   ?        14s ago       Started task udevd (PID 130) for container udevd
```

```
$ osctl service proxyd
ID       proxyd
STATE    Running
HEALTH   ?
EVENTS   [Preparing]: Running pre state (22s ago)
         [Waiting]: Waiting for conditions (22s ago)
         [Preparing]: Creating service runner (6s ago)
         [Running]: Started task proxyd (PID 461) for container proxyd (6s ago)
```

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
2019-05-17 18:01:12 +03:00
Andrew Rynhard
18a1536b01
feat: use osctl in installer (#654)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-05-15 16:14:30 -07:00
Brad Beam
0b33280915
feat(init): Add upgrade endpoint (#623)
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-05-13 15:15:25 -05:00
Brad Beam
5485b9ecb4
fix(osctl): Fix panic on osctl df if error is returned (#646)
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-05-11 21:52:10 -05:00
Brad Beam
3d5d419b93 fix(osctl): Fix formatting of command/args to be useful (#638)
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
2019-05-09 20:58:36 -07:00
Andrew Rynhard
f0e162a7f5
refactor: move osinstall into osctl (#629)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-05-09 08:49:32 -07:00
Andrew Rynhard
9b5b2f0c7c
fix(osctl): output talosconfig on generate (#627)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-05-08 20:27:50 -07:00
Andrew Rynhard
2ea7e055a2
feat(osctl): add flag for number of workers to create (#625)
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
2019-05-08 07:29:22 -07:00