IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Currently we generate a signature for the actual composefs image, and
then we apply that when we enable fsverity on the composefs
image. However, there are some issues with this.
First of all, such a signed fs-verity image file can only be read if
the corresponding puiblic keyring is loaded into the fs-verity
keyring. In a typical secure setup we will have a per-commit key that
is loaded from the initrd. Additionally, the keyring is often sealed
to avoid loading more keys later.
This means you can only ever mount (or even look at) composefs images
from the current boot. While this is not a huge issue it is something
of a pain for example when debugging things.
Secondly, and more problematic, during a deploy we can't enable
fs-verity on the newly created composefs file, because and at that
point you need to pass in the signature. Unfortunately this will fail
if the matching public key is not in the keyring, which will fail for
similar reasons as the first issue.
The current workaround is to *not* enable fs-verity during deploy, but
write the signature to a file. Then the first time the particular
commit is booted we apply the signature to the iamge. This works
around issue two, but not issue one. But it causes us to do a lot of
writes and computation during the first boot as we need to write the
fs-verity merkle tree to disk. It would be much better and robust if
the merkle tree could be written during the deployment of the update
(i.e. before boot).
The new apporach is to always deploy an unsigned, but fs-verity
enabled composefs image. Then we create separate files that contain
the expected digest, and a signature of that file. On the first boot
we sign the digest file, and on further boots we can just verify
that it is signed before using it.
This fixes issue 1, since all deploys are always readable, and it
makes the workaround for issue 2 much less problematic, as we only
need to change a much smaller file on the first boot.
Long term I would like to avoid the first-boot writing totally, and
I've been chatting with David Howells (kernel keyring maintainer) and
he proposed adding a new keyring syscall that verifies a PKCS#7
signature from userspace directly. This would be exactly what
fs-verity does, except we wouldn't have to write the digest to disk
during boot, we would just read the digest file and the signature file
each boot and ask the kernel to verify it.
Some kernel images are delivered in a signed kernel + cmdline +
initramfs + dtb blob. When this is added to the commit server side, only
after this do you know what the cmdline is, this creates a recursion
issue. To avoid this, in the case where we have ostree=aboot karg
set, create a symlink after deploy to the correct ostree target in the
rootfs, as the cmdline can't be malleable and secured client-side at
the same time.
In an installation environment (like a live ISO) we may
not have significant space outside of the target installation
repository.
There's no reason not to always open a linkable tempfile. In
the future we should fix the pull path to verify the checksum
and then just directly link in the object instead of copying.
Closes: https://github.com/ostreedev/ostree/issues/2571
This commit addresses a bug that was causing ostree deployment
to become corrupted on the large fs, when any package was installed using
'rpm-ostree install'.
In such instances, multiple files were assigned the same inode. For
example, the '/home' directory and a regular file 'pkg-get' were
assigned the same inode (2147484070), making the deployment unusable.
A root cause analysis was performed, running the process under gdb,
which revealed a lossy conversion from guint64 to guint32, for example
6442451366 converted to 2147484070:
(gdb) p name
$10 = 0x7fe9224d2d70 "home"
(gdb) p inode
$73 = 6442451366
(gdb) s
device=66311, modifier=0x7fe914791840) at
src/libostree/ostree-repo-commit.c:1590
The conversion resulted in entirely independent files potentially
receiving the same inode.
The issue was discovered on PoC machine equipped with a large NVME
(3.4TB), but the bug can be easily reproduced using `cosa run -m 4000
--qemu-size +3TB', followed by installation of any package using
`rpm-ostree install`. The resulting deployment will be unusable due to
many files being "corrupted" by the aforementioned issue.
We can't safely apply the fs-verity with signature until we have
booted with the new initrd, because the public key that matches the
signature is loaded from it. So, instead we save the .sig file next
to the compoosefs, and on the first boot we detect that it is there, and
the composefs file isn't fs-verity, so we apply it.
Things get a bit more complex due to having to temporarily make
/sysroot read-write for the fsverity operation too.
Instead of using pkg-config, etc we just include composefs.
In the end the library is just 5 c source files, and it is set up
to be easy to use as a submodule.
For now, composefs support is disabled by default.
When using composefs the root fs will always be read-only, but in this
case we should still continue remounting /sysroot. So, we record a
/run/ostree-composefs-root.stamp file in ostree-prepare-root if composefs
is used, and then react to it in ostree-remount.
In the case of composefs, we cannot compare the devino of the rootfs
and the deploy dir, because the root is the composefs mount, not a
bind mount. Instead we check the devino of the etc subdir of the
deploy, because this is a bind mount even when using composefs.
This changes ostree-prepare-root to use the .ostree.cfs image as a
composefs filesystem, instead of the checkout.
By default, composefs is used if support is built in and the .ostree.cfs
file exists in the deploy dir, otherwise we fall back to the old
method. However, if the ot-composefs kernel option is specified this
can be tweaked as per:
* off: Never use composefsz
* maybe: Use if possible
* on: Fail if not possible
* signed: Fail if the cfs image is not fs-verity signed with
a key in the keyring.
* digest=....: Fail if the cfs image does not match the specified
digest.
The final layout when composefs is active is:
/ ro overlayfs mount for composefs
/sysroot "real" root
/etc rw bind mount to $deploydir/etc
/var rw bind mount to $vardir
We also specify the $deploydir/.ostree-mnt directory as the (internal)
mountpoint for the erofs mount for composefs. This can be used to map
the root fs back to the deploy id/dir in use,
A further note: I didn't test the .usr-ovl-work overlayfs case, but a
comment mentions that you can't mount overlayfs on top of a readonly
mount. That seems incompatible with composefs. If this is needed we
have to merge that with the overlayfs that composefs itself sets up,
which is possible with the libcomposefs APIs.
In many cases, such as when using osbuild, we are not preparing the final
deployment but rather a rootfs tree that will eventually be copied to the
final location. In that case we don't want to apply the signature directly
but when the deployment is copied in place.
To make this situateion workable we also write the signature to a file
next to the composefs image file. Then whatever mechanism that does
the final copy can apply the signature.
This can be used as a composefs source for the root fs instead of
the checkout by pointing the basedir to /ostree/repo/objects.
We only write the file is `composefs` is enabled.
We enable ensure_rootfs_dirs when building the image which adds the
required root dirs to the image. In particular, this includes /etc
which often isn't in ostree commits in use.
We also create an (empty) .ostree.mnt directory, where composefs
will mount the erofs image that will be used as overlayfs lowerdir
for the root overlayfs mount. This way we can find the deploy
dir from the root overlayfs mount options.
If the commit has composefs digests recorded we verify those with the
created file. It also applies the fs-verity signature if it is
recorded, unless this is disabled with the
ex-integrity.composefs-apply-sign=false option.
If `composefs-apply-sig` is enabled (default no) we add an
ostree.composefs digest to the commit metadata. This can be verified
on deploy.
This is a separate option from the generic `composefs` option which
controls whether composefs is used during deploy. It is separate
because we want to not have to force use of fs-verity, etc during the
build.
If the `composefs-certfile` and `composefs-keyfile` keys in the
ex-integrity group are set, then the commit metadata also gets a
ostree.composefs-sig containing the signature of the composefs file.
This supports checking out a commit into a tree which is then
converted into a composefs image containing fs-verity digests for all
the regular files, and payloads that are relative to a the
`repo/objects` directory of a bare ostree repo.
Some specal files are always created in the image. This ensures that
various directories (usr, etc, boot, var, sysroot) exists in the
created image, even if they were not in the source commit. These are
needed (as bindmount targets) if you want to boot from the image. In
the non-composefs case these are just created as needed in the checked
out deploydir, but we can't do that here.
This is all controlled by the new ex-integrity config section, which
has the following layout:
```
[ex-integrity]
fsverity=yes/no/maybe
composefs=yes/no/maybe
composefs-apply-sig=yes/no
composefs-add-metadata=yes/no
composefs-keyfiile=/a/path
composefs-certfile=/a/path
```
The `fsverity` key overrides the old `ex-fsverity` section if
specified. The default for all these is for the new behaviour to be
disabled. Additionally, enabling composefs implies fsverity defaults
to `maybe`, to avoid having to set both.
The `f_bfree` member of the `statvfs` struct is documented as the
"number of free blocks". However, different filesystems have different
interpretations of this. E.g. on XFS, this is truly the number of blocks
free for allocating data. On ext4 however, it includes blocks that
are actually reserved by the filesystem and cannot be used for file
data. (Note this is separate from the distinction between `f_bfree` and
`f_bavail` which isn't relevant to us here since we're privileged.)
If a kernel and initrd is sized just right so that it's still within the
`f_bfree` limit but above what we can actually allocate, the early prune
code won't kick in since it'll think that there is enough space. So we
end up hitting `ENOSPC` when we actually copy the files in.
Rework the early prune code to instead use `fallocate` which guarantees
us that a file of a certain size can fit on the filesystem. `fallocate`
requires filesystem support, but all the filesystems we care about for
the bootfs support it (including even FAT).
(There's technically a TOCTOU race here that existed also with the
`statvfs` code where free space could change between when we check
and when we copy. Ideally we'd be able to pass down that fd to the
copying bits, but anyway in practice the bootfs is pretty much owned by
libostree and one doesn't expect concurrent writes during a finalization
operation.)
Flathub has hit the 10MB limit in 2022, and we had to drop less popular
CPU architectures from the main summary to subsummaries, effectively
cutting off users running too old Flatpak version. Despite that, the
main summary containing only x86_64 is already at 7MB. As this is
eventually going to happen to subsummaries as well, preemptively bump
the limit 12 times.
It takes between 2 and 3 years for a change like this to roll out across
Linux distributions so the best time for this was yesterday.
fixes#2715
Main motivation is prep for composefs in
https://github.com/ostreedev/ostree/pull/2640
In the interest of that, we add a `bool using_composefs` but
it's currently always `false`.
Co-authored-by: Alexander Larsson <alexl@redhat.com>