694b798c73
Tracking issue: https://github.com/projectatomic/rpm-ostree/issues/1081 To briefly recap: Let's experiment with doing ostree-in-RPM, basically the "compose" process injects additional data (SELinux labels for example) in an "ostree image" RPM, like `fedora-atomic-host-27.8-1.x86_64.rpm`. That "ostree image" RPM will contain the OSTree commit+metadata, and tell us what RPMs we need need to download. For updates, like `yum update` we only download changed RPMs, plus the new "oirpm". But SELinux labeling, depsolving, etc. are still done server side, and we still have a reliable OSTree commit checksum. This is a lot like [Jigdo](http://atterer.org/jigdo/) Here we fully demonstrate the concept working end-to-end; we use the "traditional" `compose tree` to commit a bunch of RPMs to an OSTree repo, which has a checksum, version etc. Then the new `ex commit2jigdo` generates the "oirpm". This is the "server side" operation. Next simulating the client side, `jigdo2commit` takes the OIRPM and uses it and downloads the "jigdo set" RPMs, fully regenerating *bit for bit* the final OSTree commit. If you want to play with this, I'd take a look at the `test-jigdo.sh`; from there you can find other useful bits like the example `fedora-atomic-host.spec` file (though the canonical copy of this will likely land in the [fedora-atomic](http://pagure.io/fedora-atomic) manifest git repo. Closes: #1103 Approved by: jlebon
98 lines
4.9 KiB
Markdown
98 lines
4.9 KiB
Markdown
Introducing rpm-ostree jigdo
|
|
--------
|
|
|
|
In the rpm-ostree project, we're blending an image system (libostree)
|
|
with a package system (libdnf). The goal is to gain the
|
|
advantages of both. However, the dual nature also brings overhead;
|
|
this proposal aims to reduce some of that by adding a new "jigdo"
|
|
model to rpm-ostree that makes more operations use the libdnf side.
|
|
|
|
To do this, we're reviving an old idea: The [http://atterer.org/jigdo/](Jigdo)
|
|
approach to reassembling large "images" by downloading component packages. (We're
|
|
not using the code, just the idea).
|
|
|
|
In this approach, we're still maintaining the "image" model of libostree. When
|
|
one deploys an OSTree commit, it will reliably be bit-for-bit identical. It will
|
|
have a checksum and a version number. There will be *no* dependency resolution
|
|
on the client by default, etc.
|
|
|
|
The change is that we always use libdnf to download RPM packages as they exist
|
|
today, storing any additional data inside a new "ostree-image" RPM. In this
|
|
proposal, rather than using ostree branches, the system tracks an "ostree-image"
|
|
RPM that behaves a bit like a "metapackage".
|
|
|
|
Why?
|
|
----
|
|
|
|
The "dual" nature of the system appears in many ways; users and administrators
|
|
effectively need to understand and manage both systems.
|
|
|
|
An example is when one needs to mirror content. While libostree does support
|
|
mirroring, and projects like Pulp make use of it, support is not as widespread
|
|
as mirroring for RPM. And mirroring is absolutely critical for many
|
|
organizations that don't want to depend on Internet availability.
|
|
|
|
Related to this is the mapping of libostree "branches" and rpm-md repos. In
|
|
Fedora we offer multiple branches for Atomic Host, such as
|
|
`fedora/27/x86_64/atomic-host` as well as
|
|
`fedora/27/x86_64/testing/atomic-host`, where the latter is equivalent to `yum
|
|
--enablerepo=updates-testing update`. In many ways, I believe the way we're
|
|
exposing as OSTree branches is actually nicer - it's very clear when you're on
|
|
the testing branch.
|
|
|
|
However, it's also very *different* from the yum/dnf model. Once package
|
|
layering is involved (and for a lot of small scale use cases it will be,
|
|
particularly for desktop systems), the libostree side is something that many
|
|
users and administrators have to learn *in addition* to their previous "mental model"
|
|
of how the libdnf/yum/rpm side works with `/etc/yum.repos.d` etc.
|
|
|
|
Finally, for network efficiency; on the wire, libostree has two formats, and the
|
|
intention is that most updates hit the network-efficient static delta path, but
|
|
there are various cases where this doesn't happen, such as if one is skipping a
|
|
few updates, or today when rebasing between branches. In practice, as soon as
|
|
one involves libdnf, the repodata is already large enough that it's not worth
|
|
trying to optimize fetching content over simply redownloading changed RPMs.
|
|
|
|
(Aside: people doing custom systems tend to like the network efficiency of "pure
|
|
ostree" where one doesn't pay the "repodata cost" and we will continue to
|
|
support that.)
|
|
|
|
How?
|
|
----
|
|
|
|
We've already stated that a primary design goal is to preserve the "image"
|
|
functionality by default. Further, let's assume that we have an OSTree commit,
|
|
and we want to point it at a set of RPMs to use as the jigdo source. The source
|
|
OSTree commit can have modified, added to, or removed data from the RPM set, and
|
|
we will support that. Examples of additional data are the initramfs and RPM
|
|
database.
|
|
|
|
We're hence treating the RPM set as just data blobs; again, no dependency
|
|
resolution, `%post` scripts or the like will be executed on the client. Or again
|
|
to state this more strongly, installation will still result in an OSTree commit
|
|
with checksum that is bit-for-bit identical.
|
|
|
|
A simple approach is to scan over the set of files in the RPMs, then the set
|
|
of files in the OSTree commit, and add RPMs which contain files in the OSTree
|
|
commit to our "jigdo set".
|
|
|
|
However, a major complication is SELinux labeling. It turns out that in a lot of
|
|
cases, doing SELinux labeling is expensive; there are thousands of regular
|
|
expressions involved. However, RPM packages themselves don't contain labels;
|
|
instead the labeling is stored in the `selinux-policy-targeted` package, and
|
|
further complicating things is that there are also other packages that add
|
|
labeling configuration such as `container-selinux`. In other words there's a
|
|
circular dependency: packages have labels, but labels are contained in packages.
|
|
We go to great lengths to handle this in rpm-ostree for package layering, and we
|
|
need to do the same for jigdo.
|
|
|
|
We can address this by having our OIRPM contain a mapping of (package, file
|
|
path) to a set of extended attributes (including the key `security.selinux`
|
|
one).
|
|
|
|
At this point, if we add in the new objects such as the metadata objects from
|
|
the OSTree commit and all new content objects that aren't part of a package,
|
|
we'll have our OIRPM. (There is
|
|
some [further complexity](https://pagure.io/fedora-atomic/issue/94) around
|
|
handling the initramfs and SELinux labeling that we'll omit for now).
|