3dd132285a
I noticed that Ubuntu also uses the original "jigdo", so let's start pulling off the band-aid here and do a mass rename. For this first pass I'm focusing on CLI entrypoints and docs, as that's what people are going to see; renaming all of the internal C functions, structure variables etc. can come later. Closes: #1269 Approved by: jlebon
97 lines
4.9 KiB
Markdown
97 lines
4.9 KiB
Markdown
Introducing rpm-ostree rojig
|
|
--------
|
|
|
|
In the rpm-ostree project, we're blending an image system (libostree)
|
|
with a package system (libdnf). The goal is to gain the
|
|
advantages of both. However, the dual nature also brings overhead;
|
|
this proposal aims to reduce some of that by adding a new "rojig"
|
|
model to rpm-ostree that makes more operations use the libdnf side.
|
|
|
|
To do this, we're reviving an old idea: The [http://atterer.org/jigdo/](Jigdo)
|
|
approach to reassembling large "images" by downloading component packages. (We're
|
|
not using the code, just the idea).
|
|
|
|
In this approach, we're still maintaining the "image" model of libostree. When
|
|
one deploys an OSTree commit, it will reliably be bit-for-bit identical. It will
|
|
have a checksum and a version number. There will be *no* dependency resolution
|
|
on the client by default, etc.
|
|
|
|
The change is that we always use libdnf to download RPM packages as they exist
|
|
today, storing any additional data inside a new "ostree-image" RPM. In this
|
|
proposal, rather than using ostree branches, the system tracks an "ostree-image"
|
|
RPM that behaves a bit like a "metapackage".
|
|
|
|
Why?
|
|
----
|
|
|
|
The "dual" nature of the system appears in many ways; users and administrators
|
|
effectively need to understand and manage both systems.
|
|
|
|
An example is when one needs to mirror content. While libostree does support
|
|
mirroring, and projects like Pulp make use of it, support is not as widespread
|
|
as mirroring for RPM. And mirroring is absolutely critical for many
|
|
organizations that don't want to depend on Internet availability.
|
|
|
|
Related to this is the mapping of libostree "branches" and rpm-md repos. In
|
|
Fedora we offer multiple branches for Atomic Host, such as
|
|
`fedora/27/x86_64/atomic-host` as well as
|
|
`fedora/27/x86_64/testing/atomic-host`, where the latter is equivalent to `yum
|
|
--enablerepo=updates-testing update`. In many ways, I believe the way we're
|
|
exposing as OSTree branches is actually nicer - it's very clear when you're on
|
|
the testing branch.
|
|
|
|
However, it's also very *different* from the yum/dnf model. Once package
|
|
layering is involved (and for a lot of small scale use cases it will be,
|
|
particularly for desktop systems), the libostree side is something that many
|
|
users and administrators have to learn *in addition* to their previous "mental model"
|
|
of how the libdnf/yum/rpm side works with `/etc/yum.repos.d` etc.
|
|
|
|
Finally, for network efficiency; on the wire, libostree has two formats, and the
|
|
intention is that most updates hit the network-efficient static delta path, but
|
|
there are various cases where this doesn't happen, such as if one is skipping a
|
|
few updates, or today when rebasing between branches. In practice, as soon as
|
|
one involves libdnf, the repodata is already large enough that it's not worth
|
|
trying to optimize fetching content over simply redownloading changed RPMs.
|
|
|
|
(Aside: people doing custom systems tend to like the network efficiency of "pure
|
|
ostree" where one doesn't pay the "repodata cost" and we will continue to
|
|
support that.)
|
|
|
|
How?
|
|
----
|
|
|
|
We've already stated that a primary design goal is to preserve the "image"
|
|
functionality by default. Further, let's assume that we have an OSTree commit,
|
|
and we want to point it at a set of RPMs to use as the rojig source. The source
|
|
OSTree commit can have modified, added to, or removed data from the RPM set, and
|
|
we will support that. Examples of additional data are the initramfs and RPM
|
|
database.
|
|
|
|
We're hence treating the RPM set as just data blobs; again, no dependency
|
|
resolution, `%post` scripts or the like will be executed on the client. Or again
|
|
to state this more strongly, installation will still result in an OSTree commit
|
|
with checksum that is bit-for-bit identical.
|
|
|
|
A simple approach is to scan over the set of files in the RPMs, then the set
|
|
of files in the OSTree commit, and add RPMs which contain files in the OSTree
|
|
commit to our "rojig set".
|
|
|
|
However, a major complication is SELinux labeling. It turns out that in a lot of
|
|
cases, doing SELinux labeling is expensive; there are thousands of regular
|
|
expressions involved. However, RPM packages themselves don't contain labels;
|
|
instead the labeling is stored in the `selinux-policy-targeted` package, and
|
|
further complicating things is that there are also other packages that add
|
|
labeling configuration such as `container-selinux`. In other words there's a
|
|
circular dependency: packages have labels, but labels are contained in packages.
|
|
We go to great lengths to handle this in rpm-ostree for package layering, and we
|
|
need to do the same for rojig.
|
|
|
|
We can address this by having our OIRPM contain a mapping of (package, file
|
|
path) to a set of extended attributes (including the key `security.selinux`
|
|
one).
|
|
|
|
At this point, if we add in the new objects such as the metadata objects from
|
|
the OSTree commit and all new content objects that aren't part of a package,
|
|
we'll have our rojigRPM. (There is some [further complexity](https://pagure.io/fedora-atomic/issue/94) around
|
|
handling the initramfs and SELinux labeling that we'll omit for now).
|