# Managing content in OSTree repositories Once you have a build system going, if you actually want client systems to retrieve the content, you will quickly feel a need for "repository management". OSTree itself does not currently come with tools to do this. One reason is that how content is delivered and managed has concerns very specific to the organization. For example, some operating system content vendors may want integration with a specific errata notification system. In this section, we will describe some high level ideas and methods for managing content in OSTree repositories, mostly independent of any particular model or tool. That said, a goal is to include at least some sample scripts and workflows upstream in a potential new "contrib" git repository. One example of software which can assist in managing OSTree repositories today is the [Pulp Project](http://www.pulpproject.org/), which has a [Pulp OSTree plugin](https://pulp-ostree.readthedocs.org/en/latest/). ## Separate development vs release repositories By default, OSTree accumulates server side history. This is actually optional in that your build system can (using the API) write a commit with no parent. But first, we'll investigate the ramifications of server side history. Many content vendors will want to separate their internal development with what is made public to the world. Therefore, you will want (at least) two OSTree repositories, we'll call them "dev" and "prod". To phrase this another way, let's say you have a continuous delivery system which is building from git and committing into your "dev" OSTree repository. This might happen tens to hundreds of times per day. That's a substantial amount of history over time, and it's unlikely most of your content consumers (i.e. not developers/testers) will be interested in all of it. The original vision of OSTree was to fulfill this "dev" role, and in particular the "archive-z2" format was designed for it. Then, what you'll want to do is promote content from "dev" to "prod". We'll discuss this later, but first, let's talk about promotion *inside* our "dev" repository. ## Promoting content along OSTree branches - "buildmaster", "smoketested" Besides multiple repositories, OSTree also supports multiple branches inside one repository, equivalent to git's branches. We saw in an earlier section an example branch name like `exampleos/x86_64/standard`. Choosing the branch name for your "prod" repository is absolutely critical as client systems will reference it. It becomes an important part of your face to the world, in the same way the "master" branch in a git repository is. But with your "dev" repository internally, it can be very useful to use OSTree's branching concepts to represent different stages in a software delivery pipeline. Deriving from `exampleos/x86_64/standard`, let's say our "dev" repository contains `exampleos/x86_64/buildmaster/standard`. We choose the term "buildmaster" to represent something that came straight from git master. It may not be tested very much. Our next step should be to hook up a testing system (Jenkins, Buildbot, etc.) to this. When a build (commit) passes some tests, we want to "promote" that commit. Let's create a new branch called `smoketested` to say that some basic sanity checks pass on the complete system. This might be where human testers get involved, for example. The build system can "promote" the `buildmaster` commit that passed testing like this: ``` ostree commit -b exampleos/x86_64/smoketested/standard -s 'Passed tests' --tree=ref=aec070645fe53... ``` Here we're generating a new commit object (perhaps include in the commit log links to build logs, etc.), but we're reusing the *content* from the `buildmaster` commit `aec070645fe53` that passed the smoketests. We can easily generalize this model to have an arbitrary number of stages like `exampleos/x86_64/stage-1-pass/standard`, `exampleos/x86_64/stage-2-pass/standard`, etc. depending on business requirements and logic. In this suggested model, the "stages" are increasingly expensive. The logic is that we don't want to spend substantial time on e.g. network performance tests if something basic like a systemd unit file fails on bootup. ## Promoting content between OSTree repositories Now, we have our internal continuous delivery stream flowing, it's being tested and works. We want to periodically take the latest commit on `exampleos/x86_64/stage-3-pass/standard` and expose it in our "prod" repository as `exampleos/x86_64/standard`, with a much smaller history. We'll have other business requirements such as writing release notes (and potentially putting them in the OSTree commit message), etc. In [Build Systems](buildsystem-and-repos.md) we saw how the `pull-local` command can be used to migrate content from the "build" repository (in `bare-user` mode) into an `archive-z2` repository for serving to client systems. Following this section, we now have three repositories, let's call them `repo-build`, `repo-dev`, and `repo-prod`. We've been pulling content from `repo-build` into `repo-dev` (which involves gzip compression among other things since it is a format change). When using `pull-local` to migrate content between two `archive-z2` repositories, the binary content is taken unmodified. Let's go ahead and generate a new commit in our prod repository: ``` checksum=$(ostree --repo=repo-dev rev-parse exampleos/x86_64/stage-3-pass/standard`) ostree --repo=repo-prod pull-local repo-dev ${checksum} ostree --repo=repo-prod commit -b exampleos/x86_64/standard \ -s 'Release 1.2.3' --add-metadata-string=ostree.version=1.2.3 \ --tree=ref=${checksum} ``` There are a few things going on here. First, we found the latest commit checksum for the "stage-3 dev", and told `pull-local` to copy it, without using the branch name. We do this because we don't want to expose the `exampleos/x86_64/stage-3-pass/standard` branch name in our "prod" repository. Next, we generate a new commit in prod that's referencing the exact binary content in dev. If the "dev" and "prod" repositories are on the same Unix filesystem, (like git) OSTree will make use of hard links to avoid copying any content at all - making the process very fast. Another interesting thing to notice here is that we're adding an `ostree.version` metadata string to the commit. This is an optional piece of metadata, but we are encouraging its use in the OSTree ecosystem of tools. Commands like `ostree admin status` show it by default. ## Derived data - static deltas and the summary file As discussed in [Formats](formats.md), the `archive-z2` repository we use for "prod" requires one HTTP fetch per client request by default. If we're only performing a release e.g. once a week, it's appropriate to use "static deltas" to speed up client updates. So once we've used the above command to pull content from `repo-dev` into `repo-prod`, let's generate a delta against the previous commit: ``` ostree --repo=repo-prod static-delta generate exampleos/x86_64/standard ``` We may also want to support client systems upgrading from *two* commits previous. ``` ostree --repo=repo-prod static-delta generate --from=exampleos/x86_64/standard^^ --to=exampleos/x86_64/standard ``` Generating a full permutation of deltas across all prior versions can get expensive, and there is some support in the OSTree core for static deltas which "recurse" to a parent. This can help create a model where clients download a chain of deltas. Support for this is not fully implemented yet however. Regardless of whether or not you choose to generate static deltas, you should update the summary file: ``` ostree --repo=repo-prod summary -u ``` (Remember, the `summary` command can not be run concurrently, so this should be triggered serially by other jobs). There is some more information on the design of the summary file in [Repo](repo.md). ## Pruning our build and dev repositories First, the OSTree author believes you should *not* use OSTree as a "primary content store". The binaries in an OSTree repository should be derived from a git repository. Your build system should record proper metadata such as the configuration options used to generate the build, and you should be able to rebuild it if necessary. Art assets should be stored in a system that's designed for that (e.g. [Git LFS](https://git-lfs.github.com/)). Another way to say this is that five years down the line, we are unlikely to care about retaining the exact binaries from an OS build on Wednesday afternoon three years ago. We want to save space and prune our "dev" repository. ``` ostree --repo=repo-dev prune --refs-only --keep-younger-than="6 months ago" ``` That will truncate the history older than 6 months. Deleted commits will have "tombstone markers" added so that you know they were explicitly deleted, but all content in them (that is not referenced by a still retained commit) will be garbage collected.