IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
After base62+golomb decoding, most deltas are under 65536 (in Provides
versions, average delta should be around 1536). So the whole version
can be stored using short deltas, effectively halving memory footprint.
However, this seems to be somewhat slower: per-delta copying and
decode_golomb must be invoked to recover hash values. On the other
hand, this allows to increase cache size (128 -> 192). But note that,
with larger cache sizes, LRU linear search will take longer. So this is
a compromise - and apparently a favourable one.
Currently, set.c uses array of chars to store bit vector. Each char
stores one bit: 0 or 1.
Let's use packed bitmap instead. It creates room for optimizations.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
In build.c:checkSpec(), build dependencies are checked by creating
a transaction, adding source header to the transaction and then running
rpmdepCheck(). Source headers have only BuildRequies and BuildConflicts
types of dependencies (no regular Requires and Conflicts). Also, source
packages have no Provides, but they have NAMES. When a self-conflicting
package is installed, its self conflict will be triggered by the source
package name.
To fix the problem, note that binary packages explicitly provide
their N=EVR among Provides; and source packages provides nothing,
even the name. So the solution is as simple as not to check
the dependencies against package names.
Update: also, do not check installed Requires against erasedPackages names.
To check the dependencies of an installed package, a "transaction" is
created, and the package is added to the transaction. The transaction
is then checked with rpmdepCheck(). However, since the installed
package has not been marked for removal, a conflict can be triggered
between the installed and that of transaction copies of the package.
The right thing to do is to mark package for removal, re-add it to
the transaction, and then to check the dependencies.
Header instance is its number in /var/lib/rpm/Packages database.
When a header comes from the database, it is sometimes useful to know
its instance (I need this to adjust verify.c:verifyDependencies() for
self-conflicting packages). On the contrary, setting instance numbers
should happen only within librpmdb, which is why headerSetInstance()
comes with hidden visibility.
Only for the last two weeks or so, the issues has been raised twice.
By specifying "Provides: foo, Conflicts: foo", people expect that
other packages which provide "foo" will not be installed along with
the package. What people don't anticipate is that the package will
conflict with itself, and will not be installed at all. This is where
apt and rpm differ. In apt, "conflicts may never self match". In rpm,
Requires and Conflicts are handled in exactly the same way, except that
Requires should match, and Conflicts should not match (I call this
a symmetry). Both can match against the package they come from.
So, to permit self-conflicting packages, I have to break the symmetry
and pass additional argument which indicates the type of dependency
being processed (either Requires or Conflicts). The code is then
adjusted to discard self-matching Conflicts.
Obsoletes should be handled specially, too. In tsSatisfiesDepend(),
I attempt to handle the Obsoletes case as well. It is rather
unfortunate that, in rpmdepCheck(), Obsoletes are simply not checked
just yet.
Let's try to use recent libbeectypt in our World Best RPM.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
My previous assumption that strdup() was not needed for dbProvCache keys
was wrong. The keys point to header memory, which is right. However,
those are not only ts->addedPackages and ts->erasedPackages headers.
In checkDependent(), headers come from rpmdb and are disposed immediately
after the check.
For erasedPackages, the dirIndex and provIndex are unused, and
thus should not be created at all. There is arguably a better
option - to provide explicit alMakeIndex and alFreeIndex API.
Based on rpm.org 2e76d0e6 by Panu Matilainen:
> Add in-memory hash for caching rpmdb dependency lookups
> - worst case behavior for uncached dependency lookups can be disastrous,
> eg > 35s vs < 1s on my laptop for trying to remove /bin/sh provider
> - we only bother caching rpmdb lookups, the other cases plenty fast already
> - using in-memory cache avoids nasty in vs out of chroot issues with
> temporary db files, which otherwise were about as fast
However, we do not use full-blown printDepend-based caching (i.e.
we no longer cache depends with versions). This is because, well,
dependency versions are likely to differ. This is especially true
if we consider upcoming set-versions for soname symbols - hashing
symbol sets here will be just a waste of time and memory. And so
now we cache satisfied/unsatisfied depends by just name. Thus,
"yes" hit can be used immediately only for unversioned dependences.
Top 10 dependencies which will be handled by the cache:
$ rpm -qaR |grep -v rpmlib |grep -v = |sort |uniq -c |sort -n |tail
245 /usr/lib/perl5/vendor_perl
311 libm.so.6(GLIBC_2.2.5)(64bit)
386 libpthread.so.0(GLIBC_2.2.5)(64bit)
454 /lib64/ld-linux-x86-64.so.2
548 libc.so.6(GLIBC_2.3)(64bit)
587 /bin/sh
828 libc.so.6(GLIBC_2.3.4)(64bit)
906 libc.so.6(GLIBC_2.4)(64bit)
1128 rtld(GNU_HASH)
1140 libc.so.6(GLIBC_2.2.5)(64bit)
$
Top 10 dependencies which will not be handled by the cache:
$ rpm -qaR |grep -v rpmlib |grep -e = |sort |uniq -c |sort -n |tail
13 python-base = 2.6.5-alt2
14 mono(mscorlib) = 1.0
15 qt4-common = 4.6.2-alt6
16 mono(mscorlib) = 2.0
18 mktemp >= 1:1.3.1
20 koffice-common = 4:2.2.0-alt2
20 perl-base >= 1:5.8.0
23 alternatives >= 0:0.4
49 libqt4-core >= 4.6.2
54 perl-base >= 1:5.6.0
$
Here's a simple test to see if the cache works (using Panu's example -
trying to remove /bin/sh).
(before this change)
$ time LD_LIBRARY_PATH=$PWD/1 rpm -e --test sh 2>&1 |tail
/bin/sh is needed by groff-base-1.20.1-alt0.20091013
/bin/sh is needed by groff-base-1.20.1-alt0.20091013
/bin/sh is needed by groff-base-1.20.1-alt0.20091013
/bin/sh is needed by libgnome-sharp-2.24.1-alt1
/bin/sh is needed by kernel-image-std-def-2.6.32-alt15
/bin/sh is needed by kernel-image-std-def-2.6.32-alt15
/bin/sh is needed by kde4libs-4.4.5-alt1
/bin/sh is needed by kde4base-runtime-core-4.4.5-alt1
/bin/sh is needed by kde4base-konqueror-4.4.5-alt1
/usr/lib/bash is needed by bash-builtin-lockf-0.3.1-alt1
rpm -e --test sh 2>&1 6.18s user 3.44s system 94% cpu 10.182 total
$
(after this change)
$ time LD_LIBRARY_PATH=$PWD/2 rpm -e --test sh 2>&1 |tail
/bin/sh is needed by groff-base-1.20.1-alt0.20091013
/bin/sh is needed by groff-base-1.20.1-alt0.20091013
/bin/sh is needed by groff-base-1.20.1-alt0.20091013
/bin/sh is needed by libgnome-sharp-2.24.1-alt1
/bin/sh is needed by kernel-image-std-def-2.6.32-alt15
/bin/sh is needed by kernel-image-std-def-2.6.32-alt15
/bin/sh is needed by kde4libs-4.4.5-alt1
/bin/sh is needed by kde4base-runtime-core-4.4.5-alt1
/bin/sh is needed by kde4base-konqueror-4.4.5-alt1
/usr/lib/bash is needed by bash-builtin-lockf-0.3.1-alt1
rpm -e --test sh 2>&1 0.11s user 0.09s system 91% cpu 0.218 total
$
Before this change, only rpm headers for addedPackages were loaded
during transaction; for removedPackages, we only stored the list
of rpmdb header numbers (numeric instances). With this change,
removedPackages' headers are getting loaded, too (the list is called
erasedPackages). The reason is that, when a header is loaded,
it is possible to use pointers to point e.g. into header strings
without using strdup. This might come useful as we try to reimplement
the depends cache.
Based on rpm.org 6bc5d870 by Panu Matilainen:
> Rip out dependency caching
> - it doesn't speed up things that much, is broken in some chroot
> scenarios and is ugly ugly hardwired BDB hackery where it doesn't belong
based rpm.org changes by Panu Matilainen:
fb2a6cb Make rpmdb index list hard-wired
e23a2bf Remove unused require- and provideversion indexes
2a52cc8 Remove unused _DBI defines
Some code (e.g. apt/genpkglist) explicitly relies on the fact that
header file list is represented with baseNames+dirNames+dirIndexes
arrays. Thus, generating legacy headers might have issues, and should
be disabled.
- continue processing as long as progress can be made instead of artificial
hardcoded magic "try ten times"
[rpm.org f39d2432f74bdc328ceafa8abc6cac517e02c73b]
The code in question uses an improvised "strntoul" function (hidden
behind the GET_NUM_FIELD macro) which returns "int".
int cpioHeaderRead(FSM_t fsm, struct stat * st)
...
GET_NUM_FIELD(hdr.filesize, st->st_size);
When a file size undergoes an "int bottleneck", it cannot be safely
converted back to an unsigned 64-bit integer. By the C rules, if the
size is in the range 2G..4G-1, int becomes negative (or this may be
undefined behaviour already, I'm not a language lawyer), and conversion
to unsigned 64-bit is performed as if by adding 2^64 to the negative
value.
So you get a huge 64-bit file size. Funnily enough, if you truncate it
to 32 bits, it's back to normal! That's why things worked with 32-bit
size_t.
static int expandRegular(/*@special@*/ FSM_t fsm)
...
size_t left = st->st_size;
- Implemented limited support for large files: a 2GB+ file can now be packaged,
but the total size of uncompressed cpio payload is capped at 4 GB.
- Automatically downgrade LZMA compression levels 7-9 -> 6 on small payloads.
This is probably the last change of such kind. There are many other
instances left, and fixing them all is hopeless. On the other hand,
the way mod 2^32 arithmetic works, although technically not always
well defined, is to our advantage. I suggest that only user-visible
discrepancies further be fixed.
The following comparison to st_size looks particularly bad.
The code turns out to be unused!
lib/signature.c:
> verifySizeSignature(const char * datafile, int_32 size, /*@out@*/ char * result)
> [...]
> if (size != st.st_size) {
> sprintf(result, "Header+Archive size mismatch.\n"
Some of the preceding code is probably undefined or unspecified behavior,
but there's no easy way to fix it other than rewriting, which I'm not
going to do. Surprisingly enough, the code just happens to work, due to
a series of mutual cancellations mod 2^32. As they say in Russian,
the war will write off all. Likewise, mod 2^32 arithmetic can write off
a multitude of sins (James 5:20).
> static inline rpmRC checkSize(FD_t fd, int siglen, int pad, int datalen)
[...]
> int delta;
[...]
> delta = (sizeof(struct rpmlead) + siglen + pad + datalen) - st.st_size;
Here, the expression in parentheses yields a different numeric value
depending on whether datalen is signed or unsigned. However, when delta
is finally truncated to 32 bits, the result turns out to be the same.
> switch (delta) {
> case -32: /* XXX rpm-4.0 packages */
> case 32: /* XXX Legacy headers have a HEADER_IMAGE tag added. */
> case 0:
The diff context is just big enough to see what happens.
RPMTAG_FILESIZES are interpreted as signed integers, and st_size which
has type off_t is also signed. So the 32-bit file size form the header
gets sign-extended, after which the equality to st_size does not hold.
This tag represents binary package build characteristic: if two binary
packages have equal RPMTAG_IDENTITY values, it means that these packages
have no significant differences.
One of the applications of RPMTAG_IDENTITY is reproducible build
verification.
Signed-off-by: Vladimir D. Seleznev <vseleznv@altlinux.org>
This tag is needed to track automatically installed packages with
rpmdb. Zero value means that a package was installed manually, other
values mean that the package was installed automatically as some else
package dependency.
Signed-off-by: Vladimir D. Seleznev <vseleznv@altlinux.org>
- package.c (readPackageHeaders): Use posix_fadvise(2) to disable readahead.
When scanning a large number of packages (with e.g. rpmquery), readahead
might cause negative effects on the buffer cache.