rpm-build

Author	SHA1	Message	Date
Dmitry V. Levin	3946369bfb	fsmStage: be careful with file permissions on package removal or upgrade Do not erase permissions from regular files on package removal or upgrade unless these files are both setXid and executable. It is legal to have regular system files linked somewhere, e.g. by chrooted installs, so we must be careful not to break these files.	2011-11-30 17:07:27 +00:00
Alexey Tourbin	55409f2b03	set.c: fixed assertion failure with malformed "empty set" set-string In decode_set_init(), we explicitly prohibit empty sets: // no empty sets for now if (str == '\0') return -4; This does not validate str character, since the decoder will check for errors anyway. However, this assumes that, otherwise, a non-empty set will be decoded. The assumption is wrong: it was actually possible to construct an "empty set" which triggered assertion failure. $ /usr/lib/rpm/setcmp yx00 yx00 setcmp: set.c:705: decode_delta: Assertion `c > 0' failed. zsh: abort /usr/lib/rpm/setcmp yx00 yx00 $ Here, the "00" part of the set-version yields a sequence of zero bits. Since trailing zero bits are okay, golomb decoding routine basically skips the whole sequence and returns 0. To fix the problem, we have to observe that only up to 5 trailing zero bits can be required to complete last base62 character, and the leading "0" sequence occupies 6 or more bits.	2011-10-03 05:28:00 +04:00
Alexey Tourbin	9e15c26f3f	removed support for repackaging and rollbacks (rpm.org)	2011-09-23 02:47:36 +04:00
Dmitry V. Levin	f74cea6470	Remove unsafe file permissions on package removal or upgrade Import rpm-4.2-owl-remove-unsafe-perms.diff from Owl, to remove unsafe file permissions (chmod'ing files to 0) on package removal or upgrade to prevent continued access to such files via hard-links possibly created by a user (CVE-2005-4889, CVE-2010-2059).	2011-09-07 21:37:40 +00:00
Alexey Tourbin	771548f6ec	set.c: increased cache size somewhat (128 -> 160) Below I use 'apt-shell <<<unmet' as a baseline for measurements. Cache performance with cache_size = 128: hit=39628 miss=22394 (64%) Cache performance with cache_size = 160: hit=42031 miss=19991 (68%) (11% fewer cache misses) Cache performance with cache_size = 160 pivot_size = 1 (plain LRU): hit=36172 miss=25850 (58%) Total number of soname set-versions which must be decoded at least once: miss=2173 (max 96%) callgrind annotations, 4.0.4-alt100.27: 3,904,042,289 PROGRAM TOTALS 1,378,794,846 decode_base62_golomb 1,176,120,148 rpmsetcmp 291,805,495 __GI_strcmp 162,494,544 __GI_strlen 162,222,530 msort_with_tmp'2 56,758,517 memcpy 53,132,375 __GI_strcpy ... callgrind annotations, this commit (rebuilt in hasher): 2,558,482,547 PROGRAM TOTALS 987,220,089 decode_base62_golomb 468,510,579 rpmsetcmp 162,222,530 msort_with_tmp'2 85,422,341 __GI_strcmp 82,063,609 bcmp 76,510,060 __GI_strlen 63,806,309 memcpy ... Inclusive rpmsetcmp annotation, this commit: 1,719,199,968 rpmsetcmp Typical execution time, 4.0.4-alt100.27: 1.87s user 0.29s system 96% cpu 2.242 total Typical execution time, this commit: 1.52s user 0.31s system 96% cpu 1.895 total Based on user time, this constitutes about 20% speed-up. For some reason, the speed-up is more noticable on i586 architecture (27%). Note that the cache should not be further increased, because of two reasons: 1) LRU search is linear - this is fixable; 2) cache memory cannot be reclaimed - this is unfixable. On average, the cache now takes 1.3M (max 2M). For small cache sizes, linear search is okay then (cache_decode_set costs about 20M Ir, which is less than memcmp). An interesting question is to what extent it is worth to increase the cache size, assuming that memory footprint is not an issue. A plausible answer is that decode_base62_golomb should cost no more than 1/2 of rpmsetcmp inclusive time, which is 987M Ir and 1,719M Ir respectively. So, Ideally, the cache should be increased up to the point where decode_base62_golomb takes about 700M Ir. Note, however, that using midpoint insertion technique seems to improve cache performance far more than simply increasing cache size.	2011-06-18 22:54:51 +04:00
Alexey Tourbin	d98cab549d	set.c: more redesign to avoid extra copying and strlen This partially reverts what's been introduced with previous commit. Realize that strlen() must be only called when allocating space for v[]. There is no reason to call strlen() for every Provides string, since most of them are decoded via the cache hit. Note, however, that now I have to use the following trick: memcmp(str, cur->str, cur->len + 1) == 0 I rely on the fact this works as expected even when str is shorter than cur->len. Namely, memcmp must start from lower addresses and stop at the first difference (i.e. memcmp must not read past the end of str, possibly except for a few trailing bytes on the same memory page); this is not specified by the standard, but this is how it must work. Also, since the cache now stores full decoded values, it is possible to avoid copying and instead to set the pointer to internal cache memory. Copying must be performed, however, when the set is to be downsampled. Note that average Provides set size is around 1024, which corresponds to base62 string length of about 2K and v[] of 4K. Saving strlen(2K) and memcpy(4K) on every rpmsetcmp call is indeed an improvement. callgrind annotations for "apt-cache unmet", 4.0.4-alt100.27 1,900,016,996 PROGRAM TOTALS 694,132,522 decode_base62_golomb 583,376,772 rpmsetcmp 106,136,459 __GI_strcmp 102,581,178 __GI_strlen 80,781,386 msort_with_tmp'2 38,648,490 memcpy 26,936,309 __GI_strcpy 26,918,522 regionSwab.clone.2 21,000,896 _int_malloc ... callgrind annotations for "apt-cache unmet", this commit (rebuilt in hasher): 1,264,977,497 PROGRAM TOTALS 533,131,492 decode_base62_golomb 230,706,690 rpmsetcmp 80,781,386 msort_with_tmp'2 60,541,804 __GI_strlen 42,518,368 memcpy 39,865,182 bcmp 26,918,522 regionSwab.clone.2 21,841,085 _int_malloc ...	2011-06-16 00:49:41 +04:00
Alexey Tourbin	91d560c35c	set.c: redesigned decode API to avoid extra strlen/cmp/cpy calls Now that string functions are expensive, the API is redesigned so that strlen is called only once, in rpmsetcmp. The length is then passed as an argument down to decoding functions. With the length argument, it is now possible to replace strcmp with memcmp and strcpy with memcpy.	2011-06-14 00:43:33 +04:00
Alexey Tourbin	4d6a444af4	set.c: minor cleanup and English fixes "Effectively avoided" means something like "prakticheski avoided" in Russian. Multiple escapse are not avoided "prakticheski", though; they are avoided altogether and "in principle". The right word does not come to mind.	2011-06-14 00:00:54 +04:00
Alexey Tourbin	68df596fd7	set.c: removed support for caching short deltas, shrinked cache Now that decode_base62_golomb is much cheaper, the question is: is it still worth to store short deltas, as opposed to storing full values at the expense of shrinking the cache? callgrind annotations for previous commit: 1,526,256,208 PROGRAM TOTALS 470,195,400 decode_base62_golomb 434,006,244 rpmsetcmp 106,137,949 __GI_strcmp 102,459,314 __GI_strlen ... callgrind annotations for this commit: 1,427,199,731 PROGRAM TOTALS 533,131,492 decode_base62_golomb 231,592,751 rpmsetcmp 103,476,056 __GI_strlen 102,008,203 __GI_strcmp ... So, decode_base62_golomb now takes more cycles, but the overall price goes down. This is because, when caching short deltas, two additional stages should be performed: 1) short deltas must be copied into unsigned v[] array; 2) decode_delta must be invoked to recover hash values. Both stages iterate on per-value basis and both are seemingly fast. However, they are not that fast when both of them are replaced with bare memcpy, which uses xmm registers or something like this.	2011-06-10 23:58:43 +04:00
Alexey Tourbin	3ff35a310c	set.c: improved rpmsetcmp main loop performance The loop is logically impeccable, but its main condition (v1 < v1end && v2 < v2end) is somewhat redundant: in two of the three cases, only one pointer gets advanced. To save instructions, the conditions are now handled within the cases. The loop is now a while (1) loop, a disguised form of goto. Also not that, when comparing Requires against Provides, the Requires is usually sparse: P: a b c d e f g h i j k l ... R: a c h j ... This means that a nested loop which skips intermediate Provides elements towards the next Requires element may improve performance. while (v1 < v1end && v1 < v2) v1++; However, note that the first condition (v1 < v1end) is also somewhat redundant. This kind of boundary checking can be partially omitted if the loop gets unrolled. There is a better technique, however, called the barrier: v1end must contain the biggest element possible, so that the trailing v1 is never smaller than any of v2. The nested loop is then becomes as simple as while (v1 < *v2) v1++; callgrind annotations, 4.0.4-alt100.27: 1,899,657,916 PROGRAM TOTALS 694,132,522 decode_base62_golomb 583,376,772 rpmsetcmp 106,225,572 __GI_strcmp 102,459,314 __GI_strlen ... callgrind annotations, this commit (rebuilt in hasher): 1,526,256,208 PROGRAM TOTALS 470,195,400 decode_base62_golomb 434,006,244 rpmsetcmp 106,137,949 __GI_strcmp 102,459,314 __GI_strlen ... Note that rpmsetcmp also absorbs cache_decode_set and decode_delta; the loop is now about twice as faster.	2011-06-10 15:12:33 +04:00
Alexey Tourbin	2651bb3246	set.c: unindented rpmsetcmp	2011-06-10 10:50:05 +04:00
Alexey Tourbin	0cfbd8401f	set.c: use __builtin_ffs to count vlen bits	2011-06-08 10:29:02 +04:00
Alexey Tourbin	292af70160	spec, lib/Makefile.am: compile and run set.c in -DSELF_TEST mode	2011-06-07 10:50:10 +04:00
Alexey Tourbin	57e25bb189	set.c: implemented two-bytes-at-a-time base62 decoding callgrind annotations, 4.0.4-alt100.27: 1,899,576,194 PROGRAM TOTALS 694,132,522 decode_base62_golomb 583,376,772 rpmsetcmp 106,136,459 __GI_strcmp 102,459,362 __GI_strlen ... callgrind annotations, this commit (built in hasher): 1,691,904,239 PROGRAM TOTALS 583,395,352 rpmsetcmp 486,433,168 decode_base62_golomb 106,122,657 __GI_strcmp 102,458,654 __GI_strlen	2011-06-07 10:49:48 +04:00
Alexey Tourbin	238e421ad3	set.c: use long subscript for table lookup, to avoid extra movslq instructions	2011-05-25 08:20:06 +04:00
Alexey Tourbin	97ff0102cd	set.c: improved base62_decode table lookup The whole point of using a table is not only that comparisons like (c >= 'a' && c <= 'z') can be eliminated; but also that conditional branches (the "ands" and "ifs") should be eliminated as well. The existing code, however, uses separate branches to check e.g. for the end of string; to check for an error; and to check for the (num6b < 61) common case. With this change, the table is restructured so that the common case will be handled with only a single instruction.	2011-05-25 08:18:40 +04:00
Alexey Tourbin	e061586385	build/files.c (finalizePkg): calculate RPMTAG_SIZE after optimizations Note that checkHardLinks function is now removed. It was unclear whether it was supposed to verify %lang attributes (returning non-zero on error) or indicate if all hardlinks are packaged within the package. It turns out that only a single package in our repo has PartialHardlinkSets dependency: $ cd /ALT/Sisyphus/files/x86_64/RPMS/ $ rpm -qp --qf '[%{NAME}\t%{REQUIRENAME}\n]' .rpm \|fgrep 'PartialHardlinkSets' $ cd /ALT/Sisyphus/files/noarch/RPMS/ $ rpm -qp --qf '[%{NAME}\t%{REQUIRENAME}\n]' .rpm \|fgrep 'PartialHardlinkSets' freeciv-common rpmlib(PartialHardlinkSets) $ This probably means that freeciv-common has hardlinks with different %lang attributes (which probably was supposed to be an error). So the whole issue should be reconsidered. A leave XXX marks in the code and suggest new PartialHardlinkSets implementation (however, the dependency is not being added yet).	2011-02-05 03:49:54 +03:00
Alexey Tourbin	2d3c3cef27	removed ancient dependency loop whiteout mechanism (rpm.org)	2011-01-23 02:30:59 +03:00
Alexey Tourbin	42b139d1eb	removed --fileid query selector and Filemd5s rpmdb index (rpm.org)	2011-01-22 17:35:13 +03:00
Alexey Tourbin	fad9df878b	set.c: tweak LRU first-time insertion policy Pushing new elements to the front tends to assign extra weight to that elements, at the expense of other elements that are already in the cache. The idea is then to try first-time insertion somewhere in the middle. Further attempts suggest that the "pivot" should be closer to the end. Cache performance for "apt-shell <<<unmet", previous commit: hit=62375 miss=17252 Cache performance for "apt-shell <<<unmet", this commit: hit=65085 miss=14542	2011-01-07 06:45:38 +03:00
Alexey Tourbin	add6349fea	system.h: moved AUTO_REALLOC from depends.h	2011-01-06 03:26:09 +03:00
Alexey Tourbin	523b00db6e	depends.c (rpmRangesOverlap): avoid extra strcmp calls	2011-01-06 03:24:14 +03:00
Alexey Tourbin	cf36274cb9	header.h: shut up -Wextra warning header.h: In function 'headerFreeData': header.h:705:11: error: comparison between signed and unsigned integer expressions	2011-01-03 09:52:38 +03:00
Alexey Tourbin	42db4e0a9d	set.c: final touches	2011-01-03 09:24:15 +03:00
Alexey Tourbin	73adeec07e	set.c: cache short deltas After base62+golomb decoding, most deltas are under 65536 (in Provides versions, average delta should be around 1536). So the whole version can be stored using short deltas, effectively halving memory footprint. However, this seems to be somewhat slower: per-delta copying and decode_golomb must be invoked to recover hash values. On the other hand, this allows to increase cache size (128 -> 192). But note that, with larger cache sizes, LRU linear search will take longer. So this is a compromise - and apparently a favourable one.	2011-01-03 09:04:50 +03:00
Alexey Tourbin	b674d2bc31	set.c: tweak delta routines and maskv	2011-01-03 08:21:13 +03:00
Alexey Tourbin	ebe77a46cd	set.c: avoid extra strcmp calls in the caching code	2011-01-03 08:21:12 +03:00
Alexey Tourbin	46f3968205	set.c: optimize array access in rpmsetcmp callgrind results for "apt-cache unmet", 4.0.4-alt100.6: 2,198,298,537 PROGRAM TOTALS 1,115,738,267 lib/set.c:decode_set 484,035,006 lib/set.c:rpmsetcmp 143,078,002 ???:strcmp 79,477,321 ???:strlen 61,780,572 ???:0x0000000000033080'2 54,466,947 ???:memcpy 31,161,399 ???:strcpy 24,438,336 ???:pkgCache::DepIterator::AllTargets() callgrind results for "apt-cache unmet", this commit: 1,755,431,664 PROGRAM TOTALS 764,189,271 lib/set.c:decode_base62_golomb 404,493,494 lib/set.c:rpmsetcmp 143,076,968 ???:strcmp 70,833,953 ???:strlen 61,780,572 ???:0x0000000000033080'2 54,466,947 ???:memcpy 31,161,399 ???:strcpy 24,438,336 ???:pkgCache::DepIterator::AllTargets()	2011-01-03 08:21:10 +03:00
Alexey Tourbin	d1ad36aef1	set.c: implemented combined base62+golomb decoding routine	2011-01-03 08:20:55 +03:00
Alexey Tourbin	bb21779d02	set.c: facilitate base62 decoding with table lookup	2011-01-03 02:06:07 +03:00
Alexey Tourbin	61866ee15a	set.c: minor base62 tweaks	2011-01-03 02:05:11 +03:00
Alexey Tourbin	a2f26af268	set.c: reverted Kirill's changes	2011-01-02 06:39:32 +03:00
Kirill A. Shutemov	4b05315a84	set.c: optimize decode_golomb() Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>	2010-12-07 16:58:43 +00:00
Kirill A. Shutemov	4d9409cb9c	set.c: optimize putbits() Use bit operations instead of cycles. Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>	2010-12-07 16:58:43 +00:00
Kirill A. Shutemov	6a9ce451bc	set.c: use packed bitmap for bit vector Currently, set.c uses array of chars to store bit vector. Each char stores one bit: 0 or 1. Let's use packed bitmap instead. It creates room for optimizations. Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>	2010-12-07 16:58:43 +00:00
Kirill A. Shutemov	6afa2793f8	set.c: cleanup self-tests Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>	2010-12-07 16:58:43 +00:00
Kirill A. Shutemov	82a064020b	set.c: use function-like syntax for sizeof. Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>	2010-12-07 16:58:43 +00:00
Kirill A. Shutemov	d7ece41758	set.c: do not mix declarations and code Let's move variable declarations to the begin of blocks. Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>	2010-12-07 16:58:43 +00:00
Kirill A. Shutemov	82da3b0f49	set.c: slightly reformat code to increase its readability Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>	2010-12-07 16:57:43 +00:00
Kirill A. Shutemov	5c39279236	set.c: fixup self-test functions declaration Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>	2010-12-07 16:56:19 +00:00
Kirill A. Shutemov	4226027fbb	set.c: get rid of nested functions Nested function in C is GCC extension. Let's try to follow ANSI C syntax. Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>	2010-12-07 16:56:19 +00:00
Kirill A. Shutemov	75a4506d75	set.c, set.h: get rid of C++-style comments Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>	2010-12-07 16:56:19 +00:00
Dmitry V. Levin	c7167f0395	lib/depends.c (rpmRangesOverlap): optimize out unneeded calls to printDepend() This optimization speeds up rpmRangesOverlap() by 10%.	2010-12-04 17:30:16 +00:00
Alexey Tourbin	fc5b5a5f7d	set.c: implemented LRU caching (2x speed-up, 1M footprint)	2010-12-04 13:11:45 +03:00
Dmitry V. Levin	7c35077bf5	Link librpm with libbeecrypt	2010-10-20 09:04:43 +00:00
Alexey Tourbin	2463074998	set.c: corrected set_free() return type	2010-10-12 03:14:10 +04:00
Alexey Tourbin	50a7eb6602	set.c: minor fixes, no changes	2010-09-19 09:56:00 +04:00
Alexey Tourbin	ca83cfb381	set.c (rpmsetcmp): set2 err fix	2010-09-13 18:54:45 +04:00
Alexey Tourbin	9fabff6e7a	rpmlibprov.c: added rpmlib(SetVersions) feature	2010-09-11 01:58:23 +04:00
Alexey Tourbin	3e9adcff64	depends.c (rpmRangesOverlap): added support for set-versions	2010-09-11 01:58:13 +04:00

1 2 3 4 5 ...

275 Commits