Commit Graph

11 Commits

Author SHA1 Message Date
Amar Tumballi
d54f5cdfbc timer-wheel: run the timer function outside of locked region
Surprizingly this also reduced 80% CPU overhead 'perf record' tool reported
in posix_spin_lock in a brick-mux test volume initialization process.

updates: bz#1193929
Change-Id: I4e1df60d6fd094105c312df39f1527d3f07bed68
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2019-01-09 02:53:27 +00:00
ShyamsundarR
20ef211cfa libglusterfs: Move devel headers under glusterfs directory
libglusterfs devel package headers are referenced in code using
include semantics for a program, this while it works can be better
especially when dealing with out of tree xlator builds or in
general out of tree devel package usage.

Towards this, the following changes are done,
- moved all devel headers under a glusterfs directory
- Included these headers using system header notation <> in all
code outside of libglusterfs
- Included these headers using own program notation "" within
libglusterfs

This change although big, is just moving around the headers and
making it correct when including these headers from other sources.

This helps us correctly include libglusterfs includes without
namespace conflicts.

Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b
Updates: bz#1193929
Signed-off-by: ShyamsundarR <srangana@redhat.com>
2018-12-05 21:47:04 +00:00
Kaleb S. KEITHLEY
9d70343977 contrib/timerwheel: bad 32-bit, use builtin fls(), fix copyright
It's bad form to remove other people's copyright and license when you
copy their source for your own use.

Defining BITS_PER_LONG as 64 is incorrect on 32-bit platforms.

The mismatch between the unsigned long of the timer and the int
param to fls() means on 64-bit platforms that any bits set in the
high 32-bits of the the timer are lost/ignored.

gf_tw_find_last_bit() is meant to find the last bit in an array of
longs. It's overkill for gluster's timerwheel  where we only ever pass
a single long; replacing it with a direct call to fls() which is
renamed to gf_tw_fls()

The timer routines are slightly modified from the kernel timer
functions that first appeared circa 2.6.x in .../kernel/timer.c
AFAICT.

find_last_bit() comes from the (linux) kernel (.../lib/find_bit.c
in 4.x kernels, .../lib/find_last_bit.c in 3.x kernels) but as noted
above, it is removed with this patch.

__fls() comes from the linux kernel (.../include/asm-generic/
bitops/{__fls.h,builtin-__fls.h}

Restoring/updating the copyright and license to the version from
the 4.x kernel find_bit.c. (timer.c does not have a license, __fls.h
and builtin-__fls.h do not have a copyright or license, but the whole
kernel is licensed under GPLv2 anyway.)

Change-Id: I2d2defccf1ccc74f55d99e94212747a36a1dff35
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: https://review.gluster.org/17146
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
2017-05-15 14:19:01 +00:00
Shyamsundar Ranganathan
9374338f9c Revert "contrib/timerwheel: probable bug on 32-bit, use __builtin_ffs()"
This reverts commit c92b8347ae.

Commit is not ready for a merge!

Change-Id: I3b3b52f7bfb4781dd42160e2b1059b4cdeb17956
Reviewed-on: https://review.gluster.org/17147
Tested-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
2017-05-01 21:43:38 +00:00
Kaleb S. KEITHLEY
c92b8347ae contrib/timerwheel: probable bug on 32-bit, use __builtin_ffs()
Simply always defining BITS_PER_LONG as 64 seems like it's almost
certainly wrong on 32-bit platforms and could potentially result in
incorrect results.

fls and, e.g., __builtin_ffs() return the same answer for any given
input, making it seem like the name fls (find last set) is a misnomer
and ffs (find first set, starting from the lsb) is the more accurate
name.

Using __builtin_ffs() causes the compiler (in intel) to emit code
with the bsf (bit scan forward) insn, which is approx 3x faster than
the code in ffs(), at least on the machine I tried it on. (Even so,
it takes 10M+ iterations for the speed difference to be measurable.
Choosing the "faster" implementation seems like a no-brainer, even
if there may not be any significant gain by doing so.)

Change-Id: I1616dda1a5b76f208ba737a713877c1673131e33
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: https://review.gluster.org/17142
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
2017-05-01 16:05:56 +00:00
Kaleb S. KEITHLEY
0fdf6c9db5 build: Mac OS X build issues, no spinlock, need sys_lgetxattr
use regular locks, use our syscall wrappers in libglusterfs

Change-Id: I7e0d00956366806af041b69b65d1f169aa0d2ae2
BUG: 1238793
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/11515
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
2015-07-05 13:03:19 -07:00
Venky Shankar
6ab37f0cb4 features/bitrot: cleanup, v2
This patch uses "cleanup, v1" infrastrcuture to cleanup scrubber
(data structures, threads, timers, etc..) on brick disconnection.
Signer is not cleaned up yet: probably would be done as part of
another patch.

Change-Id: I78a92b8a7f02b2f39078aa9a5a6b101fc499fd70
BUG: 1231619
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/11148
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
2015-06-25 04:45:19 -07:00
Venky Shankar
9a314fa226 contrib/timer-wheel: fix deadlock in del_timer()
commit eaf3bfa added mod_timers() and successfully screwed up
del_timer() by incorrectly wrapping it within double lock
blocks.

del_timer() was included before the above commit for the sake of
timer API completion, thankfully noone used it till now.

Change-Id: I07a454a216cf09dbb84777a23630e74a1e7f2830
BUG: 1227449
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/11050
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
2015-06-02 20:24:27 -07:00
Venky Shankar
eaf3bfa188 contrib/timer-wheel: mod_timer() and friends
Couple of timer-wheel api's to modify timer expiry times:

  mod_timer()
  mod_timer_pending()

Both the api's perform almost the same job with one minute
difference: mod_timer_pending() modifies timer expiry only
if the timer is pending (i.e. being tracked in timer-wheel).

Change-Id: Iae64934854ccfd6b081b849bff998ae3c3021bac
BUG: 1224596
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10892
Tested-by: NetBSD Build System
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
2015-05-28 02:44:25 -07:00
Venky Shankar
004f64e93d core: Global timer-wheel
Instantiate a process wide global instance of the timer wheel
data structure. Spawning glusterfs* process with option arg
"--global-timer-wheel" instantiates a global instance of
timer-wheel under global context (->ctx).

Translators can make use of this process wide instance [via a
call to glusterfs_global_timer_wheel()] instead of maintaining
an instance of their own and possibly consuming more memory.
Linux kernel too has a single instance of timer wheel where
subsystems such as IO, networking, etc.. make use of.

Bitrot daemon would be early consumers of this: bitrot translator
instances for multiple volumes would track objects belonging to
their respective bricks in this global expiry tracking data
structure. This is also a first step to move GlusterFS timer
mechanism to use timer-wheel.

Change-Id: Ie882df607e07acaced846ea269ebf1ece306d6ae
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10380
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
2015-04-26 11:04:11 -07:00
Venky Shankar
5394f3cf60 contrib/timer-wheel: import linux kernel timer-wheel
This patch imports timer-wheel[1] algorithm from the linux
kernel (~/kernel/time/timer.c) with some modifications.

Timer-wheel is an efficent way to track millions of timers for
expiry. This is a variant of the simple but RAM heavy approach
of having a list (timer bucket) for every future second.
Timer-wheel categorizes every future second into a logarithmic
array of arrays. This is done by splitting the 32 bit "timeout"
value into fixed "sliced" bits, thereby each category has a
fixed size array to which buckets are assigned.

A classic split would be 8+6+6+6 (used in this patch) which
results in 256+64+64+64 == 512 buckets. Therefore, the entire
32 bit futuristic timeouts have been mapped into 512 buckets.

[
   NOTE:
     There are other possible splits, such as "8+8+8+8", but
     this patch sticks to the widely used and tested default.
]

Therfore, the first category "holds" timers whose expiry range
is between 1..256, the next cateogry holds 257..16384, third
category 16385..1048576 and so on. When timers are added,
unless it's in the first category, timers with different
timeouts could end up in the same bucket. This means that the
timers are "partially sorted" -- sorted in their highest bits.

The expiry code walks the first array of buckets and exprires
any pending timers (1..256). Next, at time value 257, timers
in the first bucket of the second array is "cascaded" onto
the first category and timers are placed into respective
buckets according to the thier timeout values. Cascading
"brings down" the timers timeout to the coorect bucket
of their respective category. Therefore, timers are sorted
by their highest bits of the timeout value and then by the
lower bits too.

[1] https://lwn.net/Articles/152436/

Change-Id: I1219abf69290961ae9a3d483e11c107c5f49c4e3
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9707
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
2015-03-18 22:05:51 -07:00