1768 Commits

Author SHA1 Message Date
Amar Tumballi
63088d8225 multiple-files: clang-scan fixes
updates: bz#1622665
Change-Id: I9f3a75ed9be3d90f37843a140563c356830ef945
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2018-12-31 06:45:17 +00:00
Yaniv Kaul
3ce7b5dbf0 libglusterfs/src/mem-types.h: remove unused common enums from mem-types.h
They were not used at all, just taking space.
I've also marked all those that are not common really, but used
in just one place - they probably should move there (in follow-up
patches)

As a test, I've removed from the stripe xlator unused private
enums and moved one that was in the common list, but only
used in the stripe code, to be a private enum.

Compile-tested only!

updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I1158dc1d259f1fd3f69904336c46c9d83cea799f
2018-12-30 11:59:27 +00:00
Amar Tumballi
e1f92176a8 all: handle USE_AFTER_FREE warnings
* we shouldn't be using 'local' after DHT_STACK_UNWIND() as it frees
the content of local. Add a 'goto out' or similar logic to handle
the situation.

* fix possible overlook of unref(dict), instead of unref(xdata).

* make coverity happy by re-ordering unref in meta-defaults.

* gfid-access: re-order dictionary allocation so we don't have to
  do a extra unref.

* other obvious errors reported.

updates: bz#789278
Change-Id: If05961ee946b0c4868df19861d7e4a927a2a2489
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2018-12-20 06:26:37 +00:00
Poornima G
2af8fca492 posix: use synctask for janitor
With brick mux, the number of threads increases as the number of
bricks increases. As an initiative to reduce the number of
threads in brick mux scenario, replacing janitor thread to use
synctask infra.

Now close() and closedir() handle by separate janitor
thread which is linked with glusterfs_ctx.

Updates #475
Change-Id: I0c4aaf728125ab7264442fde59f3d08542785f73
Signed-off-by: Poornima G <pgurusid@redhat.com>
2018-12-19 14:36:52 +00:00
karthik-us
9662504d45 cluster/afr: Allow lookup on root if it is from ADD_REPLICA_MOUNT
Problem: When trying to convert a plain distribute volume to replica-3
or arbiter type it is failing with ENOTCONN error as the lookup on
the root will fail as there is no quorum.

Fix: Allow lookup on root if it is coming from the ADD_REPLICA_MOUNT
which is used while adding bricks to a volume. It will try to set the
pending xattrs for the newly added bricks to allow the heal to happen
in the right direction and avoid data loss scenarios.

Note: This fix will solve the problem of type conversion only in the
case where the volume was mounted at least once. The conversion of
non mounted volumes will still fail since the dht selfheal tries to
set the directory layout will fail as they do that with the PID
GF_CLIENT_PID_NO_ROOT_SQUASH set in the frame->root.

Change-Id: Ic511939981dad118cc946754341318b164954b3b
fixes: bz#1655854
Signed-off-by: karthik-us <ksubrahm@redhat.com>
2018-12-18 10:30:19 +00:00
Poornima G
b87c397091 iobuf: Get rid of pre allocated iobuf_pool and use per thread mem pool
The current implementation of iobuf_pool has two problems:
- prealloc of 12.5MB memory, this limits the scale factor of the gluster
  processes due to RAM requirements
- lock contention, as the current implementation has one global
  iobuf_pool lock. Credits for debugging and addressing the same goes to
  Krutika Dhananjay <kdhananj@redhat.com>. Issue: #410

Hence changing the iobuf implementation to use per thread mem pool.
This may theoritically appear to cause perf dip as there is no preallocation.
But per thread mem pool will not have significant perf impact as the last
allocated memory is kept alive for subsequent allocs, for some time.
The worst case would be if iobufs requested are of random sizes each time.
The best case is, if we get iobuf request of the same size. From the perf
tests, this patch did not seem to cause any perf decrease.

Note that, with this patch, the rdma performance is going to degrade
drastically. In one of the previous patchsets we had fixes to not
degrade rdma perf, but rdma is not supported and also not tested [1].
Hence the decision was to not have code in rdma that is not tested
and not supported.

[1] https://lists.gluster.org/pipermail/gluster-users.old/2018-July/034400.html

Updates: #325
Change-Id: Ic2ef3bd498f9250dea25f25ba0c01fde19584b27
Signed-off-by: Poornima G <pgurusid@redhat.com>
2018-12-18 09:35:24 +00:00
Poornima G
9ff080382c mem-pool: Add api to mem_get based on requested size
Currently mem-pool implementation provides api to get from the
mem pool based on the struct type. This is to retain api
compatibility with the old implementation of mem pool. Internally
in the mem pool structure there is a mapping from struct to size
based pools.

In this patch, we are adding new APIs to fetch memory from mem pool,
given a size.

Change-Id: Ib220ee45ebd134a7be8f6482db5a592dbb7b9211
Updates: #325
Signed-off-by: Poornima G <pgurusid@redhat.com>
2018-12-17 17:26:59 +00:00
Amar Tumballi
d49b41e817 fuse: add --lru-limit option
The inode LRU mechanism is moot in fuse xlator (ie. there is no
limit for the LRU list), as fuse inodes are referenced from
kernel context, and thus they can only be dropped on request of
the kernel. This might results in a high number of passive
inodes which are useless for the glusterfs client, causing a
significant memory overhead.

This change tries to remedy this by extending the LRU semantics
and allowing to set a finite limit on the fuse inode LRU.

A brief history of problem:

When gluster's inode table was designed, fuse didn't have any
'invalidate' method, which means, userspace application could
never ask kernel to send a 'forget()' fop, instead had to wait
for kernel to send it based on kernel's parameters. Inode table
remembers the number of times kernel has cached the inode based
on the 'nlookup' parameter. And 'nlookup' field is not used by
no other entry points (like server-protocol, gfapi etc).

Hence the inode_table of fuse module always has to have lru-limit
as '0', which means no limit. GlusterFS always had to keep all
inodes in memory as kernel would have had a reference to it.
Again, the reason for this is, kernel's glusterfs inode reference
was pointer of 'inode_t' structure in glusterfs. As it is a
pointer, we could never free it (to prevent segfault, or memory
corruption).

Solution:

In the inode table, handle the prune case of inodes with 'nlookup'
differently, and call a 'invalidator' method, which in this case is
fuse_invalidate(), and it sends the request to kernel for getting
the forget request.

When the kernel sends the forget, it means, it has dropped all
the reference to the inode, and it will send the forget with the
'nlookup' parameter too. We just need to make sure to reduce the
'nlookup' value we have when we get forget. That automatically
cause the relevant prune to happen.

Credits: Csaba Henk, Xavier Hernandez, Raghavendra Gowdappa, Nithya B

fixes: bz#1560969
Change-Id: Ifee0737b23b12b1426c224ec5b8f591f487d83a2
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2018-12-14 17:34:28 +00:00
Yaniv Kaul
e3d01793a3 Multiple posix related files: several modifications
Just looked at posix.c and related code and performed
some changes and cleanups. The only important one is #3 below,
but surely the others (#2 and #4) need careful review.
Changes to other files are as they were related to code paths
in posix.c.

I'll send a separate patch for other posix related files.

Main changes:
1. Proper initializtion for parameters, where it made sense.
2. Logged outside the lock in several places.
3. Moved from CALLOC to MALLOC where it made sense.
4. Aligned structures.
5. moved dictionary functions to use _sizen where possible.
(dict_get() -> dict_get_sizen() for example)

Compile-tested only!

Change-Id: Ia84699fb495e06d095339c91c1ba770d1393bb6c
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
2018-12-14 04:53:36 +00:00
Amar Tumballi
af7e957b49 xlator: make 'xlator_api' mandatory
* Remove the options to load old symbol.
* keep only 'xlator_api' symbol from being exported using xlator.sym
* add xlator_api to all the xlators where its missing

NOTE: This covers all the xlators which has at least a test case
to validate its loading. If there is a translator, which doesn't
have any test, then we should probably remove that from codebase.

fixes: #164
Change-Id: Ibcdc8c9844cda6b4463d907a15813745d14c1ebb
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2018-12-13 14:41:50 +05:30
Mohit Agrawal
fb917bf10b [geo-rep]: Worker still ACTIVE after killing bricks
Problem: In changelog xlator after destroying listener it call's
         unlink to delete changelog socket file but socket file
         reference is not cleaned up from process memory

Solution: 1) To cleanup reference completely from process memory
             serialize transport cleanup for changelog and then
             unlink socket file
          2) Brick xlator will notify GF_EVENT_PARENT_DOWN to next
             xlator only after cleanup all xprts

Test: To test the same run below steps
      1) Setup some volume and enable brick mux
      2) kill anyone brick with gf_attach
      3) check changelog socket for specific to killed brick
         in lsof, it should cleanup completely

fixes: bz#1600145

Change-Id: Iba06cbf77d8a87b34a60fce50f6d8c0d427fa491
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
2018-12-13 04:46:50 +00:00
Milind Changire
c9d117d54a core: move invalid port logs to DEBUG log level
Stop spamming "invalid port" logs in case sysadmin has reserved a large
number of ports.

Change-Id: I244ef7693560cc404b36cadc6b05d92ec0e908d3
fixes: bz#1656517
Signed-off-by: Milind Changire <mchangir@redhat.com>
2018-12-12 15:59:56 +00:00
Raghavendra Bhat
7dadea15c5 copy_file_range support in GlusterFS
* libglusterfs changes to add new fop

    * Fuse changes:
      - Changes in fuse bridge xlator to receive and send responses

    * posix changes to perform the op on the backend filesystem

    * protocol and rpc changes for sending and receiving the fop

    * gfapi changes for performing the fop

    * tools: glfs-copy-file-range tool for testing copy_file_range fop

      - Although, copy_file_range support has been added to the upstream
	    fuse kernel module, no release has been made yet of a kernel
        which contains the support. It is expected to come in the
        upcoming release of linux-4.20

        So, as of now, executing copy_file_range fop on a fused based
        filesystem results in fuse kernel module sending read on the
	    source fd and write on the destination fd.

	    Therefore a small gfapi based tool has been written to be able
        test the copy_file_range fop. This tool is similar (in functionality)
	    to the example program given in copy_file_range man page.

	    So, running regular copy_file_range on a fuse mount point and
	    running gfapi based glfs-copy-file-range tool gives some idea about
	    how fast, the copy_file_range (or reflink) can be.

	    On the local machine this was the result obtained.

	    mount -t glusterfs workstation:new /mnt/glusterfs
	    [root@workstation ~]# cd /mnt/glusterfs/
	    [root@workstation glusterfs]# ls
	    file
	    [root@workstation glusterfs]# cd
	    [root@workstation ~]# time /tmp/a.out /mnt/glusterfs/file /mnt/glusterfs/new
	    real  0m6.495s
	    user  0m0.000s
	    sys   0m1.439s
	    [root@workstation ~]# time glfs-copy-file-range $(hostname) new /tmp/glfs.log /file /rrr
	    OPEN_SRC: opening /file is success
	    OPEN_DST: opening /rrr is success
	    FSTAT_SRC: fstat on /rrr is success
	    copy_file_range successful

        real  0m0.309s
        user  0m0.039s
        sys   0m0.017s

        This tool needs following arguments
         1) hostname
         2) volume name
         3) log file path
         4) source file path (relative to the gluster volume root)
         5) destination file path (relative to the gluster volume root)

        "glfs-copy-file-range <hostname> <volume> <log file path> <source> <destination>"

      - Added a testcase as well to run glfs-copy-file-range tool

    * io-stats changes to capture the fop for profiling

    * NOTE:

      - Added conditional check to see whether the copy_file_range syscall
        is available or not. If not, then return ENOSYS.

      - Added conditional check for kernel minor version in fuse_kernel.h
        and fuse-bridge while referring to copy_file_range. And the kernel
        minor version is kept as it is. i.e. 24. Increment it in future
        when there is a kernel release which contains the support for
        copy_file_range fop in fuse kernel module.

    * The document which contains a writeup on this enhancement can be found at
      https://docs.google.com/document/d/1BSILbXr_knynNwxSyyu503JoTz5QFM_4suNIh2WwrSc/edit

Change-Id: I280069c814dd21ce6ec3be00a884fc24ab692367
updates: #536
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
2018-12-12 15:56:55 +00:00
ShyamsundarR
20ef211cfa libglusterfs: Move devel headers under glusterfs directory
libglusterfs devel package headers are referenced in code using
include semantics for a program, this while it works can be better
especially when dealing with out of tree xlator builds or in
general out of tree devel package usage.

Towards this, the following changes are done,
- moved all devel headers under a glusterfs directory
- Included these headers using system header notation <> in all
code outside of libglusterfs
- Included these headers using own program notation "" within
libglusterfs

This change although big, is just moving around the headers and
making it correct when including these headers from other sources.

This helps us correctly include libglusterfs includes without
namespace conflicts.

Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b
Updates: bz#1193929
Signed-off-by: ShyamsundarR <srangana@redhat.com>
2018-12-05 21:47:04 +00:00
Yaniv Kaul
f80d9be732 libglusterfs/src/iobuf.c: small refactor to re-use code.
No functional changes (I hope).
Compile-tested only!

updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>

Change-Id: Ifbec21c18a6dbe27c5271db156bff4d30ca85dbf
2018-12-05 03:59:29 +00:00
Milind Changire
748e993d1f rpc: bump up server.event-threads
Problem:
A single event-thread causes performance issues in the system.

Solution:
Bump up event-threads to 2 to make the system more performant.
This helps in making the system more responsive and helps avoid the
ping-timer-expiry problem as well. However, setting the event-threads
to 2 is not the only thing required to avoid ping-timer-expiry issues.

Change-Id: Idb0fd49e078db3bd5085dd083b0cdc77b59ddb00
fixes: bz#1653277
Signed-off-by: Milind Changire <mchangir@redhat.com>
2018-12-04 06:38:55 +00:00
Mohit Agrawal
46c15ea8fa server: Resolve memory leak path in server_init
Problem: 1) server_init does not cleanup allocate resources
            while it is failed before return error
         2) dict leak at the time of graph destroying

Solution: 1) free resources in case of server_init is failed
          2) Take dict_ref of graph xlator before destroying
             the graph to avoid leak

Change-Id: I9e31e156b9ed6bebe622745a8be0e470774e3d15
fixes: bz#1654917
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
2018-12-03 11:34:35 +00:00
Raghavendra Gowdappa
95e380eca1 rpcsvc: provide each request handler thread its own queue
A single global per program queue is contended by all request handler
threads and event threads. This can lead to high contention. So,
reduce the contention by providing each request handler thread its own
private queue.

Thanks to "Manoj Pillai"<mpillai@redhat.com> for the idea of pairing a
single queue with a fixed request-handler-thread and event-thread,
which brought down the performance regression due to overhead of
queuing significantly.

Thanks to "Xavi Hernandez"<xhernandez@redhat.com> for discussion on
how to communicate the event-thread death to request-handler-thread.

Thanks to "Karan Sandha"<ksandha@redhat.com> for voluntarily running
the perf benchmarks to qualify that performance regression introduced
by ping-timer-fixes is fixed with this patch and patiently running
many iterations of regression tests while RCAing the issue.

Thanks to "Milind Changire"<mchangir@redhat.com> for patiently running
the many iterations of perf benchmarking tests while RCAing the
regression caused by ping-timer-expiry fixes.

Change-Id: I578c3fc67713f4234bd3abbec5d3fbba19059ea5
Fixes: bz#1644629
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
2018-11-29 01:19:12 +00:00
Kaleb S. KEITHLEY
f0232d07f7 core: ctx calls naked calloc()
liblglusterfs provides wrapper functions MALLOC/__gf_default_malloc,
CALLOC/__gf_default_calloc, and REALLOC/__gf_default_realloc for those
few places outside of mempool.c that need to call malloc/calloc/realloc
directly.

Notable exceptions are "contrib" code, e.g. rbtree and timer-wheel,
and perhaps parsers generated by yacc+lex. But even parsers can be
fixed to at least call the wrappers mentioned above, if not our own
allocators.

Change-Id: Ib8069815eba9b6c04c3adaf59727ec8d8795c4d1
updates: bz#1193929
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
2018-11-28 18:15:42 +00:00
Raghavendra Gowdappa
18b6d7ce7d libglusterfs: rename macros roof and floor to not conflict with math.h
Change-Id: I666eeb63ebd000711b3f793b948d4e0c04b1a242
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
Updates: bz#1644629
2018-11-28 15:11:59 +00:00
Xavi Hernandez
6d69a66349 mem-pool: minor fix and clarification
A comment has been added to pool_destructor() function to explain why
locks are not needed there. Also, the initialization of 'poison' field
has been moved inside a locked region for further safety and clarity.

Change-Id: Idbf23bda7f9228d60c644a1bea4b6c2cfc582090
updates: bz#1193929
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
2018-11-28 03:09:28 +00:00
Atin Mukherjee
a52d2d7043 glusterd: perform store operation in cleanup lock
All glusterd store operation and cleanup thread should work under a
critical section to avoid any partial store write.

Change-Id: I4f12e738f597a1f925c87ea2f42565dcf9ecdb9d
Fixes: bz#1652430
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
2018-11-27 10:11:20 +00:00
Yaniv Kaul
01f3358501 libglusterfs/src/dict.c : consistent initialization of parameters.
Some were assigned NULL, for no good reason, some were assigned proper
initial value. Made them all consistent, as much as possible, to be
assigned reasonable initial values.
No expected functional changes (and I also assume the compiler
already did most of this work behind the scenes anyway, so no
performance implications either).

Compile-tested only!

updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>

Change-Id: I2bc0d4f2221124b5f9ef6150c86b7259074e7013
2018-11-27 05:30:02 +00:00
Xavi Hernandez
31f85a4c7b libglusterfs: fix memory corruption caused by per-thread mem pools
There was a race in the per-thread memory pool management that could lead
to memory corruption. The race appeared when the following sequence of
events happened:

1. Thread T1 allocated a memory object O1 from its own private pool P1
2. T1 terminates and P1 is marked to be destroyed
3. The mem-sweeper thread is woken up and scans all private pools
4. It detects that P1 needs to be destroyed and starts releasing the
   objects from hot and cold lists.
5. Thread T2 releases O1
6. O1 is added to the hot list of P1

The problem happens because steps 4 and 6 are protected by diferent locks,
so they can run concurrently. This means that both T1 and T2 are modifying
the same list at the same time, potentially causing corruption.

This patch fixes the problem using the following approach:

1. When an object is released, it's only returned to the hot list of the
   corresponding memory pool if it's not marked to be destroyed. Otherwise
   the memory is released to the system.
2. Object release and mem-sweeper thread synchronize access to the deletion
   mark of the memory pool to prevent simultaneous access to the list.

Some other minor adjustments are made to reduce the lengths of the locked
regions.

Fixes: bz#1651165
Change-Id: I63be3893f92096e57f54a6150e0461340084ddde
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
2018-11-26 04:24:23 +00:00
Poornima G
424978302c Coverity fix for calling risky function - fscanf
fscanf with %s reads a word, there is no restriction on the length
of that word, and the caller is required to pass a sufficiently
large buffer for storing thw word. If the input word exceeds the
buffer size, it will cause buffer overflow.

To fix this, use fscanf with width parameter. Width specifies
the maximum number of characters to be read in the current reading
operation.

Change-Id: If250abf5eb637b9fc2a79047e3599f83254cd4e5
updates: bz#1193929
Signed-off-by: Poornima G <pgurusid@redhat.com>
2018-11-24 17:22:12 +00:00
Xavi Hernandez
a0fdc9202c core: create a constant for default network timeout
A new constant named GF_NETWORK_TIMEOUT has been defined and all
references to the hard-coded timeout of 42 seconds have been
replaced with this constant.

Change-Id: Id30f5ce4f1230f9288d9e300538624bcf1a6da27
fixes: bz#1652852
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
2018-11-23 11:05:02 +01:00
Yaniv Kaul
5231d3d165 libglusterfs/src/common-utils.h: faster mem_0filled() function
based on the amusing discussion @ https://rusty.ozlabs.org/?p=560

Compile-tested only!

updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>

Change-Id: I1cac54067eb44801b216d5620fc5ee2c89befdd0
2018-11-20 03:11:30 +00:00
Kaleb S. KEITHLE
76906af9d7 core: fix strncpy warnings
Since gcc-8.2.x (fedora-28 or so) gcc has been emitting warnings
about buggy use of strncpy.

e.g.
  warning: ‘strncpy’ output truncated before terminating nul
  copying as many bytes from a string as its length
and
  warning: ‘strncpy’ specified bound depends on the length of the
  source argument

Since we're copying string fragments and explicitly null terminating
use memcpy to silence the warning

Change-Id: I413d84b5f4157f15c99e9af3e154ce594d5bcdc1
updates: bz#1193929
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
2018-11-15 05:05:34 +00:00
Mohit Agrawal
bcf1e8b074 core: Portmap entries showing stale brick entries when bricks are down
Problem: pmap is showing stale brick entries after down the brick
         because of glusterd_brick_rpc_notify call gf_is_service_running
         before call pmap_registry_remove to ensure about brick instance.

Solutiom: 1) Change the condition in gf_is_pid_running to ensure about
             process existence, use open instead of access to achieve
             the same
          2) Call search_brick_path_from_proc in __glusterd_brick_rpc_notify
             along with gf_is_service_running

Change-Id: Ia663ac61c01fdee6c12f47c0300cdf93f19b6a19
fixes: bz#1646892
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
2018-11-12 03:31:57 +00:00
Yaniv Kaul
cac2dba48b libglusterfs multiple files: remove dead initilization
Per newer GCC releases and clang-scan, some trivial
dead initialization (values that were set but were never
read) were removed.

Compile-tested only!

updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>

Change-Id: Ia9959b2ff87d2e9cb46864e68ffe7dccb984db34
2018-11-11 16:06:29 +00:00
ShyamsundarR
3056419608 coverity: ignore tainted access reported in gf_free
Earlier commit had the annotation incorrect, and also did not
wrap the sanitization in a separate function. (see commit 39a1db1)

The issues are corrected in this patch, and also a coverity
stand alone run has been tested to ensure the annotations are
respected by coverity.

Change-Id: I4a93b6981e2ff4bba9a29e590b17da248931c8ae
Updates: bz#789278
Signed-off-by: ShyamsundarR <srangana@redhat.com>
2018-11-08 07:40:40 +00:00
Yaniv Kaul
d76611fbba libglusterfs/src/iobuf.c: don't forget to unlock a mutex
commit ed83a4ee7b73e6b04694d1ac11ed25d2983ac943 changed locking
order and forgot to unlock in a negative path (when index was -1).
Coverity caught it (thanks!) as  CID 1396581:  Program hangs  (LOCK)

Note: I'm unlocking before logging the failure. I think it's the right
order - logging can take a while (especially if your disk is slow).

Compile-tested only!

updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>

Change-Id: I82ac241edf1d511bf6807cf9c46c538ab9f4acc4
2018-11-06 18:33:36 +02:00
Csaba Henk
4c6b063463 fuse: diagnostic FLUSH interrupt
We add dummy interrupt handling for the FLUSH
fuse message. It can be enabled by the
"--fuse-flush-handle-interrupt" hidden command line
option, or "-ofuse-flush-handle-interrupt=yes"
mount option.

It serves no other than diagnostic & demonstational
purposes -- to exercise the interrupt handling framework
a bit and to give an usage example.

Documentation is also provided that showcases interrupt
handling via FLUSH.

Change-Id: I522f1e798501d06b74ac3592a5f73c1ab0590c60
updates: #465
Signed-off-by: Csaba Henk <csaba@redhat.com>
2018-11-06 04:21:57 +00:00
Csaba Henk
bceb9f2567 fuse: interrupt handling framework
- add sub-framework to send timed responses to kernel
- add interrupt handler queue
- implement INTERRUPT

fuse_interrupt looks up handlers for interrupted messages
in the queue. If found, it invokes the handler function.
Else responds with EAGAIN with a delay.

See spec at

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/fuse.txt?h=v4.17#n148

and explanation in comments.

Change-Id: I1a79d3679b31f36e14b4ac8f60b7f2c1ea2badfb
updates: #465
Signed-off-by: Csaba Henk <csaba@redhat.com>
2018-11-06 04:21:57 +00:00
Yaniv Kaul
258db71786 libglusterfs/src/iobuf.c: where possible, pass the index parameter
Don't 'calculate' again where it can be passed.
If possible, do not perform under lock.

Also remove some NULL checks, assuming they were done by the callers.
Left a remark for each such change.

Compile-tested only!

updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>

Change-Id: Ia5c2851506da3388cb2d4445334c58881e2c4416
2018-11-06 04:18:50 +00:00
Yaniv Kaul
ed83a4ee7b libglusterfs/src/iobuf.c: take the pool lock once in new pool
When creating a new pool, take the pool lock once and create
all arenas, instead of taking and releasing for each arena.

Compile-tested only!

updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>

Change-Id: I7daa39de960e47e66a32ecab724cf3a61ccdc01b
2018-11-06 04:18:50 +00:00
Yaniv Kaul
ba52abc693 libglusterfs/src/iobuf.c: remove some if statements
Small code refactoring to remove some if statements
in several functions. No functional changes expected.

Compile-tested only!

updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>

Change-Id: If9f8d5d53c9688fb994b6d690aea66f65fa01c55
2018-11-06 04:18:50 +00:00
Amar Tumballi
d2b7453193 logging: create parent dir if not available
As glusterfs logging uses different directory than /var/log
(ie, /var/log/glusterfs), there is a chance it may not be
present when starting glusterfs. Create parent dir if it
doesn't exist.

Updates: bz#1193929
Change-Id: I8d6f7e5a608ba53258b14617f5d103d1e98b95c1
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2018-11-06 04:17:42 +00:00
Amar Tumballi
74e8328d3f all: fix the format string exceptions
Currently, there are possibilities in few places, where a user-controlled
(like filename, program parameter etc) string can be passed as 'fmt' for
printf(), which can lead to segfault, if the user's string contains '%s',
'%d' in it.

While fixing it, makes sense to make the explicit check for such issues
across the codebase, by making the format call properly.

Fixes: CVE-2018-14661

Fixes: bz#1644763
Change-Id: Ib547293f2d9eb618594cbff0df3b9c800e88bde4
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2018-11-05 18:50:59 +00:00
Raghavendra Bhat
643c9d049d features/snapview-server: change gf_log instances to gf_msg
Change-Id: Ib8bdf210a896423abcd7413dd4896d424ac0f561
fixes: bz#1626610
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
2018-11-05 07:05:23 +00:00
Amar Tumballi
c2a543ec94 xlator: add generic option parsing framework
As an example, and also as an enhancement, added 'log-level'
as a default option to every translator (glusterfs already
support infrastructure to handle xl->loglevel).

Corresponding infrastructure to add per xlator log-level
is not present in glusterd volume-set. Plan is to get it
sorted out in later patches or in GD2.

* Why this is needed? - Mainly because we need to only add
different log-level to some xlator to debug few things in a
production system, while not changing overall log-level. This
helps in better debug-ability.

Updates: bz#1193929

Change-Id: Ia4098ce39197cd423345b3d31fe8315481681ab8
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2018-11-02 06:24:47 +00:00
Amar Tumballi
55a6ba56be tiering: remove the translator from build and glusterd
Based on the proposal to remove few features as they are not
actively maintained [1], removing tier translator from the
build. Also make sure there are no regression tests involving
tiering feature are present.

[1] https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html

Change-Id: I2c177f711f9b54b7b24e1a13525ff3132bd9a9c5
updates: bz#1642807
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2018-11-02 02:39:35 +00:00
Amar Tumballi
7fac81aeab mem-pool: change the values to 64bits
total_allocs of certain type of variables can be 4billion in a
single day depending on load. So, 32 bits for that is not enough.

Also, size_t is good variable size for one allocation, but the
sum of allocations, should be 64bits to make sure we don't
overflow the variable.

Updates: bz#1639599
Change-Id: If3b19687f94425e913a0201ae5d73661eda51f06
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2018-11-01 02:38:39 +00:00
Mohit Agrawal
6cc573631d core: Use GF_ATOMIC ops to update inode->nlookup
fixes: bz#1644164

Change-Id: I0ac5aff565b3a30d5ff25ec5a3f20e0bda424a5d
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
2018-10-30 12:38:37 +05:30
Amar Tumballi
0f817c7795 server-protocol: set the frame type to TYPE_FOP
This will allow proper printing of exact 'fop' type to be logged in
string, not number, during backtraces.

Considering this was not done on brick processes, we have no easy
way to glance and understand which fops were pending.

What gets changed:

After a crash, most of the core-dumps logged were of the form:
```
pending frames:
frame : type(0) op(18)
frame : type(0) op(18)
frame : type(0) op(28)
```
would change to
```
pending frames:
frame : type(1) op(SETXATTR)
frame : type(1) op(SETXATTR)
frame : type(1) op(READDIR)
```

updates: bz#1639599
Change-Id: I0e3d2a8dee9cfde7ed0112a948f5213f546efb80
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2018-10-29 08:56:46 +00:00
Amar Tumballi
bf5bfa5f2f statedump: fix clang null dereference error
ctx->active can be null, and is checked elsewhere in the
same function. In another case, where 'ctx->active' gets
dereferenced, it needs to be validated before the loop
is hit.

Updates: bz#1622665
Change-Id: I4ec917e96c0756586fc7a74c76848bb9589a0293
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2018-10-26 02:54:55 +00:00
Milind Changire
0e7929ef02 glusterd: raise default transport.listen-backlog
Problem:
data center setups with large number of bricks with replication
causes a flood of connections from bricks and self-heal daemons
to glusterd causing connections to be dropped due to insufficient
listener socket backlog queue length

Solution:
raise default value of transport.listen-backlog to 1024

Change-Id: I879e4161a88f1e30875046dff232499a8e2e6c51
fixes: bz#1642850
Signed-off-by: Milind Changire <mchangir@redhat.com>
2018-10-25 13:39:21 +00:00
ShyamsundarR
39a1db1402 coverity: ignore tainted access reported in gf_free
Coverity reports tainted pointer access in _gf_free if the pointer passed in
was used by any IO related function by the caller. The taint within gf_free
is a false positive, as the tainted region is from the passed in pointer
till its allocated lenght, and not for contents before the pointer (i.e
the GF_MEM_HEADER_SIZE bytes before the passed in pointer), as that is
exclusively handled by the gf_alloc family of functions.

CID: 1228602, 1292646, 1292647, 1292648, 1292649, 1383192, 1383195, 1389691

Should additionally fix,
CID: 1292650, 1292651, 1357874, 1382373, 1382404, 1382407

Change-Id: I48c5a4028e7b0224c432bbc30f8c29408c2a466b
Updates: bz#789278
Signed-off-by: ShyamsundarR <srangana@redhat.com>
2018-10-16 09:03:24 +00:00
Milind Changire
593bbb28d9 libglusterfs: fix sys_socket coverity issue
CID 1396081:  Control flow issues  (UNREACHABLE)

Change-Id: Ifad303853224cb9abc91c1083bb1529f4c13b1d3
updates: bz#789278
Signed-off-by: Milind Changire <mchangir@redhat.com>
2018-10-16 06:33:47 +00:00
Kaleb S. KEITHLEY
7152ace0b3 core: libuuid-devel breakage
The #include "uuid.h" left over from using .../contrib/uuid is debatably
incorrect now that we use the "system header" file /usr/include/uuid/uuid.h
from libuuid-devel.

Unfortunately this is complicated by things like FreeBSD having its own
/usr/include/uuid.h, and the e2fsprogs-libuuid uuid.h in installed - as
most third-party packages in FreeBSD are - in /usr/local as
/usr/local/include/uuid/uuid.h

With a system header file it should at least be #include <uuid.h>, and
even better as #include <uuid/uuid.h>, much like the way <sys/types.h>
and <net/if.h> are included. Using #include <uuid/uuid.h> guarantees
not getting the /usr/include/uuid.h on FreeBSD, but clang/cc knows to
find "system" header files like this in /usr/local/include; with or
without the -I/... from uuid.pc. Also using #include "uuid.h" leaves
the compiler free to find a uuid.h from any -I option it might be passed.
(Fortunately we don't have any at this time.)

As we now require libuuid-devel or e2fsprogs-libuuid and configure will
exit with an error if the uuid.pc file doesn't exist, the HAVE_LIBUUID
(including the #elif FreeBSD) tests in compat-uuid.h are redundant. We
are guaranteed to have it, so testing for it is a bit silly IMO. It may
also break building third party configure scripts if they omit defining
it. (Just how hard do we want to make things for third party developers?)

Change-Id: I7317f63c806281a5d27de7d3b2208d86965545e1
updates: bz#1193929
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
2018-10-16 05:04:33 +00:00