1601 Commits

Author SHA1 Message Date
Prashanth Pai
9e7082b756 Make glusterfsd binary print statedump & xlator dir
The glusterd2 needs following options, some of which are provided by
gluster CLI today:

--print-xlatordir
--print-statedumpdir
--print-logdir

However, the CLI package need not be present on the machine running
glusterd2. This change adds the above CLI options to glusterfsd binary
which glusterd2 depends on.

Reverts 9a1ae47c8d60836ae0628a04a153f28c1085c0e8

Related changes:
https://review.gluster.org/#/c/19882/
https://github.com/gluster/glusterd2/pull/663

Updates: bz#1193929
Change-Id: I18c123b0d3350d2bd4f2400783e3b94e402a4e29
Signed-off-by: Prashanth Pai <ppai@redhat.com>
2018-04-19 15:18:40 +05:30
Mohit Agrawal
0043c63f70 gluster: Sometimes Brick process is crashed at the time of stopping brick
Problem: Sometimes brick process is getting crashed at the time
         of stop brick while brick mux is enabled.

Solution: Brick process was getting crashed because of rpc connection
          was not cleaning properly while brick mux is enabled.In this patch
          after sending GF_EVENT_CLEANUP notification to xlator(server)
          waits for all rpc client connection destroy for specific xlator.Once rpc
          connections are destroyed in server_rpc_notify for all associated client
          for that brick then call xlator_mem_cleanup for for brick xlator as well as
          all child xlators.To avoid races at the time of cleanup introduce
          two new flags at each xlator cleanup_starting, call_cleanup.

BUG: 1544090
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>

Note: Run all test-cases in separate build (https://review.gluster.org/#/c/19700/)
      with same patch after enable brick mux forcefully, all test cases are
      passed.

Change-Id: Ic4ab9c128df282d146cf1135640281fcb31997bf
updates: bz#1544090
2018-04-19 04:31:51 +00:00
hari gowtham
be26b0da2f glusterd: volume inode/fd status broken with brick mux
Problem:
The values for inode/fd was populated from the ctx received
from the server xlator.
Without brickmux, every brick from a volume belonged to a
single brick from the volume.
So searching the server and populating it worked.

With brickmux, a number of bricks can be confined to a single
process. These bricks can be from different volumes too (if
we use the max-bricks-per-process option).
If they are from different volumes, using the server xlator
to populate causes problem.

Fix:
Use the brick to validate and populate the inode/fd status.

Signed-off-by: hari gowtham <hgowtham@redhat.com>

Change-Id: I2543fa5397ea095f8338b518460037bba3dfdbfd
fixes: bz#1566067
2018-04-19 02:54:50 +00:00
Pranith Kumar K
a7525c507e cluster/afr: Make sure latency-arg is passed to afr
xlator_notify doesn't pass the extra arguments that come in the
input function, so XLATOR_NOTIFY macro should be used instead
to pass the extra arguments to the function.

BUG: 1567881
fixes bz#1567881
Change-Id: Ic15b6c446638cbacf3149693147a754219037c47
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
2018-04-18 13:55:49 +00:00
Xavi Hernandez
52d1b36e37 libglusterfs: fix comparison of a NULL dict with a non-NULL dict
Function are_dicts_equal() had a bug when the first argument was NULL and
the second one wasn't NULL. In this case it incorrectly returned that the
dicts were different when they could be equal.

Fixes: bz#1566732
Change-Id: I0fc245c2e7d1395865a76405dbd05e5d34db3273
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
2018-04-18 08:34:20 +00:00
Kaleb S. KEITHLEY
29024cfdd5 core/build/various: python3 compat, prepare for python2 -> python3
Note 1) we're not supposed to be using #!/usr/bin/env python, see
https://fedoraproject.org/wiki/Packaging:Guidelines?rd=Packaging/Guidelines#Shebang_lines

Note 2) we're also not supposed to be using "!/usr/bin/python,
see https://fedoraproject.org/wiki/Changes/Avoid_usr_bin_python_in_RPM_Build#Quick_Opt-Out

The previous patch (https://review.gluster.org/19767) tried to do too
much in one patch, so it was abandoned.

This patch does two things:
1) minor cleanup of configure(.ac) to explicitly use python2
2) change all the shebang lines to #!/usr/bin/python2 and add them
where they were missing based on warnings emitted during rpmbuild.

In a follow-up patch python2 will eventually be changed to python3.

Before that python2-isms (e.g. print, string.join(), etc.) need to be
converted to python3. Some of those can be rewritten in version agnostic
python. E.g. print statements become print() with "from __future_ import
print_function". The python 2to3 utility will be used for some of those.
Also Aravinda has given guidance in the comments to the first patch for
changes.

updates: #411
Change-Id: I471730962b2526022115a1fc33629fb078b74338
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
2018-04-12 11:04:27 +00:00
Susant Palai
48623a33a0 experimental/cloudsync: Download xlator for archival feature
spec-files:
https://review.gluster.org/#/c/18854/

Overview:
* Cloudsync maintains three file states in it's inode-ctx i.e
  1 - LOCAL,
  2 - REMOTE,
  3 - DOWNLOADING.

* A data modifying fop is allowed only if the state is LOCAL.
  If the state is REMOTE or DOWNLOADING, client will download
  or wait for the download to finish initiated by other client.

* Multiple download and upload from different clients are synchronized
  by inodelk.

* In POSIX a state check is done (part of different commit)before
  allowing the fop to continue. If the state is remote/downloading the
  fop is unwound with EREMOTE. The client will then download the file
  and continue with the fop again.

* Basic Algo for fop (let's say write fop):
  - If LOCAL -> resume fop
  - If REMOTE ->
	- INODELK
	- STAT (this gets state and heal the state if needed)
	- DOWNLOAD
	- resume fop

Note:
* Developers will need to write plugins for download, based on the
remote store they choose. In phase-1, support will be added for
one remote store per volume. In future, more options for multiple
remote stores will be explored.

TODOs:
 - Implement stat/lookup/readdirp to return size info from xattr
 - Make plugins configurable
 - Implement unlink fop
 - Add metrics collection
 - Add sharding support

Design Contributions:
Aravinda V K <avishwan@redhat.com>
Amar Tumballi <amarts@redhat.com>
Ram Ankireddypalle <areddy@commvault.com>
Susant Palai <spalai@redhat.com>

updates: #387
Change-Id: Iddf711ee7ab4e946ae3e472ff62791a7b85e6d4b
Signed-off-by: Susant Palai <spalai@redhat.com>
2018-04-10 01:09:29 +00:00
Krutika Dhananjay
08fadcc2a7 mount/fuse: Add support for multi-threaded fuse readers
Usage: Use 'reader-thread-count=<NUM>' as command line option to
set the thread count at the time of mounting the volume.

Next task is to make these threads auto-scale based on the load,
instead of having the user remount the volume everytime to change
the thread count.

Updates #412

Change-Id: I94aa1505e5ae6a133683d473e0e4e0edd139b76b
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
2018-04-02 06:10:30 +00:00
Sanoj Unnikrishnan
04ede2e163 Quota: heal directory on newly added bricks when quota limit is reached
Problem: if a lookup is done on a newly added brick for a path on which limit
has been reached, the lookup fails to heal the directory tree due to quota.

Solution: Tag the lookup as an internal fop and ignore it in quota.
Since marking internal fop does not usually give enough contextual information.
Introducing new flags to pass the contextual info.

Adding dict_check_flag and dict_set_flag to aid flag operations.
A flag is a single bit in a bit array (currently limited to 256 bits).

Change-Id: Ifb6a68bcaffedd425dd0f01f7db24edd5394c095
fixes: bz#1505355
BUG: 1505355
Signed-off-by: Sanoj Unnikrishnan <sunnikri@redhat.com>
2018-03-28 04:07:12 +00:00
Pranith Kumar K
2da6650dfa storage/posix: Add active-fd-count option in gluster
Problem:
when dd happens on sharded replicate volume all the writes on shards happen
through anon-fd. When the writes don't come quick enough, old anon-fd closes
and new fd gets created to serve the new writes. open-fd-count is decremented
only after the fd is closed as part of fd_destroy(). So even when one fd is on
the way to be closed a new fd will be created and during this short period it
appears as though there are multiple fds opened on the file. AFR thinks another
application opened the same file and switches off eager-lock leading to
extra latency.

Fix:
Have a different option called active-fd whose life cycle starts at
fd_bind() and ends just before fd_destroy()

BUG: 1557932
Change-Id: I2e221f6030feeedf29fbb3bd6554673b8a5b9c94
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
2018-03-21 10:36:31 +05:30
Mohit Agrawal
cf06dd5440 glusterd: TLS verification fails while using intermediate CA
Problem: TLS verification fails while using intermediate CA
         if mgmt SSL is enabled.

Solution: There are two main issue of TLS verification failing
          1) not calling ssl_api to set cert_depth
          2) The current code does not allow to set certificate depth
             while MGMT SSL is enabled.
          After apply this patch to set certificate depth user
          need to set parameter option transport.socket.ssl-cert-depth <depth>
          in /var/lib/glusterd/secure_acccess instead to set in
          /etc/glusterfs/glusterd.vol. At the time of set secure_mgmt in ctx
          we will check the value of cert-depth and save the value of cert-depth
          in ctx.If user does not provide any value in cert-depth in that case
          it will consider default value is 1

BUG: 1555154
Change-Id: I89e9a9e1026e37efb5c20f9ec62b1989ef644f35
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
2018-03-19 19:00:03 +00:00
Sven Fischer
de52876407 cleanup: xlator_t structure's 'client_latency' variable is not used
- Removed unused struct member and its one time usage.
  - cleaned up wrong white space

member 'client_latency' was not used otherwise since it was added by

commit 07cc8679cdf3b29680f4f105d0222da168d8bfc1
Author: Kevin Vigor <kvigor@fb.com>
Date:   Tue Mar 21 08:23:25 2017 -0700

    Halo Replication feature for AFR translator

Change-Id: Ibb0ea828d4090bbe8897f6af326b317884162a00
BUG: 1495153
Signed-off-by: Sven Fischer <sven@fischer-abc.de>
2018-03-19 03:30:31 +00:00
ShyamsundarR
ece3f0f669 protocol: Fix 4.0 client, parsing older iatt in dict
In a mixed mode cluster involving 4.0 and older 3.x bricks, if
clients are newer, then the iatt encoded in the dictionary can be
of the older iatt format, which a newer client will map incorrectly
to the newer structure.

This causes failures in FOPs that depend on this iatt for some
functionality (seen in mkdir operations failing as EIO, when DHT
hits its internal setxattr call).

The fix provided is to convert the iatt in the dict, based on which
RPC version is used to communicate with the server.

IOW, this is the reverse of change in commit "b966c7790e"

Tested using a mixed mode cluster (i.e bricks in 3.12 and 4.0 versions)
and a mixed set of clients, 3.12 and 4.0 clients.

There is no regression test provided, as this needs a mixed mode cluster
to test and validate.

Change-Id: I454e54651ca836b9f7c28f45f51d5956106aefa9
BUG: 1554053
Signed-off-by: ShyamsundarR <srangana@redhat.com>
2018-03-10 23:12:48 -05:00
ShyamsundarR
b966c7790e protocol: Added iatt conversion to older format
Added iatt conversion to an older format, when dealing with
older RPC versions. This enables iatt structure conformance
when dealing with older clients.

This helps fix rolling upgrade from 3.x versions to 4.0 version
of gluster by sending the right iatt in the dictionary when DHT
requests the same.

Change-Id: Ieaf925f81f8c7798a8fba1e90a59fa9dec82856c
BUG: 1544699
Signed-off-by: ShyamsundarR <srangana@redhat.com>
2018-03-10 18:08:53 +00:00
Amar Tumballi
940f870f47 core: provide infra to make any xlator pass-through
updates: #304

Change-Id: If6a13d2e56b195390a386d720103a882e077f66c
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2018-03-09 18:32:56 +00:00
Poornima G
1369f313d1 libglusterfs: Fix coverity issue FORWARD_NULL
Change-Id: I1402046edb232ca9d23346db82a0cfd041c91e70
Signed-off-by: Poornima G <pgurusid@redhat.com>
2018-03-02 19:38:45 +00:00
Kaushal M
fc35d400cb libglusterfs: Fix volume_options_t struct
The volume_options_t struct was modified and a new member was introduced
in the middle of the struct. This caused GD2 to crash when it tried to
read the volume options. The new member has been moved to the end of the
struct to correct this.

And a note has been added to notify developers on how to modify this
struct, and the xlator_api_t struct.

Updates: gluster/glusterfs#302

Change-Id: I2e9899ec10516be29c7e9d574da53be8ec17a99e
Signed-off-by: Kaushal M <kaushal@redhat.com>
2018-03-02 09:00:14 +00:00
Kaleb S. KEITHLEY
bb4343fb1a libglusterfs: move compat RPC/XDR #defines to eliminate warnings
Building with libtirpc (versus legacy glibc rpc) results in many
warnings about xdr macros that are redefined in libtirpc headers
because of the way compat.h and glusterfs.h are usually #included.

And these xdr macros in libglusterfs/src/compat.h - which were copied
from legacy glibc's rpc headers - are different than the same-name macros
in libtirpc. I haven't checked to see that any of the macros are
expanded (incorrectly) between the definition in compat.h and the
redefinition in tirpc/rpc/xdr.h; the risk seems pretty minimal. Regardless
it seems better, from a truth-and-beauty perspective to not have the
old, incorrect definitions in the first place.

Not to mention that any file that #includes compat.h and not glusterfs.h
does not need these xdr macro definitions at all. They're really only
needed when using really old glibc rpc, which would only be evident if
including glusterfs.h and/or glusterfs-fops.h. (Which by the way, nothing
currently #includes glusterfs-fops.h by itself. And maybe nothing ever
should?)

Change-Id: Ic11e4407d6ab7c498a8745a99379cbf4788a24e8
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
2018-02-27 10:55:10 +00:00
N Balachandran
745e522c3a options: framework for options levels
Framework in order to classify options.

Updates gluster/glusterfs#302

Change-Id: I3dd6ae27bd0eb8e0065ffca75838c801e4f3ac91
Signed-off-by: N Balachandran <nbalacha@redhat.com>
2018-02-27 10:05:57 +00:00
Varsha Rao
430bff7dc3 performance/io-threads: nuke everything from a client when it disconnects
> io-threads: nuke everything from a client when it disconnects
> Commit ID: 4d8268d760
> https://review.gluster.org/#/c/18254/
> By Jeff Darcy <jdarcy@fb.com>

This patch is required to forward port io-threads namespace patch.
Updates: #401

Change-Id: I13d3a74862eea3d01e8dbc8736987c3dae6e8b2a
Signed-off-by: Varsha Rao <varao@redhat.com>
2018-02-27 03:45:30 +00:00
Milind Changire
7d641313f4 rpcsvc: scale rpcsvc_request_handler threads
Scale rpcsvc_request_handler threads to match the scaling of event
handler threads.

Please refer to https://bugzilla.redhat.com/show_bug.cgi?id=1467614#c51
for a discussion about why we need multi-threaded rpcsvc request
handlers.

Change-Id: Ib6838fb8b928e15602a3d36fd66b7ba08999430b
Signed-off-by: Milind Changire <mchangir@redhat.com>
2018-02-26 15:14:38 +05:30
Varsha Rao
d0e7177416 xlators/features/namespace: Add namespace xlator and link into brick graph
The following release-3.8-fb branch patch is upstreamed:
> features/namespace: Add namespace xlator and link into brick graph
> Commit ID: dbd30776f26e
> https://review.gluster.org/#/c/18041/
> By Michael Goulet <mgoulet@fb.com>

Changes in this patch:
Removes extra config.h and namespace.h file in namespace.c
Adds default_getspec_cbk to libglusterfs.sym
Rename dict_for_each to dict_foreach_inline
Remove fd.h header file stack.h
Add test case for truncate, open and symlink

This patch is required to forward port io-threads namespace patch.
Updates: #401

Change-Id: Ib88c95b89eecee9b8957df8a4c8712c899c761d1
Signed-off-by: Varsha Rao <varao@redhat.com>
2018-02-21 09:52:17 +00:00
Ravishankar N
6daa653569 posix/afr: handle backward compatibility for rchecksum fop
Added a volume option 'fips-mode-rchecksum' tied to op version 4.
If not set, rchecksum fop will use MD5 instead of SHA256.

updates: #230
Change-Id: Id8ea1303777e6450852c0bc25503cda341a6aec2
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
2018-02-19 14:57:12 +00:00
Amar Tumballi
e80c10d5e6 metrics: set latency min value during xlator init
otherwise, the very first metrics will have all the min as 0.

also no need to print pending-fops if it is 0.

Updates #168

Change-Id: I233de6c92b1a73977bb468ba211ac6ec3c05298f
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2018-02-16 16:18:11 +00:00
Prashanth Pai
664b946496 Fetch backup volfile servers from glusterd2
Clients will request for a list of volfile servers from glusterd2 by
setting a (optional) flag in GETSPEC RPC call. glusterd2 will check for
the presence of this flag and accordingly return a list of glusterd2
servers in GETSPEC RPC reply. Currently, this list of servers returned
only contains servers which have bricks belonging to the volume.

See:
https://github.com/gluster/glusterd2/issues/382
https://github.com/gluster/glusterfs/issues/351

Updates #351
Change-Id: I0eee3d0bf25a87627e562380ef73063926a16b81
Signed-off-by: Prashanth Pai <ppai@redhat.com>
2018-02-16 16:16:25 +00:00
Raghavendra G
adb266baa1 libglusterfs/syncop: Add syncop_entrylk
Change-Id: Idd86b9f0fa144c2316ab6276e2def28b696ae18a
BUG: 1543279
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
2018-02-13 12:14:17 +05:30
Kinglong Mee
248152767b gfapi: return pre/post attributes from glfs_ftruncate
Updates: #389
Change-Id: I8faea0828921fb17f05f7321c3cb01747373f21e
Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
2018-02-12 21:34:46 +00:00
Kinglong Mee
09943beb49 gfapi: return pre/post attributes from glfs_fsync/fdatasync
Updates: #389
Change-Id: I4153df72d5eeecefa7579170899db4c340128bea
Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
2018-02-12 21:34:46 +00:00
Kinglong Mee
d01f7244e9 gfapi: return pre/post attributes from glfs_pread/pwrite
As nfs-ganesha, a wcc data contains pre/post attributes is return
in read/write rpc reply. nfs-ganesha get those attributes by
two getattr between the real read/write right now.

But, gluster has return pre/post attributes from glusterfsd,
those attributes are skipped in syncop/gfapi, if gfapi return them,
the upper user (nfs-ganesha) can use them directly without any
duplicate getattr.

Updates: #389
Change-Id: I7b643ae4241cfe2aeb17063de00192d81674024a
Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
2018-02-12 21:34:46 +00:00
Poornima G
bfb66cc535 io-threads: Implement put fop
Updates #353
Change-Id: I8a30b53a52618c6a6c740d2c67b19e5322ce4ddb
Signed-off-by: Poornima G <pgurusid@redhat.com>
2018-02-12 13:57:00 +05:30
Varsha Rao
aa4372bf42 performance/io-threads: expose io-thread queue depths
The following release-3.8-fb branch patch is upstreamed:

> io-stats: Expose io-thread queue depths
> Commit ID: 69509ee7d2
> https://review.gluster.org/#/c/18143/
> By Shreyas Siravara <sshreyas@fb.com>

Changes in this patch:
- Replace iot_pri_t with gf_fop_pri_t
- Replace IOT_PRI_{HI, LO, NORMAL, MAX, LEAST} with
  GF_FOP_PRI_{HI, LO, NORMAL, MAX, LEAST}
- Use dict_unref() instead of dict_destroy()

This patch is required to forward port io-threads namespace patch.
Updates: #401
Change-Id: I1b47a63185a441a30fbc423ca1015df7b36c2518
Signed-off-by: Varsha Rao <varao@redhat.com>
2018-02-08 17:01:12 +00:00
Susant Palai
545a7ce676 cluster/dht: avoid overwriting client writes during migration
For more details on this issue see
https://github.com/gluster/glusterfs/issues/308

Solution:
This is a restrictive solution where a file will not be migrated
if a client writes to it during the migration. This does not
check if the writes from the rebalance and the client actually
do overlap.

If dht_writev_cbk finds that the file is being migrated (PHASE1)
it will set an xattr on the destination file indicating the file
was updated by a non-rebalance client.
Rebalance checks if any other client has written to the dst file
and aborts the file migration if it finds the xattr.

updates gluster/glusterfs#308

Change-Id: I73aec28bc9dbb8da57c7425ec88c6b6af0fbc9dd
Signed-off-by: Susant Palai <spalai@redhat.com>
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Signed-off-by: N Balachandran <nbalacha@redhat.com>
2018-02-02 15:24:38 +00:00
Kinglong Mee
85f1d54447 statedump: sanity check of mem_acct and rec for xlator
With memory accounting is disabled, glusterfs crash when doing statedump at,

0  0x00007fe24cff543a in gf_proc_dump_xlator_mem_info_only_in_use (xl=0x7fe23e44dc00) at statedump.c:269
1  0x00007fe24cff6310 in gf_proc_dump_oldgraph_xlator_info (top=0x7fe23e44dc00) at statedump.c:530
2  0x00007fe24cff7114 in gf_proc_dump_info (signum=10, ctx=0x7fe24ac0e000) at statedump.c:845
3  0x00007fe24d4d4bab in glusterfs_sigwaiter (arg=0x7ffc6c080750) at glusterfsd.c:2109
4  0x00007fe24bbd5dc5 in start_thread () from /lib64/libpthread.so.0
5  0x00007fe24b51a73d in clone () from /lib64/libc.so.6

(gdb) p xl->mem_acct
$1 = (struct mem_acct *) 0x0
(gdb) p xl->mem_acct->rec
$2 = 0x10

Change-Id: I10858170431311833ae01224d51c66caaad5e9a3
BUG: 1539603
Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
2018-01-31 13:05:42 +00:00
ShyamsundarR
b114303ef9 glusterd: Update op-version for master
Updated the op-version on master to the next release
op-version, for any future options appearing on master.

Change-Id: I2ef6f8874c638ade1d97477bdd8ffa1bd1a9f952
BUG: 1540338
Signed-off-by: ShyamsundarR <srangana@redhat.com>
2018-01-30 14:40:35 -05:00
Mohit Agrawal
c142d26e44 rpc: Showing some unusual timer error logs during brick stop
Solution: Update msg condition in gf_timer_call_after function
          to avoid the message

BUG: 1538427
Change-Id: I849e8e052a8259cf977fd5e7ff3aeba52f9b5f27
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
2018-01-30 09:34:51 +00:00
Poornima G
d25b606546 quiesce, gfproxy: Implement failover across multiple gfproxy nodes
Updates: #242
Change-Id: I767e574a26e922760a7130bd209c178d74e8cf69
Signed-off-by: Poornima G <pgurusid@redhat.com>
2018-01-30 05:00:52 +00:00
Soumya Koduri
1a519f4bf9 gfapi : New APIs have been added to use lease feature in gluster
Following APIs glfs_h_lease(), glfs_lease() added, so that gfapi applications
can set and get lease which enables more efficient client side caching.

Updates: #350
Change-Id: Iede85be9af1d4df969b890d0937ed0afa4ca6596
Signed-off-by: Poornima G <pgurusid@redhat.com>
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
2018-01-26 12:33:42 +00:00
Kaleb S. KEITHLEY
3cad44e8a7 build: use libtirpc by default, even if ipv6 is not the default
Another error snuck in with Change-Id I86f847dfd, or more
accurately I think, with Change-Id: Ic47065e9c2...

All libs, not just libgfrpc, need to be linked with libtirpc,
especially on systems that still have xdr functions in (g)libc
where you will get a mixture of calls to libtirpc functions
and glibc functions, with catastrophic results.

BUG: 1536186
Change-Id: I97dc39c7844f44c36fe210aa813480c219e1e415
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
2018-01-26 10:55:26 +00:00
Sakshi Bansal
78868033bc dentry fop serializer: added new server side xlator for dentry fop serialization
Problems addressed by this xlator :

[1]. To prevent race between parallel mkdir,mkdir and lookup etc.

Fops like mkdir/create, lookup, rename, unlink, link that happen on a
particular dentry must be serialized to ensure atomicity.

Another possible case can be a fresh lookup to find existance of a path
whose gfid is not set yet. Further, storage/posix employs a ctime based
heuristic 'is_fresh_file' (interval time is less than 1 second of current
time) to check fresh-ness of file. With serialization of these two fops
(lookup & mkdir), we eliminate the race altogether.

[2]. Staleness of dentries

This causes exponential increase in traversal time for any inode in the
subtree of the directory pointed by stale dentry.

Cause :  Stale dentry is created because of following two operations:

      a. dentry creation due to inode_link, done during operations like
         lookup, mkdir, create, mknod, symlink, create and
      b. dentry unlinking due to various operations like rmdir, rename,
         unlink.

       The reason is __inode_link uses __is_dentry_cyclic, which explores
       all possible path to avoid cyclic link formation during inode
       linkage. __is_dentry_cyclic explores stale-dentry(ies) and its
       all ancestors which is increases traversing time exponentially.

Implementation : To acheive this all fops on dentry must take entry locks
before they proceed, once they have acquired locks, they perform the fop
and then release the lock.

Some documentation from email conversation:
[1] http://www.gluster.org/pipermail/gluster-devel/2015-December/047314.html

[2] http://www.gluster.org/pipermail/gluster-devel/2015-August/046428.html

With this patch, the feature is optional, enable it by running:

 `gluster volume set $volname features.sdfs enable`

Also the feature is tested for a month without issues in the
experiemental branch for all the regression.

Change-Id: I6e80ba3cabfa6facd5dda63bd482b9bf18b6b79b
Fixes: #397
BUG: 1304962
Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
2018-01-24 05:39:44 +00:00
Nigel Babu
93f21655e1 libglusterfs: Reset errno before call
This was causing Gluster to return a failure when testing on Centos7.

BUG: 1536913
Change-Id: Idb90baef05058123a7f69e94a51dd79abd371815
Signed-off-by: Nigel Babu <nigelb@redhat.com>
2018-01-23 18:08:27 +00:00
Anoop C S
ec3df9e65a libgfapi: Add new api for supporting mandatory-locks
The current API for byte-range locks [glfs_posix_lock()] doesn't
allow applications to specify whether it is advisory or mandatory
type locks. This particular change is to introduce an extended
byte-range lock API with an additional argument for including
the byte-range lock mode to be one among advisory(default) or
mandatory. Patch also includes a gfapi test case which make use
of this new api to acquire mandatory locks.

Ref: https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.8/Mandatory%20Locks.md

Change-Id: Ia09042c755d891895d96da857321abc4ce03e20c
Updates #393
Signed-off-by: Anoop C S <anoopcs@redhat.com>
2018-01-22 09:54:02 +00:00
Poornima G
24bf771514 md-cache: Implement dynamic configuration of xattr list for caching
Currently, the list of xattrs that md-cache can cache is hard coded
in the md-cache.c file, this necessiates code change and rebuild
everytime a new xattr needs to be added to md-cache xattr cache
list.

With this patch, the user will be able to configure a comma
seperated list of xattrs to be cached by md-cache

Updates #297

Change-Id: Ie35ed607d17182d53f6bb6e6c6563ac52bc3132e
Signed-off-by: Poornima G <pgurusid@redhat.com>
2018-01-22 04:10:52 +00:00
Amar Tumballi
303cc2b547 protocol: make on-wire-change of protocol using new XDR definition.
With this patchset, some major things are changed in XDR, mainly:

* Naming: Instead of gfs3/gfs4 settle for gfx_ for xdr structures
* add iattx as a separate structure, and add conversion methods
* the *_rsp structure is now changed, and is also reduced in number
  (ie, no need for different strucutes if it is similar to other response).
* use proper XDR methods for sending dict on wire.

Also, with the change of xdr structure, there are changes needed
outside of xlator protocol layer to handle these properly. Mainly
because the abstraction was broken to support 0-copy RDMA with payload
for write and read FOP. This made transport layer know about the xdr
payload, hence with the change of xdr payload structure, transport layer
needed to know about the change.

Updates #384

Change-Id: I1448fbe9deab0a1b06cb8351f2f37488cefe461f
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2018-01-19 22:48:39 +05:30
Soumya Koduri
9eefff096f gfapi : added glfs_setfsleaseid() for setting lease id
A new function glfs_setfsleaseid() added in gfapi. Currently lock owner
is saved in the thread context. Similarly the leaseid attribute can be
saved using glfs_setfsleaseid().

Updates: #350
Change-Id: I55966cca01d0f2649c32b87bd255568c3ffd1262
Signed-off-by: Poornima G <pgurusid@redhat.com>
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
2018-01-19 09:27:40 +00:00
karthik-us
84c5c540b2 cluster/afr: Adding option to take full file lock
Problem:
In replica 3 volumes there is a possibilities of ending up in split
brain scenario, when multiple clients writing data on the same file
at non overlapping regions in parallel.

Scenario:
- Initially all the copies are good and all the clients gets the value
  of data readables as all good.
- Client C0 performs write W1 which fails on brick B0 and succeeds on
  other two bricks.
- C1 performs write W2 which fails on B1 and succeeds on other two bricks.
- C2 performs write W3 which fails on B2 and succeeds on other two bricks.
- All the 3 writes above happen in parallel and fall on different ranges
  so afr takes granular locks and all the writes are performed in parallel.
  Since each client had data-readables as good, it does not see
  file going into split-brain in the in_flight_split_brain check, hence
  performs the post-op marking the pending xattrs. Now all the bricks
  are being blamed by each other, ending up in split-brain.

Fix:
Have an option to take either full lock or range lock on files while
doing data transactions, to prevent the possibility of ending up in
split brains. With this change, by default the files will take full
lock while doing IO. If you want to make use of the old range lock
change the value of "cluster.full-lock" to "no".

Change-Id: I7893fa33005328ed63daa2f7c35eeed7c5218962
BUG: 1535438
Signed-off-by: karthik-us <ksubrahm@redhat.com>
2018-01-19 00:15:19 +00:00
Amar Tumballi
75b063d76d rpc/*: auth-header changes
Introduce another authentication header which can now send more data.
This is useful because this data can be common for all the fops, and
we don't need to change all the signatures.

As part of this, made rpc-clnt.c little more modular to support multiple
authentication structures.

stack.h changes are placeholder for the ctime etc, can be moved later
based on need.

updates #384

Change-Id: I6111c13cfd2ec92e2b4e9295896bf62a8a33b2c7
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2018-01-17 06:00:39 +00:00
Amar Tumballi
e3a191a0d3 dict: add another type to handle backward compatibility
This new type helps to avoid excessive logs. It should be
set only in case of
 * volume graph building (graph.y)
 * dict unserialize
   (happens once a dictionary is received on wire in old protocol)

All other dict set and get should have proper check and warning
logs if there is a mismatch.

updates #220

Change-Id: I1cccb304a877aa80c07aaac95f10f5005e35b9c5
Signed-off-by: Amar Tumballi <amarts@redhat.com>
2018-01-17 03:53:37 +00:00
Xavier Hernandez
7ba7a4b27d locks: added inodelk/entrylk contention upcall notifications
The locks xlator now is able to send a contention notification to
the current owner of the lock.

This is only a notification that can be used to improve performance
of some client side operations that might benefit from extended
duration of lock ownership. Nothing is done if the lock owner decides
to ignore the message and to not release the lock. For forced
release of acquired resources, leases must be used.

Change-Id: I7f1ad32a0b4b445505b09908a050080ad848f8e0
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
2018-01-16 10:37:22 +00:00
Atin Mukherjee
edf8224ab4 dict: fix VALIDATE_DATA_AND_LOG call
Couple of instances doesn't pass enough number of parameters to the
function resulting compilation to fail.

Updates #203

Change-Id: Id8caa6fe7fc611645ad7ff11d81a2462e4ec6bab
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
2018-01-07 04:43:45 +00:00
N Balachandran
515a832de0 libglusterfs: Include key name in data type validation
Printing the key name makes it easier for developers
to figure out which keys have dict data type mismatches.

Updates #337

Change-Id: I21d9a22488a4c5e5a8d991ca2d53f1e3039f7685
Signed-off-by: N Balachandran <nbalacha@redhat.com>
2018-01-05 15:40:26 +00:00