Commit Graph

12716 Commits

Author SHA1 Message Date
Andrew A. Vasilyev
ae8cfbc340 spec file clean up 2019-06-05 13:33:43 +03:00
Andrew A. Vasilyev
3b2d437ff5 spec name update 2019-05-31 14:06:57 +03:00
Andrew A. Vasilyev
d09e6544c5 spec for 6.2 release 2019-05-31 14:03:26 +03:00
Andrew A. Vasilyev
514ddcb13c 6.2 import 2019-05-31 13:58:12 +03:00
Hari Gowtham
630b896166 doc: Added release notes for 6.2
Fixes: bz#1701203

Change-Id: Id105192610726e370fa977df2c29723201b94695
Signed-off-by: Hari Gowtham <hgowtham@redhat.com>
2019-05-23 18:54:14 +05:30
Kotresh HR
fa8c8a3fa8 geo-rep: Convert gfid conflict resolutiong logs into debug
The gfid conflict resolution code path is not supposed
to hit in generic code path. But few of the heavy rename
workload (BUG: 1694820) makes it a generic case. So
logging the entries to be fixed as INFO floods the log
in these particular workloads. Hence convert them to DEBUG.

Backport of:
 > Patch: https://review.gluster.org/22720
 > BUG: 1709653
 > Change-Id: I4d5e102b87be5fe5b54f78f329e588882d72b9d9
 > Signed-off-by: Kotresh HR <khiremat@redhat.com>

fixes: bz#1712223
Change-Id: I4d5e102b87be5fe5b54f78f329e588882d72b9d9
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2019-05-21 10:44:53 +05:30
Kotresh HR
60df33ab0b geo-rep: Fix sync hang with tarssh
Problem:
Geo-rep sync hangs when tarssh is used as sync
engine at heavy workload.

Analysis and Root cause:
It's found out that the tar process was hung.
When debugged further, it's found out that stderr
buffer of tar process on master was full i.e., 64k.
When the buffer was copied to a file from /proc/pid/fd/2,
the hang is resolved.

This can happen when files picked by tar process
to sync doesn't exist on master anymore. If this count
increases around 1k, the stderr buffer is filled up.

Fix:
The tar process is executed using Popen with stderr as PIPE.
The final execution is something like below.

tar | ssh <args> root@slave tar --overwrite -xf - -C <path>

It was waiting on ssh process first using communicate() and then tar.
Note that communicate() reads stdout and stderr. So when stderr of tar
process is filled up, there is no one to read until untar via ssh is
completed. This can't happen and leads to deadlock.
Hence we should be waiting on both process parallely, so that stderr is
read on both processes.

Backport of:
 > Patch: https://review.gluster.org/22684/
 > Change-Id: I609c7cc5c07e210c504771115b4d551a2e891adf
 > BUG: 1707728
 > Signed-off-by: Kotresh HR <khiremat@redhat.com>

Change-Id: I609c7cc5c07e210c504771115b4d551a2e891adf
fixes: bz#1709738
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2019-05-21 10:44:53 +05:30
Kotresh HR
d6f523927b tests/geo-rep: Fix arequal checksum comparison
The arequal checkusm comparison was always returning
as successful, eventhough, if it was not. Fixed the same.

Backport of:
> Patch: https://review.gluster.org/22682
> Change-Id: I5083da25c0954126e452d06311d2d376f8540555
> BUG: 1707742
> Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 288cffd1ab7180cccfcdea36d0c469b9fa52108f)

Change-Id: I5083da25c0954126e452d06311d2d376f8540555
fixes: bz#1712220
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2019-05-21 10:44:53 +05:30
Kotresh HR
072a21576a geo-rep: Fix sync-method config
Problem:
When 'use_tarssh' is set to true, it exits with successful
message but the default 'rsync' was used as sync-engine.
The new config 'sync-method' is not allowed to set from cli.

Analysis and Fix:
The 'use_tarssh' config is deprecated with new
config framework and 'sync-method' is the new
config to choose sync-method i.e. tarssh or rsync.
This patch fixes the 'sync-method' config. The allowed
values are tarssh and rsync.

Backport of:
 > Patch: https://review.gluster.org/22683
 > Change-Id: I0edb0319cad0455b29e49f2f08a64ce324735e84
 > BUG: 1707686
 > Signed-off-by: Kotresh HR <khiremat@redhat.com>

Change-Id: I0edb0319cad0455b29e49f2f08a64ce324735e84
fixes: bz#1709737
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2019-05-17 07:47:53 +00:00
Sunny Kumar
219c9bc92c geo-rep: Fix rename with existing destination with same gfid
Problem:
   Geo-rep fails to sync the rename properly if destination exists.
It results in source to be remained on slave causing more number of
files on slave. Also heavy rename workload like logrotate caused
lot of ESTALE errors

Cause:
   Geo-rep fails to sync rename if destination exists if creation
of source file also falls into single batch of changelogs being
processed. This is because, after fixing problematic gfids verifying
from master, while re-processing original entries, CREATE also was
re-processed causing more files on slave and rename to be failed.

Solution:
   Entries need to be removed from retrial list after fixing
problematic gfids on slave so that it's not re-created again on slave.
   Also treat ESTALE as EEXIST so that the error is properly handled
verifying the op on master volume.

Backport of:
 > Patch: https://review.gluster.org/22519/
 > Change-Id: I50cf289e06b997adddff0552bf2466d9201dd1f9
 > BUG: 1694820
 > Signed-off-by: Kotresh HR <khiremat@redhat.com>
 > Signed-off-by: Sunny Kumar <sunkumar@redhat.com>

Change-Id: I50cf289e06b997adddff0552bf2466d9201dd1f9
fixes: bz#1709734
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2019-05-17 07:47:53 +00:00
Kotresh HR
4a4710b810 geo-rep: Fix entries and metadata counters in geo-rep status
Entries counter was incremented twice and decremented only
once. And entries count was being used in place of metadata
entries. This patch fixes both of them.

Backport of:
 > Patch: https://review.gluster.org/22603
 > BUG: 1512093
 > Change-Id: I5601a5fe8d25c9d65b72eb529171e7117ebbb67f
 > Signed-off-by: Kotresh HR <khiremat@redhat.com>
  (cherry picked from commit e0a6941af6ed352911698012ada895d1296b549e)

fixes: bz#1709685
Change-Id: I5601a5fe8d25c9d65b72eb529171e7117ebbb67f
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2019-05-17 07:47:53 +00:00
Pranith Kumar K
84b2d08081 cluster/ec: Reopen shouldn't happen with O_TRUNC
Problem:
Doing re-open with O_TRUNC will truncate the fragment even when it is not
needed needing extra heals

Fix:
At the time of re-open don't use O_TRUNC.

fixes bz#1709660
Change-Id: Idc6408968efaad897b95a5a52481c66e843d3fb8
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
2019-05-15 10:36:50 +00:00
Ravishankar N
9f225fa2c4 afr: thin-arbiter lock release fixes
- pass fop state instead of afr local to
afr_ta_dom_lock_check_and_release()

- avoid afr_lock_release_synctask() being called simultaneosuly from
notify code path and transaction (post-op) code path due to races.

- Check if the post-op on TA is valid based on event_gen checks.

- Invalidate in-memory information when we get TA child down.

Note: Thi patch addresses some pending review comments of commit
053b1309dc
(https://review.gluster.org/#/c/glusterfs/+/20095/)

fixes: bz#1709130
Change-Id: I2ccd7e1b53362f9f3fed8680aecb23b5011eb18c
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 9ab2747da78061882f6734df4b265bce11adaef1)
2019-05-15 04:16:52 +00:00
Ashish Pandey
a1fa0379b7 cluster/afr : TA: Return actual error code in case of failure
In afr_ta_post_op_do, we were sending EIO for every failure.
However, the original error code should be sent.

Change-Id: I9fdc15dac00d758baf8e6f14db244f526481a63a
updates: bz#1709143
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
(cherry picked from commit 63159cdb5374f458d7d2bffec24d4720ffc96d6c)
2019-05-13 05:36:51 +00:00
N Balachandran
ab296b5be7 cluster/dht: Refactor dht lookup functions
Part 2: Modify dht_revalidate_cbk to call
dht_selfheal_directory instead of separate calls
to heal attrs and xattrs.

Change-Id: Id41ac6c4220c2c35484812bbfc6157fc3c86b142
fixes: bz#1707393
Signed-off-by: N Balachandran <nbalacha@redhat.com>
2019-05-09 17:40:25 +05:30
N Balachandran
7f780f30e5 cluster/dht: refactor dht lookup functions
Part 1:  refactor the dht_lookup_dir_cbk
and dht_selfheal_directory functions.
Added a simple dht selfheal directory test

Change-Id: I1410c26359e3c14b396adbe751937a52bd2fcff9
updates: bz#1707393
Signed-off-by: N Balachandran <nbalacha@redhat.com>
2019-05-08 14:00:05 +00:00
Sanju Rakonde
3fc7d08c1d glusterd: define dumpops in the xlator_api of glusterd
Problem: statedump is not capturing information related to glusterd

Solution: statdump is not capturing glusterd info because
trav->dumpops is null in gf_proc_dump_single_xlator_info ()
where trav is glusterd xlator object. trav->dumpops is null
because we missed to define dumpops in xlator_api of glusterd.
defining dumpops in xlator_api of glusterd fixes the issue.

fixes: bz#1703759
Change-Id: If85429ecb1ef580aced8d5b88d09fc15258bfc4c
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
(cherry picked from commit 5d866c13efdcdeddf184f012aa88a652e90ff22e)
2019-05-08 13:57:24 +00:00
Kotresh HR
b2c6983d0c ctime: Fix log repeated logging during open
The log "posix set mdata failed, No ctime" logged repeatedly
after the fix [1]. Those could be internal fops. This patch
fixes the same.

[1] https://review.gluster.org/22540

Backport of:
 > Patch: https://review.gluster.org/#/c/glusterfs/+/22591/
 > BUG:1701457
 > Change-Id: I42799a90b976982cedb0ca11fa224d555eb05650
 > Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 2d39572821306496c96797f4d122f8200aae4585)

fixes: bz#1702734
Change-Id: I42799a90b976982cedb0ca11fa224d555eb05650
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2019-05-08 13:57:14 +00:00
Milan Zink
2aa9898720 extras/hooks: syntactical errors in SELinux hooks, scipt logic improved
Fixes: bz#1701818
Change-Id: Ia5fa1df81bbaec3a84653d136a331c76b457f42c
Signed-off-by: Milan Zink <zeten30@gmail.com>
(cherry picked from commit 1ad201a9fd6748d7ef49fb073fcfe8c6858d557d)
2019-05-08 13:56:20 +00:00
Pranith Kumar K
bf69fa4727 cluster/ec: fix fd reopen
Currently EC tries to reopen fd's that have been opened while a brick
was down. This is done as part of regular write operations, just after
having acquired the locks, and it's sent as a sub-fop of the main write
fop.

There were two problems:

1. The reopen was attempted on all UP bricks, even if a previous lock
didn't succeed. This is incorrect because most probably the open will
fail.

2. If reopen is sent and fails, the error is propagated to the main
operation, causing it to fail when it shouldn't.

To fix this, we only attempt reopens on bricks where the current fop
owns a lock, and we prevent any error to be propagated to the main
fop.

To implement this behaviour an argument used to indicate the minimum
number of required answers has overloaded to also include some flags. To
make the change consistent, it has been necessary to rename the
argument, which means that a lot of files have been changed. However
there are no functional changes.

This change has also uncovered a problem in discard code, which didn't
correctely process requests of small sizes because no real discard fop
was being processed, only a write of 0's on some region. In this case
some fields of the fop remained uninitialized or with incorrect values.
To fix this, a new function has been created to simulate success on a
fop and it's used in the discard case.

Thanks to Pranith for providing a test script that has also detected an
issue in this patch. This patch includes a small modification of this
script to force data to be written into bricks before stopping them.

Backport of:
> Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec
> BUG: bz#1699866
> Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>

Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec
Fixes: bz#1699917
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
2019-05-08 13:54:59 +00:00
Xavi Hernandez
ab6c9ff91a core: handle memory accounting correctly
When a translator stops, memory accounting for that translator is not
destroyed (because there could remain memory allocated that references
it), but mutexes that coordinate updates of memory accounting were
destroyed. This caused incorrect memory accounting and even crashes in
debug mode.

This patch also fixes some other things:

* Reduce the number of atomic operations needed to manage memory
  accounting.
* Correctly account memory when realloc() is used.
* Merge two critical sections into one.
* Cleaned the code a bit.

Backport of:
> Change-Id: Id5eaee7338729b9bc52c931815ca3ff1e5a7dcc8
> BUG: bz#1659334
> Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>

Change-Id: Id5eaee7338729b9bc52c931815ca3ff1e5a7dcc8
Fixes: bz#1702271
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
2019-04-23 13:53:02 +02:00
ShyamsundarR
5c521d403f doc: Added release notes for 6.1
Fixes: bz#1692394
Change-Id: I44a28ec98932d54851dbf997988e1f8fd9877f0a
Signed-off-by: ShyamsundarR <srangana@redhat.com>
2019-04-17 11:35:38 -04:00
Atin Mukherjee
1bbf2f7713 glusterd: fix loading ctime in client graph logic
Commit efbf8ab wasn't handling all the scenarios of toggling ctime
option correctly and more over a ! had completely tossed up the logic.

Fixes: bz#1698471
Change-Id: If12e2f69045e59878992ee2cd0518cc0eabcce0d
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
2019-04-17 15:21:37 +00:00
Aravinda VK
ec95f02b1e geo-rep: fix integer config validation
ssh-port validation is mentioned as `validation=int` in template
`gsyncd.conf`, but not handled this during geo-rep config set.

Backport of https://review.gluster.org/22418

Fixes: bz#1695445
Change-Id: I3f19d9b471b0a3327e4d094dfbefcc58ed2c34f6
Signed-off-by: Aravinda VK <avishwan@redhat.com>
(cherry picked from commit c574984e19d59e351372eacce0ce11fb36e96dd4)
2019-04-17 13:58:52 +00:00
Aravinda VK
cbb52082da geo-rep: IPv6 support
`address_family=inet6` needs to be added while mounting master and
slave volumes in gverify script.

New option introduced to gluster cli(`--inet6`) which will be used
internally by geo-rep while calling `gluster volume info
--remote-host=<ipv6>`.

Backport of https://review.gluster.org/22363

Fixes: bz#1695436
Change-Id: I1e0d42cae07158df043e64a2f991882d8c897837
Signed-off-by: Aravinda VK <avishwan@redhat.com>
(cherry picked from commit 240e1d6821fbb779c3dd73f6f0225d755a5b7cc6)
2019-04-17 13:58:46 +00:00
Kinglong Mee
f39fc92d65 cluster-syncop: avoid duplicate unlock of inodelk/entrylk
When using ec, there are many messages at brick log as,

[inodelk.c:514:__inode_unlock_lock] 0-test-locks:  Matching lock not found for unlock 0-9223372036854775807, lo=68e040a84b7f0000 on 0x7f208c006f78
[MSGID: 115053] [server-rpc-fops_v2.c:280:server4_inodelk_cbk] 0-test-server: 2557439: INODELK <gfid:df4e41be-723f-4289-b7af-b4272b3e880c> (df4e41be-723f-4289-b7af-b4272b3e880c), client: CTX_ID:67d4a7f3-605a-4965-89a5-31309d62d1fa-GRAPH_ID:0-PID:1659-HOST:openfs-node2-PC_NAME:test-client-1-RECON_NO:-28, error-xlator: test-locks [Invalid argument]

> Change-Id: Ib164d29ebb071f620a4ca9679c4345ef7c88512a
> Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
(cherry-pick of https://review.gluster.org/#/c/glusterfs/+/22377/)

Change-Id: I6e0eaba6aca6cd99ba2a5ae2e580167d54d8ea26
Updates: bz#1690950
Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
2019-04-17 13:47:52 +00:00
Raghavendra G
b1186532c7 transport/socket: log shutdown msg occasionally
Change-Id: If3fc0884e7e2f45de2d278b98693b7a473220a5e
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Fixes: bz#1679904
(cherry picked from commit ec1b84300fe267dd12c1e42e7e91905db935f1e2)
2019-04-16 13:38:58 +00:00
Pranith Kumar K
7ec3a8527f cluster/afr: Remove local from owners_list on failure of lock-acquisition
When eager-lock lock acquisition fails because of say network failures, the
local is not being removed from owners_list, this leads to accumulation of
waiting frames and the application will hang because the waiting frames are
under the assumption that another transaction is in the process of acquiring
lock because owner-list is not empty. Handled this case as well in this patch.
Added asserts to make it easier to find these problems in future.

fixes bz#1699731
Change-Id: I3101393265e9827755725b1f2d94a93d8709e923
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
2019-04-16 11:29:03 +00:00
Mohit Agrawal
d2de3f6639 core: Log level changes do not effect on running client process
Problem: commit c34e4161f3 set log-level
         per xlator during reconfigure only for a brick process not for
         the client process.

Solution: 1) Change per xlator log-level only if brick_mux is enabled.To make sure
             about brick multiplex introudce a flag brick_mux at ctx->cmd_args.

Note: There are two other changes done with this patch
      1) Ignore client-log-level option to attach a brick with
         already running brick if brick_mux is enabled
      2) Add a log to print pid of the running process to make easier
         debugging

> Change-Id: I39e85de778e150d0685cd9a79425ce8b4783f9c9
> Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
> Fixes: bz#1696046
> (Cherry picked from commit 798aadbe51a9a02dd98a0f861cc239ecf7c8ed57)
> (Reviewed on upstream link https://review.gluster.org/#/c/glusterfs/+/22495/)

Change-Id: If91682830f894ab8f6857f19dcb1797fc15ca64c
Fixes: bz#1699715
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
2019-04-16 10:59:36 +00:00
Kotresh HR
80d2dae631 posix/ctime: Fix stat(time attributes) inconsistency during readdirp
Problem:
   Creation of tar file on gluster volume throws warning
'file changed as we read it'

Cause:
   During readdirp, for few of the files whose inode is not
present, time attributes were served from backend. This caused
the ctime of few files to be different between before readdir
and after readdir by tar.

Solution:
  If ctime feature is enabled and inode is not present, don't
serve the time attributes from backend file, serve it from xattr.

Backport of:
> Patch: https://review.gluster.org/22540
> BUG: 1698078
> Change-Id: I427ef865f97399475faf5aa6ca495f7e317603ae
> Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit c56f102da21c5b69e656a055aaf736281596284d)

fixes: bz#1699703
Change-Id: I427ef865f97399475faf5aa6ca495f7e317603ae
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2019-04-16 10:57:12 +00:00
Kinglong Mee
5f51159463 ec: fix truncate lock to cover the write in tuncate clean
ec_truncate_clean does writing under the lock granted for truncate,
but the lock is calculated by ec_adjust_offset_up, so that,
the write in ec_truncate_clean is out of lock.

Updates: bz#1699499
Change-Id: Idbe1fd48d26afe49c36b77db9f12e0907f5a4134
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
(cherry picked from commit 0e1223491e964096384edfae5032ed0d50d028ad)
2019-04-16 10:57:05 +00:00
Mohit Agrawal
88ecd64604 core: Brick is not able to detach successfully in brick_mux environment
Problem: In brick_mux environment, while volumes are stopped in a
         loop bricks are not detached successfully. Brick's are not
         detached because xprtrefcnt has not become 0 for detached brick.
         At the time of initiating brick detach process server_notify
         saves xprtrefcnt on detach brick and once counter has become
         0 then server_rpc_notify spawn a server_graph_janitor_threads
         for cleanup brick resources.xprtrefcnt has not become 0 because
         socket framework is not working due to assigning 0 as a fd for socket.
         In commit dc25d2c1ee
         there was a change in changelog fini to close htime_fd if htime_fd is not
         negative, by default htime_fd is 0 so it close 0 also.

Solution: Initialize htime_fd to -1 after just allocate changelog_priv
          by GF_CALLOC

> Fixes: bz#1699025
> Change-Id: I5f7ca62a0eb1c0510c3e9b880d6ab8af8d736a25
> Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
> (cherry picked from commit b777d83001d8006420b6c7d2d88fe68950aa7e00)

Change-Id: I7a2b6fc2d36405d51990376333e093661be48475
Fixes: bz#1699714
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
2019-04-16 10:55:51 +00:00
Mohit Agrawal
08278e8823 build: glusterfs build is failing on RHEL-6
Problem: glusterfs build is throwing error undefined
         reference to `dlclose' on RHEL 6

Solution: Add LIB_DL link in Makefile.am to resolve the same

> Fixes: bz#1696512
> Change-Id: I58019ca9e29d569d8e6df282b8ab178ad540843b
> Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
> (cherry picked from commit 064aad721c249d63fb89686b32e5d15de50e2f8c)

Change-Id: I4f68553b501c283e2066ddc64e204db40552ee73
Fixes: bz#1699713
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
2019-04-16 10:55:43 +00:00
karthik-us
6bd52e5108 cluster/afr: Thin-arbiter SHD fixes
This patch address post-merge review comments for commit
5784a00f99

Change-Id: I7ed954664a2ae8e1091d23ee3ceb9c66e83bfeac
fixes: bz#1699319
Signed-off-by: karthik-us <ksubrahm@redhat.com>
2019-04-16 10:53:04 +00:00
Pranith Kumar K
fbba6e397f protocol/client: Do not fallback to anon-fd if fd is not open
If an open comes on a file when a brick is down and after the brick comes up,
a fop comes on the fd, client xlator would still wind the fop on anon-fd
leading to wrong behavior of the fops in some cases.

Example:
If lk fop is issued on the fd just after the brick is up in the scenario above,
lk fop will be sent on anon-fd instead of failing it on that client xlator.
This lock will never be freed upon close of the fd as flush on anon-fd is
invalid and is not wound below server xlator.

As a fix, failing the fop unless the fd has FALLBACK_TO_ANON_FD flag.

Change-Id: I77692d056660b2858e323bdabdfe0a381807cccc
fixes bz#1699198
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
(cherry picked from commit 92ae26ae8039847e38c738ef98835a14be9d4296)
2019-04-16 10:52:32 +00:00
Ravishankar N
74db82dd5d afr: thin-arbiter read txn fixes
- Fixes afr_ta_read_txn() to handle inode refresh failures.
code-path.
- Fixes a double free issue of dict.

Note: This patch address post-merge review comments for commit
69532c141b

fixes: bz#1693992
Change-Id: Id5299b45b68569d47df6b73755918237a1592cb4
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 500bd0014128e6727e83b6cb77e8ac94304b8f4a)
2019-04-16 10:51:51 +00:00
Ashish Pandey
f792fd01aa cluster/ec: Don't enqueue an entry if it is already healing
Problem:
1 - heal-wait-qlength is by default 128. If shd is disabled
and we need to heal files, client side heal is needed.
If we access these files that will trigger the heal.
However, it has been observed that a file will be enqueued
multiple times in the heal wait queue, which in turn causes
queue to be filled and prevent other files to be enqueued.

2 - While a file is going through healing and a write fop from
mount comes on that file, it sends write on all the bricks including
healing one. At the end it updates version and size on all the
bricks. However, it does not unset dirty flag on all the bricks,
even if this write fop was successful on all the bricks.
After healing completion this dirty flag remain set and never
gets cleaned up if SHD is disabled.

Solution:
1 - If an entry is already in queue or going through heal process,
don't enqueue next client side request to heal the same file.

2 - Unset dirty on all the bricks at the end if fop has succeeded on
all the bricks even if some of the bricks are going through heal.

Change-Id: Ia61ffe230c6502ce6cb934425d55e2f40dd1a727
updates: bz#1693223
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
(cherry picked from commit 313dcefe7a62bd16cd794040df068f9bec9c6927)
2019-04-16 10:50:49 +00:00
Atin Mukherjee
aca7ec21ed glusterd: load ctime in the client graph only if it's not turned off
Considering ctime is a client side feature, we can't blindly load ctime
xlator into the client graph if it's explicitly turned off, that'd
result into backward compatibility issue where an old client can't mount
a volume configured on a server which is having ctime feature.

Fixes: bz#1698471
Change-Id: I6ae7b96d056073aa6746de9a449cf319786d45cc
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
(cherry picked from commit efbf8abcc3bc729a90d4a7b57dc515f1df8a5863)
2019-04-16 10:50:21 +00:00
Atin Mukherjee
c2723c57d2 logging: Fix GF_LOG_OCCASSIONALLY API
GF_LOG_OCCASSIONALLY doesn't log on the first instance rather at every
42nd iterations which isn't effective as in some cases we might not have
the code flow hitting the same log for as many as 42 times and we'd end
up suppressing the log.

Updates: bz#1679904
Change-Id: Iee293281d25a652b64df111d59b13de4efce06fa
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
(cherry picked from commit d0d3e10d44366c68fc153e48b229e72a4aa26e61)
2019-04-16 10:49:22 +00:00
Atin Mukherjee
55c5e2ecc7 glusterd: fix txn-id mem leak
This commit ensures the following:
1. Don't send commit op request to the remote nodes when gluster v
status all is executed as for the status all transaction the local
commit gets the name of the volumes and remote commit ops are
technically a no-op. So no need for additional rpc requests.
2. In op state machine flow, if the transaction is in staged state and
op_info.skip_locking is true, then no need to set the txn id in the
priv->glusterd_txn_opinfo dictionary which never gets freed.

Fixes: bz#1694610
Change-Id: Ib6a9300ea29633f501abac2ba53fb72ff648c822
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
(cherry picked from commit 34e010d64905b7387de57840d3fb16a326853c9b)
2019-04-16 10:49:09 +00:00
Ravishankar N
5946db166a afr: add client-pid to all gf_event() calls
client-pid for glustershd is GF_CLIENT_PID_SELF_HEALD
client-pid for glfsheal is GF_CLIENT_PID_GLFS_HEALD

updates: bz#1693155
Change-Id: Ib3a863af160ff48c822a5e6b0c27c575c9887470
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 8016d51a3bbd410b0b927ed66be50a09574b7982)
2019-04-16 10:48:40 +00:00
Kaleb S. KEITHLEY
27a96f1f34 rpclib: slow floating point math and libm
In release-6 rpc/rpc-lib (libgfrpc) added the function
get_rightmost_set_bit() which calls log2(3), a call that takes
a floating point parameter and returns a floating point.

It's used thusly:
    right_most_unset_bit = get_rightmost_set_bit(...);

(So is it really the right-most unset bit, or the right-most set bit?)

It's unclear to me whether this is in the data path or not. If it is,
it's rather scary to think about integer-to-float and float-to-integer
conversions and slow calls to libm functions in the data path.

gcc and clang have __builtin_ctz() which returns the same result as
get_rightmost_set_bit(), and does it substantially faster. Approx
20M iterations of get_rightmost_set_bit() took ~33sec of wall clock
time on my devel machine, while 20M iterations of __builtin_ctz()
took < 9sec; get_rightmost_set_bit() is 3x slower than __builtin_ctz().

And as a side benefit, we can again eliminate the need to link libgfrpc
with libm.

Change-Id: If9e7e80874577c52223f8125b385fc930de20699
fixes: bz#1692957
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
2019-04-16 10:48:06 +00:00
Ashish Pandey
c5bc21ebbc cluster/ec: Fix handling of heal info cases without locks
When we use heal info command, it takes lot of time as in
some cases it takes lock on entries to find out if the
entry actually needs heal or not.

There are some cases where we can avoid these locks and
can conclude if the entry needs heal or not.

1 - We do a lookup (without lock) on an entry, which we found in
.glusterfs/indices/xattrop, and find that lock count is
zero. Now if the file contains dirty bit set on all or any
brick, we can say that this entry needs heal.

2 - If the lock count is one and dirty is greater than 1,
then it also means that some fop had left the dirty bit set
which made the dirty count of current fop (which has taken lock)
more than one. At this point also we can definitely say that
this entry needs heal.

This patch is modifying code to take into consideration above two
points.
It is also changing code to not to call ec_heal_inspect if ec_heal_do
was called from client side heal. Client side heal triggeres heal
only when it is sure that it requires heal.

[We have changed the code to not to call heal for lookup]

updates bz#1697764
Change-Id: I7f09f0ecd12f65a353297aefd57026fd2bebdf9c
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
(cherry picked from commit da47caf2405c08c9abafc4a55525a8b2c2dd5bb8)
2019-04-09 05:27:52 +00:00
Kotresh HR
381e7603d9 geo-rep: Fix syncing multiple rename of symlink
Problem:
Geo-rep fails to sync rename of symlink if it's
renamed multiple times if creation and rename
happened successively

Worker crash at slave:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py",  in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", in entry_ops
    [ESTALE, EINVAL, EBUSY])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", in errno_wrap
    return call(*arg)
  File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in lsetxattr
    cls.raise_oserr()
  File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in raise_oserr
    raise OSError(errn, os.strerror(errn))
OSError: [Errno 12] Cannot allocate memory

Geo-rep Behaviour:
1. SYMLINK doesn't record target path in changelog.
   So while syncing SYMLINK, readlink is done on
   master to get target path.

2. Geo-rep will create destination if source is not
   present while syncing RENAME. Hence while syncing
   RENAME of SYMLINK, target path is collected from
   destination.

Cause:
If symlink is created and renamed multiple times, creation of
symlink is ignored, as it's no longer present on master at
that path. While symlink is renamed multiple times at master,
when syncing first RENAME of SYMLINK, both source and destination
is not present, hence target path is not known.  In this case,
while creating destination directly at slave,  regular file
attributes were encoded into blob instead of symlink,
causing failure in gfid-access translator while decoding
blob.

Solution:
While syncing of RENAME of SYMLINK, when target is not known
and when src and destination is not present on the master,
don't create destination. Ignore the rename. It's ok to ignore.
If it's unliked, it's fine.  If it's renamed to something else,
it will be synced then.

Backport of:
> Change-Id: Ibdfa495513b7c05b5370ab0b89c69a6802338d87
> BUG: bz#1693648
> Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 877af725b3e35b548d6d7aeec5adb21721d8bf8b)

Change-Id: Ibdfa495513b7c05b5370ab0b89c69a6802338d87
fixes: bz#1694002
Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 877af725b3e35b548d6d7aeec5adb21721d8bf8b)
2019-04-03 04:31:06 +00:00
Soumya Koduri
491ff40a7a gfapi: Unblock epoll thread for upcall processing
With commit#ad35193,we have made changes to offload
processing upcall notifications to synctask so as not
to block epoll threads. However seems like the issue wasnt
fully addressed.

In "glfs_cbk_upcall_data" -> "synctask_new1" after creating synctask
if there is no callback defined, the thread waits on synctask_join
till the syncfn is finished. So that way even with those changes,
epoll threads are blocked till the upcalls are processed.

Hence the right fix now is to define a callback function for that
synctask "glfs_cbk_upcall_syncop" so as to unblock epoll/notify threads
completely and the upcall processing can happen in parallel by synctask
threads.

Change-Id: I4d8645e3588fab2c3ca534e0112773aaab68a5dd
fixes: bz#1694561
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit 4a03a71c6171f6e8382664d9d29857d06ef37741)
2019-04-03 04:30:53 +00:00
Poornima G
dbfff66092 client-rpc: Fix the payload being sent on the wire
The fops allocate 3 kind of payload(buffer) in the client xlator:
- fop payload, this is the buffer allocated by the write and put fop
- rsphdr paylod, this is the buffer required by the reply cbk of
  some fops like lookup, readdir.
- rsp_paylod, this is the buffer required by the reply cbk of fops like
  readv etc.

Currently, in the lookup and readdir fop the rsphdr is sent as payload,
hence the allocated rsphdr buffer is also sent on the wire, increasing
the bandwidth consumption on the wire.

With this patch, the issue is fixed.

Fixes: bz#1692101
Change-Id: Ie8158921f4db319e60ad5f52d851fa5c9d4a269b
Signed-off-by: Poornima G <pgurusid@redhat.com>
2019-03-29 15:23:52 +00:00
Ravishankar N
4e7afab231 gfapi: add function to set client-pid
This api offers the ability to set the pid of a client to a particular
value, identical to how gluster fuse clients provide the --client-pid
option. This is an internal API to be used by gluster processes only. See
https://lists.gluster.org/pipermail/gluster-devel/2019-March/055925.html
for more details. Currently glfsheal is the only proposed consumer.

updates: bz#1693155
Change-Id: I0620be2127d79d69cdd57cffb29bba44e6e5da1f
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 08d502c3b05c6f7831bb4cc764bd458b405a99b1)
2019-03-29 11:08:32 +00:00
Yaniv Kaul
cba59f6cb7 server.c: fix Coverity CID 1399758
1399758 Dereference before null check

It was introduced @ commit 67f48bfcc16a38052e6c9ae7c25e69b03b8ae008

updates: bz#1691187
> updates: bz#789278
> Signed-off-by: Yaniv Kaul <ykaul@redhat.com>

> Change-Id: I1424b008b240691fe2a8924e31c708d0fb4f362d
> (cherry picked from commit 8aff9cc5c6277ef7dacfb89f1392b7c2eda9b825)

Change-Id: Ie2160fb9ae9cdeacf845e849da7f6001b3b6b10b
2019-03-21 04:57:14 +00:00
ShyamsundarR
3fadf5cc41 doc: Final version of release-6 release notes
Fixes: bz#1672818
Change-Id: I6a98985a7f25bc2b85af5bd85f4be3ffac7d619d
Signed-off-by: ShyamsundarR <srangana@redhat.com>
2019-03-19 10:17:57 -04:00
Kotresh HR
7e90a3b592 release-notes/6.0: Add ctime feature changes in release notes
Change-Id: I3a305b9eb292a450c83de5628ceeadcb0a44afc7
updates: bz#1672818
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Signed-off-by: ShyamsundarR <srangana@redhat.com>
2019-03-19 09:51:05 -04:00