IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This patch will allow for developers to create unit tests for
their code. Documentation has been added to the patch and
is available here:
doc/hacker-guide/en-US/markdown/unittest.md
Also, unit tests are run when RPM is created.
BUG: 1067059
Change-Id: I95cf8bb0354d4ca4ed4476a0f2385436a17d2369
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
Signed-off-by: Luis Pabon <lpabon@redhat.com>
Reviewed-on: http://review.gluster.org/7145
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Justin Clift <justin@gluster.org>
Tested-by: Justin Clift <justin@gluster.org>
To be used in afr metadata self-heal
Change-Id: I8dac4b19d61e331702427eeb5b606aab3d20b328
BUG: 1021686
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6941
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Memory is allocated for pump_priv and for pump_priv->resume_path, but if
an error is detected the references to that memory go out of scope and
the memory is never freed.
This patch assures that the memory is freed on error.
Patchset 2: These are Kaleb's recommended changes which, compared to my
original fix, are more comprehensive and provide a more
complete resolution to the memory leakage bugs in this
function. The bug reported by Coverity was limited to a
single memory allocation.
BUG: 789278
CID: 1124737
Change-Id: Ie239e3b5d28d97308bf948efec6a92f107bc648b
Signed-off-by: Christopher R. Hertel <crh@redhat.com>
Reviewed-on: http://review.gluster.org/6929
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Problem: With the current implementation we are allowing unlink
of a file if hashed subvol is down and cached subvol is up.
For the above op to work we should have the info of hashed_subvol.
But incase we do remount of the volume we will have a zeroed layout
for the disconnected subvol(start=0, stop=0, err=ENOTCONN) which will
result into hashed_subvol being NULL and failing unlink op.
Solution: Dont fail if hashed_subvol is NULL. Check cached subvol
and unlink in cached subvol. The linkto file in the hashed subvol
can be remove later.
Change-Id: Ic1982c15c8942a1adcb47ed0017d2d5ace5c9241
BUG: 983416
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/6851
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
Files that are being rebalanced are created in the new volume
and access path needs to open these files to write changing
data in parallel to both the old and new locations. While opening
the file in the new location, we need to restrict the open flags
to not use truncate or create and fail if exist flags, to prevent
open failures or inadvertently truncate the file under rebalance.
Change-Id: I12130e0377adc393f1925c45585200ad991fd0d5
BUG: 1058569
Signed-off-by: ShyamsundarR <srangana@redhat.com>
Reviewed-on: http://review.gluster.org/6830
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
The layout was not being sorted in the ascending order leading
to the wrong detection of holes/overlaps.
From looking at the previous git commits it appears that the initial version itself had the err comparison code.
Deductions from the current dht_layout_sort():
1. The zero'ed out layouts should be in the from of list, if needed
2. The layout should be sorted in the ascending order of layout error value.
3. The layout should be sorted in the ascending order of the layout 'start'.
But In some cases, with the err comparison code its not sorted in the ascending order. Example: If the input is as below for dht_layout_sort(), the sorting doesn't happen in ascending order.
Input:
0-1 err:0 2-3 err:0 6-7 err:0 0-0 err:20 4-5 err:0
With the current sort, Output:
4-5 err:0 0-0 err:0 0-1 err:0 2-3 err:0 6-7 err:0
Expected: 0-0 err:20 0-1 err:0 2-3 err:0 4-5 err:0 6-7 err:0
Looking at dht_layout_anomalies() it appears that, it doesn't require the layout to be sorted based on error value.
The other solution was to replace line 468 with:
if ((layout->list[i].err || layout->list[j].err) && (layout->list[i].start > layout->list[j].start))
Since dht_layout_anomalies() didn't expect the layout to be sorted based on the error, removed the err comparison.
Change-Id: I1215f6cd53efc7dba01c0958ba6cc7609dab6ff5
BUG: 1056406
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/6757
Reviewed-by: Anand Avati <avati@redhat.com>
Tested-by: Anand Avati <avati@redhat.com>
If the call to dict_set_dynstr() fails, the memory indicated by
xattr_buf will not have been stored in the dictionary, so it must be
freed.
Patch set 2: Added a missed call to GF_FREE(). Fixed a formatting
consistency issue.
Patch set 3: Cleaned a minor style nit.
BUG: 789278
CID: 1124786
Change-Id: Id1f85bd2cbfac0b8727a3f6901f0a50ba921817d
Signed-off-by: Christopher R. Hertel <crh@redhat.com>
Reviewed-on: http://review.gluster.org/6826
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
Currently with rmdir, if a directory contains only the linkfiles
we remove all the linkfiles and this is causing the problem when the cached
sub volume is down and end-up with duplicate files showing on the mount point.
Solution: Before removing a linkfile check if the
files exists in cached subvolume.
Change-Id: Iedffd0d9298ec8bb95d5ce27c341c9ade81f0d3c
BUG: 1042725
Signed-off-by: Vijaykumar M <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/6500
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
Memory is allocated at the top of the while loop via a call to
gf_strdup(), but there are several goto calls that exit the loop, and
the memory is not freed before each of those calls to goto. This fix
moves the final call to GF_FREE() higher in the loop so that the memory
is correctly freed.
Two variables, dup_str and str_tmp1, point to portions of the allocated
memory. Neither are used past the final call to GF_FREE( dup_str ).
BUG: 789278
CID: 1124780
Change-Id: Id24b80cdbfd8b8855c80fffec63d7fce98cbed4a
Signed-off-by: Christopher R. Hertel <crh@redhat.com>
Reviewed-on: http://review.gluster.org/6771
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
This appears to have been a cut&paste error. The same set of 12 lines
was repeated three times, causing a pointer to allocated memory to be
overwritten twice resulting in a memory leak.
This patch removes the redundant code.
BUG: 789278
CID: 1128915
Change-Id: I3e4a3703b389c00e2a4e99e0a7368c5a3dda74d0
Signed-off-by: Christopher R. Hertel <crh@redhat.com>
Reviewed-on: http://review.gluster.org/6769
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Also fixed check in dht_is_subvol_in_layout to check if the
layouts are zero'ed out.
Change-Id: I4bf8ebf66d3ef1946309b6c9aac9e79bf8a6d495
BUG: 969461
Signed-off-by: shishir gowda <sgowda@redhat.com>
Signed-off-by: Varun Shastry <vshastry@redhat.com>
Reviewed-on: http://review.gluster.org/6392
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Problem:
We found a day-1 bug when syncop_xxx() infra is used inside a synctask with
compilation optimization (CFLAGS -O2).
Detailed explanation of the Root cause:
We found the bug in 'gf_defrag_migrate_data' in rebalance operation:
Lets look at interesting parts of the function:
int
gf_defrag_migrate_data (xlator_t *this, gf_defrag_info_t *defrag, loc_t *loc,
dict_t *migrate_data)
{
.....
code section - [ Loop ]
while ((ret = syncop_readdirp (this, fd, 131072, offset, NULL,
&entries)) != 0) {
.....
code section - [ ERRNO-1 ] (errno of readdirp is stored in readdir_operrno by a
thread)
/* Need to keep track of ENOENT errno, that means, there is no
need to send more readdirp() */
readdir_operrno = errno;
.....
code section - [ SYNCOP-1 ] (syncop_getxattr is called by a thread)
ret = syncop_getxattr (this, &entry_loc, &dict,
GF_XATTR_LINKINFO_KEY);
code section - [ ERRNO-2] (checking for failures of syncop_getxattr(). This
may not always be executed in same thread which executed [SYNCOP-1])
if (ret < 0) {
if (errno != ENODATA) {
loglevel = GF_LOG_ERROR;
defrag->total_failures += 1;
.....
}
the function above could be executed by thread(t1) till [SYNCOP-1] and code
from [ERRNO-2] can be executed by a different thread(t2) because of the way
syncop-infra schedules the tasks.
when the code is compiled with -O2 optimization this is the assembly code that
is generated:
[ERRNO-1]
1165 readdir_operrno = errno; <<---- errno gets expanded
as *(__errno_location())
0x00007fd149d48b60 <+496>: callq 0x7fd149d410c0 <address@hidden>
0x00007fd149d48b72 <+514>: mov %rax,0x50(%rsp) <<------ Address
returned by __errno_location() is stored in a special location in stack for
later use.
0x00007fd149d48b77 <+519>: mov (%rax),%eax
0x00007fd149d48b79 <+521>: mov %eax,0x78(%rsp)
....
[ERRNO-2]
1281 if (errno != ENODATA) {
0x00007fd149d492ae <+2366>: mov 0x50(%rsp),%rax <<----- Because
it already stored the address returned by __errno_location(), it just
dereferences the address to get the errno value. BUT THIS CODE NEED NOT BE
EXECUTED BY SAME THREAD!!!
0x00007fd149d492b3 <+2371>: mov $0x9,%ebp
0x00007fd149d492b8 <+2376>: mov (%rax),%edi
0x00007fd149d492ba <+2378>: cmp $0x3d,%edi
The problem is that __errno_location() value of t1 and t2 are different. So
[ERRNO-2] ends up reading errno of t1 instead of errno of t2 even though t2 is
executing [ERRNO-2] code section.
When code is compiled without any optimization for [ERRNO-2]:
1281 if (errno != ENODATA) {
0x00007fd58e7a326f <+2237>: callq 0x7fd58e797300
<address@hidden><<--- As it is calling __errno_location() again it gets the
location from t2 so it works as intended.
0x00007fd58e7a3274 <+2242>: mov (%rax),%eax
0x00007fd58e7a3276 <+2244>: cmp $0x3d,%eax
0x00007fd58e7a3279 <+2247>: je 0x7fd58e7a32a1
<gf_defrag_migrate_data+2287>
Fix:
Make syncop_xxx() return (-errno) value as the return value in
case of errors and all the functions which make syncop_xxx() will need to use
(-ret) to figure out the reason for failure in case of syncop_xxx() failures.
Change-Id: I314d20dabe55d3e62ff66f3b4adb1cac2eaebb57
BUG: 1040356
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6475
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
From the history (Patch: http://review.gluster.org/4668/)
When subvols-per-directory is < available subvols, then there are layouts
which are not populated. This leads to incorrect identification of holes or
overlaps. We need to ignore layouts, which have err == 0, and start == stop.
In the current scenario (start == stop == 0).
Additionally, in layout-merge, treat missing xattrs as err = 0. In case of
missing layouts, anomalies will reset them.
For any other valid subvoles, err != 0 in case of layouts being zeroed out.
Also reverted back dht_selfheal_dir_xattr, which does layout calculation only
on subvols which have errors.
Change-Id: Idb72a869f1a6f103046bb7e6fe0019f6ac853fd4
BUG: 1047331
Signed-off-by: Vijaykumar M <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/6618
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
Problem:
Under the entry self heal, readlink is done at the
source and sink. When readlink is done at the sink,
because link is not present at the sink, afr expects
ENOENT. AFR translator takes decisions for new link
creation based on ENOENT but server translator is modified
to return ESTALE because of which afr xlator is not able
to heal.
Fix:
The check for inode absence at server includes ESTALE as
well.
Change-Id: I319e4cb4156a243afee79365b7b7a5a7823e9a24
BUG: 1046624
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/6599
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
Full crawl is executed when index self-heal is useless, like
disk replacement. So if there are on-going index crawls, they
should be stopped inorder to start full self-heals.
Change-Id: I9a1545f1ec4ad9999dc08523ce859e4fa152e214
BUG: 1049355
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6659
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Problem:
------------------------------------------
afr_lookup_perform_self_heal() {
if(afr_is_transaction_running())
goto out
else
afr_launch_self_heal();
}
------------------------------------------
When 2 clients simultaneously access a file in split-brain, one of them
acquires the inode lock and proceeds with afr_launch_self_heal (which
eventually fails and sets "sh-failed" in the callback.)
The second client meanwhile bails out of afr_lookup_perform_self_heal()
because afr_is_transaction_running() returns true due to the lock obtained by
client-1. Consequetly in client-2, "sh-failed" does not get set in the dict,
causing quick-read translator to *not* invalidate the inode, thereby serving
data randomly from one of the bricks.
Fix:
If a possible split-brain is detected on lookup, forcefully traverse the
afr_launch_self_heal() code path in afr_lookup_perform_self_heal().
Change-Id: I316f9f282543533fd3c958e4b63ecada42c2a14f
BUG: 870565
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/6578
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Varun Shastry <vshastry@redhat.com>
Problem:
Whenever a new brick is added into a replicate volume, all
source bricks are not marked as source. Only one of them is
marked as source. Here marked as source refers to adding
extended attribute at the backend of a file corresponding to
the newly added brick. As well as source bricks should point
to the newly added brick so that heal can be triggered.
Fix:
All source bricks will now point to newly added bricks and heal
can be triggered based on the extended attributes.
Change-Id: I318e1f779a380c16c448a2d05c0140d8e4647fd4
BUG: 1037501
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/6540
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Locality can be now queried by unprivileged users with
key "glusterfs.pathinfo".
Setting both "glusterfs.pathinfo" and "trusted.glusterfs.pathinfo"
on disk is prevented with this patch.
Original Author: Vijay Bellur <vbellur@redhat.com>
Change-Id: I4f7a0db8ad59165c4aeda04b23173255157a8b79
Signed-off-by: Vijaykumar M <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/5101
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Had misssed out dht_access in the previous round of cleanup
Change-Id: Ib255b9ad13ca62a8bc2eea225c46632aff8e820f
BUG: 1032894
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6496
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@gmail.com>
Problem:
Whenever there syncop_xxx() is used inside a synctask and gcc
optimizes it when compiled with -O2 there is a problem where
'errno' would not work as expected.
Fix:
Until http://review.gluster.com/6475 is reviewed and merged
we are making sure the function is not going to be optimized.
Change-Id: I504c18c8a7789f0c776a56f0aa60db3618b21601
BUG: 1040356
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6481
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
xattr name can legally be NULL. Handle that case without crashing.
Change-Id: Ie214cb05ccd52565dc247a9234ad83ae799d3866
BUG: 1036879
Signed-off-by: Poornima <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/6412
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
@key can legally be NULL. Handle that case without crashing.
Change-Id: Iaae293caa7eeb24afc9cd2580799173e2ce00911
BUG: 1036879
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6395
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
When getxattr fails with errno other than ENODATA fail rebalance
on that file. Log the reason for error.
Change-Id: Ia519870b88e6e6dd464d1c0415411aa999f80bc9
BUG: 1032927
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6341
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shishir Gowda <sgowda@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Creating linkfile could have failed, but we dont care about linkfile
for setting layout in the inode ctx (could be EEXIST etc.)
So ignore @inode in cbk and pick it up from local->loc.inode
Change-Id: I2952799d7ae0d3441b84b2ca2981afd75d7576e2
BUG: 1032859
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6319
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
When clients refer to a GFID which does not exist, the errno to
be returned in ESTALE (and not ENOENT). Even though ENOENT might
look "proper" most of the time, as the application eventually expects
ENOENT even if a parent directory does not exist, not returning
ESTALE results in resolvers (FUSE and GFAPI) to not retry resolution
in uncached mode. This can result in spurious ENOENTs during
concurrent path modification operations.
Change-Id: I7a06ea6d6a191739f2e9c6e333a1969615e05936
BUG: 1032894
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6318
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
This is needed for two reasons:
* since dht-linkfiles are internal, they shouldn't be accounted.
* hardlink handling in marker is broken. link/unlink of hardlinks
present in same directory can break marker accounting. Hence, if src
and dst are in same directory in case of rename, dht - if it breaks
rename into link/unlink operations - should instruct marker to not to
do accounting.
Change-Id: I9c9f7384569f75a2792f6450ee7a5279bf751ae7
BUG: 1022995
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/6203
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
components other than distribute (like marker to exclude linkfiles
from being accounted) also need awareness of what constitutes a
linkfile. Hence its good to separate out this functionality into
core.
Change-Id: Ib944eeacc991bb1de464c9e73ee409fc7a689ff1
BUG: 1022995
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/6152
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
* Two stages of quota enforcement is done:
Soft and hard quota Upon reaching soft quota limit on the directory
it logs/alerts in the quota daemon log (ie DEFAULT_LOG_DIR/quotad.log)
and no more writes allowed after hard
quota limit. After reaching the soft-limit the daemon alerts the
user/admin repeatively for every 'alert-time', which is
configurable.
* Quota enforcer is moved to server-side.
It takes care of enforcing quota. Since enforcer doesn't have the
cluster view, it relies on another service called
quota-aggregator. Aggregator, on query can return the size of a
directory based on the cluster view.
Enforcer is always loaded in the server graph and is by passed if
the feature is not enabled.
Options specific to enforcer:
server-quota - Specifies whether the feature is on/off. It is used
to by pass the quota if turned off.
deem-statfs - If set to on, it takes quota limits into consideration
while estimating fs size. (df command). The algorithm followed is,
i. Adjust statvfs based on limit configured on root.
ii. If limit is set on the inode passed, use size/limits on that inode to
populate statvfs. Otherwise, use size/limits configured on root.
iii. Upon statvfs, update the ctx->size on the inode.
iv. Don't let DHT aggregate, instead take the maximum of the usages from the
subvols of the DHT, since each of it contains the complete information.
Enforcer also makes use of gfid-to-path conversion functionality to
work correctly when a client like nfs predominently relies on
nameless lookups.
* Quota Aggregator acts as a thin client to provide cluster view
Its a lightweight *gluster client* process with no mount point,
started upon enabling quota or restarting the volume. This is a
single process run on each brick, which can answer queries on all
volumes in the cluster. Its volfile stored in
GLUSTERD_DEFAULT_WORKING_DIR/quotad/quotad.vol.
Credits:
Raghavendra Bhat <rabhat@redhat.com>
Varun Shastry <vshastry@redhat.com>
Shishir Gowda <sgowda@redhat.com>
Kruthika Dhananjay <kdhananj@redhat.com>
Brian Foster <bfoster@redhat.com>
Krishnan Parthasarathi <kparthas@redhat.com>
Change-Id: Id1cb25b414951da34c665a55f77385d482e0f9de
BUG: 969461
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/5952
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
Problem:
afr_[f]getxattr_pathinfo_cbks fail the fop even when it succeeded on
one of the bricks. This can happen if the last response to pathinfo
[f]getxattr is a failure.
Fix:
Remember if any of the [f]getxattr_pathinfos are successful and send
that as the op_ret/op_errno value to the xlators above.
Note:
Winding fop to a client xlator that is not connected to server produces
an error log. Preventing that by not even winding fop when client xlator
is DOWN.
Change-Id: I846e8c47423ffcfa2eabffe8924534781a36841a
BUG: 1032927
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6332
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
glfs_zerofill() can be potentially called to zero-out entire file and
hence allow for bigger value of length parameter.
Change-Id: I75f1d11af298915049a3f3a7cb3890a2d72fca63
BUG: 1028673
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-on: http://review.gluster.org/6266
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: M. Mohan Kumar <mohan@in.ibm.com>
Tested-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-by: Anand Avati <avati@redhat.com>
* migrate all the fd's on an inode to newer subvol after rebalance
* use the migration in progress flag in inode, so all the operations
on the inode can make use of it
Change-Id: Ib807a46e927a1062688fc15119c916797c52a350
BUG: 1013456
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: http://review.gluster.org/5891
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
Make changes to distributed xlator to work with BD xlator. Unlike files,
a block device can't be removed when its opened. So some part of the
code were moved down to avoid this situation. Also before truncating a
BD file its BD_XATTR should be set otherwise truncate will result in
truncating posix file. So file is created with needed BD_XATTR and
truncate is invoked. Also enables BD xlator in stripe volume type.
Change-Id: If127516e261fac5fc5b137e7fe33e100bc92acc0
BUG: 1028672
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-on: http://review.gluster.org/5235
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in
the specified range. This fop will be useful when a whole file needs to
be initialized with zero (could be useful for zero filled VM disk image
provisioning or during scrubbing of VM disk images).
Client/application can issue this FOP for zeroing out. Gluster server
will zero out required range of bytes ie server offloaded zeroing. In
the absence of this fop, client/application has to repetitively issue
write (zero) fop to the server, which is very inefficient method because
of the overheads involved in RPC calls and acknowledgements.
WRITESAME is a SCSI T10 command that takes a block of data as input and
writes the same data to other blocks and this write is handled
completely within the storage and hence is known as offload . Linux ,now
has support for SCSI WRITESAME command which is exposed to the user in
the form of BLKZEROOUT ioctl. BD Xlator can exploit BLKZEROOUT ioctl to
implement this fop. Thus zeroing out operations can be completely
offloaded to the storage device , making it highly efficient.
The fop takes two arguments offset and size. It zeroes out 'size' number
of bytes in an opened file starting from 'offset' position.
This patch adds zerofill support to the following areas:
- libglusterfs
- io-stats
- performance/md-cache,open-behind
- quota
- cluster/afr,dht,stripe
- rpc/xdr
- protocol/client,server
- io-threads
- marker
- storage/posix
- libgfapi
Client applications can exloit this fop by using glfs_zerofill introduced in
libgfapi.FUSE support to this fop has not been added as there is no system call
for this fop.
Changes from previous version 3:
* Removed redundant memory failure log messages
Changes from previous version 2:
* Rebased and fixed build error
Changes from previous version 1:
* Rebased for latest master
TODO :
* Add zerofill support to trace xlator
* Expose zerofill capability as part of gluster volume info
Here is a performance comparison of server offloaded zeofill vs zeroing
out using repeated writes.
[root@llmvm02 remote]# time ./offloaded aakash-test log 20
real 3m34.155s
user 0m0.018s
sys 0m0.040s
[root@llmvm02 remote]# time ./manually aakash-test log 20
real 4m23.043s
user 0m2.197s
sys 0m14.457s
[root@llmvm02 remote]# time ./offloaded aakash-test log 25;
real 4m28.363s
user 0m0.021s
sys 0m0.025s
[root@llmvm02 remote]# time ./manually aakash-test log 25
real 5m34.278s
user 0m2.957s
sys 0m18.808s
The argument log is a file which we want to set for logging purpose and
the third argument is size in GB .
As we can see there is a performance improvement of around 20% with this
fop.
Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18
BUG: 1028673
Signed-off-by: Aakash Lal Das <aakash@linux.vnet.ibm.com>
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-on: http://review.gluster.org/5327
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
This patch avoids giving more info to the user about the
internal heuristic employed in afr, for quota sizes.
Change-Id: Ice3a164399f09b6967500ec0c17dc340e7ae9aba
BUG: 1016683
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6098
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>