850 Commits

Author SHA1 Message Date
Raghavendra G
44b8fbe753 cluster/afr: handle GF_XATTR_LOCKINFO_KEY appropriately.
values from all children need to be aggregated into a dictionary
and serialized buffer of this aggregated dictionary has to be
the value of GF_XATTR_LOCKINFO_KEY in the dict sent as a result of
fgetxattr.

Change-Id: Ie877f7c637c07feaee4c44d7ef86aa967a17b7e7
BUG: 808400
Signed-off-by: Raghavendra G <raghavendra@gluster.com>
Reviewed-on: http://review.gluster.org/4121
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
2012-11-27 22:43:18 -08:00
Jeff Darcy
2389042e64 replicate: don't stop checking xattrs because one was absent
The functional issue is described by the subject line.  This patch also
addresses several efficiency/structure issues, such as...

* Calling dict_set_ptr once for each txn type, instead of once overall.

* Calling afr_index_for_transaction_type once per iteration instead of
  once per call (or better yet zero since the conversion is unnecessary).

* Implementation of inner functions in a different file than their one
  caller, creating a spurious header-file dependency.

Change-Id: I29e0df906a820533b66b9ced73e015dfe77267d2
BUG: 865825
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/4070
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-11-26 21:34:33 -08:00
Venkatesh Somyajula
2b1bf891f5 Cluster/afr: Fix output for gluster volume heal vn info healed
Problem:
Whenever gluster volume heal vol full command is executed, the entries
stored in the circual buffer for sh->healed are added in the dictionary
in the _crawl_post_sh_action function irrespective of whether actual self heal
(due to non-zero values in chage log) takes place or not.

Fix:
Value of key (actual-sh-done) will be set to 1 whenever self heal takes place
due to non-zero change log values and if for some FOP self heal daemon finds
that no self heal required after examining the pending matrix, the value will
be 0.

Change-Id: I11fd0b9ee76759af17c5bca6bfafbaf66bcaacbc
BUG: 863068
Signed-off-by: Venkatesh Somyajula <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/4181
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
2012-11-26 01:04:52 -08:00
Pranith Kumar K
76a4afec6e libglusterfs: Implement float percentage
Change-Id: Ia7ea63471f0bbd74686873f5f6f183475880f1a0
BUG: 839595
Signed-off-by: Pranith Kumar K <pranithk@gluster.com>
Reviewed-on: http://review.gluster.org/4162
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
2012-11-23 23:13:39 -08:00
Pranith Kumar K
2e40e0f428 cluster/afr: check transaction type for eager-lock after it is set
Problem:
Eager locking lk-owner decision is taken before transaction
type is set. Default transaction type is DATA so all transactions
are treated as DATA transactions at the time of eager-locking
decision.

Fix:
Move the code that takes lk-owner decision after the transaction
type is set.

Test:
Checked that the transaction type is set properly in gdb at
the time of the lk-owner decision.

Change-Id: I7607c7ff4f88c7ced5416a1cddb6586cf45d88f9
BUG: 861335
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/4220
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-11-21 12:03:37 -08:00
Raghavendra Bhat
f90ca96f54 cluster/dht: dump the layout information of directories only
testcase:
The changes are for removing gf_log from statedump related sections in dht and
using pthread_mutex_trylock in statedump sections. Changes are internal. So
tests were done by attaching gdb to the process and executing by manually
changing the values of some of the pointers.

Change-Id: I41fa76c1812b462cb76f5bbf2fd14de080e73895
BUG: 843822
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/4117
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
2012-11-19 02:58:06 -08:00
Venky Shankar
59dfcf1557 cluster/dht: ignore empty ->hashed_subvol during lookup
->hashed_subvol is not valid (== NULL) when the subvolume
the entity hashes to is down. For directories, we need not
rely on ->hashed_subvol as we aggregate information from all
subvolumes. So, during lookup, NULL ->hashed_subvol is ingored
but logged.

Change-Id: I306e4e274fe29d60ff028add4a6c3bcd67b2f314
Signed-off-by: Venky Shankar <vshankar@redhat.com>
BUG: 856459
Reviewed-on: http://review.gluster.org/4046
Reviewed-by: Anand Avati <avati@redhat.com>
Tested-by: Anand Avati <avati@redhat.com>
2012-10-17 00:16:05 -07:00
shishir gowda
53e49e8592 cluster/distribute: Always return the latest time in struct iatt.
save the a/c/mtime in inode_ctx, and dht_inode_ctx_update
checks the passed iatte, and updates the stat's time,
and inode_ctx's time accordingly. For preparent times, only
the iatt stat to be returned is updated, not the ctx.

With this, update, WIPE is removed, as we would always be passing
back the latest mtime, and hence cache times will be relevant.

TODO-handle rename WIPE calls

Change-Id: I8e4c738cd830f3fafeef789c9181f9c242ac96a2
BUG: 857791
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/3737
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-10-16 14:27:15 -07:00
Venkatesh Somyajulu
760a564f2c cluster/afr : Edited log message in afr_sh_entry_expunge_entry_cbk
Change-Id: I9f7562d28c8bc798552c403164397f929a7bd1e7
BUG: 860246
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/4052
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-10-12 09:17:40 -07:00
linbaiye
5459e74ef2 Preventing client crashing as the callings of GF_CALLOC has been failed.
As the callings of GF_CALLOC can seldom come to a failure, glusterfs client
will crash due to segment fault. We should have returned once the variables
of transaction's local can't be alloced.

Change-Id: Ia3798b8349d832b23c7825e64dbad93ebe29cd1b
BUG: 861335
Signed-off-by: linbaiye <linbaiye@gmail.com>
Reviewed-on: http://review.gluster.org/4005
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-10-11 20:59:45 -07:00
Jeff Darcy
10c1a9c26e replicate: don't use synctask_new from within a synctask
Change-Id: Iebf821ff720c63ab6da4b219d82c7f1d00769992
BUG: 862838
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/4032
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-10-11 18:08:49 -07:00
Kaushal M
f1f3d1c62d cli: Changes and enhancements to XML output
This patch contains several xml related changes which fix some bugs and
introduce xml output for commands which were missing it. These include,
* XML output for rebalance & remove-brick status
* XML output for replace-brick
* XML output for 'volume status all' in on xml document
* proper XML output for "volume {create|start|stop|delete}"
* type & status of a volume in 'volume info' is now given as a string as well

This patch also cleans up the '#if (HAVE_LIB_XML)' sections from the code-base,
so that it is not littered around.

Change-Id: I5bb022adf0fedf7e3ead92b4b79bfa02b0b5fef5
BUG: 828131
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/3869
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-10-11 16:34:16 -07:00
Venkatesh Somyajulu
7ed127844d cluster/afr Changed the message's log level from Error to Debug
Change-Id: Ic2506561367bfec9022dc53e9b17b03dc343df95
BUG: 859411
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/4055
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-10-10 19:14:57 -07:00
Pranith Kumar K
1d814dcae0 cluster/afr: check transaction type for eager-lock after it is set
Problem:
Eager locking lk-owner decision is taken before transaction
type is set. Default transaction type is DATA so all transactions
are treated as DATA transactions at the time of eager-locking
decision.

Fix:
Move the code that takes lk-owner decision after the transaction
type is set.

Test:
Checked that the transaction type is set properly in gdb at
the time of the lk-owner decision.

Change-Id: Ib1c886866f28788aed67622982e86d667b2cdb80
BUG: 864786
Signed-off-by: Pranith Kumar K <pranithk@gluster.com>
Reviewed-on: http://review.gluster.org/4053
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-10-10 19:11:30 -07:00
Jeff Darcy
58e6296fa2 build: split CPPFLAGS from CFLAGS
Automake provides a separate variable for preprocessor flags
(*_CPPFLAGS). They are already uses in a few places, so make it
consistent and use it everywhere. Note that cflags obtained from
pkg-config often are cppflags, which is why LIBXML2_CFLAGS moves with
into AM_CPPFLAGS, for example.

Change-Id: I15feed1d18b2ca497371271c4b5876d5ec6289dd
BUG: 862082
Original-author: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/4029
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-10-03 12:26:45 -07:00
Jeff Darcy
1ecbb7ca68 build: remove useless explicit -fPIC -shared from
CFLAGS

libtool will automatically add "-fPIC" to the compiler command line as
needed, so there is no need to specify it separately.

"-shared" is normally a linker flag and has an odd effect when used with
libtool --mode=compile, namely that it inhibits production of static
objects. For that however, using AC_DISABLE_STATIC is a lot simpler.

Change-Id: Ic4cba0fad18ffd985cf07f8d6951a976ae59a48f
BUG: 862082
Original-author: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/4027
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-10-03 12:24:59 -07:00
Jeff Darcy
04371377f2 build: remove -nostartfiles flag
The "-nostartfiles" is a discouraged option and is documented to
potentially result in undesired behavior. Since I see no reason why it
should be in glusterfs, remove it.

Change-Id: I56f2b08874516ebad91447b2583ca2fb776bb7ab
BUG: 862082
Original-author: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/4018
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-10-02 13:11:43 -07:00
Jeff Darcy
9059a76c67 build: consolidate common compilation flags into one variable
Some -D flags are present in all files, so collect them.
This adds -D${GF_HOST_OS} to some compiler command lines,
but this should not be a problem.

Change-Id: I1aeb346143d4984c9cc4f2750c465ce09af1e6ca
BUG: 862082
Original-author: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/4013
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-10-01 16:44:50 -07:00
Pranith Kumar K
dd8eb06e26 cluster/afr: Provide option to set readdir-size in entry-self-heal
Problem:
Entry self-heal does lookups on all the entries that are read
in readdir. More the size of readdir more number of lookups happen
in parallel. It is observed that it leads to HUGE cpu spikes
rendering everything else on the system unusable.

Fix:
Provided the option self-heal-readdir-size to configure the size.
Default value is at 1KB.

Tests:
Checked that the readdirs are happening with the configured value
in entry-self-heal.

Change-Id: Icaa937ad88857e6f9a12375b1e7f6a49192bc8b1
BUG: 860895
Signed-off-by: Pranith Kumar K <pranithk@gluster.com>
Reviewed-on: http://review.gluster.org/4002
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-10-01 12:39:36 -07:00
Varun Shastry
1ee65fe16f Fixed some general typing errors.
Eg: changed recieved to received

Change-Id: I360fcb99c97c8a0222e373fee20ea2fccfb938db
BUG: 860543
Signed-off-by: Varun Shastry <vshastry@redhat.com>
Reviewed-on: http://review.gluster.org/3998
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
2012-09-27 11:20:20 -07:00
Pranith Kumar K
e8712f3633 cluster/afr: Trigger heal on local subvols on any child_up
Problem:
The index in the child that comes online is generally empty
because the changes would have happened on the other child which
has been up.  So the sync begins when the other child's poll
time-out happens (i.e. 10 minutes). The expectation is that the
sync must be triggered as soon as the connection with any brick
is established.

Fix:
Whenever any child_up happens trigger the index self-heal on all
local children in the replicate subvolume.

Tests:
1) Checked that the self-heal is triggered on all local children
whenever any child comes online.
2) Checked that the volume heal commands are working fine.

Change-Id: I4f64737866470a2f989349a889ea52782930e11d
BUG: 852741
Signed-off-by: Pranith Kumar K <pranithk@gluster.com>
Reviewed-on: http://review.gluster.org/3972
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-09-25 19:41:25 -07:00
Pranith Kumar K
7add67bcca cluster/afr: Wake up post-op on non-co-operative transaction
Problem:
The problem is observed when kernel untar is done. One file untar
happens every second. The reason for this is, setattr lock is blocked
on the prev fd data-transaction full-lock (because of eager-lock).
Because of post-op-delay the post-op (xattrop + unlock) of the prev
data-transaction happens after 1 sec.
Until this the setattr is blocked resulting in performance problems
in untar.

Fix:
Whenever an loc data, meta-data transaction comes, it should wakeup
the prev-post-op on the same process' fd.

Tests:
The performance problem in untar went away. I put a breakpoint in
client_finodelk for a 2G file dd and the inodelk is hit only 4 times.
This confirms that the change does not affect post-op-delay in a
-ve way.

Change-Id: Ice3c2a1211f4dca6520a19bc4ba6cb9efb2902ad
BUG: 845754
Signed-off-by: Pranith Kumar K <pranithk@gluster.com>
Reviewed-on: http://review.gluster.org/3975
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-09-25 10:45:08 -07:00
Varun Shastry
f7342ad3a9 Clean up of typepunning errors ( Strict aliasing warnings )
Change-Id: I48733967facc526fb523a8dc9bd068f8c5cc5971
BUG: 764282
Signed-off-by: Varun Shastry <vshastry@redhat.com>
Reviewed-on: http://review.gluster.org/3950
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-09-17 21:19:37 -07:00
Varun Shastry
3e2057542d All: License message change
License message changed for server-side, dual license GPLV2 and LGPLv3+.

Change-Id: Ia9e53061b9d2df3b3ef3bc9778dceff77db46a09
BUG: 852318
Signed-off-by: Varun Shastry <vshastry@redhat.com>
Reviewed-on: http://review.gluster.org/3940
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-09-13 13:19:37 -07:00
Anand Avati
4f87fd0ae2 dht: improve dht_fix_layout_of_directory for better re-assignment
Jeff Darcy wrote:
> AFAICT, the fix-layout code doesn't do the same rotation that the
> new-directory code does. Therefore, the new bricks always claim
> completely predictable hash ranges for every directory, leading to
> either a 0-1-2-3 pattern or a 1-0-2-3 pattern.  In other words, a
> file whose hash falls into the second quarter of the range will always
> be assigned to brick 2, and a file whose hash falls into the fourth
> quarter will always be assigned to brick 3.  The rest will be split
> according to the original pattern.  Put still another way, instead of
> same-named files in different directories being spread across N bricks,
> they might be spread across only two bricks (bad) or totally
> concentrated on one brick (worse) regardless of N.

The current dht_fix_layout_of_directory() code, in an attempt to
maximize overlap of new layout with existing layout (to minimize
movement of data) fails to do a good job of randomizing new assignment
even when it could do a better job. In an example where we expand
from 2 nodes to 4 nodes, the current possibilities are limited in the
following way -

(theoretical hash range: 00 - 99)

OLD 1
-----
server1: 00 - 49
server2: 50 - 99

NEW 1
-----
server1: 00 - 24
server2: 50 - 74
server3: 25 - 49
server4: 75 - 99

OLD 2
-----
server1: 50 - 99
server2: 00 - 49

NEW 2
------
server1: 50 - 74
server2: 00 - 24
server3: 25 - 49
server4: 75 - 99

The above shows that when add-brick from 2 bricks to 4 bricks, server3
and server4 always get the _same_ hash range no matter what the original
hash range assignment was.

The fix in this patch is first do the standard new directory assignment
to a directory (with rotation etc.) and then do the reassignment to
maximize overlap. This way newly added servers still get random ranges
and existing servers have a probability of getting either of the quarters
which were part of its half previously. The same principles hold for
all add-brick from M to M+N.

Change-Id: I0cbbf3bfa334645728072d66aaaa80120d0b295f
BUG: 853258
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/3883
Tested-by: Gluster Build System <jenkins@build.gluster.com>
2012-09-12 14:29:51 -07:00
Amar Tumballi
c78919ce37 cluster/dht: handle percent option for 'min-free-disk'
* with the init option cleanups, setting of 'conf->disk_unit'
  was reset, which made it not set the '%' in the option.

* bring a global check, which makes the option assume its
  percent, as long as value is < 100.

Change-Id: I00bd1395a309cdc596a2b2b80304c6d98696a24a
Signed-off-by: Amar Tumballi <amarts@redhat.com>
BUG: 852889
Reviewed-on: http://review.gluster.org/3918
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-09-07 16:48:51 -07:00
Jules.Wang
dd7bc2d883 afr: add option description of 'open'.
Signed-off-by: Jules Wang <lancelotds@163.com>
Change-Id: I6c7dd337c758e82e9d58d4d65f53b5aa72ac5dfb
BUG: 764890
Reviewed-on: http://review.gluster.org/3895
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-09-06 18:25:19 -07:00
Amar Tumballi
fb6c8f8b4e cluster/distribute: remove gf_log() from statedump functions
Change-Id: I83cccab6819d6a74e96c2717ca539fa1568cac89
Signed-off-by: Amar Tumballi <amarts@redhat.com>
BUG: 843822
Reviewed-on: http://review.gluster.org/3912
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-09-06 13:22:30 -07:00
Amar Tumballi
d6c99b6134 libglusterfs/dict: make 'dict_t' a opaque object
* ie, don't dereference dict_t pointer, instead use APIs everywhere
* other than dict_t only 'data_t' should be the valid export from dict.h

* added 'dict_foreach_fnmatch()' API
* changed dict_lookup() to use data_t, instead of data_pair_t

Change-Id: I400bb0dd55519a7c5d2a107e67c8e7a7207228dc
Signed-off-by: Amar Tumballi <amarts@redhat.com>
BUG: 850917
Reviewed-on: http://review.gluster.org/3829
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-09-06 00:34:15 -07:00
Pranith Kumar K
a9ba00f5e4 cluster/afr: Don't stop entry/data self-heal on metadata split-brain
Problem:
Entry/Data self-heal is orthogonal to meta-data self-heal.
meta-data split-brain should not affect entry/data self-heal.

Fix:
Prevented aborting rest of the self-heals when metadata split-brain
happens.

Tests:
1) Simulated meta-data split-brain then checked data-self-heal
succeed on regular file, entry-self-heal succeed on dir.
2) Reset meta-data change-log on one of the subvols and checked
that meta-data self-heal also completes.
3) Executed self-heal sanity script.

Change-Id: I05ca222d855d3a6000703e3775471d0f874d35d6
BUG: 851451
Signed-off-by: Pranith Kumar K <pranithk@gluster.com>
Reviewed-on: http://review.gluster.org/3853
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <obdurodon@gmail.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-08-29 20:49:37 -07:00
shishir gowda
e442b07f1d dht/rebalance: set the correct ownership on the dst file.
Currently, the dst file created has root:root ownership, till
migration is completed. During this phase, open fails on the dst
file if uid/gid is non-root.
Setting the dst_file to the correct ownership fixes the issue

Change-Id: Icfec89eb10dc866cdee38dab17695fe21174ef99
BUG: 852361
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/3861
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-08-28 23:15:59 -07:00
Varun Shastry
2ff5e1c2a1 All: License message change
The license message is changed to
  Copyright (c) 2008-2012 Red Hat, Inc. <http://www.redhat.com>
  This file is part of GlusterFS.

  This file is licensed to you under your choice of the GNU Lesser
  General Public License, version 3 or any later version (LGPLv3 or
  later), or the GNU General Public License, version 2 (GPLv2), in all
  cases as published by the Free Software Foundation.

Change-Id: I07d2b63ed5fbbbd1884f1e74f2dd56013d15b0f4
BUG: 852318
Signed-off-by: Varun Shastry <vshastry@redhat.com>
Reviewed-on: http://review.gluster.org/3858
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
2012-08-28 03:45:06 -07:00
Krishnan Parthasarathi
16e880a958 afr: Avoid excessive logging in self-heal.
- (Excessive) Logging has been very useful as 'bread-crumbs' in
  many a root-cause analyses. This patch aims at avoiding logging when
  the information could be reconstructed using the xattrs, statedump,
  and/or "volume heal" CLI commands.

Change-Id: Iebc6b10ae18f0dd9704bdc6dd03bcfe0f2a09abd
BUG: 844804
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/3805
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-08-23 21:42:19 -07:00
Venkatesh Somyajulu
defc74df52 Self-heald: Prevent logging of errno ENOENT
Change-Id: Ie56228dfbdc7e519a344681487164a835488a470
BUG: 835423
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/3826
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-08-20 07:50:46 -07:00
Amar Tumballi
2f2e3bfb5e syncop: handle 'dataonly' flag in syncop_fsync()
* and also in syncop_readv(), don't look at _cbk args if op_ret
  is < 0.

Change-Id: I3ab2982bc6d186e75b6adb74c8981e4ff7058bbe
Signed-off-by: Amar Tumballi <amarts@redhat.com>
BUG: 839950
Reviewed-on: http://review.gluster.org/3828
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-08-20 00:16:01 -07:00
Jim Meyering
b757819289 cluster/dht: don't leak upon GF_REALLOC failure
Change-Id: I7dfabcc2981df5c5a1e1a54c3135400a60626cd1
BUG: 846755
Signed-off-by: Jim Meyering <meyering@redhat.com>
Reviewed-on: http://review.gluster.com/3798
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-08-19 10:14:31 -07:00
shishir gowda
e404e9b81f cluster/dht: Optimize readdirp calls in DHT
Bring in option which is supported by posix xlator
to filter out directory's entries from being returned.
DHT would now request non-first subvols to filter out
directory entries.

dht xlator-option readdir-optimize will enable this
optimization

Change-Id: I35224bc81c9657f54f952efac02790276c35ded5
BUG: 838199
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.com/3772
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-08-13 23:24:04 -07:00
Pranith Kumar K
5f76656742 cluster/afr: Unwind with correct pre/post parent bufs
RCA:
In case of dir fops create, mknod, mkdir, link, symlink, rename
if the fop fails on read-child then unwinds are happening with
all-zero pre/post iatt-bufs. The bug occurs because the parent
bufs are not saved if the response is not from read-child.

Fix:
Save the pre/post-bufs for the first response. If the response
comes from read-child, overwrite whatever we have cached.

Tests:
Attached the mount process to gdb.
Tested that the unwinds happen with proper pre/post iatt bufs in
the following cases:
1) All success case
2) Failure on read-child
3) Failure on non-read-child
4) Failure on all children.

Tested soft-link self-heal to test the change made in that.
Tested errno ENOTEMPTY for rmdir, rename fops.

Change-Id: I82882423d2d766b4f4a3044203bcb5dbcaee1755
BUG: 845242
Signed-off-by: Pranith Kumar K <pranithk@gluster.com>
Reviewed-on: http://review.gluster.com/3775
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-08-02 13:12:21 -07:00
Pranith Kumar K
5301183161 cluster/afr: Handle child_up & fd not opened case in xaction
RCA:
When an fd is opened while a brick is down, after the brick
comes back up afr issues open on the other brick. It can
fail for a number of reasons (enoent etc). While the system
is in that state, inode/entrylks pre-op happen only on the
brick that is up and fd is opened for fd-fops. post-op should
consider only the bricks where both pre-op and fop succeeded
as success, rest of them as failures. Code now marks only the
children that are down as failures as opposed to child_down &
fd-not-opened. This makes change-log appear as success on the
subvolume where we did not do any fop leading to no change-log
but differences in data/metadata for reg-files.

Fix:
Mark non-participants of fop as failure. This is tracked in
transaction.pre_op[].

Tests:
Simulated the scenario using err-gen on top of one of the client
xlator which fails all fops always. Performed fops and the changelog
represented pending fops on the brick with err-gen loaded. Tested
the case of brick down and perform entry/metadata/data operations
to confirm they still work as expected.

Change-Id: I41905936126b19abba56ca581c0301a894507e1a
BUG: 844987
Signed-off-by: Pranith Kumar K <pranithk@gluster.com>
Reviewed-on: http://review.gluster.com/3765
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-08-01 22:51:37 -07:00
Pranith Kumar K
9fcc3f4ded cluster/afr: Handle failures in fop_cbk gracefully
RCA:
Afr crashes when a last fop response fails and
'fop output' arguments are NULL. Afr does not handle
these gracefully.

Fix:
Changed the fops to not access the 'fop output' arguments
in case of failures.

Tests:
Changed afr wind_cbk code to fail the last response by setting
op_ret as -1 and op_errno as ENOMEM and setting all other output
variables as NULL to test the change. Removed the code to verify
success cases. No crashes or errors seen.

Change-Id: Iad9bc54db093a162f85bfb8dbeeda5b95acd21d8
BUG: 844689
Signed-off-by: Pranith Kumar K <pranithk@gluster.com>
Reviewed-on: http://review.gluster.com/3760
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-07-31 23:47:23 -07:00
Pranith Kumar K
b21395aee3 cluster/afr: update loc inode after inode_link
RCA:
inode passed to inode_link is not assigned any gfid if the
inode with that gfid is already linked, so loc for opendir
does not have a valid inode

Fix:
Use the linked_inode returned by inode_link in the loc to
perform further operations on the entry.

Tests:
Checked that opendir comes with an loc with valid inode.
Checked that re-opendir happens successfully. Tested index,
full self-heal work fine with the fix.

Change-Id: Idf4ced4cc2320133744962059d363e373af0e5ec
BUG: 826580
Signed-off-by: Pranith Kumar K <pranithk@gluster.com>
Reviewed-on: http://review.gluster.com/3748
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-07-31 11:25:39 -07:00
Brian Foster
879c01087d cluster/stripe: handle short writes and errors in writev callback
cluster/stripe write callback handling is broken in the event of
server side errors and short writes due to crudely summing up the
return values from each node. This can produce incorrect results
or cause an application to rewrite the wrong portions of a buffer
in an attempt to handle this condition.

Modify cluster/stripe writev handling to record the requested size
of each write and use this data to return the number of consecutive
bytes written from the original request. This allows an application
to retry a write at the point of error (and potentially consume
said error).

BUG: 809975
Change-Id: Ic35cb1e092c29545205aa32e352485c507534ce0
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.com/3700
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shishir Gowda <sgowda@redhat.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-07-30 11:39:03 -07:00
Pranith Kumar K
f153c83580 cluster/afr: Modified split-brain handling
RCA
The bug is observed because the decision to mark
a file in split-brain is taken outside appropriate locks.
Lookup gathers xattrs outside any lock. The xattrs being
in split-brain in lookup should only be taken as a hint.
Appropriate inodelks should be taken before confirming
a split-brain. Self-heal confirms this at the moment.
If data/metadata self-heal is turned off, inspecting of
xattrs could not be performed so split-brain behavior
does not work correctly if the self-heal options are turned off.

Fix
Self-heals are launched to inspect xattrs even when the
data/metadata self-heal options are turned off. The decision
to heal data/metadata after the xattrs are inspected is based
on whether the options are turned on/off. So decision to set/reset
split-brain flag is taken inside appropriate locks.

Testcases:
tests 33-36 in
https://github.com/pranithk/gluster-tests/blob/master/afr/self-heal.sh

Change-Id: Ia8aeab08208b50c06609ad35a9d72f3d553ee343
BUG: 833727
Signed-off-by: Pranith Kumar K <pranithk@gluster.com>
Reviewed-on: http://review.gluster.com/3626
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-07-26 10:14:54 -07:00
Pranith Kumar K
c2a7a22bfe cluster/afr: Filter O_TRUNC in afr-fix-open
RCA:
When open was done while a brick is down, afr opens the file after
the brick comes backup. If this happens after the self-heal on the file
is completed by self-heald etc, the file will end up in truncated state.

Fix:
Filter O_TRUNC while afr-fix-open because afr_open turns O_TRUNC
into truncate transaction, so there will be pending changelog for
the subvolume on which open fails.

Testing:
Had to simulate the race by stopping fix-open until self-heald completes
self-heal on the file after brick online.

Change-Id: I32759cc37f4bb34f206d01606a279f17b246dba4
BUG: 841840
Signed-off-by: Pranith Kumar K <pranithk@gluster.com>
Reviewed-on: http://review.gluster.com/3705
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-07-25 17:45:22 -07:00
Brian Foster
34d395fc16 cluster: fix crash on link of named pipe in stripe/replicate vol
A crash occurs when attempting to link a named pipe on a striped,
replicated volume. The cause for this crash is attempting to deref
a NULL inode pointer in stripe_link_cbk(). The RCA for this bug
uncovered a couple of problems:

- AFR ignores the inode pointer it receives on failure (returning
  NULL).
- stripe assumes the inode pointer is valid on failure.

Either one of these changes addresses the crash, but this patch
includes both changes. AFR is modified to pass along the inode
pointer it receives (which could still be NULL). stripe is
modified to not assume the inode pointer is valid on fop failure.

BUG: 842825
Change-Id: I9cb2cc918552620929c3ecbd69bc66d4635eafdc
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.com/3727
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-07-25 15:03:57 -07:00
Pranith Kumar K
75ee490213 cluster/afr: Perform data self-heal for non regular files
RCA:
Data self-heal for non regular files open the files
and then proceeds using that fd. This approach
does not work for symlinks because open on symlink opens
the file resolved by it.

Fix:
If the file is not a regular file then perform self-heal using
loc. It needs to get 'big' lock and then perform lookup to get
changelog then erase data part of chagelog, then unlock.

Test cases:
Automated at
https://github.com/pranithk/gluster-tests/blob/master/afr/special-file-self-heal-test.sh

Change-Id: I924a922f5135872efe2cccf2e712ada082c5689f
BUG: 811317
Signed-off-by: Pranith Kumar K <pranithk@gluster.com>
Reviewed-on: http://review.gluster.com/3724
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-07-25 15:02:39 -07:00
Brian Foster
787d52d8e8 cluster/stripe: don't fail if no fctx on a non-regular file
cluster/stripe broke directory rename. Only check for fctx on regular
files.

BUG: 842652
Change-Id: I8a1e7ff30d57c994082cb10471f610023713ee53
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.com/3720
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-07-25 14:55:06 -07:00
shishir gowda
7cce8c843e cluster/distribute: Suppress user xattr mismatch log message
Changing the log-level to DEBUG.
Xattr mismatch can occur when parallel setxattr's race, or when
one of the bricks was down. A subsequent setxattr will fix the
condition when all the subvols are up. In this case, the 'user.swift'
xattr used by ufo was out of sync, but did not cause any other error.

Change-Id: I6fdff78869b8ff72c305bbe122033e6c1d9d3cff
BUG: 838197
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.com/3722
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Mohammed Junaid <junaid@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
2012-07-25 08:20:35 -07:00
Brian Foster
d6f88e9edb afr: pass back xdata in create
A striped, replicated volume spits an error on file creation because
stripe requires xdata to process stripe information and AFR isn't
passing it back.

This fix was suggested by Amar Tumballi.

BUG: 842373
Change-Id: Ia7063590ca5e873d4a4e155989cf067e8a07501f
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.com/3713
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
2012-07-23 11:48:29 -07:00
Amar Tumballi
f2c110aa4a stripe: filter coalesce key in getxattr()/listxattr()
as 'stripe-coalesce' is an internal key, no need to show it on top
of the mount-point.

Change-Id: Iab836e73d59c42774db8a2eee13fe3b0cd994bc9
Signed-off-by: Amar Tumballi <amarts@redhat.com>
BUG: 801887
Reviewed-on: http://review.gluster.com/3680
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shishir Gowda <sgowda@redhat.com>
2012-07-17 23:05:31 -07:00