1190 Commits

Author SHA1 Message Date
Kaleb S KEITHLEY
1d0a0d180b core: use syscall wrappers instead of direct syscalls - tail
tail, as in dog chasing its tail. These are the unwrapped
syscalls that have crept in (or were missed) in the previous
patches.

various xlators and other components are invoking system calls
directly instead of using the libglusterfs/syscall.[ch] wrappers.

If not using the system call wrappers there should be a comment
in the source explaining why the wrapper isn't used.

Change-Id: If183487de92fc7cbc47d4c5aa3f3e80eae50b84f
BUG: 1267967
Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/12589
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
2015-11-16 15:20:17 -08:00
Kaleb S. KEITHLEY
2099cc875a core: use syscall wrappers instead of direct syscalls - libglusterfs
various xlators and other components are invoking system calls
directly instead of using the libglusterfs/syscall.[ch] wrappers.

If not using the system call wrappers there should be a comment
in the source explaining why the wrapper isn't used.

Change-Id: Ieeca2d36adbc884e4cfa0026dba40df70310d40b
BUG: 1267967
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/12275
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
2015-11-16 06:18:10 -08:00
Sakshi Bansal
3f0c70f2d5 afr: replica pair going offline does not require CHILD_MODIFIED event
As a part of CHILD_MODIFIED event DHT forgets the current layout and
performs fresh lookup. However this is not required when a replica pair
goes offline as the xattrs can be read from other replica pairs. Hence
setting different event to handle replica pair going down.

Change-Id: I5ede2a6398e63f34f89f9d3c9bc30598974402e3
BUG: 1281230
Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Reviewed-on: http://review.gluster.org/12573
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
2015-11-16 05:43:15 -08:00
Kaleb S. KEITHLEY
82e45b87e3 build: install and package header files more conventionally
The current way we install and package header files for the -devel
package is a hack. This patch uses more conventional autoconf, libtool,
and rpmbuild idioms to package -devel headers and libraries.

Change-Id: I63ffb3460f5c12b6b355493bd00824ac9e5354c5
BUG: 1271907
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/12360
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
2015-11-16 01:15:04 -08:00
Joseph Fernandes
d4aaa00d77 tier/ctr: Providing option to record or ignore metadata heat
Currently we heat up a file for both data and metadata write.
Here we provide a ctr xlator option called "ctr-record-metadata-heat"
were the admin can decide on recording metadata heat i.e heatup a
file on metadata writes or not.
Metadata data operation are
a. setattr: explicit changing of atime/mtime using utimes,
            changing of posix permissions of the file
b. rename: Renaming a file,
c. unlink, link: adding or deleting hardlinks
d. xattrs: setting or removal of xattrs.

NOTE: atime, mtime and ctime change through writev, readv, truncate, mknod
and create will not be considered here as these fops are data and primary
metadata fops.

Defaultly "ctr-record-metadata-heat" is off. Admin can
switch it on using gluster volume set command.

Change-Id: I91157509255dd5cb429cda2b6d4f64582e155e7b
BUG: 1279166
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/12540
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
2015-11-15 13:47:48 -08:00
Joseph Fernandes
3dadb0f4ef tier/libgfdb: Extending log level flexibity in libgfdb
Extending log level flexibity to relevant fops and operations
This is an extension to
http://review.gluster.org/#/c/12491/

Change-Id: I33b2f7732f89f52569fb99baa692c7e3bb4c7ab1
BUG: 1277352
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/12567
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
2015-11-11 06:21:02 -08:00
vmallika
2794cb71b9 marker: do remove xattr only for last link
With unlink, rename, rmdir, contribution xattrs
are removed. If the file is a last link
then remove_xattr will fail with ENOENT.

So it better to perform remove_xattr
only if there are more links to the file

Change-Id: Ifc1e7fda4d310fd87f6f28a635c9ea78b8f3929d
BUG: 1257694
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/12033
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
2015-11-09 03:12:26 -08:00
Joseph Fernandes
bbaf4cdb42 tier/ctr: Ignore CTR Lookup heal insert errors
CTR doesnt read from the DB, so to make sure that file records are
created it does a heal during a lookup. It remembers the decision in
the inode context cache and retrys periodically. When the volume is
restarted it looses all the inode cache from the previous time and CTR
lookup heals tries the heal again, but this time it finds that the records
are already there from sql and logs this error, and remembers this until the
volume is restarted or inode is flushed out of inode cache of the brick.

Solution: the log levels should be reduced to trace for this case and
customers need not see this.

Change-Id: I67b568fb6904f8597e2c6d32894a247c4f500b94
BUG: 1277352
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/12491
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
2015-11-06 17:14:03 -08:00
Joseph Fernandes
1d0402b25a tier/libgfdb: Replacing ASCII query file with binary
Earlier, when the database was queried we used to save
all the queried records in an ASCII format in the query file.
This caused issues like filename having ASCII delimiter and used
to take a lot of space. The tier.c file also had a lot of parsing code.

Here we changed the format of the query file to binary.
All the logic of serialization and formating of query record is done
by libgfdb. Libgfdb provides API,
gfdb_write_query_record() and gfdb_read_query_record(),
which the user i.e tier migrator and CTR xlator can use to
write to and read from query file.
With this binary format we save on disk space i.e reduce to 50% atleast
as we are saving GFID's in binary format 16 bytes and not the string format
which takes 36 bytes + We are not saving path of the file + we are also saving on
ASCII delimiters.

The on disk format of query record is as follows,

+---------------------------------------------------------------------------+
| Length of serialized query record |       Serialized Query Record         |
+---------------------------------------------------------------------------+
             4 bytes                     Length of serialized query record
                                                      |
                                                      |
     -------------------------------------------------|
     |
     |
     V
   Serialized Query Record Format:
   +---------------------------------------------------------------------------+
   | GFID |  Link count   |  <LINK INFO>  |.....                      | FOOTER |
   +---------------------------------------------------------------------------+
     16 B        4 B         Link Length                                  4 B
                                |                                          |
                                |                                          |
   -----------------------------|                                          |
   |                                                                       |
   |                                                                       |
   V                                                                       |
   Each <Link Info> will be serialized as                                  |
   +-----------------------------------------------+                       |
   | PGID | BASE_NAME_LENGTH |      BASE_NAME      |                       |
   +-----------------------------------------------+                       |
     16 B       4 B             BASE_NAME_LENGTH                           |
                                                                           |
                                                                           |
   ------------------------------------------------------------------------|
   |
   |
   V
   FOOTER is a magic number 0xBAADF00D indicating the end of the record.
   This also serves as a serialized schema validator.

Change-Id: I9db7416fd421e118dd44eafab8b535caafe50d5a
BUG: 1272207
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/12354
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
2015-11-06 10:30:45 -08:00
vmallika
d90b87eed2 quota: add version to quota xattrs
When a quota is disable and the clean-up process terminated
without completely cleaning-up the quota xattrs.
Now when quota is enabled again, this can mess-up the accounting

A version number is suffixed for all quota xattrs and this version
number is specific to marker xaltor, i.e when quota xattrs are
requested by quotad/client marker will remove the version suffix in the
key before sending the response

Change-Id: I1ca2c11460645edba0f6b68db70d476d8d26e1eb
BUG: 1272411
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/12386
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
2015-11-02 00:54:35 -08:00
Richard Wareing
d3e496cbcd debug/io-stats: Add FOP sampling feature
Summary:
- Using sampling feature you can record details about every Nth FOP.
  The fields in each sample are: FOP type, hostname, uid, gid, FOP priority,
  port and time taken (latency) to fufill the request.
- Implemented using a ring buffer which is not (m/c) allocated in the IO path,
  this should make the sampling process pretty cheap.
- DNS resolution done @ dump time not @ sample time for performance w/
  cache
- Metrics can be used for both diagnostics, traffic/IO profiling as well
  as P95/P99 calculations
- To control this feature there are two new volume options:
  diagnostics.fop-sample-interval - The sampling interval, e.g. 1 means
  sample every FOP, 100 means sample every 100th FOP
  diagnostics.fop-sample-buf-size - The size (in bytes) of the ring
  buffer used to store the samples.  In the even more samples
  are collected in the stats dump interval than can be held in this buffer,
  the oldest samples shall be discarded.  Samples are stored in the log
  directory under /var/log/glusterfs/samples.
- Uses DNS cache written by sshreyas@fb.com (Thank-you!), the DNS cache
  TTL is controlled by the diagnostics.stats-dnscache-ttl-sec option
  and defaults to 24hrs.

Test Plan:
- Valgrind'd to ensure it's leak free
- Run prove test(s)
- Shadow testing on 100+ brick cluster

Change-Id: I9ee14c2fa18486b7efb38e59f70687249d3f96d8
BUG: 1271310
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/12210
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
2015-11-01 09:14:34 -08:00
Mohamed Ashiq
323e71617f cli : 'gluster volume help' output sorted alphabetically
'gluster volume help' output is not sorted alphabetically.
This makes little harder for the user to search or get to know of
few gluster volume commands usage just from gluster cli.

Change-Id: I855da2e4748a5c2ff3be319c50fa9548d676ee8a
BUG: 1242894
Signed-off-by: Mohamed Ashiq <mliyazud@redhat.com>
Reviewed-on: http://review.gluster.org/11663
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Reviewed-by: Anand Nekkunti <anekkunt@redhat.com>
2015-10-28 13:42:32 -07:00
Kaleb S. KEITHLEY
063d4ead61 core: use syscall wrappers instead of direct syscalls -- glusterd
various xlators and other components are invoking system calls
directly instead of using the libglusterfs/syscall.[ch] wrappers.

If not using the system call wrappers there should be a comment
in the source explaining why the wrapper isn't used.

Change-Id: I28bf2a5f7730b35914e7ab57fed91e1966b30073
BUG: 1267967
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/12379
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
2015-10-28 06:55:04 -07:00
Jeff Darcy
fb2a511493 libglusterfs: replace default functions with generated versions
Replacing repetitive code like this with code generated from a more
compact "canonical" definition carries several advantages.

 * Ease the process of adding new fops (e.g. GF_FOP_IPC).

 * Ease the process of making global changes to existing fops (e.g.
   adding "xdata").

 * Ensure strict consistency between all of the pieces that must be
   compatible with each other, through both kinds of changes.

What we have right now is just a start.  The above benefits will only
truly be realized when we use the same definitions to generate stubs,
syncops, and perhaps even parts of gfapi or glupy.

This same infrastructure can also be used to reduce code duplication and
potential for error in many of our translators.  NSR already uses a
similar technique, using a few hundred lines of templates to generate a
few *thousand* lines of code.  The ability to make a global "aspect"
change (e.g. to quorum checking) in one place instead of seventy has
already been demonstrated there.

Other candidates for code generation include the AFR/EC transaction
infrastructure, or stub creation/resumption in io-threads.

Change-Id: If7d59de7a088848b557f5aea00741b4fe19017c1
BUG: 1271325
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/9411
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
2015-10-22 22:51:44 -07:00
Gaurav Kumar Garg
816ca94f5d libglusterfs: pass buffer size to gf_store_read_and_tokenize function
Previously if user set an option where length of key=value goes beyond
PATH_MAX (4096) character then tokenzing the option at the time of
reading configuration file will fail.
This is because of the we was having restraction in fgets to read maximum
of PATH_MAX (4096) length of character.
Consequence of this is when user try to restart glusterd, after setting
key=value length beyond PATH_MAX (4096) character, glusterd will not restart.

With this fix instead of PATH_MAX, consumer of gf_store_read_and_tokenize
function will decide the size of the buffer length.

Change-Id: I655a8ce982effdfff8f3e785ea31f543dbe39301
BUG: 1271150
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/12346
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Nekkunti <anekkunt@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
2015-10-14 13:42:30 -07:00
Joseph Fernandes
f6618acd4f tier/ctr: CTR DB named lookup heal of cold tier during attach tier
Heal hardlink in the db for already existing data in the cold
tier during attach tier. i.e during fix layout do lookup to files
in the cold tier.

CTR xlator on the  brick/server side does db update/insert of the hardlink on a namelookup.
Currently the namedlookup is done synchronous to the fixlayout that is
triggered by attach tier. This is not performant, adding more time to
fixlayout. The performant approach is record the hardlinks on a compressed
datastore and then do the namelookup asynchronously later, giving the ctr db
eventual consistency

Change-Id: I4ffc337fffe7d447804786851a9183a51b5044a9
BUG: 1252586
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/11828
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
2015-10-10 10:42:07 -07:00
Pranith Kumar K
fe3c6f0fa2 cluster/ec: Implement gfid-hash read-policy
Add a policy in ec to performs reads from same bricks as long as they
are good. Based on the gfid of the file/directory it determines the
bricks to be considered for reading.

Change-Id: Ic97b5c54c086a28b5e07a330a4fd448551b49376
BUG: 1261260
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/12133
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
2015-10-09 05:26:05 -07:00
Milind Changire
47d8d2fc9c gfapi: xattr key length check to avoid brick crash
Added check to test if xattr key length > max allowed for OS
distribution and return:
EINVAL if xattr name pointer is NULL or 0 length
ENAMETOOLONG if xattr name length > max allowed for distribution

Typically the VFS does this in the kernel for us.  But since we are
bypassing the VFS by providing the libgfapi to talk directly to the
brick process, we need to add such checks.

Change-Id: I610a8440871200ae4640351902b752777a3ec0c2
BUG: 1263056
Signed-off-by: Milind Changire <mchangir@redhat.com>
Reviewed-on: http://review.gluster.org/12207
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
2015-10-09 05:15:06 -07:00
Joseph Fernandes
58d1a9be56 tier/ctr: Solution for db locks for tier migrator and ctr using sqlite version less than 3.7 i.e rhel 6.7
Problem: On RHEL 6.7, we have sqlite version 3.6.2 which doesnt support
WAL journaling mode, as this journaling mode is only available in sqlite 3.7 and above.
As a result we cannot have to progreses concurrently accessing sqlite, without
running into db locks! Well WAL is also need for performace on CTR side.

Solution: This solution is to use CTR db connection for doing queries when WAL mode is
absent. i,e tier migrator will send sync_op ipc calls to CTR, which in turn will
do the query and create/update the query file suggested by tier migrator.

Pending: Well this solution will stop the db locks but the performance is still an issue for CTR.
We are developing an in-Memory Transaction Log (iMeTaL) which will help boost the CTR
performance by doing in memory udpates on the IO path and later flush the updates to
the db in a batch/segment flush.

Change-Id: Ie3149643ded159234b5cc6aa6cf93b9022c2f124
BUG: 1240577
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/12191
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Luis Pabon <lpabon@redhat.com>
2015-10-08 12:00:31 -07:00
Richard Wareing
722ed51222 xlators: add JSON FOP statistics dumps every N seconds
Summary:
- Adds a thread to the io-stats translator which dumps out statistics
  every N seconds where N is configurable by an option called
  "diagnostics.stats-dump-interval"
- Thread cleanly starts/stops when translator is unloaded
- Updates macros to use "Atomic Builtins" (e.g. intel CPU extentions) to
  use memory barries to update counters vs using locks.  This should
  reduce overhead and prevent any deadlock bugs due to lock contention.

Test Plan:
- Test on development machine
- Run prove -v tests/basic/stats-dump.t

Change-Id: If071239d8fdc185e4e8fd527363cc042447a245d
BUG: 1266476
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/12209
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Avra Sengupta <asengupt@redhat.com>
2015-10-08 05:44:27 -07:00
Ashish Pandey
84e9a590df cluster/ec : Mark new entry changelog in entry self-heal
Problem :
When a new entry is created dirty mark xattrs are not
created this will need full heal to be performed, even
when there are partial failures.

Solution :
Marks new entry changelog in self-heal.

PS: Also fixed erasing of dirty markers when no data heal
is required.

BUG: 1254121
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Change-Id: I156e3d3201afa77efe118e1aaace1d91c90a9613
Reviewed-on: http://review.gluster.org/11938
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
2015-10-06 03:41:49 -07:00
Pranith Kumar K
059db0254f glusterd, dht: volume set for use-readdirp in dht
Change-Id: Icab246b1d02808864d878d949fa56f9f889b538a
BUG: 1265677
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/12221
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
2015-10-01 00:09:02 -07:00
Raghavendra Talur
007bfd9632 gfapi: transport and port are optional for glfs_set_volfile_server
Only server is the required argument for glfs_set_volfile_server
and both transport and port are optional. When glfs_set_volfile_server
is invocated multiple times, only on the first invocation we replace
port 0 with 24007 and transport NULL with "tcp".

Hence, replacing the parameters at the entry function is the right way.

Change-Id: If9f4a5f7fd9038eed140e2f47167a8fd11acc2f6
BUG: 1260561
Signed-off-by: Raghavendra Talur <rtalur@redhat.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Reviewed-on: http://review.gluster.org/12114
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
2015-09-28 04:55:08 -07:00
vmallika
ecfa2edd78 posix: xattrop 'GF_XATTROP_ADD_ARRAY_WITH_DEFAULT' implementation
Implementation of xattrop type:
GF_XATTROP_ADD_ARRAY_WITH_DEFAULT
GF_XATTROP_ADD_ARRAY64_WITH_DEFAULT

These operations are similar to 'GF_XATTROP_ADD_ARRAY',
except that it adds a default value if xattr is missing
or its value is zero on disk.

One use-case of this operation is in inode-quota.
When a new directory is created, its default dir_count
should be set to 1. So when a xattrop performed setting
inode-xattrs, it should account initial dir_count
1 if the xattrs are not present

Here is the usage of this operation

value required in xdata for each key
struct array {
    int32_t   newvalue_1;
    int32_t   newvalue_2;
    ...
    int32_t   newvalue_n;
    int32_t   default_1;
    int32_t   default_2;
    ...
    int32_t   default_n;
};

or

struct array {
    int32_t   value_1;
    int32_t   value_2;
    ...
    int32_t   value_n;
} data[2];
fill data[0] with new value to add
fill data[1] with default value

xattrop GF_XATTROP_ADD_ARRAY_WITH_DEFAULT
for i from 1 to n
{
    if (xattr (dest_i) is zero or not set in the disk)
        dest_i = newvalue_i + default_i
    else
        dest_i = dest_i + newvalue_i
}

value in xdata after xattrop is successful
struct array {
    int32_t   dest_1;
    int32_t   dest_2;
    ...
    int32_t   dest_n;
};

Change-Id: Ic6a08473e99fd98299a839d4d8416081a7534efd
BUG: 1243946
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/11702
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
2015-09-28 04:12:16 -07:00
Krutika Dhananjay
0428a0acf7 features/shard: Port log messages to new framework
Change-Id: Iac01e6a89a0d0c37a12a5e47f17f7ced85a31590
BUG: 1265516
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/12217
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
2015-09-27 23:08:36 -07:00
Dan Lambright
c90c03e9a2 cluster/tier fix bug with sql includes introduced by 12031
We accidentally introduced a bug where client translators have a
dependency on sql. This broke freebsd smoke tests. Fix is to
abstract from the client those dependencies.

Change-Id: I7152573a489bacc8f32e6eb139f9ff4408288f5b
BUG: 1260730
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/12155
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
2015-09-11 17:12:55 -07:00
Dan Lambright
d55a6fee5c cluster/tier: add gluster v tier <vol>
Currently the tier feature piggy backs off the rebalance command
syntax to obtain status and this is clumsy. Introduce a new
tier command that can do tier specific operations, starting
with volume status to display counters.

Old commands:
gluster volume attach-tier <vol> [replica count] {bricklist..}
gluster volume detach-tier <vol> {start|stop|commit}

New commands:
gluster volume tier <vol> attach [replica count] {bricklist} |
                          detach {start|stop|commit} |
                          status

Change-Id: Ic07b3c6260588162de7d34380f8cbd3d8a7f35d3
BUG: 1255693
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/11984
Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
2015-09-09 14:56:36 -07:00
Joseph Fernandes
96af474045 tier/ctr: Solving DB Lock issue due to write contention from db connections
Problem: The DB on the brick is been accessed by CTR, for write and
tier migrator, for read and write. The write from tier migrator is reseting
the heat counters after a cycle. Since we are using sqlite, two connections
trying to write would cause a db lock contention. As a result CTR used to fail
to update the db.

Solution: Using the same db connection of CTR for reseting the heat counters.
1) Introducted a new IPC FOP for CTR
2) After the query do a ipc syncop to the underlying client xlator associated
   to the brick.
3) CTR in brick will catch the IPC FOP and cleat the heat counters.

Change-Id: I53306bfc08dcdba479deb4ccc154896521336150
BUG: 1260730
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/12031
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
2015-09-08 05:13:00 -07:00
Kaleb S. KEITHLEY
8b7af53cb8 build: Fix build on Mac OS X, boolean
bool and true conflict with clang macros in clang on Mac OS X,
possibly with newer (?) versions of clang on Linux

Change-Id: Ia8c56ae68b4ebffb99b0684ac72d68ec50eaa7fa
BUG: 1249391
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/11816
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
2015-09-01 07:23:09 -07:00
Mohamed Ashiq
7b39098376 logging : GF_LOG_NONE logs always
Shouldn't GF_LOG_NONE mean "Never log this"? If so, it's not being
tested for and is, instead, treated as a higher priority than
CRITICAL thus is always logged.

Change-Id: Icad1e02a720a05ff21bd54ebf19c0032e6d5ce03
BUG: 1246794
Signed-off-by: Mohamed Ashiq <mliyazud@redhat.com>
Reviewed-on: http://review.gluster.org/11797
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
2015-09-01 05:58:40 -07:00
Jeff Darcy
0773ca67fd all: reduce "inline" usage
There are three kinds of inline functions: plain inline, extern inline,
and static inline.  All three have been removed from .c files, except
those in "contrib" which aren't our problem.  Inlines in .h files, which
are overwhelmingly "static inline" already, have generally been left
alone.  Over time we should be able to "lower" these into .c files, but
that has to be done in a case-by-case fashion requiring more manual
effort.  This part was easy to do automatically without (as far as I can
tell) any ill effect.

In the process, several pieces of dead code were flagged by the
compiler, and were removed.

Change-Id: I56a5e614735c9e0a6ee420dab949eac22e25c155
BUG: 1245331
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/11769
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2015-09-01 04:55:15 -07:00
Mohamed Ashiq
038dfe57cf libglusterfs:Porting log messages to new framework
Change-Id: I8625b7dc8941720cc7a864b8fddbcc7b4c485fcd
BUG: 1252836
Signed-off-by: Mohamed Ashiq <mliyazud@redhat.com>
Reviewed-on: http://review.gluster.org/11896
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
2015-09-01 01:53:32 -07:00
Krishnan Parthasarathi
5b3d64b734 event: remove vestigial mutex_unlock from event_dispatch
This was missed erroneously by self in http://review.gluster.org/12004

BUG: 1242421
Change-Id: Id6cb156fd584704486e49bc4c4a7bdd44a98679f
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/12044
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
2015-08-29 09:56:26 -07:00
Xavier Hernandez
368f96700e cluster/ec: Allow read fops to be processed in parallel
Currently ec only sends a single read request at a time for a given
inode. Since reads do not interfere between them, this patch allows
multiple concurrent read requests to be sent in parallel.

Change-Id: If853430482a71767823f39ea70ff89797019d46b
BUG: 1245689
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/11742
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
2015-08-29 03:55:28 -07:00
Pranith Kumar K
e55579bdb1 fd: Do fd_bind on successful open
- fd_unref should decrement fd->inode->fd_count only if it is present in the
inode's fd list.
- successful open/opendir should perform fd_bind.

Change-Id: I81dd04f330e2fee86369a6dc7147af44f3d49169
BUG: 1207735
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11044
Reviewed-by: Anoop C S <anoopcs@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
2015-08-28 10:26:17 -07:00
Krishnan Parthasarathi
dedf2bde88 event-epoll: Use pollers[] to check if event_pool_dispatch was called
BUG: 1242421
Change-Id: I1a0044653f15d33f89ffe16edc5baba40393dec3
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/12004
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
2015-08-28 10:14:01 -07:00
vmallika
6ad05cd4b7 posix: xattrop 'GF_XATTROP_GET_AND_SET' implementation
GF_XATTROP_GET_AND_SET stores the existing xattr
value in xdata and sets the new value

xattrop was reusing input xattr dict to set the results
instead of creating new dict.
This can be problem for server side xlators as the inout dict
will have the value changed.

Change-Id: I43369082e1d0090d211381181e9f3b9075b8e771
BUG: 1251454
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/11995
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
2015-08-27 06:44:16 -07:00
Krishnan Parthasarathi
d80537d8e5 event: add dispatched flag to know if event_dispatch was called
This is important for glusterfs processes that choose to reconfigure
no. of event-threads (a.k.a epoll worker-threads) before they call
event_dispatch on the event_pool. glusterd needs this today.

Change-Id: Ia8df3c958545324472262c555ed84b71797f002e
BUG: 1242421
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/11911
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
2015-08-21 05:03:06 -07:00
Raghavendra Bhat
53e57a62ab protocol/server: forget the inodes which got ENOENT in lookup
If a looked up object is removed from the backend, then upon getting a
revalidated lookup on that object ENOENT error is received. protocol/server
xlator handles it by removing dentry upon which ENOENT is received. But the
inode associated with it still remains in the inode table, and whoever does
nameless lookup on the gfid of that object will be able to do it successfully
despite the object being not present.

For handling this issue, upon getting ENOENT on a looked up entry in revalidate
lookups, protocol/server should forget the inode as well.

Though removing files directly from the backend is not allowed, in case of
objects corrupted due to bitrot and marked as bad by scrubber, objects are
removed directly from the backend in case of replicate volumes, so that the
object is healed from the good copy. For handling this, the inode of the bad
object removed from the backend should be forgotten. Otherwise, the inode which
knows the object it represents is bad, does not allow read/write operations
happening as part of self-heal.

Change-Id: I23b7a5bef919c98eea684aa1e977e317066cfc71
BUG: 1238188
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/11489
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
2015-08-20 22:02:45 -07:00
Dan Lambright
6e055b7d73 cluster/tier: fix 64 bit issue with sql query using times
We overflowed when converting seconds to usecs in preperation for
sql queries. The fix uses uint64_t throughout including subexpressions.

Change-Id: I59bdb742197400dede97f54735b52030920b0d19
BUG: 1231268
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/11885
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Joseph Fernandes
2015-08-13 04:38:53 -07:00
Krishnan Parthasarathi
f7668938cd rpc: add owner xlator argument to rpc_clnt_new
The @owner argument tells RPC layer the xlator that owns
the connection and to which xlator THIS needs be set during
network notifications like CONNECT and DISCONNECT.

Code paths that originate from the head of a (volume) graph and use
STACK_WIND ensure that the RPC local endpoint has the right xlator saved
in the frame of the call (callback pair). This guarantees that the
callback is executed in the right xlator context.

The client handshake process which includes fetching of brick ports from
glusterd, setting lk-version on the brick for the session, don't have
the correct xlator set in their frames. The problem lies with RPC
notifications. It doesn't have the provision to set THIS with the xlator
that is registered with the corresponding RPC programs. e.g,
RPC_CLNT_CONNECT event received by protocol/client doesn't have THIS set
to its xlator. This implies, call(-callbacks) originating from this
thread don't have the right xlator set too.

The fix would be to save the xlator registered with the RPC connection
during rpc_clnt_new. e.g, protocol/client's xlator would be saved with
the RPC connection that it 'owns'. RPC notifications such as CONNECT,
DISCONNECT, etc inherit THIS from the RPC connection's xlator.

Change-Id: I9dea2c35378c511d800ef58f7fa2ea5552f2c409
BUG: 1235582
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/11436
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
2015-08-12 23:12:50 -07:00
Joseph Fernandes
b5a98df634 tier/libgfdb : Setting Freq counters of un-selected files to zero
Change Time Recorder increments the write/read frequency counters
on a read or write of a file, if the "features.record-counters" is
"on". It is the responsibility of the tiering migrator to reset
these counters to zero for un-selected files to reset them to zero
as frequency counters are function of promotion/Demotion cycles.
If the counters are not set to zero then,

1) the counters may overflow in the DB
2) The file may be wrongly promoted or demoted.

This fix will reset the freq counters of un-selected files to zero
after promotion/demotion frequency.

Change-Id: Ideea2c76a52d421a7e67c37fb0c823f552b3da7a
BUG: 1242504
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/11648
Tested-by: Joseph Fernandes
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
2015-08-12 06:47:43 -07:00
Anusha Rao
64e6836ac8 features/changelog: Porting log messages to new logging framework
Change-Id: Ic7f842acca52908fd88e0796dc90b82650405b25
BUG: 1194640
Signed-off-by: Anusha Rao <anusha91rao@gmail.com>
Reviewed-on: http://review.gluster.org/10532
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
2015-08-11 22:36:44 -07:00
Soumya Koduri
3e05e2c699 logging: Stop using global xlator for log_buf allocations
This reverts commit 765849ee00f6661c9059122ff2346b03b224745f.

With http://review.gluster.org/#/c/10417/, struct mem_accnt
is no longer embedded in a xlator_t object, but instead is
allocated separately. Hence this workaround provided to avoid
crashes in logging infrastructure is no longer needed

BUG: 1243806
Change-Id: I1bc006172ebe5e417a5923f8be2f37a6277c5e5d
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/11811
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
2015-08-08 03:51:16 -07:00
Niels de Vos
64a5bf3749 fuse: add "resolve-gids" mount option to overcome 32-groups limit
Add a --resolve-gids commandline option to the glusterfs binary. This
option gets set when executing "mount -t glusterfs -o resolve-gids ...".

This option is most useful in combination with the "acl" mount option.
POSIX ACL permission checking is done on the FUSE-client side to improve
performance (in addition to the checking on the bricks).

The fuse-bridge reads /proc/$PID/status by default, and this file
contains maximum 32 groups. Any local (client-side) permission checking
that requires more than the first 32 groups will fail.

By enabling the "resolve-gids" option, the fuse-bridge will call
getgrouplist() to retrieve all the groups from the user accessing the
mountpoint. This is comparable to how "nfs.server-aux-gids" works.

Note that when a user belongs to more than ~93 groups, the volume option
server.manage-gids needs to be enabled too. Without this option, the
RPC-layer will need to reduce the number of groups to make them fit in
the RPC-header.

Change-Id: I7ede90d0e41bcf55755cced5747fa0fb1699edb2
BUG: 1246275
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/11732
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
2015-08-05 04:53:12 -07:00
Emmanuel Dreyfus
28fc199d5d SSL improvements: ECDH, DH, CRL, and accessible options
- Introduce ssl.dh-param option to specify a file containinf DH parameters.
  If it is provided, EDH ciphers are available.

- Introduce ssl.ec-curve option to specify an elliptic curve name. If
  unspecified, ECDH ciphers are available using the prime256v1 curve.

- Introduce ssl.crl-path option to specify the directory where the
  CRL hash file can be found. Setting to NULL disable CRL checking,
  just like the default.

- Make all ssl.* options accessible through gluster volume set.

- In default cipher list, exclude weak ciphers instead of listing
  the strong ones.

- Enforce server cipher preference.

- introduce RPC_SET_OPT macro to factor repetitive code in glusterd-volgen.c

- Add ssl-ciphers.t test to check all the features touched by this change.

Change-Id: I7bfd433df6bbf176f4a58e770e06bcdbe22a101a
BUG: 1247152
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/11735
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Kaushal M <kaushal@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
2015-08-05 04:51:43 -07:00
Gaurav Kumar Garg
e7f737e62a libglusterfs: write error handling when filesystem have no space left
When no space left on filesystem and user want to perform any operation
which result to change in /var/lib/glusterd/* files then glusterd is
failing to write on temporary file.

Fix is to handle error when write failed on temporary files due to no
space left on file system.

Change-Id: I79d595776b42580da35e1282b4a3cd7fe3d14afe
BUG: 1226829
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/11029
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Cedric Buissart
2015-07-31 07:12:39 -07:00
Anand Avati
ea90c92820 timer: fix race between gf_timer_call_cancel() and gf_timer_proc()
Change-Id: Ie264d3d591352e4a8ddaa90ae2174d9c552396f1
BUG: 1243187
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6459
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Poornima G <pgurusid@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
2015-07-29 04:17:25 -07:00
Kaleb KEITHLEY
2d85abedb2 Revert "Revert "core: avoid crashes in gf_msg dup-detection code""
This reverts commit ca67ac071c56a3bd6f2b2ba3a958f0305db50a3d.

Change-Id: Iba688b524c78b84aaa0992afa5ee8e549603d990
Reviewed-on: http://review.gluster.org/11777
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
2015-07-28 10:38:54 -07:00
Xavier Hernandez
8d915d196f cluster/ec: Minimize usage of EIO error
Change-Id: I82e245615419c2006a2d1b5e94ff0908d2f5e891
BUG: 1245276
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/11741
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
2015-07-28 04:12:17 -07:00