IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
gluster volume set <VOLUME> group <GROUP> is used for setting multiple
pre-defined volume options, but this was undocumented. This patch doc-
ments this feature.
fixes: bz#1243991
Change-Id: Id346cf2537f85179caff32479f09555ce2e72e76
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
Scenarios tested:
* Upgrade the node when there are stripe / tiering and regular
type of volumes are present.
- All volumes are started fine (as the change was not on brick volfile)
- For tier, the functionality may not even work, as changetimerecorder
is not present.
- 'gluster volume info' properly shows as 'NOT SUPPORTED' for stripe and
tier type of volume.
* Upgrade in a rolling upgrade scenario, where an old version is
able to connect to higher master.
- on a normal volume, if the volfile-server was new, the newer client
volfiles needed to have utime xlator conditionally.
- with this one change, all other changes seem to work fine.
Change-Id: Ib2d3b69dafa02b2c695a735b13c1aa70aba07cb8
updates: bz#1635688
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Fuse sets a random gfid-req value for a fresh lookup. Posix
lookup will set this gfid on entries with missing gfids causing
a GFID mismatch for directories.
DHT will now ignore the Fuse provided gfid-req and use the GFID
returned from other subvols to heal the missing gfid.
Change-Id: I5f541978808f246ba4542564251e341ec490db14
fixes: bz#1670259
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Auto invalidation is necessary when same (meta)data is shared/access
across multiple mounts. However, if (meta)data is not shared, all
relevant I/O goes through the cache of single mount and hence is
coherent with (meta)data on bricks always. So, fuse-auto-invalidation
can be disabled for this case which gives a huge performance boost for
workloads that write data and then immediately read the data they just
wrote.
From glusterfs --help,
<snip>
--auto-invalidation[=BOOL] controls whether fuse-kernel can
auto-invalidate attribute, dentry and page-cache.
Disable this only if same files/directories are
not accessed across two different mounts
concurrently [default: "on"]
</snip>
Details on how disabling auto-invalidation helped to reduce pgbench
init times can be found at [1]. Time taken for pgbench init of scale
8000 was 8340s. That will be an improvement of 86% (59280s vs 8340s)
with auto-invalidations turned off along with other
optimizations. Just disabling auto-invalidation contributed 56%
improvement by reducing the total time taken by 33260s.
[1] https://www.spinics.net/lists/gluster-devel/msg25907.html
Change-Id: I0ed730dba9064bd9c576ad1800170a21e100e1ce
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
updates: bz#1664934
This patch creates a specific function to set the thread name using a
string format and a variable argument list, like printf().
This function is used to set the thread name from gf_thread_create(),
which now accepts a variable argument list to create the full name. It's
not necessary anymore to use a local array to build the name of the
thread. This is done automatically.
Change-Id: Idd8d01fd462c227359b96e98699f8c6d962dc17c
Updates: bz#1193929
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
If a fop to create an entry fails on one of the data brick,
we mark the pending changelog on the entry on brick for which
it was successful. This is done as part of post op phase to
make sure that entry gets healed even if it gets renamed to
some other path where its parent was not marked as bad.
As it happens as part of post op, we should consider thin-arbiter
to check if the brick, which was successful, is the good brick or not.
This will avoide split brain and other issues.
Change-Id: I12686675be98f02f70a5186b3ed748c541514d53
updates: bz#1662264
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Rebalance sets the sgid and t bits on a file
that is being migrated. These permissions are
not removed in dht_readdirp_cbk when listing files
causing them to show up on the mountpoint.
We now remove these permissions if a non-linkto
file has the linkto xattr set.
Change-Id: I5c69b2ecfe2df804fe50faea903b242d01729596
fixes: bz#1669937
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Too many logs get printed if dict_ref() and dict_unref() are passed NULL
pointer.
fixes: bz#1671213
Change-Id: I18afd849d64318f68baa7b549ee310dac0e1e786
Signed-off-by: Milind Changire <mchangir@redhat.com>
A call to gf_backtrace_save() was done on each context switch of a
synctask. The backtrace is generated writing to the filesystem, so it
can have an important impact on latency.
The generated backtrace was not used anywhere, so it's been removed.
Change-Id: I399a93b932c5b6e981c696c72c3e1ef44710ba52
Updates: bz#1193929
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Problem: Avoid thread creation for bitrot-stub
for a volume if feature is not enabled
Solution: Before thread creation check the flag if feature
is enabled
Updates: #475
Change-Id: I2c6cc35bba142d4b418cc986ada588e558512c8e
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Problem: When rpc-transport-disconnect happens, server_connection_cleanup_flush_cbk()
is supposed to call rpc_transport_unref() after open-files on
that transport are flushed per transport.But open-fd-count is
maintained in bound_xl->fd_count, which can be incremented/decremented
cumulatively in server_connection_cleanup() by all transport
disconnect paths. So instead of rpc_transport_unref() happening
per transport, it ends up doing it only once after all the files
on all the transports for the brick are flushed leading to
rpc-leaks.
Solution: To avoid races maintain fd_cnt at client instead of maintaining
on brick
Credits: Pranith Kumar Karampuri
Change-Id: I6e8ea37a61f82d9aefb227c5b3ab57a7a36850e6
fixes: bz#1668190
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
...when ctime is zero. ia_type and ia_gfid always need to be non-zero
for things to work correctly.
Problem:
Commit c9bde30212 zeroed out the iatt
buffer in the cbks of modification fops before unwinding if the ctime in
the buffer was zero. This was causing the fops to fail: noticeable when
AFR's 'consistent-metadata' option was enabled. (AFR zeros out the ctime
when the option is set. See commit
4c4624c9ba).
Fixes:
-Do not zero out the ia_type and ia_gfid of the iatt buff under any
circumstance.
-Also, fixed _rda_inode_ctx_update_iatts() to always update these values from
the incoming buf when ctime is zero. Otherwise we end up with zero
ia_type and ia_gfid the first time the function is called *and* the
incoming buf has ctime set to zero.
fixes: bz#1670253
Reported-By:Michael Hanselmann <public@hansmi.ch>
Change-Id: Ib72228892d42c3513c19fc6dfb543f2aa3489eca
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
With the feature enabled, some of the performance testing results,
specially those which create millions of small files, got approximately
4x regression compared to version before enabling this.
On master without this patch: 765 creates/sec
On master with this patch : 3380 creates/sec
Also there seems to be regression caused by this in 'ls -l' workload.
On master without this patch: 3030 files/sec
On master with this patch : 16610 files/sec
This is a feature added to handle multiple clients parallely operating
(specially those which race for file creates with same name) on a single
namespace/directory. Considering that is < 3% of Gluster's usecase right
now, it makes sense to disable the feature by default, so we don't
penalize the default users who doesn't bother about this usecase.
Also note that the client side translators, specially, distribute,
replicate and disperse already handle the issue upto 99.5% of the cases
without SDFS, so it makes sense to keep the feature disabled by default.
Credits: Shyamsunder <srangana@redhat.com> for running the tests and
getting the numbers.
Change-Id: Iec49ce1d82e621e9db25eb633fcb1d932e74f4fc
Updates: bz#1670031
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Mostly, unlock before logging.
In some cases, moved different code that was not needed
to be under lock (for example, taking time, or malloc'ing)
to be executed before taking the lock.
Note: logging might be slightly less accurate in order, since it may
not be done now under the lock, so order of logs is racy. I think
it's a reasonable compromise.
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I2438710016afc9f4f62a176ef1a0d3ed793b4f89
Found an issue on concurrent invoke of event handler to the same socket
fd, causing memory corruption. This issue arises after applying commit
"socket: Remove redundant in_lock in incoming message handling" that
removes priv->in_lock to serialize socket read.
The following call sequence describes how concurrent socket event handle
happens.
thread 1 thread 2 thread 3
epoll_wait() return
(slot->in_handler is 0) call select_on_epoll()
and epoll_ctl() on fd
epoll_wait() return
slot->in_handler++
(slot->in_handler is 1)
slot->in_handler++
(slot->in_handler is 2)
call handler() call handler()
Fix this issue by skip invoke of handler if there is already a handler
inprogress.
Change-Id: I437126ac772debcadb00993a948919c931cd607b
updates: bz#1467614
Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
PROBLEM:
Lot of the earlier changes in the management of shards in lru, fsync
lists assumed that if a given shard exists in fsync list, it must be
part of lru list as well. This was found to be not true.
Consider this - a file is FALLOCATE'd to a size which would make the
number of participant shards to be greater than the lru list size.
In this case, some of the resolved shards that are to participate in
this fop will be evicted from lru list to give way to the rest of the
shards. And once FALLOCATE completes, these shards are added to fsync
list but without a ref. After the fop completes, these shard inodes
are unref'd and destroyed while their inode ctxs are still part of
fsync list. Now when an FSYNC is called on the base file and the
fsync-list traversed, the client crashes due to illegal memory access.
FIX:
Hold a ref on the shard inode when adding to fsync list as well.
And unref under following conditions:
1. when the shard is evicted from lru list
2. when the base file is fsync'd
3. when the shards are deleted.
Change-Id: Iab460667d091b8388322f59b6cb27ce69299b1b2
fixes: bz#1669077
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
gf_dirent struct has d_type variable which should check
with DT_DIR istead of IA_IFDIR or IA_IFDIR has to compare
with entry->d_stat.ia_type
Change-Id: Idf1059ce2a590734bc5b6adaad73604d9a708804
updates: bz#1653359
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Problem: At the time of deleting block hosting volume
through heketi-cli , it is throwing an error "target is busy".
cli is throwing an error because brick is not detached successfully
and brick is not detached due to race condition to cleanp xprt
associated with detached brick
Solution: To avoid xprt specifc race condition introduce an atomic flag
on rpc_transport
Change-Id: Id4ff1fe8375a63be71fb3343f455190a1b8bb6d4
fixes: bz#1668190
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
so that we can understand more about process memory and thread consumptions
With this, we will also be able to understand more about the process details
with brick-mux.
updates: bz#1193929
Change-Id: I147a3e3814fc37dfb635217d0a0f0184fae0994f
Signed-off-by: Amar Tumballi <amarts@redhat.com>
We had only changed the log level to DEBUG in release branch earlier.
But considering 90%+ of our deployments happen in same env, we can look
at these specific logs on need basis. With this change, the master
branch will be easier to debug with lesser logs.
Change-Id: I4157a7ec7d5ec9c2948b2bbc1e4cb8317f28d6b8
Updates: bz#1666833
Signed-off-by: Amar Tumballi <amarts@redhat.com>
This patch fixes a lock contention in reaadir-ahead xlator.
There are two issues, one is the processing of "." ".."
entry while holding an fd_ctx lock. The other one is destroying
the stack inside a fd_ctx lock.
Change-Id: Id0bf83a3d9fea6b40015b8d167525c59c6cfa25e
updates: bz#1659708
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Updated full find to filter for files and directories.
--full --type f lists only the files,
--full --type d lists only the directories,
--full (by default) lists both files and directories.
fixes: #579
Change-Id: If2c91a21a131667d5de34635c1846013e8fa20b7
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
Problem: In commit 2261e444a4
wrong data type of nr_files was changed to dump
nr_files to statedump so build is failing on 32bit
environment
Solution: Change data type to avoid errors
Change-Id: Ifb4b19feda6e0e56d110b23351e7a0efd5bfa29b
fixes: bz#1657607
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
The value rsp.xdata.xdata_val was being freed twice. It was assigned
to dict->extra_stdfree, dict_destroy would free it and also there was
an explicit free. Getting rid of explicit free in this patch.
Change-Id: Ia9c73454bec3970b33f154fa754398bf3b045645
fixes: bz#1668268
Signed-off-by: Poornima G <pgurusid@redhat.com>
This patch helps enable IPv6 connections in the cluster.
The default address-family is IPv4 without using this option explicitly.
When address-family is set to "inet6" in the /etc/glusterfs/glusterd.vol
file, the mount command-line also needs to have
-o xlator-option="transport.address-family=inet6" added to it.
This option also gets added to the brick command-line.
Snapshot and gfapi use-cases should also use this option to pass in the
inet6 address-family.
Change-Id: I97db91021af27bacb6d7578e33ea4817f66d7270
fixes: bz#1635863
Signed-off-by: Milind Changire <mchangir@redhat.com>
Automatic Splitbrain with size as policy must
not resolve splitbrains when the copies are of same size.
Determining if the sizes of copies are same and
returning -1 in that case.
updates: bz#1655052
Change-Id: I3d8e8b4d7962b070ed16c3ee02a1e5a926fd5eab
Signed-off-by: Iraj Jamali <ijamali@redhat.com>
Event handler handles socket level error only, while protocol handler
handles in protocol level error. If protocol handler decides to
disconnect on error in any case, it should call disconnect instead of
return an error back to event handler.
Change-Id: I9375be98cc52cb969085333f3c7229a91207d1bd
updates: bz#1666143
Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
In the case socket read returns EAGAIN, positive value about remaining
vector to send is returned. This return value will be passed all the way
back to event handler, making it complains.
[2018-12-29 08:02:25.603199] T [socket.c:1640:__socket_read_simple_payload] 0-test-client-0-extra.0: partial read on non-blocking socket.
[2018-12-29 08:02:25.603201] T [rpc-clnt.c:654:rpc_clnt_reply_init] 0-test-client-2-extra.1: received rpc message (RPC XID: 0xfa6 Program: GlusterFS 4.x v1, ProgVers: 400, Proc: 12) from rpc-transport (test-client-2-extra.1)
[2018-12-29 08:02:25.603207] T [socket.c:3129:socket_event_handler] 0-test-client-0-extra.0: (sock:32) socket_event_poll_in returned 1
Formerly, in socket_proto_state_machine, return value of socket_readv is
used to check if message is all read-in. In this commit, it is checked
whether size of bytes indicated in header are all read in. In this way,
only 0 and -1 will be returned from socket_proto_state_machine(),
indicating whether there is error in the underlying socket.
Change-Id: I8be0d178b049f0720d738a03aec41c4b375d2972
updates: bz#1666143
Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
There is a low level security issue with fencing since one client
can preempt another client's lock.
This patch does not completely eliminate the issue of a client
misbehaving, but certainly it adds a security layer for default use cases
that does not need fencing.
Change-Id: I55cd15f2ed1ae0f2556e3d27a2ef4bc10fdada1c
updates: #466
Signed-off-by: Susant Palai <spalai@redhat.com>
rm -rf <dir> fails on dirs which contain linkto files
that point to themselves because dht incorrectly thought
that they were cached files after looking them up.
The fix now treats them as invalid linkto files
and deletes them.
Change-Id: I376c72a5309714ee339c74485e02cfb4e29be643
fixes: bz#1667804
Signed-off-by: N Balachandran <nbalacha@redhat.com>
To avoid the warning and preparing for adding writesame support.
Updates: #617
Change-Id: I0710b1e4c240368a9bf52968bddc6e250ae2028d
Signed-off-by: Xiubo Li <xiubli@redhat.com>
In gluster get-state volumeoptions command there was some amount of leak
observed. This fix resolves the identified leaks.
Change-Id: Ibde5743d1136fa72c531d48bb1b0b5da0c0b82a1
fixes: bz#1667779
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
In automatic Splitbrain resolution when favorite child policy
is set as size, split brain resolution must not work for
directories.
Currently, if a directory is in split brain with both copies
having same size, the source is selected arbitrarily
and healed.
fixes: bz#1655050
Change-Id: I5739498639c17c89874cc577362e543adab55f5d
Signed-off-by: Sheetal Pamecha <sheetal.pamecha08@gmail.com>
When gluster command is fired without any arguments it just shows the
prompt, so added a welcome message and info to get help.
fixes: bz#1535528
Change-Id: I627b66b67443716e9270025c1e47b98b6facba13
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
1. cli_req.dict.dict_val,
It must be freed no metter operation error or success.
Fix it as lookup "alloca" memory before decode.
2. args.xdata.xdata_val,
It is allocated by "alloca", free is unneeded.
3. qd_nameless_lookup,
It olny needs gfid, a gfs3_lookup_req argument is unneeded.
Change-Id: I746dddf7f3d1465b1885af2644afe0bcf0a5665b
fixes: bz#1656682
Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
Added functionality to gluster volume set auth.allow command to
accept CIDR IP addresses. Modified few functions to isolate cidr
feature so that it prevents other gluster commands such as peer
probe to use cidr format ip. The functions are modified in such
a way that they have an option to enable accepting of cidr
format for other gluster commands if required in furture.
updates: bz#1138841
Change-Id: Ie6734002a7078f1820e5df42d404411cce945e8b
Credits: Mohit Agrawal
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
design reference: https://review.gluster.org/#/c/glusterfs-specs/+/21925/
This patch adds the lock preempt support.
Note: The current model stores lock enforcement information as separate
xattr on disk. There is another effort going in parallel to store this
in stat(x) of the file. This patch is self sufficient to add fencing
support. Based on the availability of the stat(x) support either I will
rebase this patch or we can modify the necessary bits post merging this
patch.
Change-Id: If4a42f3e0afaee1f66cdb0360ad4e0c005b5b017
updates: #466
Signed-off-by: Susant Palai <spalai@redhat.com>
Problem: When geo-rep is configured as hybrid crawl
directory renames are not synced to the slave.
Solution: Rename sync of directory was failing due to incorrect
destination path calculation.
During check for existence on slave we miscalculated
realpath. <host:brickpath/dir>.
Change-Id: I23f1ea60e86a917598fe869d5d24f8da654d8a0a
fixes: bz#1665826
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
In the case socket write return with EAGAIN, the remaining vector count
is return all way back to event handler, making followup pollin event to
skip handling and dispatch loop complains about failure. Even thought
temporary write failure is not an error.
[2018-12-29 07:31:41.772310] E [MSGID: 101191] [event-epoll.c:674:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
Change-Id: Idf03d120b5f7619eda19720a583cbcc3e7da2504
updates: bz#1666143
Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
Problem: Some functions are not freeing memory allocated by
xdr_to_genric so it has become leak
Solution: Call free to avoid leak
Change-Id: I3524fe2831d1511d378a032f21467edae3850314
fixes: bz#1656682
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
Initially glfs_fsetattr and glfs_setattr, both functions accepted iatt as arguements
but now they accept stat and later in the function the stat is being converted to iatt
so that it can be passed to syncop_fsetattr/syncop_setattr.
Change-Id: I41a9e0124785a32ca19ef4d492c5ed5002e66ede
updates: #389
Signed-off-by: Arjun Sharma <arjsharm@redhat.com>
Problem: Sometime brick is getting crash at the time of handling
pmap signin request
Solution: glusterfs_mgmt_pamp_signin is using same frame to send
pmap signin request so to avoid crash send signin request
on separate frame
Change-Id: I443f854171ec4372e8d5f84bdc576c468e92c493
fixes: bz#1665656
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
Problem: In gluster code some of the places it call's get_new_dict
to create a dictionary without taking reference so at the time
of dict_unref it has become a leak
Solution: To resolve the same call dict_new instead of get_new_dict
updates bz#1650403
Change-Id: I3ccbbf5af07079a4fa09aad2cd0458c8625b2f06
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>