IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
bd_super is currently reset to NULL in kill_block_super() so we rely on previous
users of the block_device object to initialise this value for the next user.
This quirk was exposed on RHEL5 when a third party filesystem did not always use
kill_block_super() and therefore bd_super wasn't being reset when a block_device
object was recycled within the cache. This may not be a problem upstream but
makes sense to be defensive.
Signed-off-by: Lachlan McIlroy <lmcilroy@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
inode_lru_list_del() is expensive because of per superblock lru locking,
while some inodes are not in lru list.
Adding a check in iput_final() can speedup pipe/sockets workloads on
SMP.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Some inodes (pipes, sockets, ...) are not hashed, no need to take
contended inode_hash_lock at dismantle time.
nice speedup on SMP machines on socket intensive workloads.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Some inodes (pipes, sockets, ...) are not in bdi writeback list.
evict() can avoid calling inode_wb_list_del() and its expensive spinlock
by checking inode i_wb_list being empty or not.
At this point, no other cpu/user can concurrently manipulate this inode
i_wb_list
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Autofs may set the DCACHE_NEED_AUTOMOUNT flag on negative dentries. These
need attention from the automounter daemon regardless of the LOOKUP_FOLLOW flag.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
In btrfs we have 2 indexes for inodes. One is for readdir, it's in this nice
sequential order and works out brilliantly for readdir. However if you use ls,
it usually stat's each file it gets from readdir. This is where the second
index comes in, which is based on a hash of the name of the file. So then the
lookup has to lookup this index, and then lookup the inode. The index lookup is
going to be in random order (since its based on the name hash), which gives us
less than stellar performance. Since we know the inode location from the
readdir index, I create a dummy dentry and copy the location key into
dentry->d_fsdata. Then on lookup if we have d_fsdata we use that location to
lookup the inode, avoiding looking up the other directory index. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reported-and-acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Currently, we take a sb->s_active reference and a cifsFileInfo reference
when an oplock break workqueue job is queued. This is unnecessary and
more complicated than it needs to be. Also as Al points out,
deactivate_super has non-trivial locking implications so it's best to
avoid that if we can.
Instead, just cancel any pending oplock breaks for this filehandle
synchronously in cifsFileInfo_put after taking it off the lists.
That should ensure that this job doesn't outlive the structures it
depends on.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The recent fix to the above function causes this compiler warning to pop
on some gcc versions:
CC [M] fs/cifs/cifssmb.o
fs/cifs/cifssmb.c: In function ‘CIFSSMBQAllEAs’:
fs/cifs/cifssmb.c:5708: warning: ‘ea_name_len’ may be used uninitialized in
this function
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The code that matches EA names in CIFSSMBQAllEAs is incorrect. It
uses strncmp to do the comparison with the length limited to the
name_len sent in the response.
Problem: Suppose we're looking for an attribute named "foobar" and
have an attribute before it in the EA list named "foo". The
comparison will succeed since we're only looking at the first 3
characters. Fix this by also comparing the length of the provided
ea_name with the name_len in the response. If they're not equal then
it shouldn't match.
Reported-by: Jian Li <jiali@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Sniffing traffic on the wire shows that windows clients send a zeroed
out signature field in a NEGOTIATE request, and send "BSRSPYL" in the
signature field during SESSION_SETUP. Make the cifs client behave the
same way.
It doesn't seem to make much difference in any server that I've tested
against, but it's probably best to follow windows behavior as closely as
possible here.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Currently, we mirror the same size back to the server that it sends us.
That makes little sense. Instead we should be sending the server the
maximum buffer size that we can handle -- CIFSMaxBufSize minus the
4 byte RFC1001 header.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
For invalid extents, find other pages in the same fsblock and write them out.
[pnfsblock: write_begin]
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Note: When upper layer's read/write request cannot be fulfilled, the block
layout driver shouldn't silently mark the page as error. It should do
what can be done and leave the rest to the upper layer. To do so, we
should set rdata/wdata->res.count properly.
When upper layer re-send the read/write request to finish the rest
part of the request, pgbase is the position where we should start at.
[pnfsblock: bl_write_pagelist support functions]
[pnfsblock: bl_write_pagelist adjust for missing PG_USE_PNFS]
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: handle errors when read or write pagelist.]
Signed-off-by: Zhang Jingwang <yyalone@gmail.com>
[pnfs-block: use new write_pagelist api]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Jim Rees <rees@umich.edu>
[SQUASHME: pnfsblock: mds_offset is set in the generic layer]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
[pnfsblock: mark IO error with NFS_LAYOUT_{RW|RO}_FAILED]
Signed-off-by: Peng Tao <peng_tao@emc.com>
[pnfsblock: SQUASHME: adjust to API change]
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: fixup blksize alignment in bl_setup_layoutcommit]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
[pnfsblock: bl_write_pagelist adjust for missing PG_USE_PNFS]
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: handle errors when read or write pagelist.]
Signed-off-by: Zhang Jingwang <yyalone@gmail.com>
[pnfs-block: use new write_pagelist api]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Note: When upper layer's read/write request cannot be fulfilled, the block
layout driver shouldn't silently mark the page as error. It should do
what can be done and leave the rest to the upper layer. To do so, we
should set rdata/wdata->res.count properly.
When upper layer re-send the read/write request to finish the rest
part of the request, pgbase is the position where we should start at.
[pnfsblock: mark IO error with NFS_LAYOUT_{RW|RO}_FAILED]
Signed-off-by: Peng Tao <peng_tao@emc.com>
[pnfsblock: read path error handling]
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: handle errors when read or write pagelist.]
Signed-off-by: Zhang Jingwang <yyalone@gmail.com>
[pnfs-block: use new read_pagelist api]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
In blocklayout driver. There are two things happening
while layoutcommit/cleanup.
1. the modified extents are encoded.
2. On cleanup the extents are put back on the layout rw
extents list, for reads.
In the new system where actual xdr encoding is done in
encode_layoutcommit() directly into xdr buffer, these are
the new commit stages:
1. On setup_layoutcommit, the range is adjusted as before
and a structure is allocated for communication with
bl_encode_layoutcommit && bl_cleanup_layoutcommit
(Generic layer provides a void-star to hang it on)
2. bl_encode_layoutcommit is called to do the actual
encoding directly into xdr. The commit-extent-list is not
freed and is stored on above structure.
FIXME: The code is not yet converted to the new XDR cleanup
3. On cleanup the commit-extent-list is put back by a call
to set_to_rw() as before, but with no need for XDR decoding
of the list as before. And the commit-extent-list is freed.
Finally allocated structure is freed.
[rm inode and pnfs_layout_hdr args from cleanup_layoutcommit()]
Signed-off-by: Jim Rees <rees@umich.edu>
[pnfsblock: introduce bl_committing list]
Signed-off-by: Peng Tao <peng_tao@emc.com>
[pnfsblock: SQUASHME: adjust to API change]
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[blocklayout: encode_layoutcommit implementation]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
[pnfsblock: fix bug setting up layoutcommit.]
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
[pnfsblock: cleanup_layoutcommit wants a status parameter]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
In blocklayout driver. There are two things happening
while layoutcommit/cleanup.
1. the modified extents are encoded.
2. On cleanup the extents are put back on the layout rw
extents list, for reads.
In the new system where actual xdr encoding is done in
encode_layoutcommit() directly into xdr buffer, these are
the new commit stages:
1. On setup_layoutcommit, the range is adjusted as before
and a structure is allocated for communication with
bl_encode_layoutcommit && bl_cleanup_layoutcommit
(Generic layer provides a void-star to hang it on)
2. bl_encode_layoutcommit is called to do the actual
encoding directly into xdr. The commit-extent-list is not
freed and is stored on above structure.
FIXME: The code is not yet converted to the new XDR cleanup
3. On cleanup the commit-extent-list is put back by a call
to set_to_rw() as before, but with no need for XDR decoding
of the list as before. And the commit-extent-list is freed.
Finally allocated structure is freed.
[rm inode and pnfs_layout_hdr args from cleanup_layoutcommit()]
[pnfsblock: get rid of deprecated xdr macros]
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[blocklayout: encode_layoutcommit implementation]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
[pnfsblock: fix bug setting up layoutcommit.]
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
[pnfsblock: prevent commit list corruption]
[pnfsblock: fix layoutcommit with an empty opaque]
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Adds working implementations of various support functions
to handle INVAL extents, needed by writes, such as
bl_mark_sectors_init and bl_is_sector_init.
[pnfsblock: fix 64-bit compiler warnings for extent manipulation]
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
[Implement release_inval_marks]
Signed-off-by: Zhang Jingwang <zhangjingwang@nrchpc.ac.cn>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Implement bl_find_get_extent(), one of the core extent manipulation
routines.
[pnfsblock: Lookup list entry of layouts and tags in reverse order]
Signed-off-by: Zhang Jingwang <zhangjingwang@nrchpc.ac.cn>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Jim Rees <rees@umich.edu>
pnfsblock: fix print format warnings for sector_t and size_t
gcc spews warnings about these on x86_64, e.g.:
fs/nfs/blocklayout/blocklayout.c:74: warning: format ‘%Lu’ expects type ‘long long unsigned int’, but argument 2 has type ‘sector_t’
fs/nfs/blocklayout/blocklayout.c:388: warning: format ‘%d’ expects type ‘int’, but argument 5 has type ‘size_t’
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
XDR decodes the block layout payload sent in LAYOUTGET result, storing
the result in an extent list.
[pnfsblock: get rid of deprecated xdr macros]
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: fix bug getting pnfs_layout_type in translate_devid().]
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Call GETDEVICELIST during mount, then call and parse GETDEVICEINFO
for each device returned.
[pnfsblock: get rid of deprecated xdr macros]
Signed-off-by: Jim Rees <rees@umich.edu>
[pnfsblock: fix pnfs_deviceid references]
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: fix print format warnings for sector_t and size_t]
[pnfs-block: #include <linux/vmalloc.h>]
[pnfsblock: no PNFS_NFS_SERVER]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[pnfsblock: fix bug determining size of striped volume]
[pnfsblock: fix oops when using multiple devices]
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
[pnfsblock: get rid of vmap and deviceid->area structure]
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Replace a stub, so that extents underlying the layouts are properly
added, merged, or ignored as necessary.
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: delete the new node before put it]
Signed-off-by: Mingyang Guo <guomingyang@nrchpc.ac.cn>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This gives layout driver a chance to cleanup structures they put in at
encode_layoutcommit.
Signed-off-by: Andy Adamson <andros@netapp.com>
[fixup layout header pointer for layoutcommit]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
[rm inode and pnfs_layout_hdr args from cleanup_layoutcommit()]
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
To allow layout driver to issue getdevicelist at mount time, and clean up
at umount time.
[fixup non NFS_V4_1 set_pnfs_layoutdriver definition]
[pnfs: pass mntfh down the init_pnfs path]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Using NFS4_MAX_UINT64 will break current protocol.
[Needed in v3.0]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
There can be multiple lseg per file, so layoutcommit should be
able to handle it.
[Needed in v3.0]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
No need to save it for every lseg.
No need to save it at every pnfs_set_layoutcommit.
[Needed in v3.0]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
No need to save it for every lseg.
[Needed in v3.0]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This fixes a soft lockup on conditions
a) the flusher is working on a work by __bdi_start_writeback(), while
b) someone else calls writeback_inodes_sb*() or sync_inodes_sb(), which
grab sb->s_umount and enqueue a new work for the flusher to execute
The s_umount grabbed by (b) will fail the grab_super_passive() in (a).
Then if the inode is requeued, wb_writeback() will busy retry on it.
As a result, wb_writeback() loops for ever without releasing
wb->list_lock, which further blocks other tasks.
Fix the busy loop by redirtying the inode. This may undesirably delay
the writeback of the inode, however most likely it will be picked up
soon by the queued work by writeback_inodes_sb*(), sync_inodes_sb() or
even writeback_inodes_wb().
bug url: http://www.spinics.net/lists/linux-fsdevel/msg47292.html
Reported-by: Christoph Hellwig <hch@infradead.org>
Tested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Print out the name of the file that triggers the cookie loop message to
make it slightly easier to track down the cause.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If the directory contents change, then we have to accept that the
file->f_pos value may shrink if we do a 'search-by-cookie'. In that
case, we should turn off the loop detection and let the NFS client
try to recover.
The patch also fixes a second loop detection bug by ensuring
that after turning on the ctx->duped flag, we read at least one new
cookie into ctx->dir_cookie before attempting to match with
ctx->dup_cookie.
Reported-by: Petr Vandrovec <petr@vandrovec.name>
Cc: stable@kernel.org [2.6.39+]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We added some more error handling in b40971426a "ext4: add error
checking to calls to ext4_handle_dirty_metadata()". But we need to
call kfree() as well to avoid a memory leak.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Fix problems if fsync() races against a rename of a parent directory
as pointed out by Al Viro in his own inimitable way:
>While we are at it, could somebody please explain what the hell is ext4
>doing in
>static int ext4_sync_parent(struct inode *inode)
>{
> struct writeback_control wbc;
> struct dentry *dentry = NULL;
> int ret = 0;
>
> while (inode && ext4_test_inode_state(inode, EXT4_STATE_NEWENTRY)) {
> ext4_clear_inode_state(inode, EXT4_STATE_NEWENTRY);
> dentry = list_entry(inode->i_dentry.next,
> struct dentry, d_alias);
> if (!dentry || !dentry->d_parent || !dentry->d_parent->d_inode)
> break;
> inode = dentry->d_parent->d_inode;
> ret = sync_mapping_buffers(inode->i_mapping);
> ...
>Note that dentry obviously can't be NULL there. dentry->d_parent is never
>NULL. And dentry->d_parent would better not be negative, for crying out
>loud! What's worse, there's no guarantees that dentry->d_parent will
>remain our parent over that sync_mapping_buffers() *and* that inode won't
>just be freed under us (after rename() and memory pressure leading to
>eviction of what used to be our dentry->d_parent)......
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6:
ecryptfs: Make inode bdi consistent with superblock bdi
eCryptfs: Unlock keys needed by ecryptfsd
When commit 4e34e719e457 ("fs: take the ACL checks to common code")
changed the xyz_check_acl() functions into the more natural
xyz_get_acl() interface, we grew two copies of the
#define ext2_get_acl NULL
define for the non-acl case.
Remove the extra one.
Reported-by: Marco Stornelli <marco.stornelli@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>