18160 Commits

Author SHA1 Message Date
Christoph Hellwig
e970a573ce nfsd: open a file descriptor for fsync in nfs4 recovery
Instead of just looking up a path use do_filp_open to get us a file
structure for the nfs4 recovery directory.  This allows us to get
rid of the last non-standard vfs_fsync caller with a NULL file
pointer.

[AV: should be using fput(), not filp_close()]

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:21 -04:00
Richard Kennedy
2e147f1ef7 fs: inode.c use atomic_inc_return in __iget
Using atomic_inc_return in __iget(struct inode *inode) makes the intent
of this code clearer and generates less code on processors that have
this operation.

On x86_64 this patch reduces the text size of inode.o by 12 bytes.

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>

----
patch against 2.6.34-rc7
compiled & tested on x86_64 AMD X2

I've been running with this patch applied for several weeks with no
obvious problems.
regards
Richard
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:21 -04:00
Eric Paris
a7cf4145bb anon_inode: set S_IFREG on the anon_inode
anon_inode_mkinode() sets inode->i_mode = S_IRUSR | S_IWUSR;  This means
that (inode->i_mode & S_IFMT) == 0.  This trips up some SELinux code that
needs to determine if a given inode is a regular file, a directory, etc.
The easiest solution is to just make sure that the anon_inode also sets
S_IFREG.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:20 -04:00
Stephen Hemminger
b7bb0a1291 gfs: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:20 -04:00
Stephen Hemminger
537d81ca7c ocfs: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:20 -04:00
Stephen Hemminger
365f0cb9d2 jffs2: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:20 -04:00
Stephen Hemminger
46e58764f0 xfs: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:19 -04:00
Stephen Hemminger
94d09a98cd reiserfs: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:19 -04:00
Stephen Hemminger
11e2752807 ext4: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:19 -04:00
Stephen Hemminger
d1f21049f9 ext3: constify xattr handlers
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:19 -04:00
Stephen Hemminger
749c72efa4 ext2: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:18 -04:00
Stephen Hemminger
f01cbd3f81 btrfs: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:18 -04:00
Stephen Hemminger
bb4354538e fs: xattr_handler table should be const
The entries in xattr handler table should be immutable (ie const)
like other operation tables.

Later patches convert common filesystems. Uncoverted filesystems
will still work, but will generate a compiler warning.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:18 -04:00
Josef Bacik
18e9e5104f Introduce freeze_super and thaw_super for the fsfreeze ioctl
Currently the way we do freezing is by passing sb>s_bdev to freeze_bdev and then
letting it do all the work.  But freezing is more of an fs thing, and doesn't
really have much to do with the bdev at all, all the work gets done with the
super.  In btrfs we do not populate s_bdev, since we can have multiple bdev's
for one fs and setting s_bdev makes removing devices from a pool kind of tricky.
This means that freezing a btrfs filesystem fails, which causes us to corrupt
with things like tux-on-ice which use the fsfreeze mechanism.  So instead of
populating sb->s_bdev with a random bdev in our pool, I've broken the actual fs
freezing stuff into freeze_super and thaw_super.  These just take the
super_block that we're freezing and does the appropriate work.  It's basically
just copy and pasted from freeze_bdev.  I've then converted freeze_bdev over to
use the new super helpers.  I've tested this with ext4 and btrfs and verified
everything continues to work the same as before.

The only new gotcha is multiple calls to the fsfreeze ioctl will return EBUSY if
the fs is already frozen.  I thought this was a better solution than adding a
freeze counter to the super_block, but if everybody hates this idea I'm open to
suggestions.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:18 -04:00
Al Viro
e1e46bf186 Trim includes in fs/super.c
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:17 -04:00
Al Viro
d3f2147307 Move grabbing s_umount to callers of grab_super()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:17 -04:00
Al Viro
7ed1ee6118 Take statfs variants to fs/statfs.c
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:17 -04:00
Al Viro
01a05b337a new helper: iterate_supers()
... and switch the simple "loop over superblocks and do something"
loops to it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:16 -04:00
Al Viro
35cf7ba0b4 Bury __put_super_and_need_restart()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:16 -04:00
Al Viro
79893c17b4 fix prune_dcache()/umount() race
... and get rid of the last __put_super_and_need_restart() caller

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:16 -04:00
Al Viro
df40c01a92 In get_super() and user_get_super() restarts are unconditional
If superblock had been still alive, we would've returned it...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:16 -04:00
Al Viro
1494583de5 fix get_active_super()/umount() race
This one needs restarts...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:15 -04:00
Al Viro
e7fe0585ca fix do_emergency_remount()/umount() races
need list_for_each_entry_safe() here.  Original didn't even
have restart logics, so if you race with umount() it blew up.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:15 -04:00
Al Viro
6754af6464 Convert simple loops over superblocks to list_for_each_entry_safe
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:15 -04:00
Al Viro
8edd64bd60 get rid of restarts in sync_filesystems()
At the same time we can kill s_need_restart and local mutex in there.
__put_super() made public for a while; will be gone later.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:15 -04:00
Al Viro
551de6f34d Leave superblocks on s_list until the end
We used to remove from s_list and s_instances at the same
time.  So let's *not* do the former and skip superblocks
that have empty s_instances in the loops over s_list.

The next step, of course, will be to get rid of rescan logics
in those loops.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:14 -04:00
Al Viro
1712ac8fda Saner locking around deactivate_super()
Make sure that s_umount is acquired *before* we drop the final
active reference; we still have the fast path (atomic_dec_unless)
and we have gotten rid of the window between the moment when
s_active hits zero and s_umount is acquired.  Which simplifies
the living hell out of grab_super() and inotify pin_to_kill()
stuff.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:14 -04:00
Al Viro
b20bd1a5e7 get rid of S_BIAS
use atomic_inc_not_zero(&sb->s_active) instead of playing games with
checking ->s_count > S_BIAS

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:14 -04:00
Al Viro
389b8be6ef get rid of open-coded grab_super() in get_active_super()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:14 -04:00
Al Viro
3981f2e2a0 ceph: should use deactivate_locked_super() on failure exits
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:13 -04:00
Al Viro
2ccde7c631 Clean ecryptfs ->get_sb() up
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:13 -04:00
Al Viro
decabd6650 fix a couple of ecryptfs leaks
First of all, get_sb_nodev() grabs anon dev minor and we
never free it in ecryptfs ->kill_sb().  Moreover, on one
of the failure exits in ecryptfs_get_sb() we leak things -
it happens before we set ->s_root and ->put_super() won't
be called in that case.  Solution: kill ->put_super(), do
all that stuff in ->kill_sb().  And use kill_anon_sb() instead
of generic_shutdown_super() to deal with anon dev leak.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:13 -04:00
Al Viro
894680710d Simplify devpts_get_sb() failure exits
postpone simple_set_mnt() until we know we won't fail.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:12 -04:00
Christoph Hellwig
a135aa2cd7 remove incorrect comment in do_emergency_remount
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:12 -04:00
Al Viro
13e3c5e5b9 clean DCACHE_CANT_MOUNT in d_delete()
We set the "it's dead, don't mount on it" flag _and_ do not remove it if
we turn the damn thing negative and leave it around.  And if it goes
positive afterwards, well...

Fortunately, there's only one place where that needs to be caught:
only d_delete() can turn the sucker negative without immediately freeing
it; all other places that can lead to ->d_iput() call are followed by
unconditionally freeing struct dentry in question.  So the fix is obvious:

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16014
Reported-by: Adam Tkac <vonsch@gmail.com>
Tested-by: Adam Tkac <vonsch@gmail.com>
Cc: <stable@kernel.org>         [2.6.34.x]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:12 -04:00
Sage Weil
970690012c ceph: avoid resending queued message to monitor
The auth_reply handler will (re)send any pending requests.  For the
initial mon authenticate phase, that's correct, but when a auth ticket
renewal races with an in-flight request, we may resend a request message
that is already in flight.  Avoid this by revoking the message before
sending it.

We should also avoid resending requests at all during ticket renewal; that
will come soon.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-21 15:01:22 -07:00
Tobias Klauser
9e32789f63 ceph: Storage class should be before const qualifier
The C99 specification states in section 6.11.5:

The placement of a storage-class specifier other than at the beginning
of the declaration specifiers in a declaration is an obsolescent
feature.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-21 15:01:21 -07:00
Sripathi Kodi
4681dbdacb 9p: add 9P2000.L rename operation
I made a V2 of this patch on top of my patches for VFS switches.
All the changes were due to change in some offsets.

rename - change name of file or directory

size[4] Trename tag[2] fid[4] newdirfid[4] name[s]
size[4] Rrename tag[2]

The rename message is used to change the name of a file, possibly moving it
to a new directory.  The 9P wstat message can only rename a file within the
same directory.

Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-05-21 16:44:34 -05:00
Sripathi Kodi
bda8e77520 9p: add 9P2000.L statfs operation
I made a V2 of this patch on top of my patches for VFS switches. The
change was adding v9fs_statfs pointer to v9fs_super_ops_dotl
instead of v9fs_super_ops.

statfs - get file system statistics

size[4] Tstatfs tag[2] fid[4]
size[4] Rstatfs tag[2] type[4] bsize[4] blocks[8] bfree[8] bavail[8]
                files[8] ffree[8] fsid[8] namelen[4]

The statfs message is used to request file system information returned
by the statfs(2) system call, which is used by df(1) to report file
system and disk space usage.

Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-05-21 16:44:33 -05:00
Sripathi Kodi
9b6533c9b3 9p: VFS switches for 9p2000.L: VFS switches
Implements VFS switches for 9p2000.L protocol.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-05-21 16:44:33 -05:00
Jens Axboe
ee9a3607fb Merge branch 'master' into for-2.6.35
Conflicts:
	fs/ext3/fsync.c

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 21:27:26 +02:00
Jens Axboe
b492e95be0 pipe: set lower and upper limit on max pages in the pipe page array
We need at least two to guarantee proper POSIX behaviour, so
never allow a smaller limit than that.

Also expose a /proc/sys/fs/pipe-max-pages sysctl file that allows
root to define a sane upper limit. Make it default to 16 times the
default size, which is 16 pages.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 21:12:52 +02:00
Jens Axboe
35f3d14dbb pipe: add support for shrinking and growing pipes
This patch adds F_GETPIPE_SZ and F_SETPIPE_SZ fcntl() actions for
growing and shrinking the size of a pipe and adjusts pipe.c and splice.c
(and relay and network splice) usage to work with these larger (or smaller)
pipes.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 21:12:40 +02:00
Linus Torvalds
d515e86e63 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
  udf: BKL ioctl pushdown
2010-05-21 11:17:43 -07:00
Jens Axboe
c2c4986edd writeback: fix problem with !CONFIG_BLOCK compilation
When CONFIG_BLOCK isn't enabled:

mm/page-writeback.c: In function 'laptop_mode_timer_fn':
mm/page-writeback.c:708: error: dereferencing pointer to incomplete type
mm/page-writeback.c:709: error: dereferencing pointer to incomplete type

Fix this by essentially eliminating the laptop sync handlers when
CONFIG_BLOCK isn't set, as most are only used from the block layer code.
The exception is laptop_sync_completion() which is used from sys_sync(),
make that an empty declaration in that case.

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 20:01:03 +02:00
Tejun Heo
b403a98e26 block: improve automatic native capacity unlocking
Currently, native capacity unlocking is initiated only when a
recognized partition extends beyond the end of the disk.  However,
there are several other unhandled cases where truncated capacity can
lead to misdetection of partitions.

* Partition table is fully beyond EOD.

* Partition table is partially beyond EOD (daisy chained ones).

* Recognized partition starts beyond EOD.

This patch updates generic partition check code such that all the
above three cases are handled too.  For the first two, @state tracks
whether low level partition check code tried to read beyond EOD during
partition scan and triggers native capacity unlocking accordingly.
The third is now handled similarly to the original unlocking case.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 20:01:03 +02:00
Tejun Heo
1493bf217f block: use struct parsed_partitions *state universally in partition check code
Make the following changes to partition check code.

* Add ->bdev to struct parsed_partitions.

* Introduce read_part_sector() which is a simple wrapper around
  read_dev_sector() which takes struct parsed_partitions *state
  instead of @bdev.

* For functions which used to take @state and @bdev, drop @bdev.  For
  functions which used to take @bdev, replace it with @state.

* While updating, drop superflous checks on NULL state/bdev in ldm.c.

This cleans up the API a bit and enables better handling of IO errors
during partition check as the generic partition check code now has
much better visibility into what went wrong in the low level code
paths.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 20:01:02 +02:00
Tejun Heo
c3e33e043f block,ide: simplify bdops->set_capacity() to ->unlock_native_capacity()
bdops->set_capacity() is unnecessarily generic.  All that's required
is a simple one way notification to lower level driver telling it to
try to unlock native capacity.  There's no reason to pass in target
capacity or return the new capacity.  The former is always the
inherent native capacity and the latter can be handled via the usual
device resize / revalidation path.  In fact, the current API is always
used that way.

Replace ->set_capacity() with ->unlock_native_capacity() which take
only @disk and doesn't return anything.  IDE which is the only current
user of the API is converted accordingly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 20:01:02 +02:00
Tejun Heo
56bca01738 block: restart partition scan after resizing a device
Device resize via ->set_capacity() can reveal new partitions (e.g. in
chained partition table formats such as dos extended parts).  Restart
partition scan from the beginning after resizing a device.  This
change also makes libata always revalidate the disk after resize which
makes lower layer native capacity unlocking implementation simpler and
more robust as resize can be handled in the usual path.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 20:01:02 +02:00
Tejun Heo
fa4b9074cd buffer: make invalidate_bdev() drain all percpu LRU add caches
invalidate_bdev() should release all page cache pages which are clean
and not being used; however, if some pages are still in the percpu LRU
add caches on other cpus, those pages are considered in used and don't
get released.  Fix it by calling lru_add_drain_all() before trying to
invalidate pages.

This problem was discovered while testing block automatic native
capacity unlocking.  Null pages which were read before automatic
unlocking didn't get released by invalidate_bdev() and ended up
interfering with partition scan after unlocking.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 20:01:02 +02:00