40714 Commits

Author SHA1 Message Date
NeilBrown
92acca4542 UDF: support NFSv2 export
The "fh_len" passed to ->fh_to_* is not guaranteed to be that same as
that returned by encode_fh - it may be larger.

With NFSv2, the filehandle is fixed length, so it may appear longer
than expected and be zero-padded.

So we must test that fh_len is at least some value, not exactly equal
to it.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-05-18 11:23:08 +02:00
Adir Kuhn
4649f6b362 fs: ext3: super: fixed a space coding style issue
Fixed a coding style issue

Signed-off-by: Adir Kuhn <adirkuhn@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-05-18 11:23:08 +02:00
Fabian Frederick
231473f6dd udf: Return error from udf_find_entry()
Return appropriate error from udf_find_entry() instead of just NULL.
That way we can distinguish the fact that some error happened when
looking up filename (and return error to userspace) from the fact that
we just didn't find the filename. Also update callers of
udf_find_entry() accordingly.

[JK: Improved udf_find_entry() documentation]

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-05-18 11:23:07 +02:00
Fabian Frederick
6ce6383673 udf: Make udf_get_filename() return error instead of 0 length file name
Zero length file name isn't really valid. So check the length of the
final file name generated by udf_translate_to_linux() and return -EINVAL
instead of zero length file name. Update caller of udf_get_filename() to
not check for 0 return value.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-05-18 11:23:06 +02:00
Fabian Frederick
5dce54b71e udf: bug on exotic flag in udf_get_filename()
UDF volume is only mounted with UDF_FLAG_UTF8
or UDF_FLAG_NLS_MAP (see fill udf_fill_super().
BUG() if we have something different in udf_get_filename()

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-05-18 11:23:06 +02:00
Fabian Frederick
78fc2e694f udf: improve error management in udf_CS0toNLS()
Only callsite udf_get_filename() now returns error code as well.

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-05-18 11:23:05 +02:00
Fabian Frederick
e9d4cf411f udf: improve error management in udf_CS0toUTF8()
udf_CS0toUTF8() now returns -EINVAL on error.
udf_load_pvoldesc() and udf_get_filename() do the same.

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-05-18 11:23:05 +02:00
Fabian Frederick
d67e4a4814 udf: unicode: update function name in comments
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-05-18 11:23:04 +02:00
Fabian Frederick
31f2566f33 udf: remove unnecessary test in udf_build_ustr_exact()
We can remove parameter checks:

udf_build_ustr_exact() is only called by udf_get_filename()
which now assures dest is not NULL

udf_find_entry() and udf_readdir() call udf_get_filename()
after checking sname
udf_symlink_filler() calls udf_pc_to_char() with sname=kmap(page).

udf_find_entry() and udf_readdir() call udf_get_filename with UDF_NAME_LEN
udf_pc_to_char() with PAGE_SIZE

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-05-18 11:23:04 +02:00
Fabian Frederick
5ceb8b554d udf: Return -ENOMEM when allocation fails in udf_get_filename()
Return -ENOMEM when allocation fails in udf_get_filename(). Update
udf_pc_to_char(), udf_readdir(), and udf_find_entry() to handle the
error appropriately. This allows us to pass appropriate error to
userspace instead of corrupting symlink contents by omitting some path
elements.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-05-18 11:23:03 +02:00
Linus Torvalds
92752b5cdd Merge branch 'for-linus-4.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML hostfs fix from Richard Weinberger:
 "This contains a single fix for a regression introduced in 4.1-rc1"

* 'for-linus-4.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  hostfs: Use correct mask for file mode
2015-05-16 16:33:59 -07:00
Linus Torvalds
6a8098a447 Fix a number of ext4 bugs; the most serious of which is a bug in the
lazytime mount optimization code where we could end up updating the
 timestamps to the wrong inode.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJVV1l1AAoJEPL5WVaVDYGjdPwH/RzNut4bfgq7yK2yUVNqPpPN
 QzjR848fT1lj7C1eN7eEh+NRG+KNM2QnmMJBU8jVnwq2l3r8AGFV/bDRC+Zx4U8L
 cz9mZJMU7ZDP5TH/WVyimySGAXpaFKruXA+3L8CyC3LQEI6TUOxKt5CqNi0/9nND
 B8HoF+Ei7jIILrcW7KKj55/fSfh4iiy+iUb0kjrSnZj0y5sROfFG2QhQwIhJRk7I
 /8aeg2HYbhWXCKQHnQ5F4lLNCf44kdJ/EoCpz6aOHtVwrnBcQ44yeqm5MtHSh6Qw
 lj8iPCIlcHYGZE4im+pWAavDMeHBm/VnOnH9545t6nNFq6W7WNdkD99ZJ/AQyWQ=
 =JJxO
 -----END PGP SIGNATURE-----

Merge tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
 "Fix a number of ext4 bugs; the most serious of which is a bug in the
  lazytime mount optimization code where we could end up updating the
  timestamps to the wrong inode"

* tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix an ext3 collapse range regression in xfstests
  jbd2: fix r_count overflows leading to buffer overflow in journal recovery
  ext4: check for zero length extent explicitly
  ext4: fix NULL pointer dereference when journal restart fails
  ext4: remove unused function prototype from ext4.h
  ext4: don't save the error information if the block device is read-only
  ext4: fix lazytime optimization
2015-05-16 15:55:31 -07:00
Linus Torvalds
c7309e88a6 Merge branch 'for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
 "The first commit is a fix from Filipe for a very old extent buffer
  reuse race that triggered a BUG_ON.  It hasn't come up often, I looked
  through old logs at FB and we hit it a handful of times over the last
  year.

  The rest are other corners he hit during testing"

* 'for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: fix race when reusing stale extent buffers that leads to BUG_ON
  Btrfs: fix race between block group creation and their cache writeout
  Btrfs: fix panic when starting bg cache writeout after IO error
  Btrfs: fix crash after inode cache writeback failure
2015-05-16 15:50:58 -07:00
Linus Torvalds
4b470f1208 Merge branch 'parisc-4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
 "One important patch which fixes crashes due to stack randomization on
  architectures where the stack grows upwards (currently parisc and
  metag only).

  This bug went unnoticed on parisc since kernel 3.14 where the flexible
  mmap memory layout support was added by commit 9dabf60dc4ab.  The
  changes in fs/exec.c are inside an #ifdef CONFIG_STACK_GROWSUP section
  and will not affect other platforms.

  The other two patches rename args of the kthread_arg() function and
  fixes a printk output"

* 'parisc-4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc,metag: Fix crashes due to stack randomization on stack-grows-upwards architectures
  parisc: copy_thread(): rename 'arg' argument to 'kthread_arg'
  parisc: %pf is only for function pointers
2015-05-15 13:06:06 -07:00
Al Viro
b853a16176 turn user_{path_at,path,lpath,path_dir}() into static inlines
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:45 -04:00
Al Viro
9883d1855e namei: move saved_nd pointer into struct nameidata
these guys are always declared next to each other; might as well put
the former (pointer to previous instance) into the latter and simplify
the calling conventions for {set,restore}_nameidata()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:45 -04:00
Al Viro
520ae68747 inline user_path_create()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:44 -04:00
Al Viro
a2ec4a2d5c inline user_path_parent()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:44 -04:00
Al Viro
76ae2a5ab1 namei: trim do_last() arguments
now that struct filename is stashed in nameidata we have no need to
pass it in

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:43 -04:00
Al Viro
c8a53ee5ee namei: stash dfd and name into nameidata
fewer arguments to pass around...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:43 -04:00
Al Viro
102b8af266 namei: fold path_cleanup() into terminate_walk()
they are always called next to each other; moreover,
terminate_walk() is more symmetrical that way.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:42 -04:00
Al Viro
5c31b6cedb namei: saner calling conventions for filename_parentat()
a) make it reject ERR_PTR() for name
b) make it putname(name) on all other failure exits
c) make it return name on success

again, simplifies the callers

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:42 -04:00
Al Viro
181c37b6e4 namei: saner calling conventions for filename_create()
a) make it reject ERR_PTR() for name
b) make it putname(name) upon return in all other cases.

seriously simplifies the callers...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:42 -04:00
Al Viro
391172c46e namei: shift nameidata down into filename_parentat()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:41 -04:00
Al Viro
abc9f5beb1 namei: make filename_lookup() reject ERR_PTR() passed as name
makes for much easier life in callers

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:41 -04:00
Al Viro
9ad1aaa615 namei: shift nameidata inside filename_lookup()
pass root instead; non-NULL => copy to nd.root and
set LOOKUP_ROOT in flags

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:40 -04:00
Al Viro
e4bd1c1a95 namei: move putname() call into filename_lookup()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:40 -04:00
Al Viro
625b6d1054 namei: pass the struct path to store the result down into path_lookupat()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:39 -04:00
Al Viro
18d8c86011 namei: uninline set_root{,_rcu}()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:39 -04:00
Al Viro
aed434ada6 namei: be careful with mountpoint crossings in follow_dotdot_rcu()
Otherwise we are risking a hard error where nonlazy restart would be the right
thing to do; it's a very narrow race with mount --move and most of the time it
ends up being completely harmless, but it's possible to construct a case when
we'll get a bogus hard error instead of falling back to non-lazy walk...

For one thing, when crossing _into_ overmount of parent we need to check for
mount_lock bumps when we get NULL from __lookup_mnt() as well.

For another, and less exotically, we need to make sure that the data fetched
in follow_up_rcu() had been consistent.  ->mnt_mountpoint is pinned for as
long as it is a mountpoint, but we need to check mount_lock after fetching
to verify that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:38 -04:00
Al Viro
89076bc319 get rid of assorted nameidata-related debris
pointless forward declarations, stale comments

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:37 -04:00
Al Viro
5a8d87e8ed namei: unlazy_walk() doesn't need to mess with current->fs anymore
now that we have ->root_seq, legitimize_path(&nd->root, nd->root_seq)
will do just fine...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:36 -04:00
Al Viro
8f47a0167c namei: handle absolute symlinks without dropping out of RCU mode
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:10:22 -04:00
Al Viro
8c1b456689 enable passing fast relative symlinks without dropping out of RCU mode
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:06:28 -04:00
NeilBrown
8fa9dd2466 VFS/namei: make the use of touch_atime() in get_link() RCU-safe.
touch_atime is not RCU-safe, and so cannot be called on an RCU walk.
However, in situations where RCU-walk makes a difference, the symlink
will likely to accessed much more often than it is useful to update
the atime.

So split out the test of "Does the atime actually need to be updated"
into  atime_needs_update(), and have get_link() unlazy if it finds that
it will need to do that update.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:06:27 -04:00
Al Viro
bc40aee053 namei: don't unlazy until get_link()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:06:27 -04:00
Al Viro
7973387a2f namei: make unlazy_walk and terminate_walk handle nd->stack, add unlazy_link
We are almost done - primitives for leaving RCU mode are aware of nd->stack
now, a new primitive for going to non-RCU mode when we have a symlink on hands
added.

The thing we are heavily relying upon is that *any* unlazy failure will be
shortly followed by terminate_walk(), with no access to nameidata in between.
So it's enough to leave the things in a state terminate_walk() would cope with.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-15 01:06:01 -04:00
Theodore Ts'o
b9576fc362 ext4: fix an ext3 collapse range regression in xfstests
The xfstests test suite assumes that an attempt to collapse range on
the range (0, 1) will return EOPNOTSUPP if the file system does not
support collapse range.  Commit 280227a75b56: "ext4: move check under
lock scope to close a race" broke this, and this caused xfstests to
fail when run when testing file systems that did not have the extents
feature enabled.

Reported-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-15 00:24:10 -04:00
Vladimir Davydov
499611ed45 kernfs: do not account ino_ida allocations to memcg
root->ino_ida is used for kernfs inode number allocations. Since IDA has
a layered structure, different IDs can reside on the same layer, which
is currently accounted to some memory cgroup. The problem is that each
kmem cache of a memory cgroup has its own directory on sysfs (under
/sys/fs/kernel/<cache-name>/cgroup). If the inode number of such a
directory or any file in it gets allocated from a layer accounted to the
cgroup which the cache is created for, the cgroup will get pinned for
good, because one has to free all kmem allocations accounted to a cgroup
in order to release it and destroy all its kmem caches. That said we
must not account layers of ino_ida to any memory cgroup.

Since per net init operations may create new sysfs entries directly
(e.g. lo device) or indirectly (nf_conntrack creates a new kmem cache
per each namespace, which, in turn, creates new sysfs entries), an easy
way to reproduce this issue is by creating network namespace(s) from
inside a kmem-active memory cgroup.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>	[4.0.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-05-14 17:55:51 -07:00
Darrick J. Wong
e531d0bceb jbd2: fix r_count overflows leading to buffer overflow in journal recovery
The journal revoke block recovery code does not check r_count for
sanity, which means that an evil value of r_count could result in
the kernel reading off the end of the revoke table and into whatever
garbage lies beyond.  This could crash the kernel, so fix that.

However, in testing this fix, I discovered that the code to write
out the revoke tables also was not correctly checking to see if the
block was full -- the current offset check is fine so long as the
revoke table space size is a multiple of the record size, but this
is not true when either journal_csum_v[23] are set.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
2015-05-14 19:11:50 -04:00
Eryu Guan
2f974865ff ext4: check for zero length extent explicitly
The following commit introduced a bug when checking for zero length extent

5946d08 ext4: check for overlapping extents in ext4_valid_extent_entries()

Zero length extent could pass the check if lblock is zero.

Adding the explicit check for zero length back.

Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2015-05-14 19:00:45 -04:00
Lukas Czerner
9d50659406 ext4: fix NULL pointer dereference when journal restart fails
Currently when journal restart fails, we'll have the h_transaction of
the handle set to NULL to indicate that the handle has been effectively
aborted. We handle this situation quietly in the jbd2_journal_stop() and just
free the handle and exit because everything else has been done before we
attempted (and failed) to restart the journal.

Unfortunately there are a number of problems with that approach
introduced with commit

41a5b913197c "jbd2: invalidate handle if jbd2_journal_restart()
fails"

First of all in ext4 jbd2_journal_stop() will be called through
__ext4_journal_stop() where we would try to get a hold of the superblock
by dereferencing h_transaction which in this case would lead to NULL
pointer dereference and crash.

In addition we're going to free the handle regardless of the refcount
which is bad as well, because others up the call chain will still
reference the handle so we might potentially reference already freed
memory.

Moreover it's expected that we'll get aborted handle as well as detached
handle in some of the journalling function as the error propagates up
the stack, so it's unnecessary to call WARN_ON every time we get
detached handle.

And finally we might leak some memory by forgetting to free reserved
handle in jbd2_journal_stop() in the case where handle was detached from
the transaction (h_transaction is NULL).

Fix the NULL pointer dereference in __ext4_journal_stop() by just
calling jbd2_journal_stop() quietly as suggested by Jan Kara. Also fix
the potential memory leak in jbd2_journal_stop() and use proper
handle refcounting before we attempt to free it to avoid use-after-free
issues.

And finally remove all WARN_ON(!transaction) from the code so that we do
not get random traces when something goes wrong because when journal
restart fails we will get to some of those functions.

Cc: stable@vger.kernel.org
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2015-05-14 18:55:18 -04:00
Theodore Ts'o
92c8263910 ext4: remove unused function prototype from ext4.h
The ext4_extent_tree_init() function hasn't been in the ext4 code for
a long time ago, except in an unused function prototype in ext4.h

Google-Bug-Id: 4530137
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-14 18:43:36 -04:00
Theodore Ts'o
1b46617b8d ext4: don't save the error information if the block device is read-only
Google-Bug-Id: 20939131
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-14 18:37:30 -04:00
Theodore Ts'o
8f4d855839 ext4: fix lazytime optimization
We had a fencepost error in the lazytime optimization which means that
timestamp would get written to the wrong inode.

Cc: stable@vger.kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-14 18:19:01 -04:00
Miklos Szeredi
d377c5eb54 ovl: don't remove non-empty opaque directory
When removing an opaque directory we can't just call rmdir() to check for
emptiness, because the directory will need to be replaced with a whiteout.
The replacement is done with RENAME_EXCHANGE, which doesn't check
emptiness.

Solution is just to check emptiness by reading the directory.  In the
future we could add a new rename flag to check for emptiness even for
RENAME_EXCHANGE to optimize this case.

Reported-by: Vincent Batts <vbatts@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Jordi Pujol Palomer <jordipujolp@gmail.com>
Fixes: 263b4a0fee43 ("ovl: dont replace opaque dir")
Cc: <stable@vger.kernel.org> # v4.0+
2015-05-14 10:04:44 +02:00
Jeff Layton
feaff8e5b2 nfs: take extra reference to fl->fl_file when running a setlk
We had a report of a crash while stress testing the NFS client:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000150
    IP: [<ffffffff8127b698>] locks_get_lock_context+0x8/0x90
    PGD 0
    Oops: 0000 [#1] SMP
    Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_filter ebtable_broute bridge stp llc ebtables ip6table_security ip6table_mangle ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_raw ip6table_filter ip6_tables iptable_security iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_raw coretemp crct10dif_pclmul ppdev crc32_pclmul crc32c_intel ghash_clmulni_intel vmw_balloon serio_raw vmw_vmci i2c_piix4 shpchp parport_pc acpi_cpufreq parport nfsd auth_rpcgss nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi scsi_transport_spi mptscsih mptbase e1000 ata_generic pata_acpi
    CPU: 1 PID: 399 Comm: kworker/1:1H Not tainted 4.1.0-0.rc1.git0.1.fc23.x86_64 #1
    Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013
    Workqueue: rpciod rpc_async_schedule [sunrpc]
    task: ffff880036aea7c0 ti: ffff8800791f4000 task.ti: ffff8800791f4000
    RIP: 0010:[<ffffffff8127b698>]  [<ffffffff8127b698>] locks_get_lock_context+0x8/0x90
    RSP: 0018:ffff8800791f7c00  EFLAGS: 00010293
    RAX: ffff8800791f7c40 RBX: ffff88001f2ad8c0 RCX: ffffe8ffffc80305
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
    RBP: ffff8800791f7c88 R08: ffff88007fc971d8 R09: 279656d600000000
    R10: 0000034a01000000 R11: 279656d600000000 R12: ffff88001f2ad918
    R13: ffff88001f2ad8c0 R14: 0000000000000000 R15: 0000000100e73040
    FS:  0000000000000000(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000150 CR3: 0000000001c0b000 CR4: 00000000000407e0
    Stack:
     ffffffff8127c5b0 ffff8800791f7c18 ffffffffa0171e29 ffff8800791f7c58
     ffffffffa0171ef8 ffff8800791f7c78 0000000000000246 ffff88001ea0ba00
     ffff8800791f7c40 ffff8800791f7c40 00000000ff5d86a3 ffff8800791f7ca8
    Call Trace:
     [<ffffffff8127c5b0>] ? __posix_lock_file+0x40/0x760
     [<ffffffffa0171e29>] ? rpc_make_runnable+0x99/0xa0 [sunrpc]
     [<ffffffffa0171ef8>] ? rpc_wake_up_task_queue_locked.part.35+0xc8/0x250 [sunrpc]
     [<ffffffff8127cd3a>] posix_lock_file_wait+0x4a/0x120
     [<ffffffffa03e4f12>] ? nfs41_wake_and_assign_slot+0x32/0x40 [nfsv4]
     [<ffffffffa03bf108>] ? nfs41_sequence_done+0xd8/0x2d0 [nfsv4]
     [<ffffffffa03c116d>] do_vfs_lock+0x2d/0x30 [nfsv4]
     [<ffffffffa03c251d>] nfs4_lock_done+0x1ad/0x210 [nfsv4]
     [<ffffffffa0171a30>] ? __rpc_sleep_on_priority+0x390/0x390 [sunrpc]
     [<ffffffffa0171a30>] ? __rpc_sleep_on_priority+0x390/0x390 [sunrpc]
     [<ffffffffa0171a5c>] rpc_exit_task+0x2c/0xa0 [sunrpc]
     [<ffffffffa0167450>] ? call_refreshresult+0x150/0x150 [sunrpc]
     [<ffffffffa0172640>] __rpc_execute+0x90/0x460 [sunrpc]
     [<ffffffffa0172a25>] rpc_async_schedule+0x15/0x20 [sunrpc]
     [<ffffffff810baa1b>] process_one_work+0x1bb/0x410
     [<ffffffff810bacc3>] worker_thread+0x53/0x480
     [<ffffffff810bac70>] ? process_one_work+0x410/0x410
     [<ffffffff810bac70>] ? process_one_work+0x410/0x410
     [<ffffffff810c0b38>] kthread+0xd8/0xf0
     [<ffffffff810c0a60>] ? kthread_worker_fn+0x180/0x180
     [<ffffffff817a1aa2>] ret_from_fork+0x42/0x70
     [<ffffffff810c0a60>] ? kthread_worker_fn+0x180/0x180

Jean says:

"Running locktests with a large number of iterations resulted in a
 client crash.  The test run took a while and hasn't finished after close
 to 2 hours. The crash happened right after I gave up and killed the test
 (after 107m) with Ctrl+C."

The crash happened because a NULL inode pointer got passed into
locks_get_lock_context. The call chain indicates that file_inode(filp)
returned NULL, which means that f_inode was NULL. Since that's zeroed
out in __fput, that suggests that this filp pointer outlived the last
reference.

Looking at the code, that seems possible. We copy the struct file_lock
that's passed in, but if the task is signalled at an inopportune time we
can end up trying to use that file_lock in rpciod context after the process
that requested it has already returned (and possibly put its filp
reference).

Fix this by taking an extra reference to the filp when we allocate the
lock info, and put it in nfs4_lock_release.

Reported-by: Jean Spector <jean@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-05-13 14:56:06 -04:00
Chuck Lever
6b19687563 nfs: stat(2) fails during cthon04 basic test5 on NFSv4.0
When running the Connectathon basic tests against a Solaris NFS
server over NFSv4.0, test5 reports that stat(2) returns a file size
of zero instead of 1MB.

On success, nfs_commit_inode() can return a positive result; see
other call sites such as nfs_file_fsync_commit() and
nfs_commit_unstable_pages().

The call site recently added in nfs_wb_all() does not prevent that
positive return value from leaking to its callers. If it leaks
through nfs_sync_inode() back to nfs_getattr(), that causes stat(2)
to return a positive return value to user space while also not
filling in the passed-in struct stat.

Additional clean up: the new logic in nfs_wb_all() is rewritten in
bfields-normal form.

Fixes: 5bb89b4702e2 ("NFSv4.1/pnfs: Separate out metadata . . .")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-05-13 14:56:03 -04:00
David S. Miller
b04096ff33 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Four minor merge conflicts:

1) qca_spi.c renamed the local variable used for the SPI device
   from spi_device to spi, meanwhile the spi_set_drvdata() call
   got moved further up in the probe function.

2) Two changes were both adding new members to codel params
   structure, and thus we had overlapping changes to the
   initializer function.

3) 'net' was making a fix to sk_release_kernel() which is
   completely removed in 'net-next'.

4) In net_namespace.c, the rtnl_net_fill() call for GET operations
   had the command value fixed, meanwhile 'net-next' adjusted the
   argument signature a bit.

This also matches example merge resolutions posted by Stephen
Rothwell over the past two days.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 14:31:43 -04:00
Helge Deller
d045c77c1a parisc,metag: Fix crashes due to stack randomization on stack-grows-upwards architectures
On architectures where the stack grows upwards (CONFIG_STACK_GROWSUP=y,
currently parisc and metag only) stack randomization sometimes leads to crashes
when the stack ulimit is set to lower values than STACK_RND_MASK (which is 8 MB
by default if not defined in arch-specific headers).

The problem is, that when the stack vm_area_struct is set up in fs/exec.c, the
additional space needed for the stack randomization (as defined by the value of
STACK_RND_MASK) was not taken into account yet and as such, when the stack
randomization code added a random offset to the stack start, the stack
effectively got smaller than what the user defined via rlimit_max(RLIMIT_STACK)
which then sometimes leads to out-of-stack situations and crashes.

This patch fixes it by adding the maximum possible amount of memory (based on
STACK_RND_MASK) which theoretically could be added by the stack randomization
code to the initial stack size. That way, the user-defined stack size is always
guaranteed to be at minimum what is defined via rlimit_max(RLIMIT_STACK).

This bug is currently not visible on the metag architecture, because on metag
STACK_RND_MASK is defined to 0 which effectively disables stack randomization.

The changes to fs/exec.c are inside an "#ifdef CONFIG_STACK_GROWSUP"
section, so it does not affect other platformws beside those where the
stack grows upwards (parisc and metag).

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: linux-parisc@vger.kernel.org
Cc: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Cc: stable@vger.kernel.org # v3.16+
2015-05-12 22:03:44 +02:00