linux

iv/linux

Author	SHA1	Message	Date
Al Viro	d85b399b64	fix proc_fill_cache() in case of d_alloc_parallel() failure If d_alloc_parallel() returns ERR_PTR(...), we don't want to dput() that. Small reorganization allows to have all error-in-lookup cases rejoin the main codepath after dput(child), avoiding the entire problem. Spotted-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Fixes: `0168b9e38c` "procfs: switch instantiate_t to d_splice_alias()" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-06-08 01:17:11 -04:00
Al Viro	888e2b03ef	switch the rest of procfs lookups to d_splice_alias() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-26 14:20:50 -04:00
Al Viro	0168b9e38c	procfs: switch instantiate_t to d_splice_alias() ... and get rid of pointless struct inode *dir argument of those, while we are at it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-26 14:20:50 -04:00
Al Viro	9883638641	don't bother with tid_fd_revalidate() in lookups what we want it for is actually updating inode metadata; take _that_ into a separate helper and use it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-26 14:20:28 -04:00
Al Viro	1ae9bd8b7e	proc_lookupfd_common(): don't bother with instantiate unless the file is open ... and take the "check if file is open, pick ->f_mode" into a helper; tid_fd_revalidate() can use it. The next patch will get rid of tid_fd_revalidate() calls in instantiate callbacks. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:28:04 -04:00
Al Viro	1bbc55131e	procfs: get rid of ancient BS in pid_revalidate() uses First of all, calling pid_revalidate() in the end of <pid>/* lookups is not about closing any kind of races; that used to be true once upon a time, but these days those comments are actively misleading. Especially since pid_revalidate() doesn't even do d_drop() on failure anymore. It doesn't matter, anyway, since once pid_revalidate() starts returning false, ->d_delete() of those dentries starts saying "don't keep"; they won't get stuck in dcache any longer than they are pinned. These calls cannot be just removed, though - the side effect of pid_revalidate() (updating i_uid/i_gid/etc.) is what we are calling it for here. Let's separate the "update ownership" into a new helper (pid_update_inode()) and use it, both in lookups and in pid_revalidate() itself. The comments in pid_revalidate() are also out of date - they refer to the time when pid_revalidate() used to call d_drop() directly... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:28:03 -04:00
Al Viro	11f17c9bd7	cifs_lookup(): switch to d_splice_alias() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:28:03 -04:00
Al Viro	a8b75f663e	cifs_lookup(): cifs_get_inode_...() never returns 0 with *inode left NULL not since 2004... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:28:02 -04:00
Al Viro	500e2ab6c3	9p: unify paths in v9fs_vfs_lookup() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:28:02 -04:00
Al Viro	1c5fedbb50	ncp_lookup(): use d_splice_alias() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:28:01 -04:00
Al Viro	293542d8e5	hfsplus: switch to d_splice_alias() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:28:00 -04:00
Al Viro	0e5c56fd07	hfs: don't allow mounting over .../rsrc That's one case when unlink() destroys a subtree, thanks to "resource fork" idiocy. We might forcibly evict that shit on unlink(2), but for now let's just disallow overmounting; as it is, anything that plays games with those would leak mounts. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:28:00 -04:00
Al Viro	6b9cceead0	hfs: use d_splice_alias() code is simpler that way Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:59 -04:00
Al Viro	18fbbfc2bf	omfs_lookup(): report IO errors, use d_splice_alias() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:58 -04:00
Al Viro	04bb1ba141	orangefs_lookup: simplify d_splice_alias() can handle NULL and ERR_PTR() for inode just fine... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:58 -04:00
Al Viro	0ed883fd80	openpromfs: switch to d_splice_alias() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:57 -04:00
Al Viro	b113a6d3cf	xfs_vn_lookup: simplify a bit have all post-xfs_lookup() branches converge on d_splice_alias() Cc: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:57 -04:00
Al Viro	9a7dddcaff	adfs_lookup: do not fail with ENOENT on negatives, use d_splice_alias() Cc: Russell King <linux@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:56 -04:00
Al Viro	686bb96d1b	adfs_lookup_byname: .. is taken care of in fs/namei.c Cc: Russell King <linux@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:55 -04:00
Al Viro	8130c15176	romfs_lookup: switch to d_splice_alias() ... and hash negative lookups Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:55 -04:00
Al Viro	c1481700f4	qnx6_lookup: switch to d_splice_alias() ... and hash negative lookups Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:54 -04:00
Al Viro	191ac107f9	ubifs_lookup: use d_splice_alias() code is simpler that way Acked-by: Richard Weinberger <richard@nod.at> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:54 -04:00
Al Viro	5bf3544970	sysv_lookup: use d_splice_alias() code is simpler that way Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:53 -04:00
Al Viro	b135dcea37	qnx4_lookup: use d_splice_alias() code is simpler that way Acked-by: Anders Larsen <al@alarsen.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:52 -04:00
Al Viro	b014951692	minix_lookup: use d_splice_alias() code is simpler that way Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:52 -04:00
Al Viro	72ff0b038d	freevxfs_lookup(): use d_splice_alias() code is simpler that way Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:51 -04:00
Al Viro	d023b3a19f	cramfs_lookup(): use d_splice_alias() simpler code that way, actually Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:51 -04:00
Al Viro	b455ecd4bb	bfs_add_entry: pass name/len as qstr pointer same story as with bfs_find_entry() Cc: "Tigran A. Aivazian" <aivazian.tigran@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:50 -04:00
Al Viro	33ebdebece	bfs_find_entry: pass name/len as qstr pointer all callers feed something->name/something->len anyway Cc: "Tigran A. Aivazian" <aivazian.tigran@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:49 -04:00
Al Viro	a596a23b9a	bfs_lookup(): use d_splice_alias() code is actually simpler that way. Acked-by: "Tigran A. Aivazian" <aivazian.tigran@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-22 14:27:38 -04:00
Al Viro	837f3ec692	Merge branch 'work.misc' into work.lookup	2018-05-21 17:43:32 -04:00
Al Viro	baf10564fb	aio: fix io_destroy(2) vs. lookup_ioctx() race kill_ioctx() used to have an explicit RCU delay between removing the reference from ->ioctx_table and percpu_ref_kill() dropping the refcount. At some point that delay had been removed, on the theory that percpu_ref_kill() itself contained an RCU delay. Unfortunately, that was the wrong kind of RCU delay and it didn't care about rcu_read_lock() used by lookup_ioctx(). As the result, we could get ctx freed right under lookup_ioctx(). Tejun has fixed that in `a6d7cff472` ("fs/aio: Add explicit RCU grace period when freeing kioctx"); however, that fix is not enough. Suppose io_destroy() from one thread races with e.g. io_setup() from another; CPU1 removes the reference from current->mm->ioctx_table[...] just as CPU2 has picked it (under rcu_read_lock()). Then CPU1 proceeds to drop the refcount, getting it to 0 and triggering a call of free_ioctx_users(), which proceeds to drop the secondary refcount and once that reaches zero calls free_ioctx_reqs(). That does INIT_RCU_WORK(&ctx->free_rwork, free_ioctx); queue_rcu_work(system_wq, &ctx->free_rwork); and schedules freeing the whole thing after RCU delay. In the meanwhile CPU2 has gotten around to percpu_ref_get(), bumping the refcount from 0 to 1 and returned the reference to io_setup(). Tejun's fix (that queue_rcu_work() in there) guarantees that ctx won't get freed until after percpu_ref_get(). Sure, we'd increment the counter before ctx can be freed. Now we are out of rcu_read_lock() and there's nothing to stop freeing of the whole thing. Unfortunately, CPU2 assumes that since it has grabbed the reference, ctx is NOT going away until it gets around to dropping that reference. The fix is obvious - use percpu_ref_tryget_live() and treat failure as miss. It's not costlier than what we currently do in normal case, it's safe to call since freeing is delayed and it closes the race window - either lookup_ioctx() comes before percpu_ref_kill() (in which case ctx->users won't reach 0 until the caller of lookup_ioctx() drops it) or lookup_ioctx() fails, ctx->users is unaffected and caller of lookup_ioctx() doesn't see the object in question at all. Cc: stable@kernel.org Fixes: `a6d7cff472` "fs/aio: Add explicit RCU grace period when freeing kioctx" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-21 14:30:11 -04:00
Al Viro	5aa1437d2d	ext2: fix a block leak open file, unlink it, then use ioctl(2) to make it immutable or append only. Now close it and watch the blocks not freed... Immutable/append-only checks belong in ->setattr(). Note: the bug is old and backport to anything prior to `737f2e93b9` ("ext2: convert to use the new truncate convention") will need these checks lifted into ext2_setattr(). Cc: stable@kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-21 14:30:11 -04:00
Al Viro	3819bb0d79	nfsd: vfs_mkdir() might succeed leaving dentry negative unhashed That can (and does, on some filesystems) happen - ->mkdir() (and thus vfs_mkdir()) can legitimately leave its argument negative and just unhash it, counting upon the lookup to pick the object we'd created next time we try to look at that name. Some vfs_mkdir() callers forget about that possibility... Acked-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-21 14:30:10 -04:00
Al Viro	9c3e9025a3	cachefiles: vfs_mkdir() might succeed leaving dentry negative unhashed That can (and does, on some filesystems) happen - ->mkdir() (and thus vfs_mkdir()) can legitimately leave its argument negative and just unhash it, counting upon the lookup to pick the object we'd created next time we try to look at that name. Some vfs_mkdir() callers forget about that possibility... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-21 14:30:10 -04:00
Al Viro	7b745a4e40	unfuck sysfs_mount() new_sb is left uninitialized in case of early failures in kernfs_mount_ns(), and while IS_ERR(root) is true in all such cases, using IS_ERR(root) \|\| !new_sb is not a solution - IS_ERR(root) is true in some cases when new_sb is true. Make sure new_sb is initialized (and matches the reality) in all cases and fix the condition for dropping kobj reference - we want it done precisely in those situations where the reference has not been transferred into a new super_block instance. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-21 14:30:09 -04:00
Al Viro	82382acec0	kernfs: deal with kernfs_fill_super() failures make sure that info->node is initialized early, so that kernfs_kill_sb() can list_del() it safely. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-21 14:30:08 -04:00
Joe Perches	08a8f30868	cramfs: Fix IS_ENABLED typo There's an extra C here... Fixes: `99c18ce580` ("cramfs: direct memory access support") Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-21 14:30:08 -04:00
Al Viro	f4e4d434fe	befs_lookup(): use d_splice_alias() RTFS(Documentation/filesystems/nfs/Exporting) if you try to make something exportable. Fixes: `ac632f5b63` "befs: add NFS export support" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-21 14:30:07 -04:00
Al Viro	87fbd639c0	affs_lookup: switch to d_splice_alias() Making something exportable takes more than providing ->s_export_ops. In particular, ->lookup() MUST use d_splice_alias() instead of d_add(). Reading Documentation/filesystems/nfs/Exporting would've been a good idea; as it is, exporting AFFS is badly (and exploitably) broken. Partially-Fixes: `ed4433d723` "fs/affs: make affs exportable" Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-21 14:29:12 -04:00
Al Viro	30da870ce4	affs_lookup(): close a race with affs_remove_link() we unlock the directory hash too early - if we are looking at secondary link and primary (in another directory) gets removed just as we unlock, we could have the old primary moved in place of the secondary, leaving us to look into freed entry (and leaving our dentry with ->d_fsdata pointing to a freed entry). Cc: stable@vger.kernel.org # 2.4.4+ Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-21 14:27:45 -04:00
Danilo Krummrich	030c7e0bb7	vfs: namei: use path_equal() in follow_dotdot() Use path_equal() to detect whether we're already in root. Signed-off-by: Danilo Krummrich <danilokrummrich@dk-develop.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-17 22:19:39 -04:00
Li Qiang	75abe32946	fs.h: fix outdated comment about file flags The __dentry_open function was removed in commit <2a027e7a18738>("fold __dentry_open() into its sole caller"). Signed-off-by: Li Qiang <liq3ea@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-17 22:19:39 -04:00
Al Viro	e919328810	__inode_security_revalidate() never gets NULL opt_dentry Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-14 09:51:42 -04:00
Al Viro	2220c5b0a7	make xattr_getsecurity() static many years overdue... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-14 09:51:34 -04:00
Al Viro	b127125d9d	fix breakage caused by d_find_alias() semantics change "VFS: don't keep disconnected dentries on d_anon" had a non-trivial side-effect - d_unhashed() now returns true for those dentries, making d_find_alias() skip them altogether. For most of its callers that's fine - we really want a connected alias there. However, there is a codepath where we relied upon picking such aliases if nothing else could be found - selinux delayed initialization of contexts for inodes on already mounted filesystems used to rely upon that. Cc: stable@kernel.org # `f1ee616214` "VFS: don't keep disconnected dentries on d_anon" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-13 15:47:58 -04:00
Al Viro	f6ddc16175	vfat: simplify checks in vfat_lookup() vfat_d_anon_disconn() is called only if alias->d_parent is equal to dentry->d_parent and it returns false unless alias->d_parent == alias. But in that case alias is the directory we are doing lookup in, and d_splice_alias() would've done the right thing. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-13 12:09:14 -04:00
Al Viro	61fec493c9	get rid of dead code in d_find_alias() All "try disconnected alias if nothing else fits" logics in d_find_alias() got accidentally disabled by Neil a while ago; for most of the callers it was the right thing to do, so fixes belong in few callers that do want disconnected aliases. This just takes the now-dead code in d_find_alias() out. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-13 12:08:32 -04:00
Dave Chinner	79f546a696	fs: don't scan the inode cache before SB_BORN is set We recently had an oops reported on a 4.14 kernel in xfs_reclaim_inodes_count() where sb->s_fs_info pointed to garbage and so the m_perag_tree lookup walked into lala land. It produces an oops down this path during the failed mount: radix_tree_gang_lookup_tag+0xc4/0x130 xfs_perag_get_tag+0x37/0xf0 xfs_reclaim_inodes_count+0x32/0x40 xfs_fs_nr_cached_objects+0x11/0x20 super_cache_count+0x35/0xc0 shrink_slab.part.66+0xb1/0x370 shrink_node+0x7e/0x1a0 try_to_free_pages+0x199/0x470 __alloc_pages_slowpath+0x3a1/0xd20 __alloc_pages_nodemask+0x1c3/0x200 cache_grow_begin+0x20b/0x2e0 fallback_alloc+0x160/0x200 kmem_cache_alloc+0x111/0x4e0 The problem is that the superblock shrinker is running before the filesystem structures it depends on have been fully set up. i.e. the shrinker is registered in sget(), before ->fill_super() has been called, and the shrinker can call into the filesystem before fill_super() does it's setup work. Essentially we are exposed to both use-after-free and use-before-initialisation bugs here. To fix this, add a check for the SB_BORN flag in super_cache_count. In general, this flag is not set until ->fs_mount() completes successfully, so we know that it is set after the filesystem setup has completed. This matches the trylock_super() behaviour which will not let super_cache_scan() run if SB_BORN is not set, and hence will not allow the superblock shrinker from entering the filesystem while it is being set up or after it has failed setup and is being torn down. Cc: stable@kernel.org Signed-Off-By: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-11 15:37:57 -04:00
Al Viro	1e2e547a93	do d_instantiate/unlock_new_inode combinations safely For anything NFS-exported we do _not_ want to unlock new inode before it has grown an alias; original set of fixes got the ordering right, but missed the nasty complication in case of lockdep being enabled - unlock_new_inode() does lockdep_annotate_inode_mutex_key(inode) which can only be done before anyone gets a chance to touch ->i_mutex. Unfortunately, flipping the order and doing unlock_new_inode() before d_instantiate() opens a window when mkdir can race with open-by-fhandle on a guessed fhandle, leading to multiple aliases for a directory inode and all the breakage that follows from that. Correct solution: a new primitive (d_instantiate_new()) combining these two in the right order - lockdep annotate, then d_instantiate(), then the rest of unlock_new_inode(). All combinations of d_instantiate() with unlock_new_inode() should be converted to that. Cc: stable@kernel.org # 2.6.29 and later Tested-by: Mike Marshall <hubcap@omnibond.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-05-11 15:36:37 -04:00

1 2 3 4 5 ...

752042 Commits