linux

iv/linux

Author	SHA1	Message	Date
Yan, Zheng	508b32d866	ceph: request xattrs if xattr_version is zero Following sequence of events can happen. - Client releases an inode, queues cap release message. - A 'lookup' reply brings the same inode back, but the reply doesn't contain xattrs because MDS didn't receive the cap release message and thought client already has up-to-data xattrs. The fix is force sending a getattr request to MDS if xattrs_version is 0. The getattr mask is set to CEPH_STAT_CAP_XATTR, so MDS knows client does not have xattr. Signed-off-by: Yan, Zheng <zyan@redhat.com>	2014-10-14 21:03:38 +04:00
Yan, Zheng	03974e8177	ceph: make sure request isn't in any waiting list when kicking request. we may corrupt waiting list if a request in the waiting list is kicked. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>	2014-10-14 21:03:24 +04:00
Yan, Zheng	656e438294	ceph: protect kick_requests() with mdsc->mutex Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>	2014-10-14 21:03:24 +04:00
Yan, Zheng	5d23371fdb	ceph: trim unused inodes before reconnecting to recovering MDS So the recovering MDS does not need to fetch these ununsed inodes during cache rejoin. This may reduce MDS recovery time. Signed-off-by: Yan, Zheng <zyan@redhat.com>	2014-10-14 21:03:22 +04:00
Martin K. Petersen	e19a8a0ad2	block: Remove REQ_KERNEL REQ_KERNEL is no longer used. Remove it and drop the redundant uio argument to nfs_file_direct_{read,write}. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@infradead.org> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-10-14 09:00:44 -06:00
Vinícius Tinti	0458a953d8	btrfs: LLVMLinux: Remove VLAIS Replaced the use of a Variable Length Array In Struct (VLAIS) with a C99 compliant equivalent. This patch instead allocates the appropriate amount of memory using a char array using the SHASH_DESC_ON_STACK macro. The new code can be compiled with both gcc and clang. Signed-off-by: Vinícius Tinti <viniciustinti@gmail.com> Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de> Reviewed-by: Mark Charlebois <charlebm@gmail.com> Signed-off-by: Behan Webster <behanw@converseincode.com> Acked-by: Chris Mason <clm@fb.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net>	2014-10-14 10:51:22 +02:00
Linus Torvalds	1b5a5f59e3	FS-Cache fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIVAwUAVDwD3ROxKuMESys7AQLayg//Tmdi4eLzcky/HcOfAoVIY3B5Wvs1MBbN 3HhaYWKDeJvWxFmRDfQK0c1dyjBA2Xe7bPhdwQ8S9epAWAoW6D4g3Mg2+YReGLCK U/CcrMHN77RSydTG0Mj/Z99IynSdf9rwdNrCEy8NiNkGe8Z/JCFPpZurRCc4PL44 4miTUq3ESMTGkUsa9BH+T0ngEka2ZdwnmzlYkdzeqmjmlbFx8RxcEewBeAoAlU73 eihKKyX+1uWX/2DmJol5NtZx+BbNkFsO+pX+s+70TsbjiyILCAmgh5meTpkGsDrW iJGcgxwhcmyq1aTPcHRmXeNsVenbqRefGUtz7B5Q0x1Uk+ofRYfVVdiyTS2juGbC DFGyNBUcFqsmbSMxM+yZGSzgR9KbzoZHDR/ppbJfMqIoe+oGju/NE+AZ6Q3f2/Es AIGc8imc96QU08OnrZtreZxfgFMcFxBoGHvAM9AUr1ue80SWhVRZjwYx/JcIP7Cm TKyilgb5hfxJ7zon+JuHSqttpeG3zOTjjhcKDmJlybYkKlTeRXm6ZcKVrro5d2+z GLnH32HQRJvXBZslymqb7OgkxIW4ySO3PcAWTosUv9+zG0BPR1mB0NVQrSLEPk4L JHA+Mjp8O37pN3kRantVNHk73t0z4qkbi8Ixft0yAus9qNNFMeKh+7NbBRjxUZAU ARcAbvVMyT0= =RtLr -----END PGP SIGNATURE----- Merge tag 'fscache-fixes-20141013' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull fs-cache fixes from David Howells: "Two fixes for bugs in CacheFiles and a cleanup in FS-Cache" * tag 'fscache-fixes-20141013' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: fs/fscache/object-list.c: use __seq_open_private() CacheFiles: Fix incorrect test for in-memory object collision CacheFiles: Handle object being killed before being set up	2014-10-14 08:40:15 +02:00
Linus Torvalds	b11445f830	* Fix for a theoretical race condition which could lead to a situation when UBIFS is unable to mount a file-system (Hujianyang) * Few fixes for the ubiblock sybsystem, error path fixes * The ubiblock subsystem has had the volume size change handling improved * Few fixes and nicifications in the fastmap subsystem -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUO9l0AAoJECmIfjd9wqK01TwP/jAcA7GEnxUpQ8UFBZhJEIN0 0Ad4oDrGShpuEYgYyRFjCstuXErJBhMwImrevJhRmwxaY2fzGqBeDO9YKKGkKDfa qjGsQrUaCJgV6qC2iT056ZmI7V/XyZfnZQ4Z8nQbzafoJ3MPbB6ExqBy8CZi8q/6 A516cen/cnZfHOQ1aqN6gyw2l976IzdJx8v0WOeYaXcvfDMrmfY8mkfh7EahOIVm Kz9BVlVRxxfKPCqMpm+xV8KAOsMueOnKy+6zL7rFh+AvLQBACq44BV1HkZtg2avX NBAo1RTPumeCht2t4nLJfgc+BJZ7cNpNFAijsWVJxp6umUqlsnbqckAx69O+JE9/ VZjM1KN1suI0bm01bj6xysGvg+JNTMiZ+HEqiseICSWtDbnCT4qDL3MPFgmD9OYh 9ar92Ku2HeY3DakKNd89gqw0ey28cv4i957KleneYzewcfFQ5pC/dp4thcDWa5fH AHoblC4ShmcURDPYsIKRZsiTUf/uf3iLFIWAGJBDnSRg4dzzjoJkenz4W5ecWFDj JokceklSf0zm8qAAdIUXw5Sihza1cnSBAIYBxVR808U+bwkCTOFF5xcTQy6wKf3y NBb+ygh/ugps8B2evJEmp6ByLWQZr8j1q7IokZtglKWN2qOTfzyMxzlWl9vOQJYq EQytnka5OEEXamr7g1iB =2XCN -----END PGP SIGNATURE----- Merge tag 'upstream-3.18-rc1-v2' of git://git.infradead.org/linux-ubifs Pull UBI/UBIFS fixes from Artem Bityutskiy: - fix for a theoretical race condition which could lead to a situation when UBIFS is unable to mount a file-system (Hujianyang) - a few fixes for the ubiblock sybsystem, error path fixes - the ubiblock subsystem has had the volume size change handling improved - a few fixes and nicifications in the fastmap subsystem * tag 'upstream-3.18-rc1-v2' of git://git.infradead.org/linux-ubifs: UBI: Fastmap: Calc fastmap size correctly UBIFS: Fix trivial typo in power_cut_emulated() UBI: Fix trivial typo in __schedule_ubi_work UBI: wl: Rename cancel flag to shutdown UBI: ubi_eba_read_leb: Remove in vain variable assignment UBIFS: Align the dump messages of SB_NODE UBI: Fix livelock in produce_free_peb() UBI: return on error in rename_volumes() UBI: Improve comment on work_sem UBIFS: Remove bogus assert UBI: Dispatch update notification if the volume is updated UBI: block: Add support for the UBI_VOLUME_UPDATED notification UBI: block: Fix block device size setting UBI: block: fix dereference on uninitialized dev UBI: add missing kmem_cache_free() in process_pool_aeb error path UBIFS: fix free log space calculation UBIFS: fix a race condition	2014-10-14 08:38:54 +02:00
Darrick J. Wong	813d32f913	ext4: check s_chksum_driver when looking for bg csum presence Convert the ext4_has_group_desc_csum predicate to look for a checksum driver instead of the metadata_csum flag and change the bg checksum calculation function to look for GDT_CSUM before taking the crc16 path. Without this patch, if we mount with ^uninit_bg,^metadata_csum and later metadata_csum gets turned on by accident, the block group checksum functions will incorrectly assume that checksumming is enabled (metadata_csum) but that crc16 should be used (!s_chksum_driver). This is totally wrong, so fix the predicate and the checksum formula selection. (Granted, if the metadata_csum feature bit gets enabled on a live FS then something underhanded is going on, but we could at least avoid writing garbage into the on-disk fields.) Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Dmitry Monakhov <dmonakhov@openvz.org> Cc: stable@vger.kernel.org	2014-10-14 02:35:49 -04:00
Linus Torvalds	0ef3a56b1c	Merge branch 'CVE-2014-7975' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux Pull do_umount fix from Andy Lutomirski: "This fix really ought to be safe. Inside a mountns owned by a non-root user namespace, the namespace root almost always has MNT_LOCKED set (if it doesn't, then there's a bug, because rootfs could be exposed). In that case, calling umount on "/" will return -EINVAL with or without this patch. Outside a userns, this patch will have no effect. may_mount, required by umount, already checks ns_capable(current->nsproxy->mnt_ns->user_ns, CAP_SYS_ADMIN) so an additional capable(CAP_SYS_ADMIN) check will have no effect. That leaves anything that calls umount on "/" in a non-root userns while chrooted. This is the case that is currently broken (it remounts ro, which shouldn't be allowed) and that my patch changes to -EPERM. If anything relies on that, I'd be surprised" * 'CVE-2014-7975' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux: fs: Add a missing permission check to do_umount	2014-10-14 08:35:01 +02:00
Peter Feiner	64e455079e	mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared For VMAs that don't want write notifications, PTEs created for read faults have their write bit set. If the read fault happens after VM_SOFTDIRTY is cleared, then the PTE's softdirty bit will remain clear after subsequent writes. Here's a simple code snippet to demonstrate the bug: char* m = mmap(NULL, getpagesize(), PROT_READ \| PROT_WRITE, MAP_ANONYMOUS \| MAP_SHARED, -1, 0); system("echo 4 > /proc/$PPID/clear_refs"); /* clear VM_SOFTDIRTY / assert(m == '\0'); /* new PTE allows write access / assert(!soft_dirty(x)); m = 'x'; /* should dirty the page / assert(soft_dirty(x)); / fails */ With this patch, write notifications are enabled when VM_SOFTDIRTY is cleared. Furthermore, to avoid unnecessary faults, write notifications are disabled when VM_SOFTDIRTY is set. As a side effect of enabling and disabling write notifications with care, this patch fixes a bug in mprotect where vm_page_prot bits set by drivers were zapped on mprotect. An analogous bug was fixed in mmap by commit c9d0bf241451 ("mm: uncached vma support with writenotify"). Signed-off-by: Peter Feiner <pfeiner@google.com> Reported-by: Peter Feiner <pfeiner@google.com> Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Jamie Liu <jamieliu@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:28 +02:00
Zach Brown	9470dd5d35	fs: check bh blocknr earlier when searching lru It's very common for the buffer heads in the lru to have different block numbers. By comparing the blocknr before the bdev and size we can reduce the cost of searching in the very common case where all the entries have the same bdev and size. In quick hot cache cycle counting tests on a single fs workstation this cut the cost of a miss by about 20%. A diff of the disassembly shows the reordering of the bdev and blocknr comparisons. This is in such a tiny loop that skipping one comparison is a meaningful portion of the total work being done: 1628: 83 c1 01 add $0x1,%ecx 162b: 83 f9 08 cmp $0x8,%ecx 162e: 74 60 je 1690 <__find_get_block+0xa0> 1630: 89 c8 mov %ecx,%eax 1632: 65 4c 8b 04 c5 00 00 mov %gs:0x0(,%rax,8),%r8 1639: 00 00 163b: 4d 85 c0 test %r8,%r8 163e: 4c 89 c3 mov %r8,%rbx 1641: 74 e5 je 1628 <__find_get_block+0x38> - 1643: 4d 3b 68 30 cmp 0x30(%r8),%r13 + 1643: 4d 3b 68 18 cmp 0x18(%r8),%r13 1647: 75 df jne 1628 <__find_get_block+0x38> - 1649: 4d 3b 60 18 cmp 0x18(%r8),%r12 + 1649: 4d 3b 60 30 cmp 0x30(%r8),%r12 164d: 75 d9 jne 1628 <__find_get_block+0x38> 164f: 49 39 50 20 cmp %rdx,0x20(%r8) 1653: 75 d3 jne 1628 <__find_get_block+0x38> Signed-off-by: Zach Brown <zab@zabbo.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:26 +02:00
Rasmus Villemoes	a97df4277d	isofs: replace strnicmp with strncasecmp The kernel used to contain two functions for length-delimited, case-insensitive string comparison, strnicmp with correct semantics and a slightly buggy strncasecmp. The latter is the POSIX name, so strnicmp was renamed to strncasecmp, and strnicmp made into a wrapper for the new strncasecmp to avoid breaking existing users. To allow the compat wrapper strnicmp to be removed at some point in the future, and to avoid the extra indirection cost, do s/strnicmp/strncasecmp/g. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:24 +02:00
Rasmus Villemoes	2bd63329cb	ocfs2: replace strnicmp with strncasecmp The kernel used to contain two functions for length-delimited, case-insensitive string comparison, strnicmp with correct semantics and a slightly buggy strncasecmp. The latter is the POSIX name, so strnicmp was renamed to strncasecmp, and strnicmp made into a wrapper for the new strncasecmp to avoid breaking existing users. To allow the compat wrapper strnicmp to be removed at some point in the future, and to avoid the extra indirection cost, do s/strnicmp/strncasecmp/g. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:24 +02:00
Rasmus Villemoes	87e747cdb9	cifs: replace strnicmp with strncasecmp The kernel used to contain two functions for length-delimited, case-insensitive string comparison, strnicmp with correct semantics and a slightly buggy strncasecmp. The latter is the POSIX name, so strnicmp was renamed to strncasecmp, and strnicmp made into a wrapper for the new strncasecmp to avoid breaking existing users. To allow the compat wrapper strnicmp to be removed at some point in the future, and to avoid the extra indirection cost, do s/strnicmp/strncasecmp/g. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Steve French <sfrench@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:24 +02:00
Fabian Frederick	76e5121089	FS/OMFS: block number sanity check during fill_super operation This patch defines maximum block number to 2^31. It also converts bitmap_size and array_size to unsigned int in omfs_get_imap Signed-off-by: Fabian Frederick <fabf@skynet.be> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Suggested-by: Bob Copeland <me@bobcopeland.com> Acked-by: Bob Copeland <me@bobcopeland.com> Tested-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:22 +02:00
Fabian Frederick	c70b17b653	fs/affs: remove redundant sys_tz declarations sys_tz is already declared in include/linux/time.h Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:22 +02:00
Fabian Frederick	73516ace94	fs/affs/file.c: fix shadow warnings Four functions declared variables twice resulting in shadow warnings. This patch renames internal variables and adds blank line after declarations. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:22 +02:00
Fabian Frederick	3bc759931d	fs/affs/inode.c: remove unused variable head is set to AFFS_HEAD(bh) but never used. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:22 +02:00
Fabian Frederick	1e907f4f11	fs/affs/super.c: remove unused variable key is set in affs_fill_super but never used. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:21 +02:00
Oleg Nesterov	b03023ecbd	coredump: add %i/%I in core_pattern to report the tid of the crashed thread format_corename() can only pass the leader's pid to the core handler, but there is no simple way to figure out which thread originated the coredump. As Jan explains, this also means that there is no simple way to create the backtrace of the crashed process: As programs are mostly compiled with implicit gcc -fomit-frame-pointer one needs program's .eh_frame section (equivalently PT_GNU_EH_FRAME segment) or .debug_frame section. .debug_frame usually is present only in separate debug info files usually not even installed on the system. While .eh_frame is a part of the executable/library (and it is even always mapped for C++ exceptions unwinding) it no longer has to be present anywhere on the disk as the program could be upgraded in the meantime and the running instance has its executable file already unlinked from disk. One possibility is to echo 0x3f >/proc/*/coredump_filter and dump all the file-backed memory including the executable's .eh_frame section. But that can create huge core files, for example even due to mmapped data files. Other possibility would be to read .eh_frame from /proc/PID/mem at the core_pattern handler time of the core dump. For the backtrace one needs to read the register state first which can be done from core_pattern handler: ptrace(PTRACE_SEIZE, tid, 0, PTRACE_O_TRACEEXIT) close(0); // close pipe fd to resume the sleeping dumper waitpid(); // should report EXIT PTRACE_GETREGS or other requests The remaining problem is how to get the 'tid' value of the crashed thread. It could be read from the first NT_PRSTATUS note of the core file but that makes the core_pattern handler complicated. Unfortunately %t is already used so this patch uses %i/%I. Automatic Bug Reporting Tool (https://github.com/abrt/abrt/wiki/overview) is experimenting with this. It is using the elfutils (https://fedorahosted.org/elfutils/) unwinder for generating the backtraces. Apart from not needing matching executables as mentioned above, another advantage is that we can get the backtrace without saving the core (which might be quite large) to disk. [mmilata@redhat.com: final paragraph of changelog] Signed-off-by: Jan Kratochvil <jan.kratochvil@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: Mark Wielaard <mjw@redhat.com> Cc: Martin Milata <mmilata@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:21 +02:00
Fabian Frederick	877aabd6ce	fat: remove redundant sys_tz declaration sys_tz is already declared extern struct in include/linux/time.h Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:20 +02:00
Fabian Frederick	54cc6cea73	fs/reiserfs/journal.c: fix sparse context imbalance warning Merge conditional unlock/lock in the same condition to avoid sparse warning: fs/reiserfs/journal.c:703:36: warning: context imbalance in 'add_to_chunk' - unexpected unlock Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:20 +02:00
Fabian Frederick	35c0b380d8	fs/ufs/balloc.c: remove unused variable ucg is defined and set in ufs_bitmap_search but never used. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:20 +02:00
Fabian Frederick	a792d90829	fs/hfs/hfs_fs.h: remove redundant sys_tz declaration sys_tz is already declared in include/linux/time.h Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:20 +02:00
Andreas Rohner	b9f6614072	nilfs2: improve the performance of fdatasync() Support for fdatasync() has been implemented in NILFS2 for a long time, but whenever the corresponding inode is dirty the implementation falls back to a full-flegded sync(). Since every write operation has to update the modification time of the file, the inode will almost always be dirty and fdatasync() will fall back to sync() most of the time. But this fallback is only necessary for a change of the file size and not for a change of the various timestamps. This patch adds a new flag NILFS_I_INODE_SYNC to differentiate between those two situations. * If it is set the file size was changed and a full sync is necessary. * If it is not set then only the timestamps were updated and fdatasync() can go ahead. There is already a similar flag I_DIRTY_DATASYNC on the VFS layer with the exact same semantics. Unfortunately it cannot be used directly, because NILFS2 doesn't implement write_inode() and doesn't clear the VFS flags when inodes are written out. So the VFS writeback thread can clear I_DIRTY_DATASYNC at any time without notifying NILFS2. So I_DIRTY_DATASYNC has to be mapped onto NILFS_I_INODE_SYNC in nilfs_update_inode(). Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:20 +02:00
Andreas Rohner	e2c7617ae3	nilfs2: add missing blkdev_issue_flush() to nilfs_sync_fs() Under normal circumstances nilfs_sync_fs() writes out the super block, which causes a flush of the underlying block device. But this depends on the THE_NILFS_SB_DIRTY flag, which is only set if the pointer to the last segment crosses a segment boundary. So if only a small amount of data is written before the call to nilfs_sync_fs(), no flush of the block device occurs. In the above case an additional call to blkdev_issue_flush() is needed. To prevent unnecessary overhead, the new flag nilfs->ns_flushed_device is introduced, which is cleared whenever new logs are written and set whenever the block device is flushed. For convenience the function nilfs_flush_device() is added, which contains the above logic. Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:20 +02:00
Himangi Saraogi	0f2a84f41a	fs/befs/btree.c: remove typedef befs_btree_node The Linux kernel coding style guidelines suggest not using typedefs for structure types. This patch gets rid of the typedef for befs_btree_node. The following Coccinelle semantic patch detects the case. @tn1@ type td; @@ typedef struct { ... } td; @script:python tf@ td << tn1.td; tdres; @@ coccinelle.tdres = td; @@ type tn1.td; identifier tf.tdres; @@ -typedef struct + tdres { ... } -td ; @@ type tn1.td; identifier tf.tdres; @@ -td + struct tdres Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Acked-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:20 +02:00
NeilBrown	ef16cc5909	autofs4: d_manage() should return -EISDIR when appropriate in rcu-walk mode. If rcu-walk mode we don't have to return -EISDIR for non-mount-traps as we will simply drop into REF-walk and handling DCACHE_NEED_AUTOMOUNT dentrys the slow way. But it is better if we do when possible. In 'oz_mode', use the same condition as ref-walk: if not a mountpoint, then it must be -EISDIR. In regular mode there are most tests needed. Most of them can be performed without taking any spinlocks. If we find a directory that isn't obviously empty, and isn't mounted on, we need to call 'simple_empty()' which does take a spinlock. If this turned out to hurt performance, some other approach could be found to signal when a directory is known to be empty. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Ian Kent <raven@themaw.net> Tested-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:16 +02:00
NeilBrown	4d885f90e3	autofs4: avoid taking fs_lock during rcu-walk ->fs_lock protects AUTOFS_INF_EXPIRING. We need to be sure that once the flag is set, no new references beneath the dentry are taken. So rcu-walk currently needs to take fs_lock before checking the flag. This hurts performance. Change the expiry to a two-stage process. First set AUTOFS_INF_NO_RCU which forces any path walk into ref-walk mode, then drop the lock and call synchronize_rcu(). Once that returns we can be sure no rcu-walk is active beneath the dentry and we can check reference counts again. Now during an RCU-walk we can test AUTOFS_INF_EXPIRING without taking the lock as along as we test AUTOFS_INF_NO_RCU too. If either are set, we must abort the RCU-walk If neither are set, we know that refcounts will be tested again after we finish the RCU-walk so we are safe to continue. ->fs_lock is still taken in d_manage() to check for a non-trap directory. That will be resolved in the next patch. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Ian Kent <raven@themaw.net> Tested-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:16 +02:00
NeilBrown	6ece08e618	autofs4: make "autofs4_can_expire" idempotent. Have a "test" function change the value it is testing can be confusing, particularly as a future patch will be calling this function twice. So move the update for 'last_used' to avoid repeat expiry to the place where the final determination on what to expire is known. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Ian Kent <raven@themaw.net> Tested-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:16 +02:00
NeilBrown	a5d1dba143	autofs4: factor should_expire() out of autofs4_expire_indirect. Future patch will potentially call this twice, so make it separate. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Ian Kent <raven@themaw.net> Tested-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:16 +02:00
NeilBrown	23bfc2a24e	autofs4: allow RCU-walk to walk through autofs4 This series teaches autofs about RCU-walk so that we don't drop straight into REF-walk when we hit an autofs directory, and so that we avoid spinlocks as much as possible when performing an RCU-walk. This is needed so that the benefits of the recent NFS support for RCU-walk are fully available when NFS filesystems are automounted. Patches have been carefully reviewed and tested both with test suites and in production - thanks a lot to Ian Kent for his support there. This patch (of 6): Any attempt to look up a pathname that passes though an autofs4 mount is currently forced out of RCU-walk into REF-walk. This can significantly hurt performance of many-thread work loads on many-core systems, especially if the automounted filesystem supports RCU-walk but doesn't get to benefit from it. So if autofs4_d_manage is called with rcu_walk set, only fail with -ECHILD if it is necessary to wait longer than a spinlock. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Ian Kent <raven@themaw.net> Tested-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:16 +02:00
Fabian Frederick	8a273345dc	fs/ncpfs/dir.c: remove redundant sys_tz declaration sys_tz is already declared in include/linux/time.h Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:16 +02:00
Arnd Bergmann	de8288b1f8	binfmt_misc: work around gcc-4.9 warning gcc-4.9 on ARM gives us a mysterious warning about the binfmt_misc parse_command function: fs/binfmt_misc.c: In function 'parse_command.part.3': fs/binfmt_misc.c:405:7: warning: array subscript is above array bounds [-Warray-bounds] I've managed to trace this back to the ARM implementation of memset, which is called from copy_from_user in case of a fault and which does #define memset(p,v,n) \ ({ \ void *__p = (p); size_t __n = n; \ if ((__n) != 0) { \ if (__builtin_constant_p((v)) && (v) == 0) \ __memzero((__p),(__n)); \ else \ memset((__p),(v),(__n)); \ } \ (__p); \ }) Apparently gcc gets confused by the check for "size != 0" and believes that the size might be zero when it gets to the line that does "if (s[count-1] == '\n')", so it would access data outside of the array. gcc is clearly wrong here, since this condition was already checked earlier in the function and the 'size' value can not change in the meantime. Fortunately, we can work around it and get rid of the warning by rearranging the function to check for zero size after doing the copy_from_user. It is still safe to pass a zero size into copy_from_user, so it does not cause any side effects. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:16 +02:00
Mike Frysinger	bbaecc0882	binfmt_misc: expand the register format limit to 1920 bytes The current code places a 256 byte limit on the registration format. This ends up being fairly limited when you try to do matching against a binary format like ELF: - the magic & mask formats cannot have any embedded NUL chars (string_unescape_inplace halts at the first NUL) - each escape sequence quadruples the size: \x00 is needed for NUL - trying to match bytes at the start of the file as well as further on leads to a lot of \x00 sequences in the mask - magic & mask have to be the same length (when decoded) - still need bytes for the other fields - impossible! Let's look at a concrete (and common) example: using QEMU to run MIPS ELFs. The name field uses 11 bytes "qemu-mipsel". The interp uses 20 bytes "/usr/bin/qemu-mipsel". The type & flags takes up 4 bytes. We need 7 bytes for the delimiter (usually ":"). We can skip offset. So already we're down to 107 bytes to use with the magic/mask instead of the real limit of 128 (BINPRM_BUF_SIZE). If people use shell code to register (which they do the majority of the time), they're down to ~26 possible bytes since the escape sequence must be \x##. The ELF format looks like (both 32 & 64 bit): e_ident: 16 bytes e_type: 2 bytes e_machine: 2 bytes Those 20 bytes are enough for most architectures because they have so few formats in the first place, thus they can be uniquely identified. That also means for shell users, since 20 is smaller than 26, they can sanely register a handler. But for some targets (like MIPS), we need to poke further. The ELF fields continue on: e_entry: 4 or 8 bytes e_phoff: 4 or 8 bytes e_shoff: 4 or 8 bytes e_flags: 4 bytes We only care about e_flags here as that includes the bits to identify whether the ELF is O32/N32/N64. But now we have to consume another 16 bytes (for 32 bit ELFs) or 28 bytes (for 64 bit ELFs) just to match the flags. If every byte is escaped, we send 288 more bytes to the kernel ((20 {e_ident,e_type,e_machine} + 12 {e_entry,e_phoff,e_shoff} + 4 {e_flags}) * 2 {mask,magic} * 4 {escape}) and we've clearly blown our budget. Even if we try to be clever and do the decoding ourselves (rather than relying on the kernel to process \x##), we still can't hit the mark -- string_unescape_inplace treats mask & magic as C strings so NUL cannot be embedded. That leaves us with having to pass \x00 for the 12/24 entry/phoff/shoff bytes (as those will be completely random addresses), and that is a minimum requirement of 48/96 bytes for the mask alone. Add up the rest and we blow through it (this is for 64 bit ELFs): magic: 20 {e_ident,e_type,e_machine} + 24 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 48 # ^^ See note below. mask: 20 {e_ident,e_type,e_machine} + 96 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 120 Remember above we had 107 left over, and now we're at 168. This is of course the best case scenario -- you'll also want to have NUL bytes in the magic & mask too to match literal zeros. Note: the reason we can use 24 in the magic is that we can work off of the fact that for bytes the mask would clobber, we can stuff any value into magic that we want. So when mask is \x00, we don't need the magic to also be \x00, it can be an unescaped raw byte like '!'. This lets us handle more formats (barely) under the current 256 limit, but that's a pretty tall hoop to force people to jump through. With all that said, let's bump the limit from 256 bytes to 1920. This way we support escaping every byte of the mask & magic field (which is 1024 bytes by themselves -- 128 * 4 * 2), and we leave plenty of room for other fields. Like long paths to the interpreter (when you have source in your /really/long/homedir/qemu/foo). Since the current code stuffs more than one structure into the same buffer, we leave a bit of space to easily round up to 2k. 1920 is just as arbitrary as 256 ;). Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:15 +02:00
Rob Jones	d5d962265d	fs/fscache/object-list.c: use __seq_open_private() Reduce boilerplate code by using __seq_open_private() instead of seq_open() in fscache_objlist_open(). Signed-off-by: Rob Jones <rob.jones@codethink.co.uk> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com>	2014-10-13 17:52:21 +01:00
David Howells	a30efe261b	CacheFiles: Fix incorrect test for in-memory object collision When CacheFiles cache objects are in use, they have in-memory representations, as defined by the cachefiles_object struct. These are kept in a tree rooted in the cache and indexed by dentry pointer (since there's a unique mapping between object index key and dentry). Collisions can occur between a representation already in the tree and a new representation being set up because it takes time to dispose of an old representation - particularly if it must be unlinked or renamed. When such a collision occurs, cachefiles_mark_object_active() is meant to check to see if the old, already-present representation is in the process of being discarded (ie. FSCACHE_OBJECT_IS_LIVE is not set on it) - and, if so, wait for the representation to be removed (ie. CACHEFILES_OBJECT_ACTIVE is then cleared). However, the test for whether the old representation is still live is checking the new object - which always will be live at this point. This leads to an oops looking like: CacheFiles: Error: Unexpected object collision object: OBJ1b354 objstate=LOOK_UP_OBJECT fl=8 wbusy=2 ev=0[0] ops=0 inp=0 exc=0 parent=ffff88053f5417c0 cookie=ffff880538f202a0 [pr=ffff8805381b7160 nd=ffff880509c6eb78 fl=27] key=[8] '2490000000000000' xobject: OBJ1a600 xobjstate=DROP_OBJECT fl=70 wbusy=2 ev=0[0] xops=0 inp=0 exc=0 xparent=ffff88053f5417c0 xcookie=ffff88050f4cbf70 [pr=ffff8805381b7160 nd= (null) fl=12] ------------[ cut here ]------------ kernel BUG at fs/cachefiles/namei.c:200! ... Workqueue: fscache_object fscache_object_work_func [fscache] ... RIP: ... cachefiles_walk_to_object+0x7ea/0x860 [cachefiles] ... Call Trace: [<ffffffffa04dadd8>] ? cachefiles_lookup_object+0x58/0x100 [cachefiles] [<ffffffffa01affe9>] ? fscache_look_up_object+0xb9/0x1d0 [fscache] [<ffffffffa01afc4d>] ? fscache_parent_ready+0x2d/0x80 [fscache] [<ffffffffa01b0672>] ? fscache_object_work_func+0x92/0x1f0 [fscache] [<ffffffff8107e82b>] ? process_one_work+0x16b/0x400 [<ffffffff8107fc16>] ? worker_thread+0x116/0x380 [<ffffffff8107fb00>] ? manage_workers.isra.21+0x290/0x290 [<ffffffff81085edc>] ? kthread+0xbc/0xe0 [<ffffffff81085e20>] ? flush_kthread_worker+0x80/0x80 [<ffffffff81502d0c>] ? ret_from_fork+0x7c/0xb0 [<ffffffff81085e20>] ? flush_kthread_worker+0x80/0x80 Reported-by: Manuel Schölling <manuel.schoelling@gmx.de> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com>	2014-10-13 17:52:21 +01:00
Trond Myklebust	b8fb9c30f2	NFS: Fix a bogus warning in nfs_generic_pgio It is OK for pageused == pagecount in the loop, as long as we don't add another entry to the *pages array. Move the test so that it only triggers in that case. Reported-by: Steve Dickson <SteveD@redhat.com> Fixes: bba5c1887a92 (nfs: disallow duplicate pages in pgio page vectors) Cc: Weston Andros Adamson <dros@primarydata.com> Cc: stable@vger.kernel.org # 3.16.x Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-10-13 11:04:02 -04:00
Trond Myklebust	3caa0c6ed7	NFS: Fix an uninitialised pointer Oops in the writeback error path SteveD reports the following Oops: RIP: 0010:[<ffffffffa053461d>] [<ffffffffa053461d>] __put_nfs_open_context+0x1d/0x100 [nfs] RSP: 0018:ffff880fed687b90 EFLAGS: 00010286 RAX: 0000000000000024 RBX: 0000000000000000 RCX: 0000000000000006 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff880fed687bc0 R08: 0000000000000092 R09: 000000000000047a R10: 0000000000000000 R11: ffff880fed6878d6 R12: ffff880fed687d20 R13: ffff880fed687d20 R14: 0000000000000070 R15: ffffea000aa33ec0 FS: 00007fce290f0740(0000) GS:ffff8807ffc60000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000070 CR3: 00000007f2e79000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: 0000000000000000 ffff880036c5e510 ffff880fed687d20 ffff880fed687d20 ffff880036c5e200 ffffea000aa33ec0 ffff880fed687bd0 ffffffffa0534710 ffff880fed687be8 ffffffffa053d5f0 ffff880036c5e200 ffff880fed687c08 Call Trace: [<ffffffffa0534710>] put_nfs_open_context+0x10/0x20 [nfs] [<ffffffffa053d5f0>] nfs_pgio_data_destroy+0x20/0x40 [nfs] [<ffffffffa053d672>] nfs_pgio_error+0x22/0x40 [nfs] [<ffffffffa053d8f4>] nfs_generic_pgio+0x74/0x2e0 [nfs] [<ffffffffa06b18c3>] pnfs_generic_pg_writepages+0x63/0x210 [nfsv4] [<ffffffffa053d579>] nfs_pageio_doio+0x19/0x50 [nfs] [<ffffffffa053eb84>] nfs_pageio_complete+0x24/0x30 [nfs] [<ffffffffa053cb25>] nfs_direct_write_schedule_iovec+0x115/0x1f0 [nfs] [<ffffffffa053675f>] ? nfs_get_lock_context+0x4f/0x120 [nfs] [<ffffffffa053d252>] nfs_file_direct_write+0x262/0x420 [nfs] [<ffffffffa0532d91>] nfs_file_write+0x131/0x1d0 [nfs] [<ffffffffa0532c60>] ? nfs_need_sync_write.isra.17+0x40/0x40 [nfs] [<ffffffff812127b8>] do_io_submit+0x3b8/0x840 [<ffffffff81212c50>] SyS_io_submit+0x10/0x20 [<ffffffff81610f29>] system_call_fastpath+0x16/0x1b This is due to the calls to nfs_pgio_error() in nfs_generic_pgio(), which happen before the nfs_pgio_header's open context is referenced in nfs_pgio_rpcsetup(). Reported-by: Steve Dickson <SteveD@redhat.com> Cc: stable@vger.kernel.org # 3.16.x Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-10-13 10:26:43 -04:00
Linus Torvalds	faafcba3b5	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle were: - Optimized support for Intel "Cluster-on-Die" (CoD) topologies (Dave Hansen) - Various sched/idle refinements for better idle handling (Nicolas Pitre, Daniel Lezcano, Chuansheng Liu, Vincent Guittot) - sched/numa updates and optimizations (Rik van Riel) - sysbench speedup (Vincent Guittot) - capacity calculation cleanups/refactoring (Vincent Guittot) - Various cleanups to thread group iteration (Oleg Nesterov) - Double-rq-lock removal optimization and various refactorings (Kirill Tkhai) - various sched/deadline fixes ... and lots of other changes" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits) sched/dl: Use dl_bw_of() under rcu_read_lock_sched() sched/fair: Delete resched_cpu() from idle_balance() sched, time: Fix build error with 64 bit cputime_t on 32 bit systems sched: Improve sysbench performance by fixing spurious active migration sched/x86: Fix up typo in topology detection x86, sched: Add new topology for multi-NUMA-node CPUs sched/rt: Use resched_curr() in task_tick_rt() sched: Use rq->rd in sched_setaffinity() under RCU read lock sched: cleanup: Rename 'out_unlock' to 'out_free_new_mask' sched: Use dl_bw_of() under RCU read lock sched/fair: Remove duplicate code from can_migrate_task() sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW sched: print_rq(): Don't use tasklist_lock sched: normalize_rt_tasks(): Don't use _irqsave for tasklist_lock, use task_rq_lock() sched: Fix the task-group check in tg_has_rt_tasks() sched/fair: Leverage the idle state info when choosing the "idlest" cpu sched: Let the scheduler see CPU idle states sched/deadline: Fix inter- exclusive cpusets migrations sched/deadline: Clear dl_entity params when setscheduling to different class sched/numa: Kill the wrong/dead TASK_DEAD check in task_numa_fault() ...	2014-10-13 16:23:15 +02:00
Linus Torvalds	d6dd50e07c	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main changes in this cycle were: - changes related to No-CBs CPUs and NO_HZ_FULL - RCU-tasks implementation - torture-test updates - miscellaneous fixes - locktorture updates - RCU documentation updates" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (81 commits) workqueue: Use cond_resched_rcu_qs macro workqueue: Add quiescent state between work items locktorture: Cleanup header usage locktorture: Cannot hold read and write lock locktorture: Fix __acquire annotation for spinlock irq locktorture: Support rwlocks rcu: Eliminate deadlock between CPU hotplug and expedited grace periods locktorture: Document boot/module parameters rcutorture: Rename rcutorture_runnable parameter locktorture: Add test scenario for rwsem_lock locktorture: Add test scenario for mutex_lock locktorture: Make torture scripting account for new _runnable name locktorture: Introduce torture context locktorture: Support rwsems locktorture: Add infrastructure for torturing read locks torture: Address race in module cleanup locktorture: Make statistics generic locktorture: Teach about lock debugging locktorture: Support mutexes locktorture: Add documentation ...	2014-10-13 15:44:12 +02:00
Linus Torvalds	5ff0b9e1a1	xfs: update for 3.18-rc1 This update contains: o various cleanups o log recovery debug hooks o seek hole/data implementation merge o extent shift rework to fix collapse range bugs o various sparse warning fixes o log recovery transaction processing rework to fix use after free bugs o metadata buffer IO infrastructuer rework to ensure all buffers under IO have valid reference counts o various fixes for ondisk flags, writeback and zero range corner cases -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJUOxyCAAoJEK3oKUf0dfodzt8QAKcFdE8hyCAnD8IK85v46gWG IHnxOlTLrhs/22wfD1fSUcjCBQsQAIloQihvVGStugFnkEUHOUjlZ/oMcGNFPECC L7B4Ns6WmA9TA8ibgYvLZepautNjzhS5/lGfqSWpw4hQPsJJp2fGyCVF/ZhwnP6D qPeflVic8E8rgaJp98X8uFyZ+9EEoSF7/9EhmvVNwsO6UaThhIO/oPydx8oNrhKS k6aADmxNYtFWJb6kUjFbXaJwrFIFLvJc60FZz2eUViVGBx6K8D5FBiVbzZKe2WZ6 VOz4fj63BYI7Nxk4rZGJPoyql+ChO/pIVwH15ZmYRkkgUXs8FGy85mNKMg7DHnFm K/ZUhW5IBc6GtkwPCjNIM642IQYnTR5SdQfFxMS2EYPBUumcQ3EbD44aGZY69YYu pP+2g4b1diadNkGACccj6teQ9V0fbyF0lfZqoZMeN/W0As6l9oYa0yFBGsK9sblq yrPfce+wEy5HBy9M7Fqpvm3bwMunNViqilGZXKlOyodSgXxSF3JwXuc+8/TNwcnL O0RSD7R7k6TvrmAntTgwT4beZi4ziG+/tVa0rD3mJM/sXyzcP2bwbA1APM74NcHh p8mrJRci6vtKPwIylQ1xeCeK/WD21OhbJWBYR+0JOEJSnAjtv8flk7mqGLhy+M+Y yCdHJIfuJ4NKj4X3f0Kc =TdAB -----END PGP SIGNATURE----- Merge tag 'xfs-for-linus-3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs Pull xfs update from Dave Chinner: "This update contains: - various cleanups - log recovery debug hooks - seek hole/data implementation merge - extent shift rework to fix collapse range bugs - various sparse warning fixes - log recovery transaction processing rework to fix use after free bugs - metadata buffer IO infrastructuer rework to ensure all buffers under IO have valid reference counts - various fixes for ondisk flags, writeback and zero range corner cases" * tag 'xfs-for-linus-3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (56 commits) xfs: fix agno increment in xfs_inumbers() loop xfs: xfs_iflush_done checks the wrong log item callback xfs: flush the range before zero range conversion xfs: restore buffer_head unwritten bit on ioend cancel xfs: check for null dquot in xfs_quota_calc_throttle() xfs: fix crc field handling in xfs_sb_to/from_disk xfs: don't send null bp to xfs_trans_brelse() xfs: check for inode size overflow in xfs_new_eof() xfs: only set extent size hint when asked xfs: project id inheritance is a directory only flag xfs: kill time.h xfs: compat_xfs_bstat does not have forkoff xfs: simplify xfs_zero_remaining_bytes xfs: check xfs_buf_read_uncached returns correctly xfs: introduce xfs_buf_submit[_wait] xfs: kill xfs_bioerror_relse xfs: xfs_bioerror can die. xfs: kill xfs_bdstrat_cb xfs: rework xfs_buf_bio_endio error handling xfs: xfs_buf_ioend and xfs_buf_iodone_work duplicate functionality ...	2014-10-13 12:06:54 +02:00
Linus Torvalds	77c688ac87	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "The big thing in this pile is Eric's unmount-on-rmdir series; we finally have everything we need for that. The final piece of prereqs is delayed mntput() - now filesystem shutdown always happens on shallow stack. Other than that, we have several new primitives for iov_iter (Matt Wilcox, culled from his XIP-related series) pushing the conversion to ->read_iter()/ ->write_iter() a bit more, a bunch of fs/dcache.c cleanups and fixes (including the external name refcounting, which gives consistent behaviour of d_move() wrt procfs symlinks for long and short names alike) and assorted cleanups and fixes all over the place. This is just the first pile; there's a lot of stuff from various people that ought to go in this window. Starting with unionmount/overlayfs mess... ;-/" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (60 commits) fs/file_table.c: Update alloc_file() comment vfs: Deduplicate code shared by xattr system calls operating on paths reiserfs: remove pointless forward declaration of struct nameidata don't need that forward declaration of struct nameidata in dcache.h anymore take dname_external() into fs/dcache.c let path_init() failures treated the same way as subsequent link_path_walk() fix misuses of f_count() in ppp and netlink ncpfs: use list_for_each_entry() for d_subdirs walk vfs: move getname() from callers to do_mount() gfs2_atomic_open(): skip lookups on hashed dentry [infiniband] remove pointless assignments gadgetfs: saner API for gadgetfs_create_file() f_fs: saner API for ffs_sb_create_file() jfs: don't hash direct inode [s390] remove pointless assignment of ->f_op in vmlogrdr ->open() ecryptfs: ->f_op is never NULL android: ->f_op is never NULL nouveau: __iomem misannotations missing annotation in fs/file.c fs: namespace: suppress 'may be used uninitialized' warnings ...	2014-10-13 11:28:42 +02:00
Dmitry Monakhov	aef4885ae1	ext4: move error report out of atomic context in ext4_init_block_bitmap() Error report likely result in IO so it is bad idea to do it from atomic context. This patch should fix following issue: BUG: sleeping function called from invalid context at include/linux/buffer_head.h:349 in_atomic(): 1, irqs_disabled(): 0, pid: 137, name: kworker/u128:1 5 locks held by kworker/u128:1/137: #0: ("writeback"){......}, at: [<ffffffff81085618>] process_one_work+0x228/0x4d0 #1: ((&(&wb->dwork)->work)){......}, at: [<ffffffff81085618>] process_one_work+0x228/0x4d0 #2: (jbd2_handle){......}, at: [<ffffffff81242622>] start_this_handle+0x712/0x7b0 #3: (&ei->i_data_sem){......}, at: [<ffffffff811fa387>] ext4_map_blocks+0x297/0x430 #4: (&(&bgl->locks[i].lock)->rlock){......}, at: [<ffffffff811f3180>] ext4_read_block_bitmap_nowait+0x5d0/0x630 CPU: 3 PID: 137 Comm: kworker/u128:1 Not tainted 3.17.0-rc2-00184-g82752e4 #165 Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.99.99.x028.061320111235 06/13/2011 Workqueue: writeback bdi_writeback_workfn (flush-1:0) 0000000000000411 ffff880813777288 ffffffff815c7fdc ffff880813777288 ffff880813a8bba0 ffff8808137772a8 ffffffff8108fb30 ffff880803e01e38 ffff880803e01e38 ffff8808137772c8 ffffffff811a8d53 ffff88080ecc6000 Call Trace: [<ffffffff815c7fdc>] dump_stack+0x51/0x6d [<ffffffff8108fb30>] __might_sleep+0xf0/0x100 [<ffffffff811a8d53>] __sync_dirty_buffer+0x43/0xe0 [<ffffffff811a8e03>] sync_dirty_buffer+0x13/0x20 [<ffffffff8120f581>] ext4_commit_super+0x1d1/0x230 [<ffffffff8120fa03>] save_error_info+0x23/0x30 [<ffffffff8120fd06>] __ext4_error+0xb6/0xd0 [<ffffffff8120f260>] ? ext4_group_desc_csum+0x140/0x190 [<ffffffff811f2d8c>] ext4_read_block_bitmap_nowait+0x1dc/0x630 [<ffffffff8122e23a>] ext4_mb_init_cache+0x21a/0x8f0 [<ffffffff8113ae95>] ? lru_cache_add+0x55/0x60 [<ffffffff8112e16c>] ? add_to_page_cache_lru+0x6c/0x80 [<ffffffff8122eaa0>] ext4_mb_init_group+0x190/0x280 [<ffffffff8122ec51>] ext4_mb_good_group+0xc1/0x190 [<ffffffff8123309a>] ext4_mb_regular_allocator+0x17a/0x410 [<ffffffff8122c821>] ? ext4_mb_use_preallocated+0x31/0x380 [<ffffffff81233535>] ? ext4_mb_new_blocks+0x205/0x8e0 [<ffffffff8116ed5c>] ? kmem_cache_alloc+0xfc/0x180 [<ffffffff812335b0>] ext4_mb_new_blocks+0x280/0x8e0 [<ffffffff8116f2c4>] ? __kmalloc+0x144/0x1c0 [<ffffffff81221797>] ? ext4_find_extent+0x97/0x320 [<ffffffff812257f4>] ext4_ext_map_blocks+0xbc4/0x1050 [<ffffffff811fa387>] ? ext4_map_blocks+0x297/0x430 [<ffffffff811fa3ab>] ext4_map_blocks+0x2bb/0x430 [<ffffffff81200e43>] ? ext4_init_io_end+0x23/0x50 [<ffffffff811feb44>] ext4_writepages+0x564/0xaf0 [<ffffffff815cde3b>] ? _raw_spin_unlock+0x2b/0x40 [<ffffffff810ac7bd>] ? lock_release_non_nested+0x2fd/0x3c0 [<ffffffff811a009e>] ? writeback_sb_inodes+0x10e/0x490 [<ffffffff811a009e>] ? writeback_sb_inodes+0x10e/0x490 [<ffffffff811377e3>] do_writepages+0x23/0x40 [<ffffffff8119c8ce>] __writeback_single_inode+0x9e/0x280 [<ffffffff811a026b>] writeback_sb_inodes+0x2db/0x490 [<ffffffff811a0664>] wb_writeback+0x174/0x2d0 [<ffffffff810ac359>] ? lock_release_holdtime+0x29/0x190 [<ffffffff811a0863>] wb_do_writeback+0xa3/0x200 [<ffffffff811a0a40>] bdi_writeback_workfn+0x80/0x230 [<ffffffff81085618>] ? process_one_work+0x228/0x4d0 [<ffffffff810856cd>] process_one_work+0x2dd/0x4d0 [<ffffffff81085618>] ? process_one_work+0x228/0x4d0 [<ffffffff81085c1d>] worker_thread+0x35d/0x460 [<ffffffff810858c0>] ? process_one_work+0x4d0/0x4d0 [<ffffffff810858c0>] ? process_one_work+0x4d0/0x4d0 [<ffffffff8108a885>] kthread+0xf5/0x100 [<ffffffff810990e5>] ? local_clock+0x25/0x30 [<ffffffff8108a790>] ? __init_kthread_worker+0x70/0x70 [<ffffffff815ce2ac>] ret_from_fork+0x7c/0xb0 [<ffffffff8108a790>] ? __init_kthread_work Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org	2014-10-13 03:42:12 -04:00
Dmitry Monakhov	9aa5d32ba2	ext4: Replace open coded mdata csum feature to helper function Besides the fact that this replacement improves code readability it also protects from errors caused direct EXT4_S(sb)->s_es manipulation which may result attempt to use uninitialized csum machinery. #Testcase_BEGIN IMG=/dev/ram0 MNT=/mnt mkfs.ext4 $IMG mount $IMG $MNT #Enable feature directly on disk, on mounted fs tune2fs -O metadata_csum $IMG # Provoke metadata update, likey result in OOPS touch $MNT/test umount $MNT #Testcase_END # Replacement script @@ expression E; @@ - EXT4_HAS_RO_COMPAT_FEATURE(E, EXT4_FEATURE_RO_COMPAT_METADATA_CSUM) + ext4_has_metadata_csum(E) https://bugzilla.kernel.org/show_bug.cgi?id=82201 Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org	2014-10-13 03:36:16 -04:00
Dave Chinner	6889e783cd	Merge branch 'xfs-misc-fixes-for-3.18-3' into for-next	2014-10-13 10:22:45 +11:00
Eric Sandeen	a8b1ee8baf	xfs: fix agno increment in xfs_inumbers() loop caused a regression in xfs_inumbers, which in turn broke xfsdump, causing incomplete dumps. The loop in xfs_inumbers() needs to fill the user-supplied buffers, and iterates via xfs_btree_increment, reading new ags as needed. But the first time through the loop, if xfs_btree_increment() succeeds, we continue, which triggers the ++agno at the bottom of the loop, and we skip to soon to the next ag - without the proper setup under next_ag to read the next ag. Fix this by removing the agno increment from the loop conditional, and only increment agno if we have actually hit the code under the next_ag: target. Cc: stable@vger.kernel.org Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-10-13 10:21:53 +11:00
Eric Biggers	a457606a6f	fs/file_table.c: Update alloc_file() comment This comment is 5 years outdated; init_file() no longer exists. Signed-off-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-12 17:09:10 -04:00
Eric Biggers	8cc431165d	vfs: Deduplicate code shared by xattr system calls operating on paths The following pairs of system calls dealing with extended attributes only differ in their behavior on whether the symbolic link is followed (when the named file is a symbolic link): - setxattr() and lsetxattr() - getxattr() and lgetxattr() - listxattr() and llistxattr() - removexattr() and lremovexattr() Despite this, the implementations all had duplicated code, so this commit redirects each of the above pairs of system calls to a corresponding function to which different lookup flags (LOOKUP_FOLLOW or 0) are passed. For me this reduced the stripped size of xattr.o from 8824 to 8248 bytes. Signed-off-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-12 17:09:10 -04:00

1 2 3 4 5 ...

38390 Commits