linux

iv/linux

Author	SHA1	Message	Date
Miklos Szeredi	1e6be9c19c	fuse: add missing FR_FORCE commit 2e38bea99a80eab408adee27f873a188d57b76cb upstream. fuse_file_put() was missing the "force" flag for the RELEASE request when sending synchronously (fuseblk). If this flag is not set, then a sync request may be interrupted before it is dequeued by the userspace filesystem. In this case the OPEN won't be balanced with a RELEASE. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 5a18ec176c93 ("fuse: fix hang of single threaded fuseblk filesystem") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:47 +01:00
Laura Abbott	abf74467e7	crypto: testmgr - Pad aes_ccm_enc_tv_template vector commit 1c68bb0f62bf8de8bb30123ea840d5168f25abea upstream. Running with KASAN and crypto tests currently gives BUG: KASAN: global-out-of-bounds in __test_aead+0x9d9/0x2200 at addr ffffffff8212fca0 Read of size 16 by task cryptomgr_test/1107 Address belongs to variable 0xffffffff8212fca0 CPU: 0 PID: 1107 Comm: cryptomgr_test Not tainted 4.10.0+ #45 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014 Call Trace: dump_stack+0x63/0x8a kasan_report.part.1+0x4a7/0x4e0 ? __test_aead+0x9d9/0x2200 ? crypto_ccm_init_crypt+0x218/0x3c0 [ccm] kasan_report+0x20/0x30 check_memory_region+0x13c/0x1a0 memcpy+0x23/0x50 __test_aead+0x9d9/0x2200 ? kasan_unpoison_shadow+0x35/0x50 ? alg_test_akcipher+0xf0/0xf0 ? crypto_skcipher_init_tfm+0x2e3/0x310 ? crypto_spawn_tfm2+0x37/0x60 ? crypto_ccm_init_tfm+0xa9/0xd0 [ccm] ? crypto_aead_init_tfm+0x7b/0x90 ? crypto_alloc_tfm+0xc4/0x190 test_aead+0x28/0xc0 alg_test_aead+0x54/0xd0 alg_test+0x1eb/0x3d0 ? alg_find_test+0x90/0x90 ? __sched_text_start+0x8/0x8 ? __wake_up_common+0x70/0xb0 cryptomgr_test+0x4d/0x60 kthread+0x173/0x1c0 ? crypto_acomp_scomp_free_ctx+0x60/0x60 ? kthread_create_on_node+0xa0/0xa0 ret_from_fork+0x2c/0x40 Memory state around the buggy address: ffffffff8212fb80: 00 00 00 00 01 fa fa fa fa fa fa fa 00 00 00 00 ffffffff8212fc00: 00 01 fa fa fa fa fa fa 00 00 00 00 01 fa fa fa >ffffffff8212fc80: fa fa fa fa 00 05 fa fa fa fa fa fa 00 00 00 00 ^ ffffffff8212fd00: 01 fa fa fa fa fa fa fa 00 00 00 00 01 fa fa fa ffffffff8212fd80: fa fa fa fa 00 00 00 00 00 05 fa fa fa fa fa fa This always happens on the same IV which is less than 16 bytes. Per Ard, "CCM IVs are 16 bytes, but due to the way they are constructed internally, the final couple of bytes of input IV are dont-cares. Apparently, we do read all 16 bytes, which triggers the KASAN errors." Fix this by padding the IV with null bytes to be at least 16 bytes. Fixes: 0bc5a6c5c79a ("crypto: testmgr - Disable rfc4309 test and convert test vectors") Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:47 +01:00
Krister Johansen	11a4d644d6	perf callchain: Reference count maps commit aa33b9b9a2ebb00d33c83a5312d4fbf2d5aeba36 upstream. If dso__load_kcore frees all of the existing maps, but one has already been attached to a callchain cursor node, then we can get a SIGSEGV in any function that happens to try to use this invalid cursor. Use the existing map refcount mechanism to forestall cleanup of a map until the cursor iterates past the node. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: 84c2cafa2889 ("perf tools: Reference count struct map") Link: http://lkml.kernel.org/r/20170106062331.GB2707@templeofstupid.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:47 +01:00
Vitaly Kuznetsov	65013a93b6	Drivers: hv: vmbus: Raise retry/wait limits in vmbus_post_msg() commit c0bb03924f1a80e7f65900e36c8e6b3dc167c5f8 upstream. DoS protection conditions were altered in WS2016 and now it's easy to get -EAGAIN returned from vmbus_post_msg() (e.g. when we try changing MTU on a netvsc device in a loop). All vmbus_post_msg() callers don't retry the operation and we usually end up with a non-functional device or crash. While host's DoS protection conditions are unknown to me my tests show that it can take up to 10 seconds before the message is sent so doing udelay() is not an option, we really need to sleep. Almost all vmbus_post_msg() callers are ready to sleep but there is one special case: vmbus_initiate_unload() which can be called from interrupt/NMI context and we can't sleep there. I'm also not sure about the lonely vmbus_send_tl_connect_request() which has no in-tree users but its external users are most likely waiting for the host to reply so sleeping there is also appropriate. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:47 +01:00
Ley Foon Tan	730b1b20c6	PCI: altera: Fix TLP_CFG_DW0 for TLP write commit 2a7275a3d867b228216886aae35e1f64291180b1 upstream. eb5767122feb ("PCI: altera: Simplify TLB_CFG_DW0 usage") used TLP_FMTTYPE_CFGRD* (instead of TLP_FMTTYPE_CFGWR*) for TLP writes, which causes writing to configuration space to fail. Fix it by using correct FMTTYPE for write operation. Fixes: eb5767122feb ("PCI: altera: Simplify TLB_CFG_DW0 usage") Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:47 +01:00
Gavin Shan	1fb738a3dc	pci/hotplug/pnv-php: Disable MSI and PCI device properly commit 49f4b08e61547a5ccd2db551d994c4503efe5666 upstream. pnv_php_disable_irq() can be called in two paths: Bailing path in pnv_php_enable_irq() or releasing slot. The MSI (or MSIx) interrupts is disabled unconditionally in pnv_php_disable_irq(). It's wrong because that might be enabled by drivers other than pnv-php. This disables MSI (or MSIx) interrupts and the PCI device only if it was enabled by pnv-php. In the error path of pnv_php_enable_irq(), we rely on the newly added parameter @disable_device. In the path of releasing slot, @pnv_php->irq is checked. Fixes: 360aebd85a4c ("drivers/pci/hotplug: Support surprise hotplug in powernv driver") Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:47 +01:00
Dexuan Cui	bc5338a4fd	PCI: hv: Fix wslot_to_devfn() to fix warnings on device removal commit 60e2e2fbafdd1285ae1b4ad39ded41603e0c74d0 upstream. The devfn of 00:02.0 is 0x10. devfn_to_wslot(0x10) == 0x2, and wslot_to_devfn(0x2) should be 0x10, while it's 0x2 in the current code. Due to this, hv_eject_device_work() -> pci_get_domain_bus_and_slot() returns NULL and pci_stop_and_remove_bus_device() is not called. Later when the real device driver's .remove() is invoked by hv_pci_remove() -> pci_stop_root_bus(), some warnings can be noticed because the VM has lost the access to the underlying device at that time. Signed-off-by: Jake Oshins <jakeo@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Haiyang Zhang <haiyangz@microsoft.com> CC: K. Y. Srinivasan <kys@microsoft.com> CC: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:47 +01:00
Christian Lamparter	787fd7a6c8	ath9k: use correct OTP register offsets for the AR9340 and AR9550 commit c9f1e32600816d695f817477d56490bfc2ba43c6 upstream. This patch fixes the OTP register definitions for the AR934x and AR9550 WMAC SoC. Previously, the ath9k driver was unable to initialize the integrated WMAC on an Aerohive AP121: \| ath: phy0: timeout (1000 us) on reg 0x30018: 0xbadc0ffe & 0x00000007 != 0x00000004 \| ath: phy0: timeout (1000 us) on reg 0x30018: 0xbadc0ffe & 0x00000007 != 0x00000004 \| ath: phy0: Unable to initialize hardware; initialization status: -5 \| ath9k ar934x_wmac: failed to initialize device \| ath9k: probe of ar934x_wmac failed with error -5 It turns out that the AR9300_OTP_STATUS and AR9300_OTP_DATA definitions contain a typo. Cc: Gabor Juhos <juhosg@openwrt.org> Fixes: add295a4afbdf5852d0 "ath9k: use correct OTP register offsets for AR9550" Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: Chris Blake <chrisrblake93@gmail.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:47 +01:00
Felix Fietkau	c41cae06bf	ath9k: fix race condition in enabling/disabling IRQs commit 3a5e969bb2f6692a256352649355d56d018d6b88 upstream. The code currently relies on refcounting to disable IRQs from within the IRQ handler and re-enabling them again after the tasklet has run. However, due to race conditions sometimes the IRQ handler might be called twice, or the tasklet may not run at all (if interrupted in the middle of a reset). This can cause nasty imbalances in the irq-disable refcount which will get the driver permanently stuck until the entire radio has been stopped and started again (ath_reset will not recover from this). Instead of using this fragile logic, change the code to ensure that running the irq handler during tasklet processing is safe, and leave the refcount untouched. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:46 +01:00
Felix Fietkau	93c1f1db1a	ath5k: drop bogus warning on drv_set_key with unsupported cipher commit a70e1d6fd6b5e1a81fa6171600942bee34f5128f upstream. Simply return -EOPNOTSUPP instead. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:46 +01:00
Tamizh chelvam	bac7f7135d	ath10k: fix boot failure in UTF mode/testmode commit cb4281528b62207918b1e95827cad7527aa4dbaa upstream. Rx filter reset and the dynamic tx switch mode (EXT_RESOURCE_CFG) configuration are causing the following errors when UTF firmware is loaded to the target. Error message 1: [ 598.015629] ath10k_pci 0001:01:00.0: failed to ping firmware: -110 [ 598.020828] ath10k_pci 0001:01:00.0: failed to reset rx filter: -110 [ 598.141556] ath10k_pci 0001:01:00.0: failed to start core (testmode): -110 Error message 2: [ 668.615839] ath10k_ahb a000000.wifi: failed to send ext resource cfg command : -95 [ 668.618902] ath10k_ahb a000000.wifi: failed to start core (testmode): -95 Avoiding these configurations while bringing the target in testmode is solving the problem. Signed-off-by: Tamizh chelvam <c_traja@qti.qualcomm.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:46 +01:00
Alexander Usyskin	d2a8cd3eee	mei: remove support for broken parallel read commit cb97fbbcac15982406e0c74cd5512a8b6fcf10b3 upstream. Parallel reads from multiple threads on a file descriptor are not well defined and racy. It is safer to return to original behavior and simply fail the additional read. The solution is to remove request for next read credit. Fixes: ff1586a7ea57 ("mei: enqueue consecutive reads") Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:46 +01:00
Mathias Svensson	d6407e10bc	samples/seccomp: fix 64-bit comparison macros commit 916cafdc95843fb9af5fd5f83ca499d75473d107 upstream. There were some bugs in the JNE64 and JLT64 comparision macros. This fixes them, improves comments, and cleans up the file while we are at it. Reported-by: Stephen Röttger <sroettger@google.com> Signed-off-by: Mathias Svensson <idolf@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:46 +01:00
Theodore Ts'o	d6dcec965b	ext4: return EROFS if device is r/o and journal replay is needed commit 4753d8a24d4588657bc0a4cd66d4e282dff15c8c upstream. If the file system requires journal recovery, and the device is read-ony, return EROFS to the mount system call. This allows xfstests generic/050 to pass. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:46 +01:00
Theodore Ts'o	269bf7b8c5	ext4: preserve the needs_recovery flag when the journal is aborted commit 97abd7d4b5d9c48ec15c425485f054e1c15e591b upstream. If the journal is aborted, the needs_recovery feature flag should not be removed. Otherwise, it's the journal might not get replayed and this could lead to more data getting lost. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:46 +01:00
Theodore Ts'o	0b37d0c0c6	ext4: fix inline data error paths commit eb5efbcb762aee4b454b04f7115f73ccbcf8f0ef upstream. The write_end() function must always unlock the page and drop its ref count, even on an error. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:46 +01:00
Eric Biggers	68ca0fdac4	ext4: fix use-after-iput when fscrypt contexts are inconsistent commit dd01b690f8f4b1e414f89e5a9a5326bf720d6652 upstream. In the case where the child's encryption context was inconsistent with its parent directory, we were using inode->i_sb and inode->i_ino after the inode had already been iput(). Fix this by doing the iput() in the correct places. Note: only ext4 had this bug, not f2fs and ubifs. Fixes: d9cdc9033181 ("ext4 crypto: enforce context consistency") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:46 +01:00
Jan Kara	a5a9cf387d	ext4: fix data corruption in data=journal mode commit 3b136499e906460919f0d21a49db1aaccf0ae963 upstream. ext4_journalled_write_end() did not propely handle all the cases when generic_perform_write() did not copy all the data into the target page and could mark buffers with uninitialized contents as uptodate and dirty leading to possible data corruption (which would be quickly fixed by generic_perform_write() retrying the write but still). Fix the problem by carefully handling the case when the page that is written to is not uptodate. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:46 +01:00
Jan Kara	fc6c2da174	ext4: trim allocation requests to group size commit cd648b8a8fd5071d232242d5ee7ee3c0815776af upstream. If filesystem groups are artifically small (using parameter -g to mkfs.ext4), ext4_mb_normalize_request() can result in a request that is larger than a block group. Trim the request size to not confuse allocation code. Reported-by: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:46 +01:00
Roman Pen	e0b53d6729	ext4: do not polute the extents cache while shifting extents commit 03e916fa8b5577d85471452a3d0c5738aa658dae upstream. Inside ext4_ext_shift_extents() function ext4_find_extent() is called without EXT4_EX_NOCACHE flag, which should prevent cache population. This leads to oudated offsets in the extents tree and wrong blocks afterwards. Patch fixes the problem providing EXT4_EX_NOCACHE flag for each ext4_find_extents() call inside ext4_ext_shift_extents function. Fixes: 331573febb6a2 Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: Namjae Jeon <namjae.jeon@samsung.com> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:45 +01:00
Roman Pen	72ae476d04	ext4: Include forgotten start block on fallocate insert range commit 2a9b8cba62c0741109c33a2be700ff3d7703a7c2 upstream. While doing 'insert range' start block should be also shifted right. The bug can be easily reproduced by the following test: ptr = malloc(4096); assert(ptr); fd = open("./ext4.file", O_CREAT \| O_TRUNC \| O_RDWR, 0600); assert(fd >= 0); rc = fallocate(fd, 0, 0, 8192); assert(rc == 0); for (i = 0; i < 2048; i++) ((unsigned short )ptr + i) = 0xbeef; rc = pwrite(fd, ptr, 4096, 0); assert(rc == 4096); rc = pwrite(fd, ptr, 4096, 4096); assert(rc == 4096); for (block = 2; block < 1000; block++) { rc = fallocate(fd, FALLOC_FL_INSERT_RANGE, 4096, 4096); assert(rc == 0); for (i = 0; i < 2048; i++) ((unsigned short )ptr + i) = block; rc = pwrite(fd, ptr, 4096, 4096); assert(rc == 4096); } Because start block is not included in the range the hole appears at the wrong offset (just after the desired offset) and the following pwrite() overwrites already existent block, keeping hole untouched. Simple way to verify wrong behaviour is to check zeroed blocks after the test: $ hexdump ./ext4.file \| grep '0000 0000' The root cause of the bug is a wrong range (start, stop], where start should be inclusive, i.e. [start, stop]. This patch fixes the problem by including start into the range. But not to break left shift (range collapse) stop points to the beginning of the a block, not to the end. The other not obvious change is an iterator check on validness in a main loop. Because iterator is unsigned the following corner case should be considered with care: insert a block at 0 offset, when stop variables overflows and never becomes less than start, which is 0. To handle this special case iterator is set to NULL to indicate that end of the loop is reached. Fixes: 331573febb6a2 Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: Namjae Jeon <namjae.jeon@samsung.com> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:45 +01:00
Omar Sandoval	8ca25e39ec	loop: fix LO_FLAGS_PARTSCAN hang commit e02898b423802b1f3a3aaa7f16e896da069ba8f7 upstream. loop_reread_partitions() needs to do I/O, but we just froze the queue, so we end up waiting forever. This can easily be reproduced with losetup -P. Fix it by moving the reread to after we unfreeze the queue. Fixes: ecdd09597a57 ("block/loop: fix race between I/O and set_status") Reported-by: Tejun Heo <tj@kernel.org> Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:45 +01:00
Ming Lei	50447afd96	block/loop: fix race between I/O and set_status commit ecdd09597a57251323b0de50e3d45e69298c4a83 upstream. Inside set_status, transfer need to setup again, so we have to drain IO before the transition, otherwise oops may be triggered like the following: divide error: 0000 [#1] SMP KASAN CPU: 0 PID: 2935 Comm: loop7 Not tainted 4.10.0-rc7+ #213 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff88006ba1e840 task.stack: ffff880067338000 RIP: 0010:transfer_xor+0x1d1/0x440 drivers/block/loop.c:110 RSP: 0018:ffff88006733f108 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8800688d7000 RCX: 0000000000000059 RDX: 0000000000000000 RSI: 1ffff1000d743f43 RDI: ffff880068891c08 RBP: ffff88006733f160 R08: ffff8800688d7001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800688d7000 R13: ffff880067b7d000 R14: dffffc0000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88006d000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000006c17e0 CR3: 0000000066e3b000 CR4: 00000000001406f0 Call Trace: lo_do_transfer drivers/block/loop.c:251 [inline] lo_read_transfer drivers/block/loop.c:392 [inline] do_req_filebacked drivers/block/loop.c:541 [inline] loop_handle_cmd drivers/block/loop.c:1677 [inline] loop_queue_work+0xda0/0x49b0 drivers/block/loop.c:1689 kthread_worker_fn+0x4c3/0xa30 kernel/kthread.c:630 kthread+0x326/0x3f0 kernel/kthread.c:227 ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430 Code: 03 83 e2 07 41 29 df 42 0f b6 04 30 4d 8d 44 24 01 38 d0 7f 08 84 c0 0f 85 62 02 00 00 44 89 f8 41 0f b6 48 ff 25 ff 01 00 00 99 <f7> 7d c8 48 63 d2 48 03 55 d0 48 89 d0 48 89 d7 48 c1 e8 03 83 RIP: transfer_xor+0x1d1/0x440 drivers/block/loop.c:110 RSP: ffff88006733f108 ---[ end trace 0166f7bd3b0c0933 ]--- Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Ming Lei <tom.leiming@gmail.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:45 +01:00
Theodore Ts'o	a9b0c14ba1	jbd2: don't leak modified metadata buffers on an aborted journal commit e112666b4959b25a8552d63bc564e1059be703e8 upstream. If the journal has been aborted, we shouldn't mark the underlying buffer head as dirty, since that will cause the metadata block to get modified. And if the journal has been aborted, we shouldn't allow this since it will almost certainly lead to a corrupted file system. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:45 +01:00
Mathieu Desnoyers	3de5a92847	Fix: Disable sys_membarrier when nohz_full is enabled commit 907565337ebf998a68cb5c5b2174ce5e5da065eb upstream. Userspace applications should be allowed to expect the membarrier system call with MEMBARRIER_CMD_SHARED command to issue memory barriers on nohz_full CPUs, but synchronize_sched() does not take those into account. Given that we do not want unrelated processes to be able to affect real-time sensitive nohz_full CPUs, simply return ENOSYS when membarrier is invoked on a kernel with enabled nohz_full CPUs. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> CC: Josh Triplett <josh@joshtriplett.org> CC: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Rik van Riel <riel@redhat.com> Acked-by: Lai Jiangshan <jiangshanlai@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:45 +01:00
Alexandre Belloni	206af3d97f	power: reset: at91-poweroff: timely shutdown LPDDR memories commit 0b0408745e7ff24757cbfd571d69026c0ddb803c upstream. LPDDR memories can only handle up to 400 uncontrolled power off. Ensure the proper power off sequence is used before shutting down the platform. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Sebastian Reichel <sre@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:45 +01:00
Hannes Reinecke	e9dc8334d7	scsi: use 'scsi_device_from_queue()' for scsi_dh commit 857de6e00778738dc3d61f75acbac35bdc48e533 upstream. The device handler needs to check if a given queue belongs to a scsi device; only then does it make sense to attach a device handler. [mkp: dropped flags] Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:45 +01:00
Raghava Aditya Renukunta	73f5176eca	scsi: aacraid: Reorder Adapter status check commit c421530bf848604e97d0785a03b3fe2c62775083 upstream. The driver currently checks the SELF_TEST_FAILED first and then KERNEL_PANIC next. Under error conditions(boot code failure) both SELF_TEST_FAILED and KERNEL_PANIC can be set at the same time. The driver has the capability to reset the controller on an KERNEL_PANIC, but not on SELF_TEST_FAILED. Fixed by first checking KERNEL_PANIC and then the others. Fixes: e8b12f0fb835223752 ([SCSI] aacraid: Add new code for PMC-Sierra's SRC base controller family) Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:45 +01:00
Long Li	a50781fe6e	scsi: storvsc: properly set residual data length on errors commit 40630f462824ee24bc00d692865c86c3828094e0 upstream. On I/O errors, the Windows driver doesn't set data_transfer_length on error conditions other than SRB_STATUS_DATA_OVERRUN. In these cases we need to set data_transfer_length to 0, indicating there is no data transferred. On SRB_STATUS_DATA_OVERRUN, data_transfer_length is set by the Windows driver to the actual data transferred. Reported-by: Shiva Krishna <Shiva.Krishna@nimblestorage.com> Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:45 +01:00
Long Li	e59693753e	scsi: storvsc: properly handle SRB_ERROR when sense message is present commit bba5dc332ec2d3a685cb4dae668c793f6a3713a3 upstream. When sense message is present on error, we should pass along to the upper layer to decide how to deal with the error. This patch fixes connectivity issues with Fiber Channel devices. Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:45 +01:00
Long Li	27f5ef378d	scsi: storvsc: use tagged SRB requests if supported by the device commit 3cd6d3d9b1abab8dcdf0800224ce26daac24eea2 upstream. Properly set SRB flags when hosting device supports tagged queuing. This patch improves the performance on Fiber Channel disks. Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:44 +01:00
Heinz Mauelshagen	2937e22c23	dm raid: fix data corruption on reshape request commit d36a19541fe8f392778ac137d60f9be8dfdd8f9d upstream. The lvm2 sequence to manage dm-raid constructor flags that trigger a rebuild or a reshape is defined as: 1) load table with flags (e.g. rebuild/delta_disks/data_offset) 2) clear out the flags in lvm2 metadata 3) store the lvm2 metadata, reload the table to reset the flags previously established during the initial load (1) -- in order to prevent repeatedly requesting a rebuild or a reshape on activation Currently, loading an inactive table with rebuild/reshape flags specified will cause dm-raid to rebuild/reshape on resume and thus start updating the raid metadata (about the progress). When the second table reload, to reset the flags, occurs the constructor accesses the volatile progress state kept in the raid superblocks. Because the active mapping is still processing the rebuild/reshape, that position will be stale by the time the device is resumed. In the reshape case, this causes data corruption by processing already reshaped stripes again. In the rebuild case, it does _not_ cause data corruption but instead involves superfluous rebuilds. Fix by keeping the raid set frozen during the first resume and then allow the rebuild/reshape during the second resume. Fixes: 9dbd1aa3a ("dm raid: add reshaping support to the target") Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:44 +01:00
Mike Snitzer	b7f874eedc	dm round robin: revert "use percpu 'repeat_count' and 'current_path'" commit 37a098e9d10db6e2efc05fe61e3a6ff2e9802c53 upstream. The sloppy nature of lockless access to percpu pointers (s->current_path) in rr_select_path(), from multiple threads, is causing some paths to used more than others -- which results in less IO performance being observed. Revert these upstream commits to restore truly symmetric round-robin IO submission in DM multipath: b0b477c dm round robin: use percpu 'repeat_count' and 'current_path' 802934b dm round robin: do not use this_cpu_ptr() without having preemption disabled There is no benefit to all this complexity if repeat_count = 1 (which is the recommended default). Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:44 +01:00
Mikulas Patocka	bad6c16b81	dm stats: fix a leaked s->histogram_boundaries array commit 6085831883c25860264721df15f05bbded45e2a2 upstream. Fixes: dfcfac3e4cd9 ("dm stats: collect and report histogram of IO latencies") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:44 +01:00
Joe Thornber	9987feba90	dm cache: fix corruption seen when using cache > 2TB commit ca763d0a53b264a650342cee206512bc92ac7050 upstream. A rounding bug due to compiler generated temporary being 32bit was found in remap_to_cache(). A localized cast in remap_to_cache() fixes the corruption but this preferred fix (changing from uint32_t to sector_t) eliminates potential for future rounding errors elsewhere. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:44 +01:00
Chanwoo Choi	fe8f92c7be	PM / devfreq: Fix wrong trans_stat of passive devfreq device commit 30582c25a4b4e0a5e456a309fde79b845e9473b2 upstream. Until now, the trans_stat information of passive devfreq is not updated. This patch updates the trans_stat information after setting the target frequency of passive devfreq device. Fixes: 996133119f57 ("PM / devfreq: Add new passive governor") Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:44 +01:00
Chanwoo Choi	2294b771a4	PM / devfreq: Fix available_governor sysfs commit bcf23c79c4e46130701370af4383b61a3cba755c upstream. The devfreq using passive governor is not able to change the governor. So, the user can not change the governor through 'available_governor' sysfs entry. Also, the devfreq which don't use the passive governor is not able to change to 'passive' governor on the fly. Fixes: 996133119f57 ("PM / devfreq: Add new passive governor") Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:44 +01:00
Mimi Zohar	d9cc31683a	ima: fix ima_d_path() possible race with rename commit bc15ed663e7e53ee4dc3e60f8d09c93a0528c694 upstream. On failure to return a pathname from ima_d_path(), a pointer to dname is returned, which is subsequently used in the IMA measurement list, the IMA audit records, and other audit logging. Saving the pointer to dname for later use has the potential to race with rename. Intead of returning a pointer to dname on failure, this patch returns a pointer to a copy of the filename. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:44 +01:00
Davidlohr Bueso	270e84a1e6	ipc/shm: Fix shmat mmap nil-page protection commit 95e91b831f87ac8e1f8ed50c14d709089b4e01b8 upstream. The issue is described here, with a nice testcase: https://bugzilla.kernel.org/show_bug.cgi?id=192931 The problem is that shmat() calls do_mmap_pgoff() with MAP_FIXED, and the address rounded down to 0. For the regular mmap case, the protection mentioned above is that the kernel gets to generate the address -- arch_get_unmapped_area() will always check for MAP_FIXED and return that address. So by the time we do security_mmap_addr(0) things get funky for shmat(). The testcase itself shows that while a regular user crashes, root will not have a problem attaching a nil-page. There are two possible fixes to this. The first, and which this patch does, is to simply allow root to crash as well -- this is also regular mmap behavior, ie when hacking up the testcase and adding mmap(... \|MAP_FIXED). While this approach is the safer option, the second alternative is to ignore SHM_RND if the rounded address is 0, thus only having MAP_SHARED flags. This makes the behavior of shmat() identical to the mmap() case. The downside of this is obviously user visible, but does make sense in that it maintains semantics after the round-down wrt 0 address and mmap. Passes shm related ltp tests. Link: http://lkml.kernel.org/r/1486050195-18629-1-git-send-email-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reported-by: Gareth Evans <gareth.evans@contextis.co.uk> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:44 +01:00
Stas Sergeev	6d94a6b32e	sigaltstack: support SS_AUTODISARM for CONFIG_COMPAT commit 441398d378f29a5ad6d0fcda07918e54e4961800 upstream. Currently SS_AUTODISARM is not supported in compatibility mode, but does not return -EINVAL either. This makes dosemu built with -m32 on x86_64 to crash. Also the kernel's sigaltstack selftest fails if compiled with -m32. This patch adds the needed support. Link: http://lkml.kernel.org/r/20170205101213.8163-2-stsp@list.ru Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net> Cc: Milosz Tanski <milosz@adfin.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Waiman Long <Waiman.Long@hpe.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Wang Xiaoqiang <wangxq10@lzu.edu.cn> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:44 +01:00
Michal Hocko	521e92b198	mm, vmscan: consider eligible zones in get_scan_count commit 71ab6cfe88dcf9f6e6a65eb85cf2bda20a257682 upstream. get_scan_count() considers the whole node LRU size when - doing SCAN_FILE due to many page cache inactive pages - calculating the number of pages to scan In both cases this might lead to unexpected behavior especially on 32b systems where we can expect lowmem memory pressure very often. A large highmem zone can easily distort SCAN_FILE heuristic because there might be only few file pages from the eligible zones on the node lru and we would still enforce file lru scanning which can lead to trashing while we could still scan anonymous pages. The later use of lruvec_lru_size can be problematic as well. Especially when there are not many pages from the eligible zones. We would have to skip over many pages to find anything to reclaim but shrink_node_memcg would only reduce the remaining number to scan by SWAP_CLUSTER_MAX at maximum. Therefore we can end up going over a large LRU many times without actually having chance to reclaim much if anything at all. The closer we are out of memory on lowmem zone the worse the problem will be. Fix this by filtering out all the ineligible zones when calculating the lru size for both paths and consider only sc->reclaim_idx zones. The patch would need to be tweaked a bit to apply to 4.10 and older but I will do that as soon as it hits the Linus tree in the next merge window. Link: http://lkml.kernel.org/r/20170117103702.28542-3-mhocko@kernel.org Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis") Signed-off-by: Michal Hocko <mhocko@suse.com> Tested-by: Trevor Cordes <trevor@tecnopolis.ca> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:44 +01:00
Michal Hocko	710531320a	mm, vmscan: cleanup lru size claculations commit fd538803731e50367b7c59ce4ad3454426a3d671 upstream. lruvec_lru_size returns the full size of the LRU list while we sometimes need a value reduced only to eligible zones (e.g. for lowmem requests). inactive_list_is_low is one such user. Later patches will add more of them. Add a new parameter to lruvec_lru_size and allow it filter out zones which are not eligible for the given context. Link: http://lkml.kernel.org/r/20170117103702.28542-2-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:43 +01:00
Yisheng Xie	8f6620e391	mm balloon: umount balloon_mnt when removing vb device commit 9c57b5808c625f4fc93da330b932647eaff321f7 upstream. With CONFIG_BALLOON_COMPACTION=y the kernel will mount balloon_mnt for balloon page migration when we probe a virtio_balloon device. However we do not unmount it when removing the device. Fix this. Fixes: b1123ea6d3b3 ("mm: balloon: use general non-lru movable page feature") Link: http://lkml.kernel.org/r/1486531318-35189-1-git-send-email-xieyisheng1@huawei.com Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Rafael Aquini <aquini@redhat.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Gioh Kim <gi-oh.kim@profitbricks.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:43 +01:00
Minchan Kim	2c290eede9	mm: do not access page->mapping directly on page_endio commit dd8416c47715cf324c9a16f13273f9fda87acfed upstream. With rw_page, page_endio is used for completing IO on a page and it propagates write error to the address space if the IO fails. The problem is it accesses page->mapping directly which might be okay for file-backed pages but it shouldn't for anonymous page. Otherwise, it can corrupt one of field from anon_vma under us and system goes panic randomly. swap_writepage bdev_writepage ops->rw_page I encountered the BUG during developing new zram feature and it was really hard to figure it out because it made random crash, somtime mmap_sem lockdep, sometime other places where places never related to zram/zsmalloc, and not reproducible with some configuration. When I consider how that bug is subtle and people do fast-swap test with brd, it's worth to add stable mark, I think. Fixes: dd6bd0d9c7db ("swap: use bdev_read_page() / bdev_write_page()") Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:43 +01:00
Vinayak Menon	58d1dbb904	mm: vmpressure: fix sending wrong events on underflow commit e1587a4945408faa58d0485002c110eb2454740c upstream. At the end of a window period, if the reclaimed pages is greater than scanned, an unsigned underflow can result in a huge pressure value and thus a critical event. Reclaimed pages is found to go higher than scanned because of the addition of reclaimed slab pages to reclaimed in shrink_node without a corresponding increment to scanned pages. Minchan Kim mentioned that this can also happen in the case of a THP page where the scanned is 1 and reclaimed could be 512. Link: http://lkml.kernel.org/r/1486641577-11685-1-git-send-email-vinmenon@codeaurora.org Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Rik van Riel <riel@redhat.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: Shiraz Hashim <shashim@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:43 +01:00
Gavin Shan	d1e8042628	mm/page_alloc: fix nodes for reclaim in fast path commit e02dc017c3032dcdce1b993af0db135462e1b4b7 upstream. When @node_reclaim_node isn't 0, the page allocator tries to reclaim pages if the amount of free memory in the zones are below the low watermark. On Power platform, none of NUMA nodes are scanned for page reclaim because no nodes match the condition in zone_allows_reclaim(). On Power platform, RECLAIM_DISTANCE is set to 10 which is the distance of Node-A to Node-A. So the preferred node even won't be scanned for page reclaim. __alloc_pages_nodemask() get_page_from_freelist() zone_allows_reclaim() Anton proposed the test code as below: # cat alloc.c : int main(int argc, char argv[]) { void p; unsigned long size; unsigned long start, end; start = time(NULL); size = strtoul(argv[1], NULL, 0); printf("To allocate %ldGB memory\n", size); size <<= 30; p = malloc(size); assert(p); memset(p, 0, size); end = time(NULL); printf("Used time: %ld seconds\n", end - start); sleep(3600); return 0; } The system I use for testing has two NUMA nodes. Both have 128GB memory. In below scnario, the page caches on node#0 should be reclaimed when it encounters pressure to accommodate request of allocation. # echo 2 > /proc/sys/vm/zone_reclaim_mode; \ sync; \ echo 3 > /proc/sys/vm/drop_caches; \ # taskset -c 0 cat file.32G > /dev/null; \ grep FilePages /sys/devices/system/node/node0/meminfo Node 0 FilePages: 33619712 kB # taskset -c 0 ./alloc 128 # grep FilePages /sys/devices/system/node/node0/meminfo Node 0 FilePages: 33619840 kB # grep MemFree /sys/devices/system/node/node0/meminfo Node 0 MemFree: 186816 kB With the patch applied, the pagecache on node-0 is reclaimed when its free memory is running out. It's the expected behaviour. # echo 2 > /proc/sys/vm/zone_reclaim_mode; \ sync; \ echo 3 > /proc/sys/vm/drop_caches # taskset -c 0 cat file.32G > /dev/null; \ grep FilePages /sys/devices/system/node/node0/meminfo Node 0 FilePages: 33605568 kB # taskset -c 0 ./alloc 128 # grep FilePages /sys/devices/system/node/node0/meminfo Node 0 FilePages: 1379520 kB # grep MemFree /sys/devices/system/node/node0/meminfo Node 0 MemFree: 317120 kB Fixes: 5f7a75acdb24 ("mm: page_alloc: do not cache reclaim distances") Link: http://lkml.kernel.org/r/1486532455-29613-1-git-send-email-gwshan@linux.vnet.ibm.com Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Anton Blanchard <anton@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:43 +01:00
Dan Williams	f1faaec484	mm, devm_memremap_pages: hold device_hotplug lock over mem_hotplug_{begin, done} commit b5d24fda9c3dce51fcb4eee459550a458eaaf1e2 upstream. The mem_hotplug_{begin,done} lock coordinates with {get,put}_online_mems() to hold off "readers" of the current state of memory from new hotplug actions. mem_hotplug_begin() expects exclusive access, via the device_hotplug lock, to set mem_hotplug.active_writer. Calling mem_hotplug_begin() without locking device_hotplug can lead to corrupting mem_hotplug.refcount and missed wakeups / soft lockups. [dan.j.williams@intel.com: v2] Link: http://lkml.kernel.org/r/148728203365.38457.17804568297887708345.stgit@dwillia2-desk3.amr.corp.intel.com Link: http://lkml.kernel.org/r/148693885680.16345.17802627926777862337.stgit@dwillia2-desk3.amr.corp.intel.com Fixes: f931ab479dd2 ("mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done}") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Ben Hutchings <ben@decadent.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:43 +01:00
Pavel Shilovsky	c06d74df4e	CIFS: Fix splice read for non-cached files commit 9c25702cee1405099f982894c865c163de7909a8 upstream. Currently we call copy_page_to_iter() for uncached reading into a pipe. This is wrong because it treats pages as VFS cache pages and copies references rather than actual data. When we are trying to read from the pipe we end up calling page_cache_pipe_buf_confirm() which returns -ENODATA. This error is translated into 0 which is returned to a user. This issue is reproduced by running xfs-tests suite (generic test #249) against mount points with "cache=none". Fix it by mapping pages manually and calling copy_to_iter() that copies data into the pipe. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:43 +01:00
Ashok Raj	24427cd71d	iommu/vt-d: Tylersburg isoch identity map check is done too late. commit 21e722c4c8377b5bc82ad058fed12165af739c1b upstream. The check to set identity map for tylersburg is done too late. It needs to be done before the check for identity_map domain is done. To: Joerg Roedel <joro@8bytes.org> To: David Woodhouse <dwmw2@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org Cc: Ashok Raj <ashok.raj@intel.com> Fixes: 86080ccc22 ("iommu/vt-d: Allocate si_domain in init_dmars()") Signed-off-by: Ashok Raj <ashok.raj@intel.com> Reported-by: Yunhong Jiang <yunhong.jiang@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:43 +01:00
CQ Tang	61cb3c6357	iommu/vt-d: Fix some macros that are incorrectly specified in intel-iommu commit aaa59306b0b7e0ca4ba92cc04c5db101cbb1c096 upstream. Some of the macros are incorrect with wrong bit-shifts resulting in picking the incorrect invalidation granularity. Incorrect Source-ID in extended devtlb invalidation caused device side errors. To: Joerg Roedel <joro@8bytes.org> To: David Woodhouse <dwmw2@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org Cc: CQ Tang <cq.tang@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Fixes: 2f26e0a9 ("iommu/vt-d: Add basic SVM PASID support") Signed-off-by: CQ Tang <cq.tang@intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Tested-by: CQ Tang <cq.tang@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-03-12 06:41:43 +01:00

1 2 3 4 5 ...

636690 Commits