637148 Commits

Author SHA1 Message Date
Tony Luck
af5ef6dafe x86/mce: Fix copy/paste error in exception table entries
commit 26a37ab319a26d330bab298770d692bb9c852aff upstream.

Back in commit:

  92b0729c34cab ("x86/mm, x86/mce: Add memcpy_mcsafe()")

... I made a copy/paste error setting up the exception table entries
and ended up with two for label .L_cache_w3 and none for .L_cache_w2.

This means that if we take a machine check on:

  .L_cache_w2: movq 2*8(%rsi), %r10

then we don't have an exception table entry for this instruction
and we can't recover.

Fix: s/3/2/

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 92b0729c34cab ("x86/mm, x86/mce: Add memcpy_mcsafe()")
Link: http://lkml.kernel.org/r/1490046030-25862-1-git-send-email-tony.luck@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:35 +02:00
Baoquan He
2211d19ac6 x86/mm/KASLR: Exclude EFI region from KASLR VA space randomization
commit a46f60d76004965e5669dbf3fc21ef3bc3632eb4 upstream.

Currently KASLR is enabled on three regions: the direct mapping of physical
memory, vamlloc and vmemmap. However the EFI region is also mistakenly
included for VA space randomization because of misusing EFI_VA_START macro
and assuming EFI_VA_START < EFI_VA_END.

(This breaks kexec and possibly other things that rely on stable addresses.)

The EFI region is reserved for EFI runtime services virtual mapping which
should not be included in KASLR ranges. In Documentation/x86/x86_64/mm.txt,
we can see:

  ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space

EFI uses the space from -4G to -64G thus EFI_VA_START > EFI_VA_END,
Here EFI_VA_START = -4G, and EFI_VA_END = -64G.

Changing EFI_VA_START to EFI_VA_END in mm/kaslr.c fixes this problem.

Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Acked-by: Thomas Garnier <thgarnie@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1490331592-31860-1-git-send-email-bhe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:35 +02:00
Lucas Stach
4366c7e346 drm/etnaviv: (re-)protect fence allocation with GPU mutex
commit f3cd1b064f1179d9e6188c6d67297a2360880e10 upstream.

The fence allocation needs to be protected by the GPU mutex, otherwise
the fence seqnos of concurrent submits might not match the insertion order
of the jobs in the kernel ring. This breaks the assumption that jobs
complete with monotonically increasing fence seqnos.

Fixes: d9853490176c (drm/etnaviv: take GPU lock later in the submit process)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:35 +02:00
Eric Anholt
6acf520708 drm/vc4: Allocate the right amount of space for boot-time CRTC state.
commit 6d6e500391875cc372336c88e9a8af377be19c36 upstream.

Without this, the first modeset would dereference past the allocation
when trying to free the mm node.

Signed-off-by: Eric Anholt <eric@anholt.net>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170328201343.4884-1-eric@anholt.net
Fixes: d8dbf44f13b9 ("drm/vc4: Make the CRTCs cooperate on allocating display lists.")
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:35 +02:00
Michel Dänzer
fa68e1d3ce drm/radeon: Override fpfn for all VRAM placements in radeon_evict_flags
commit ce4b4f228e51219b0b79588caf73225b08b5b779 upstream.

We were accidentally only overriding the first VRAM placement. For BOs
with the RADEON_GEM_NO_CPU_ACCESS flag set,
radeon_ttm_placement_from_domain creates a second VRAM placment with
fpfn == 0. If VRAM is almost full, the first VRAM placement with
fpfn > 0 may not work, but the second one with fpfn == 0 always will
(the BO's current location trivially satisfies it). Because "moving"
the BO to its current location puts it back on the LRU list, this
results in an infinite loop.

Fixes: 2a85aedd117c ("drm/radeon: Try evicting from CPU accessible to
                      inaccessible VRAM first")
Reported-by: Zachary Michaels <zmichaels@oblong.com>
Reported-and-Tested-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:35 +02:00
David Hildenbrand
1563625c71 KVM: kvm_io_bus_unregister_dev() should never fail
commit 90db10434b163e46da413d34db8d0e77404cc645 upstream.

No caller currently checks the return value of
kvm_io_bus_unregister_dev(). This is evil, as all callers silently go on
freeing their device. A stale reference will remain in the io_bus,
getting at least used again, when the iobus gets teared down on
kvm_destroy_vm() - leading to use after free errors.

There is nothing the callers could do, except retrying over and over
again.

So let's simply remove the bus altogether, print an error and make
sure no one can access this broken bus again (returning -ENOMEM on any
attempt to access it).

Fixes: e93f8a0f821e ("KVM: convert io_bus to SRCU")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:34 +02:00
Peter Xu
ef46a13b9c KVM: x86: clear bus pointer when destroyed
commit df630b8c1e851b5e265dc2ca9c87222e342c093b upstream.

When releasing the bus, let's clear the bus pointers to mark it out. If
any further device unregister happens on this bus, we know that we're
done if we found the bus being released already.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:34 +02:00
Uwe Kleine-König
5289f1ce39 serial: mxs-auart: Fix baudrate calculation
commit a6040bc610554c66088fda3608ae5d6307c548e4 upstream.

The reference manual for the i.MX28 recommends to calculate the divisor
as

	divisor = (UARTCLK * 32) / baud rate, rounded to the nearest integer

, so let's do this. For a typical setup of UARTCLK = 24 MHz and baud
rate = 115200 this changes the divisor from 6666 to 6667 and so the
actual baud rate improves from 115211.521 Bd (error ≅ 0.01 %) to
115194.240 Bd (error ≅ 0.005 %).

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:34 +02:00
Alan Stern
67e41b1368 USB: fix linked-list corruption in rh_call_control()
commit 1633682053a7ee8058e10c76722b9b28e97fb73f upstream.

Using KASAN, Dmitry found a bug in the rh_call_control() routine: If
buffer allocation fails, the routine returns immediately without
unlinking its URB from the control endpoint, eventually leading to
linked-list corruption.

This patch fixes the problem by jumping to the end of the routine
(where the URB is unlinked) when an allocation failure occurs.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:34 +02:00
Nicolas Ferre
1f1c9e2965 tty/serial: atmel: fix TX path in atmel_console_write()
commit 497e1e16f45c70574dc9922c7f75c642c2162119 upstream.

A side effect of 89d8232411a8 ("tty/serial: atmel_serial: BUG: stop DMA
from transmitting in stop_tx") is that the console can be called with
TX path disabled. Then the system would hang trying to push charecters
out in atmel_console_putchar().

Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Fixes: 89d8232411a8 ("tty/serial: atmel_serial: BUG: stop DMA from transmitting in stop_tx")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:34 +02:00
Richard Genoud
e087ae68e8 tty/serial: atmel: fix race condition (TX+DMA)
commit 31ca2c63fdc0aee725cbd4f207c1256f5deaabde upstream.

If uart_flush_buffer() is called between atmel_tx_dma() and
atmel_complete_tx_dma(), the circular buffer has been cleared, but not
atmel_port->tx_len.
That leads to a circular buffer overflow (dumping (UART_XMIT_SIZE -
atmel_port->tx_len) bytes).

Tested-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:34 +02:00
Joerg Roedel
b3641939b1 ACPI: Do not create a platform_device for IOAPIC/IOxAPIC
commit 08f63d97749185fab942a3a47ed80f5bd89b8b7d upstream.

No platform-device is required for IO(x)APICs, so don't even
create them.

[ rjw: This fixes a problem with leaking platform device objects
  after IOAPIC/IOxAPIC hot-removal events.]

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:34 +02:00
Josh Poimboeuf
000d2bb6c0 ACPI: Fix incompatibility with mcount-based function graph tracing
commit 61b79e16c68d703dde58c25d3935d67210b7d71b upstream.

Paul Menzel reported a warning:

  WARNING: CPU: 0 PID: 774 at /build/linux-ROBWaj/linux-4.9.13/kernel/trace/trace_functions_graph.c:233 ftrace_return_to_handler+0x1aa/0x1e0
  Bad frame pointer: expected f6919d98, received f6919db0
    from func acpi_pm_device_sleep_wake return to c43b6f9d

The warning means that function graph tracing is broken for the
acpi_pm_device_sleep_wake() function.  That's because the ACPI Makefile
unconditionally sets the '-Os' gcc flag to optimize for size.  That's an
issue because mcount-based function graph tracing is incompatible with
'-Os' on x86, thanks to the following gcc bug:

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42109

I have another patch pending which will ensure that mcount-based
function graph tracing is never used with CONFIG_CC_OPTIMIZE_FOR_SIZE on
x86.

But this patch is needed in addition to that one because the ACPI
Makefile overrides that config option for no apparent reason.  It has
had this flag since the beginning of git history, and there's no related
comment, so I don't know why it's there.  As far as I can tell, there's
no reason for it to be there.  The appropriate behavior is for it to
honor CONFIG_CC_OPTIMIZE_FOR_{SIZE,PERFORMANCE} like the rest of the
kernel.

Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:34 +02:00
Helge Deller
76343bfbca parisc: Fix access fault handling in pa_memcpy()
commit 554bfeceb8a22d448cd986fc9efce25e833278a1 upstream.

pa_memcpy() is the major memcpy implementation in the parisc kernel which is
used to do any kind of userspace/kernel memory copies.

Al Viro noticed various bugs in the implementation of pa_mempcy(), most notably
that in case of faults it may report back to have copied more bytes than it
actually did.

Fixing those bugs is quite hard in the C-implementation, because the compiler
is messing around with the registers and we are not guaranteed that specific
variables are always in the same processor registers. This makes proper fault
handling complicated.

This patch implements pa_memcpy() in assembler. That way we have correct fault
handling and adding a 64-bit copy routine was quite easy.

Runtime tested with 32- and 64bit kernels.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:34 +02:00
Helge Deller
99e354a59a parisc: Avoid stalled CPU warnings after system shutdown
commit 476e75a44b56038bee9207242d4bc718f6b4de06 upstream.

Commit 73580dac7618 ("parisc: Fix system shutdown halt") introduced an endless
loop for systems which don't provide a software power off function.  But the
soft lockup detector will detect this and report stalled CPUs after some time.
Avoid those unwanted warnings by disabling the soft lockup detector.

Fixes: 73580dac7618 ("parisc: Fix system shutdown halt")
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:34 +02:00
Helge Deller
09b931fcb8 parisc: Clean up fixup routines for get_user()/put_user()
commit d19f5e41b344a057bb2450024a807476f30978d2 upstream.

Al Viro noticed that userspace accesses via get_user()/put_user() can be
simplified a lot with regard to usage of the exception handling.

This patch implements a fixup routine for get_user() and put_user() in such
that the exception handler will automatically load -EFAULT into the register
%r8 (the error value) in case on a fault on userspace.  Additionally the fixup
routine will zero the target register on fault in case of a get_user() call.
The target register is extracted out of the faulting assembly instruction.

This patch brings a few benefits over the old implementation:
1. Exception handling gets much cleaner, easier and smaller in size.
2. Helper functions like fixup_get_user_skip_1 (all of fixup.S) can be dropped.
3. No need to hardcode %r9 as target register for get_user() any longer. This
   helps the compiler register allocator and thus creates less assembler
   statements.
4. No dependency on the exception_data contents any longer.
5. Nested faults will be handled cleanly.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:34 +02:00
Kinglong Mee
3967cf7e6a nfsd: map the ENOKEY to nfserr_perm for avoiding warning
commit c952cd4e949ab3d07287efc2e80246e03727d15d upstream.

Now that Ext4 and f2fs filesystems support encrypted directories and
files, attempts to access those files may return ENOKEY, resulting in
the following WARNING.

Map ENOKEY to nfserr_perm instead of nfserr_io.

[ 1295.411759] ------------[ cut here ]------------
[ 1295.411787] WARNING: CPU: 0 PID: 12786 at fs/nfsd/nfsproc.c:796 nfserrno+0x74/0x80 [nfsd]
[ 1295.411806] nfsd: non-standard errno: -126
[ 1295.411816] Modules linked in: nfsd nfs_acl auth_rpcgss nfsv4 nfs lockd fscache tun bridge stp llc fuse ip_set nfnetlink vmw_vsock_vmci_transport vsock snd_seq_midi snd_seq_midi_event coretemp crct10dif_pclmul crc32_generic crc32_pclmul snd_ens1371 gameport ghash_clmulni_intel snd_ac97_codec f2fs intel_rapl_perf ac97_bus snd_seq ppdev snd_pcm snd_rawmidi snd_timer vmw_balloon snd_seq_device snd joydev soundcore parport_pc parport nfit acpi_cpufreq tpm_tis vmw_vmci tpm_tis_core tpm shpchp i2c_piix4 grace sunrpc xfs libcrc32c vmwgfx drm_kms_helper ttm drm crc32c_intel e1000 mptspi scsi_transport_spi serio_raw mptscsih mptbase ata_generic pata_acpi fjes [last unloaded: nfs_acl]
[ 1295.412522] CPU: 0 PID: 12786 Comm: nfsd Tainted: G        W       4.11.0-rc1+ #521
[ 1295.412959] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[ 1295.413814] Call Trace:
[ 1295.414252]  dump_stack+0x63/0x86
[ 1295.414666]  __warn+0xcb/0xf0
[ 1295.415087]  warn_slowpath_fmt+0x5f/0x80
[ 1295.415502]  ? put_filp+0x42/0x50
[ 1295.415927]  nfserrno+0x74/0x80 [nfsd]
[ 1295.416339]  nfsd_open+0xd7/0x180 [nfsd]
[ 1295.416746]  nfs4_get_vfs_file+0x367/0x3c0 [nfsd]
[ 1295.417182]  ? security_inode_permission+0x41/0x60
[ 1295.417591]  nfsd4_process_open2+0x9b2/0x1200 [nfsd]
[ 1295.418007]  nfsd4_open+0x481/0x790 [nfsd]
[ 1295.418409]  nfsd4_proc_compound+0x395/0x680 [nfsd]
[ 1295.418812]  nfsd_dispatch+0xb8/0x1f0 [nfsd]
[ 1295.419233]  svc_process_common+0x4d9/0x830 [sunrpc]
[ 1295.419631]  svc_process+0xfe/0x1b0 [sunrpc]
[ 1295.420033]  nfsd+0xe9/0x150 [nfsd]
[ 1295.420420]  kthread+0x101/0x140
[ 1295.420802]  ? nfsd_destroy+0x60/0x60 [nfsd]
[ 1295.421199]  ? kthread_park+0x90/0x90
[ 1295.421598]  ret_from_fork+0x2c/0x40
[ 1295.421996] ---[ end trace 0d5a969cd7852e1f ]---

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:33 +02:00
Olga Kornievskaia
461bbb9094 NFSv4.1 fix infinite loop on IO BAD_STATEID error
commit 0e3d3e5df07dcf8a50d96e0ecd6ab9a888f55dfc upstream.

Commit 63d63cbf5e03 "NFSv4.1: Don't recheck delegations that
have already been checked" introduced a regression where when a
client received BAD_STATEID error it would not send any TEST_STATEID
and instead go into an infinite loop of resending the IO that caused
the BAD_STATEID.

Fixes: 63d63cbf5e03 ("NFSv4.1: Don't recheck delegations that have already been checked")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:33 +02:00
Ludovic Desroches
80df2b3e18 mmc: sdhci-of-at91: fix MMC_DDR_52 timing selection
commit d0918764c17b94c30bbb2619929b1719ff52707a upstream.

The controller has different timings for MMC_TIMING_UHS_DDR50 and
MMC_TIMING_MMC_DDR52. Configuring the controller with SDHCI_CTRL_UHS_DDR50,
when MMC_TIMING_MMC_DDR52 timings are requested, is not correct and can
lead to unexpected behavior.

Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Fixes: bb5f8ea4d514 ("mmc: sdhci-of-at91: introduce driver for the Atmel SDMMC")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:33 +02:00
Hans de Goede
fa3b4f4f57 mmc: sdhci: Disable runtime pm when the sdio_irq is enabled
commit 923713b357455cfb9aca2cd3429cb0806a724ed2 upstream.

SDIO cards may need clock to send the card interrupt to the host.

On a cherrytrail tablet with a RTL8723BS wifi chip, without this patch
pinging the tablet results in:

PING 192.168.1.14 (192.168.1.14) 56(84) bytes of data.
64 bytes from 192.168.1.14: icmp_seq=1 ttl=64 time=78.6 ms
64 bytes from 192.168.1.14: icmp_seq=2 ttl=64 time=1760 ms
64 bytes from 192.168.1.14: icmp_seq=3 ttl=64 time=753 ms
64 bytes from 192.168.1.14: icmp_seq=4 ttl=64 time=3.88 ms
64 bytes from 192.168.1.14: icmp_seq=5 ttl=64 time=795 ms
64 bytes from 192.168.1.14: icmp_seq=6 ttl=64 time=1841 ms
64 bytes from 192.168.1.14: icmp_seq=7 ttl=64 time=810 ms
64 bytes from 192.168.1.14: icmp_seq=8 ttl=64 time=1860 ms
64 bytes from 192.168.1.14: icmp_seq=9 ttl=64 time=812 ms
64 bytes from 192.168.1.14: icmp_seq=10 ttl=64 time=48.6 ms

Where as with this patch I get:

PING 192.168.1.14 (192.168.1.14) 56(84) bytes of data.
64 bytes from 192.168.1.14: icmp_seq=1 ttl=64 time=3.96 ms
64 bytes from 192.168.1.14: icmp_seq=2 ttl=64 time=1.97 ms
64 bytes from 192.168.1.14: icmp_seq=3 ttl=64 time=17.2 ms
64 bytes from 192.168.1.14: icmp_seq=4 ttl=64 time=2.46 ms
64 bytes from 192.168.1.14: icmp_seq=5 ttl=64 time=2.83 ms
64 bytes from 192.168.1.14: icmp_seq=6 ttl=64 time=1.40 ms
64 bytes from 192.168.1.14: icmp_seq=7 ttl=64 time=2.10 ms
64 bytes from 192.168.1.14: icmp_seq=8 ttl=64 time=1.40 ms
64 bytes from 192.168.1.14: icmp_seq=9 ttl=64 time=2.04 ms
64 bytes from 192.168.1.14: icmp_seq=10 ttl=64 time=1.40 ms

Cc: Dong Aisheng <b29396@freescale.com>
Cc: Ian W MORRISON <ianwmorrison@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:33 +02:00
Aaron Armstrong Skomra
8d6c332242 HID: wacom: Don't add ghost interface as shared data
commit 8b4073596997f2ccbf68d8e72e07b827388a4536 upstream.

A previous commit (below) adds a check for already probed interfaces to
Wacom's matching heuristic. Unfortunately this causes the Bamboo Pen
(CTL-460) to match itself to its 'ghost' touch interface. After
subsequent changes to the driver this match to the ghost causes the
kernel to crash. This patch avoids calling wacom_add_shared_data()
for the BAMBOO_PEN's ghost touch interface.

Fixes: 41372d5d40e7 ("HID: wacom: Augment 'oVid' and 'oPid' with heuristics for HID_GENERIC")
Signed-off-by: Aaron Armstrong Skomra <aaron.skomra@wacom.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:33 +02:00
Takashi Sakamoto
e5a1347391 ASoC: Intel: Skylake: fix invalid memory access due to wrong reference of pointer
commit d1a6fe41d3c4ff0d26f0b186d774493555ca5282 upstream.

In 'skl_tplg_set_module_init_data()', a pointer to 'params' member of
'struct skl_algo_data' is calculated, then casted to (u32 *) and assigned
to a member of configuration data. The configuration data is passed to the
other functions and used to process intel IPC. In this processing, the
value of member is used to get message data, however this can bring invalid
memory access in 'skl_set_module_params()' as a result of calculation of
a pointer for actual message data.

(sound/soc/intel/skylake/skl-topology.c)
skl_tplg_init_pipe_modules()
->skl_tplg_set_module_init_data() (has this bug)
->skl_tplg_set_module_params()
  (sound/soc/intel/skylake/skl-messages.c)
  ->skl_set_module_params()
    ((char *)param) + data_offset

This commit fixes the bug.

Fixes: abb740033b56 ("ASoC: Intel: Skylake: Add support to configure module params")
Signed-off-by: Takashi Sakamoto <takashi.sakamoto@miraclelinux.com>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:33 +02:00
Songjun Wu
7a042a4eeb ASoC: atmel-classd: fix audio clock rate
commit cd3ac9affc43b44f49d7af70d275f0bd426ba643 upstream.

Fix the audio clock rate according to the datasheet.

Reported-by: Dushara Jayasinghe <dushara@successful.com.au>
Signed-off-by: Songjun Wu <songjun.wu@microchip.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:33 +02:00
Hui Wang
8aabccdc9d ALSA: hda - fix a problem for lineout on a Dell AIO machine
commit 2f726aec19a9d2c63bec9a8a53a3910ffdcd09f8 upstream.

On this Dell AIO machine, the lineout jack does not work.

We found the pin 0x1a is assigned to lineout on this machine, and in
the past, we applied ALC298_FIXUP_DELL1_MIC_NO_PRESENCE to fix the
heaset-set mic problem for this machine, this fixup will redefine
the pin 0x1a to headphone-mic, as a result the lineout doesn't
work anymore.

After consulting with Dell, they told us this machine doesn't support
microphone via headset jack, so we add a new fixup which only defines
the pin 0x18 as the headset-mic.

[rearranged the fixup insertion position by tiwai in order to make the
 merge with other branches easier -- tiwai]

Fixes: 59ec4b57bcae ("ALSA: hda - Fix headset mic detection problem for two dell machines")
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:33 +02:00
Takashi Iwai
74a2c1ff88 ALSA: seq: Fix race during FIFO resize
commit 2d7d54002e396c180db0c800c1046f0a3c471597 upstream.

When a new event is queued while processing to resize the FIFO in
snd_seq_fifo_clear(), it may lead to a use-after-free, as the old pool
that is being queued gets removed.  For avoiding this race, we need to
close the pool to be deleted and sync its usage before actually
deleting it.

The issue was spotted by syzkaller.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:33 +02:00
Bjorn Helgaas
0dd5b335ed PCI: iproc: Save host bridge window resource in struct iproc_pcie
commit 6e347b5e05ea2ac4ac467a5a1cfaebb2c7f06f80 upstream.

The host bridge memory window resource is inserted into the iomem_resource
tree and cannot be deallocated until the host bridge itself is removed.

Previously, the window was on the stack, which meant the iomem_resource
entry pointed into the stack and was corrupted as soon as the probe
function returned, which caused memory corruption and errors like this:

  pcie_iproc_bcma bcma0:8: resource collision: [mem 0x40000000-0x47ffffff] conflicts with PCIe MEM space [mem 0x40000000-0x47ffffff]

Move the memory window resource from the stack into struct iproc_pcie so
its lifetime matches that of the host bridge.

Fixes: c3245a566400 ("PCI: iproc: Request host bridge window resources")
Reported-and-tested-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:33 +02:00
Bart Van Assche
8f9155989f scsi: scsi_dh_alua: Ensure that alua_activate() calls the completion function
commit 7cb689fe42927281b8d98606ae5450173fcc66a9 upstream.

Callers of scsi_dh_activate(), e.g. dm-mpath, assume that this function
either returns an error code or calls the completion function. Make
alua_activate() call the completion function even if scsi_device_get()
fails.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:33 +02:00
Bart Van Assche
68b275b7cb scsi: scsi_dh_alua: Check scsi_device_get() return value
commit 625fe857e4fac6518716f3c0ff5e5deb8ec6d238 upstream.

Do not queue ALUA work nor call scsi_device_put() if the
scsi_device_get() call fails. This patch fixes the following crash:

general protection fault: 0000 [#1] SMP
RIP: 0010:scsi_device_put+0xb/0x30
Call Trace:
 scsi_disk_put+0x2d/0x40
 sd_release+0x3d/0xb0
 __blkdev_put+0x29e/0x360
 blkdev_put+0x49/0x170
 dm_put_table_device+0x58/0xc0 [dm_mod]
 dm_put_device+0x70/0xc0 [dm_mod]
 free_priority_group+0x92/0xc0 [dm_multipath]
 free_multipath+0x70/0xc0 [dm_multipath]
 multipath_dtr+0x19/0x20 [dm_multipath]
 dm_table_destroy+0x67/0x120 [dm_mod]
 dev_suspend+0xde/0x240 [dm_mod]
 ctl_ioctl+0x1f5/0x520 [dm_mod]
 dm_ctl_ioctl+0xe/0x20 [dm_mod]
 do_vfs_ioctl+0x8f/0x700
 SyS_ioctl+0x3c/0x70
 entry_SYSCALL_64_fastpath+0x18/0xad

Fixes: commit 03197b61c5ec ("scsi_dh_alua: Use workqueue for RTPG")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:32 +02:00
John Garry
cf31d6d215 scsi: libsas: fix ata xfer length
commit 9702c67c6066f583b629cf037d2056245bb7a8e6 upstream.

The total ata xfer length may not be calculated properly, in that we do
not use the proper method to get an sg element dma length.

According to the code comment, sg_dma_len() should be used after
dma_map_sg() is called.

This issue was found by turning on the SMMUv3 in front of the hisi_sas
controller in hip07. Multiple sg elements were being combined into a
single element, but the original first element length was being use as
the total xfer length.

Fixes: ff2aeb1eb64c8a4770a6 ("libata: convert to chained sg")
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:32 +02:00
peter chang
c2a8695278 scsi: sg: check length passed to SG_NEXT_CMD_LEN
commit bf33f87dd04c371ea33feb821b60d63d754e3124 upstream.

The user can control the size of the next command passed along, but the
value passed to the ioctl isn't checked against the usable max command
size.

Signed-off-by: Peter Chang <dpf@google.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:32 +02:00
Christoph Hellwig
d5dbd1c959 xfs: try any AG when allocating the first btree block when reflinking
commit 2fcc319d2467a5f5b78f35f79fd6e22741a31b1e upstream.

When a reflink operation causes the bmap code to allocate a btree block
we're currently doing single-AG allocations due to having ->firstblock
set and then try any higher AG due a little reflink quirk we've put in
when adding the reflink code.  But given that we do not have a minleft
reservation of any kind in this AG we can still not have any space in
the same or higher AG even if the file system has enough free space.
To fix this use a XFS_ALLOCTYPE_FIRST_AG allocation in this fall back
path instead.

[And yes, we need to redo this properly instead of piling hacks over
 hacks.  I'm working on that, but it's not going to be a small series.
 In the meantime this fixes the customer reported issue]

Also add a warning for failing allocations to make it easier to debug.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:32 +02:00
Brian Foster
da617af8f0 xfs: use iomap new flag for newly allocated delalloc blocks
commit f65e6fad293b3a5793b7fa2044800506490e7a2e upstream.

Commit fa7f138 ("xfs: clear delalloc and cache on buffered write
failure") fixed one regression in the iomap error handling code and
exposed another. The fundamental problem is that if a buffered write
is a rewrite of preexisting delalloc blocks and the write fails, the
failure handling code can punch out preexisting blocks with valid
file data.

This was reproduced directly by sub-block writes in the LTP
kernel/syscalls/write/write03 test. A first 100 byte write allocates
a single block in a file. A subsequent 100 byte write fails and
punches out the block, including the data successfully written by
the previous write.

To address this problem, update the ->iomap_begin() handler to
distinguish newly allocated delalloc blocks from preexisting
delalloc blocks via the IOMAP_F_NEW flag. Use this flag in the
->iomap_end() handler to decide when a failed or short write should
punch out delalloc blocks.

This introduces the subtle requirement that ->iomap_begin() should
never combine newly allocated delalloc blocks with existing blocks
in the resulting iomap descriptor. This can occur when a new
delalloc reservation merges with a neighboring extent that is part
of the current write, for example. Therefore, drop the
post-allocation extent lookup from xfs_bmapi_reserve_delalloc() and
just return the record inserted into the fork. This ensures only new
blocks are returned and thus that preexisting delalloc blocks are
always handled as "found" blocks and not punched out on a failed
rewrite.

Reported-by: Xiong Zhou <xzhou@redhat.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:32 +02:00
Chandan Rajendra
77aedb0cbe xfs: Use xfs_icluster_size_fsb() to calculate inode alignment mask
commit d5825712ee98d68a2c17bc89dad2c30276894cba upstream.

When block size is larger than inode cluster size, the call to
XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size) returns 0. Also, mkfs.xfs
would have set xfs_sb->sb_inoalignmt to 0. Hence in
xfs_set_inoalignment(), xfs_mount->m_inoalign_mask gets initialized to
-1 instead of 0. However, xfs_mount->m_sinoalign would get correctly
intialized to 0 because for every positive value of xfs_mount->m_dalign,
the condition "!(mp->m_dalign & mp->m_inoalign_mask)" would evaluate to
false.

Also, xfs_imap() worked fine even with xfs_mount->m_inoalign_mask having
-1 as the value because blks_per_cluster variable would have the value 1
and hence we would never have a need to use xfs_mount->m_inoalign_mask
to compute the inode chunk's agbno and offset within the chunk.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:32 +02:00
Christoph Hellwig
d07b5855ab xfs: fix and streamline error handling in xfs_end_io
commit 787eb485509f9d58962bd8b4dbc6a5ac6e2034fe upstream.

There are two different cases of buffered I/O errors:

 - first we can have an already shutdown fs.  In that case we should skip
   any on-disk operations and just clean up the appen transaction if
   present and destroy the ioend
 - a real I/O error.  In that case we should cleanup any lingering COW
   blocks.  This gets skipped in the current code and is fixed by this
   patch.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:32 +02:00
Christoph Hellwig
3b83a02af2 xfs: only reclaim unwritten COW extents periodically
commit 3802a345321a08093ba2ddb1849e736f84e8d450 upstream.

We only want to reclaim preallocations from our periodic work item.
Currently this is archived by looking for a dirty inode, but that check
is rather fragile.  Instead add a flag to xfs_reflink_cancel_cow_* so
that the caller can ask for just cancelling unwritten extents in the COW
fork.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: fix typos in commit message]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:32 +02:00
Christoph Hellwig
a240293694 xfs: tune down agno asserts in the bmap code
commit 410d17f67e583559be3a922f8b6cc336331893f3 upstream.

In various places we currently assert that xfs_bmap_btalloc allocates
from the same as the firstblock value passed in, unless it's either
NULLAGNO or the dop_low flag is set.  But the reflink code does not
fully follow this convention as it passes in firstblock purely as
a hint for the allocator without actually having previous allocations
in the transaction, and without having a minleft check on the current
AG, leading to the assert firing on a very full and heavily used
file system.  As even the reflink code only allocates from equal or
higher AGs for now we can simply the check to always allow for equal
or higher AGs.

Note that we need to eventually split the two meanings of the firstblock
value.  At that point we can also allow the reflink code to allocate
from any AG instead of limiting it in any way.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:32 +02:00
Chandan Rajendra
9559c48c1a xfs: Use xfs_icluster_size_fsb() to calculate inode chunk alignment
commit 8ee9fdbebc84b39f1d1c201c5e32277c61d034aa upstream.

On a ppc64 system, executing generic/256 test with 32k block size gives the following call trace,

XFS: Assertion failed: args->maxlen > 0, file: /root/repos/linux/fs/xfs/libxfs/xfs_alloc.c, line: 2026

kernel BUG at /root/repos/linux/fs/xfs/xfs_message.c:113!
Oops: Exception in kernel mode, sig: 5 [#1]
SMP NR_CPUS=2048
DEBUG_PAGEALLOC
NUMA
pSeries
Modules linked in:
CPU: 2 PID: 19361 Comm: mkdir Not tainted 4.10.0-rc5 #58
task: c000000102606d80 task.stack: c0000001026b8000
NIP: c0000000004ef798 LR: c0000000004ef798 CTR: c00000000082b290
REGS: c0000001026bb090 TRAP: 0700   Not tainted  (4.10.0-rc5)
MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI>
CR: 28004428  XER: 00000000
CFAR: c0000000004ef180 SOFTE: 1
GPR00: c0000000004ef798 c0000001026bb310 c000000001157300 ffffffffffffffea
GPR04: 000000000000000a c0000001026bb130 0000000000000000 ffffffffffffffc0
GPR08: 00000000000000d1 0000000000000021 00000000ffffffd1 c000000000dd4990
GPR12: 0000000022004444 c00000000fe00800 0000000020000000 0000000000000000
GPR16: 0000000000000000 0000000043a606fc 0000000043a76c08 0000000043a1b3d0
GPR20: 000001002a35cd60 c0000001026bbb80 0000000000000000 0000000000000001
GPR24: 0000000000000240 0000000000000004 c00000062dc55000 0000000000000000
GPR28: 0000000000000004 c00000062ecd9200 0000000000000000 c0000001026bb6c0
NIP [c0000000004ef798] .assfail+0x28/0x30
LR [c0000000004ef798] .assfail+0x28/0x30
Call Trace:
[c0000001026bb310] [c0000000004ef798] .assfail+0x28/0x30 (unreliable)
[c0000001026bb380] [c000000000455d74] .xfs_alloc_space_available+0x194/0x1b0
[c0000001026bb410] [c00000000045b914] .xfs_alloc_fix_freelist+0x144/0x480
[c0000001026bb580] [c00000000045c368] .xfs_alloc_vextent+0x698/0xa90
[c0000001026bb650] [c0000000004a6200] .xfs_ialloc_ag_alloc+0x170/0x820
[c0000001026bb7c0] [c0000000004a9098] .xfs_dialloc+0x158/0x320
[c0000001026bb8a0] [c0000000004e628c] .xfs_ialloc+0x7c/0x610
[c0000001026bb990] [c0000000004e8138] .xfs_dir_ialloc+0xa8/0x2f0
[c0000001026bbaa0] [c0000000004e8814] .xfs_create+0x494/0x790
[c0000001026bbbf0] [c0000000004e5ebc] .xfs_generic_create+0x2bc/0x410
[c0000001026bbce0] [c0000000002b4a34] .vfs_mkdir+0x154/0x230
[c0000001026bbd70] [c0000000002bc444] .SyS_mkdirat+0x94/0x120
[c0000001026bbe30] [c00000000000b760] system_call+0x38/0xfc
Instruction dump:
4e800020 60000000 7c0802a6 7c862378 3c82ffca 7ca72b78 38841c18 7c651b78
38600000 f8010010 f821ff91 4bfff94d <0fe00000> 60000000 7c0802a6 7c892378

When block size is larger than inode cluster size, the call to
XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size) returns 0. Also, mkfs.xfs
would have set xfs_sb->sb_inoalignmt to 0. This causes
xfs_ialloc_cluster_alignment() to return 0.  Due to this
args.minalignslop (in xfs_ialloc_ag_alloc()) gets the unsigned
equivalent of -1 assigned to it. This later causes alloc_len in
xfs_alloc_space_available() to have a value of 0. In such a scenario
when args.total is also 0, the assert statement "ASSERT(args->maxlen >
0);" fails.

This commit fixes the bug by replacing the call to XFS_B_TO_FSBT() in
xfs_ialloc_cluster_alignment() with a call to xfs_icluster_size_fsb().

Suggested-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:32 +02:00
Brian Foster
5db7b41b60 xfs: don't reserve blocks for right shift transactions
commit 48af96ab92bc68fb645068b978ce36df2379e076 upstream.

The block reservation for the transaction allocated in
xfs_shift_file_space() is an artifact of the original collapse range
support. It exists to handle the case where a collapse range occurs,
the initial extent is left shifted into a location that forms a
contiguous boundary with the previous extent and thus the extents
are merged. This code was subsequently refactored and reused for
insert range (right shift) support.

If an insert range occurs under low free space conditions, the
extent at the starting offset is split before the first shift
transaction is allocated. If the block reservation fails, this
leaves separate, but contiguous extents around in the inode. While
not a fatal problem, this is unexpected and will flag a warning on
subsequent insert range operations on the inode. This problem has
been reproduce intermittently by generic/270 running against a
ramdisk device.

Since right shift does not create new extent boundaries in the
inode, a block reservation for extent merge is unnecessary. Update
xfs_shift_file_space() to conditionally reserve fs blocks for left
shift transactions only. This avoids the warning reproduced by
generic/270.

Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:32 +02:00
Darrick J. Wong
e5e2e56fd4 xfs: fix uninitialized variable in _reflink_convert_cow
commit 93aaead52a9eebdc20dc8fa673c350e592a06949 upstream.

Fix an uninitialize variable.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:31 +02:00
Brian Foster
c251c6c2de xfs: split indlen reservations fairly when under reserved
commit 75d65361cf3c0dae2af970c305e19c727b28a510 upstream.

Certain workoads that punch holes into speculative preallocation can
cause delalloc indirect reservation splits when the delalloc extent is
split in two. If further splits occur, an already short-handed extent
can be split into two in a manner that leaves zero indirect blocks for
one of the two new extents. This occurs because the shortage is large
enough that the xfs_bmap_split_indlen() algorithm completely drains the
requested indlen of one of the extents before it honors the existing
reservation.

This ultimately results in a warning from xfs_bmap_del_extent(). This
has been observed during file copies of large, sparse files using 'cp
--sparse=always.'

To avoid this problem, update xfs_bmap_split_indlen() to explicitly
apply the reservation shortage fairly between both extents. This smooths
out the overall indlen shortage and defers the situation where we end up
with a delalloc extent with zero indlen reservation to extreme
circumstances.

Reported-by: Patrick Dung <mpatdung@gmail.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:31 +02:00
Brian Foster
2d7c1c7ffa xfs: handle indlen shortage on delalloc extent merge
commit 0e339ef8556d9e567aa7925f8892c263d79430d9 upstream.

When a delalloc extent is created, it can be merged with pre-existing,
contiguous, delalloc extents. When this occurs,
xfs_bmap_add_extent_hole_delay() merges the extents along with the
associated indirect block reservations. The expectation here is that the
combined worst case indlen reservation is always less than or equal to
the indlen reservation for the individual extents.

This is not always the case, however, as existing extents can less than
the expected indlen reservation if the extent was previously split due
to a hole punch. If a new extent merges with such an extent, the total
indlen requirement may be larger than the sum of the indlen reservations
held by both extents.

xfs_bmap_add_extent_hole_delay() assumes that the worst case indlen
reservation is always available and assigns it to the merged extent
without consideration for the indlen held by the pre-existing extent. As
a result, the subsequent xfs_mod_fdblocks() call can attempt an
unintentional allocation rather than a free (indicated by an ASSERT()
failure). Further, if the allocation happens to fail in this context,
the failure goes unhandled and creates a filesystem wide block
accounting inconsistency.

Fix xfs_bmap_add_extent_hole_delay() to function as designed. Cap the
indlen reservation assigned to the merged extent to the sum of the
indlen reservations held by each of the individual extents.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:31 +02:00
Christoph Hellwig
47d7d1ea6c xfs: don't fail xfs_extent_busy allocation
commit 5e30c23d13919a718b22d4921dc5c0accc59da27 upstream.

We don't just need the structure to track busy extents which can be
avoided with a synchronous transaction, but also to keep track of
pending discard.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:31 +02:00
Christoph Hellwig
5bbf5ba693 xfs: reject all unaligned direct writes to reflinked files
commit 54a4ef8af4e0dc5c983d17fcb9cf5fd25666d94e upstream.

We currently fall back from direct to buffered writes if we detect a
remaining shared extent in the iomap_begin callback.  But by the time
iomap_begin is called for the potentially unaligned end block we might
have already written most of the data to disk, which we'd now write
again using buffered I/O.  To avoid this reject all writes to reflinked
files before starting I/O so that we are guaranteed to only write the
data once.

The alternative would be to unshare the unaligned start and/or end block
before doing the I/O. I think that's doable, and will actually be
required to support reflinks on DAX file system.  But it will take a
little more time and I'd rather get rid of the double write ASAP.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[slight changes in context due to the new direct I/O code in 4.10+]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:31 +02:00
Christoph Hellwig
67eb7bf836 xfs: update ctime and mtime on clone destinatation inodes
commit c5ecb42342852892f978572ddc6dca703460f25a upstream.

We're changing both metadata and data, so we need to update the
timestamps for clone operations.  Dedupe on the other hand does
not change file data, and only changes invisible metadata so the
timestamps should not be updated.

This follows existing btrfs behavior.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: remove redundant is_dedupe test]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:31 +02:00
Hou Tao
e060f4884c xfs: reset b_first_retry_time when clear the retry status of xfs_buf_t
commit 4dd2eb633598cb6a5a0be2fd9a2be0819f5eeb5f upstream.

After successful IO or permanent error, b_first_retry_time also
needs to be cleared, else the invalid first retry time will be
used by the next retry check.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:31 +02:00
Darrick J. Wong
e02f0ff252 xfs: mark speculative prealloc CoW fork extents unwritten
commit 5eda43000064a69a39fb7869cc63c9571535ad29 upstream.

Christoph Hellwig pointed out that there's a potentially nasty race when
performing simultaneous nearby directio cow writes:

"Thread 1 writes a range from B to c

"                    B --------- C
                           p

"a little later thread 2 writes from A to B

"        A --------- B
               p

[editor's note: the 'p' denote cowextsize boundaries, which I added to
make this more clear]

"but the code preallocates beyond B into the range where thread
"1 has just written, but ->end_io hasn't been called yet.
"But once ->end_io is called thread 2 has already allocated
"up to the extent size hint into the write range of thread 1,
"so the end_io handler will splice the unintialized blocks from
"that preallocation back into the file right after B."

We can avoid this race by ensuring that thread 1 cannot accidentally
remap the blocks that thread 2 allocated (as part of speculative
preallocation) as part of t2's write preparation in t1's end_io handler.
The way we make this happen is by taking advantage of the unwritten
extent flag as an intermediate step.

Recall that when we begin the process of writing data to shared blocks,
we create a delayed allocation extent in the CoW fork:

D: --RRRRRRSSSRRRRRRRR---
C: ------DDDDDDD---------

When a thread prepares to CoW some dirty data out to disk, it will now
convert the delalloc reservation into an /unwritten/ allocated extent in
the cow fork.  The da conversion code tries to opportunistically
allocate as much of a (speculatively prealloc'd) extent as possible, so
we may end up allocating a larger extent than we're actually writing
out:

D: --RRRRRRSSSRRRRRRRR---
U: ------UUUUUUU---------

Next, we convert only the part of the extent that we're actively
planning to write to normal (i.e. not unwritten) status:

D: --RRRRRRSSSRRRRRRRR---
U: ------UURRUUU---------

If the write succeeds, the end_cow function will now scan the relevant
range of the CoW fork for real extents and remap only the real extents
into the data fork:

D: --RRRRRRRRSRRRRRRRR---
U: ------UU--UUU---------

This ensures that we never obliterate valid data fork extents with
unwritten blocks from the CoW fork.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:31 +02:00
Darrick J. Wong
8370826f7d xfs: allow unwritten extents in the CoW fork
commit 05a630d76bd3f39baf0eecfa305bed2820796dee upstream.

In the data fork, we only allow extents to perform the following state
transitions:

delay -> real <-> unwritten

There's no way to move directly from a delalloc reservation to an
/unwritten/ allocated extent.  However, for the CoW fork we want to be
able to do the following to each extent:

delalloc -> unwritten -> written -> remapped to data fork

This will help us to avoid a race in the speculative CoW preallocation
code between a first thread that is allocating a CoW extent and a second
thread that is remapping part of a file after a write.  In order to do
this, however, we need two things: first, we have to be able to
transition from da to unwritten, and second the function that converts
between real and unwritten has to be made aware of the cow fork.  Do
both of those things.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:31 +02:00
Darrick J. Wong
3d2bd2fd5c xfs: verify free block header fields
commit de14c5f541e78c59006bee56f6c5c2ef1ca07272 upstream.

Perform basic sanity checking of the directory free block header
fields so that we avoid hanging the system on invalid data.

(Granted that just means that now we shutdown on directory write,
but that seems better than hanging...)

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:31 +02:00
Darrick J. Wong
4056a74aaf xfs: check for obviously bad level values in the bmbt root
commit b3bf607d58520ea8c0666aeb4be60dbb724cd3a2 upstream.

We can't handle a bmbt that's taller than BTREE_MAXLEVELS, and there's
no such thing as a zero-level bmbt (for that we have extents format),
so if we see this, send back an error code.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:31 +02:00
Darrick J. Wong
efab3ae29c xfs: filter out obviously bad btree pointers
commit d5a91baeb6033c3392121e4d5c011cdc08dfa9f7 upstream.

Don't let anybody load an obviously bad btree pointer.  Since the values
come from disk, we must return an error, not just ASSERT.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08 09:30:30 +02:00