Commit Graph

638488 Commits

Author SHA1 Message Date
3272bad0c2 usb: dwc3: replace %p with %pK
commit 04fb365c45 upstream.

%p will leak kernel pointers, so let's not expose the information on
dmesg and instead use %pK. %pK will only show the actual addresses if
explicitly enabled under /proc/sys/kernel/kptr_restrict.

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-12 15:01:03 +02:00
366d9207d9 drm/virtio: don't leak bo on drm_gem_object_init failure
commit 385aee965b upstream.

Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170406155941.458-1-kraxel@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-12 15:01:03 +02:00
b96976c1a8 media: entity: Fix stream count check
commit 41387a59c8 upstream.

There's a sanity check for the stream count remaining positive or zero on
error path, but instead of performing the check on the traversed entity it
is performed on the entity where traversal ends. Fix this.

Fixes: commit 3801bc7d1b ("[media] media: Media Controller fix to not let stream_count go negative")

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-12 15:01:02 +02:00
3693042f1c tracing/kprobes: Allow to create probe with a module name starting with a digit
commit 9e52b32567 upstream.

Always try to parse an address, since kstrtoul() will safely fail when
given a symbol as input. If that fails (which will be the case for a
symbol), try to parse a symbol instead.

This allows creating a probe such as:

    p:probe/vlan_gro_receive 8021q:vlan_gro_receive+0

Which is necessary for this command to work:

    perf probe -m 8021q -a vlan_gro_receive

Link: http://lkml.kernel.org/r/fd72d666f45b114e2c5b9cf7e27b91de1ec966f1.1498122881.git.sd@queasysnail.net

Fixes: 413d37d1e ("tracing: Add kprobe-based event tracer")
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-12 15:01:02 +02:00
9403514ba1 ceph: choose readdir frag based on previous readdir reply
commit b50c2de51e upstream.

The dirfragtree is lazily updated, it's not always accurate. Infinite
loops happens in following circumstance.

- client send request to read frag A
- frag A has been fragmented into frag B and C. So mds fills the reply
  with contents of frag B
- client wants to read next frag C. ceph_choose_frag(frag value of C)
  return frag A.

The fix is using previous readdir reply to calculate next readdir frag
when possible.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-12 15:01:02 +02:00
c4c592b2c1 driver core: platform: fix race condition with driver_override
commit 6265539776 upstream.

The driver_override implementation is susceptible to race condition when
different threads are reading vs storing a different driver override.
Add locking to avoid race condition.

Fixes: 3d713e0e38 ("driver core: platform: add device binding path 'driver_override'")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Salido <salidoa@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-12 15:01:02 +02:00
26ff065b84 fs: completely ignore unknown open flags
commit 629e014bb8 upstream.

Currently we just stash anything we got into file->f_flags, and the
report it in fcntl(F_GETFD).  This patch just clears out all unknown
flags so that we don't pass them to the fs or report them.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-12 15:01:02 +02:00
6efb1b0b6c fs: add a VALID_OPEN_FLAGS
commit 80f18379a7 upstream.

Add a central define for all valid open flags, and use it in the uniqueness
check.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-12 15:01:02 +02:00
9f86f302ec Linux 4.9.36 2017-07-05 14:40:44 +02:00
a29fd27ca2 KVM: nVMX: Fix exception injection
commit d4912215d1 upstream.

 WARNING: CPU: 3 PID: 2840 at arch/x86/kvm/vmx.c:10966 nested_vmx_vmexit+0xdcd/0xde0 [kvm_intel]
 CPU: 3 PID: 2840 Comm: qemu-system-x86 Tainted: G           OE   4.12.0-rc3+ #23
 RIP: 0010:nested_vmx_vmexit+0xdcd/0xde0 [kvm_intel]
 Call Trace:
  ? kvm_check_async_pf_completion+0xef/0x120 [kvm]
  ? rcu_read_lock_sched_held+0x79/0x80
  vmx_queue_exception+0x104/0x160 [kvm_intel]
  ? vmx_queue_exception+0x104/0x160 [kvm_intel]
  kvm_arch_vcpu_ioctl_run+0x1171/0x1ce0 [kvm]
  ? kvm_arch_vcpu_load+0x47/0x240 [kvm]
  ? kvm_arch_vcpu_load+0x62/0x240 [kvm]
  kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
  ? kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
  ? __fget+0xf3/0x210
  do_vfs_ioctl+0xa4/0x700
  ? __fget+0x114/0x210
  SyS_ioctl+0x79/0x90
  do_syscall_64+0x81/0x220
  entry_SYSCALL64_slow_path+0x25/0x25

This is triggered occasionally by running both win7 and win2016 in L2, in
addition, EPT is disabled on both L1 and L2. It can't be reproduced easily.

Commit 0b6ac343fc (KVM: nVMX: Correct handling of exception injection) mentioned
that "KVM wants to inject page-faults which it got to the guest. This function
assumes it is called with the exit reason in vmcs02 being a #PF exception".
Commit e011c663 (KVM: nVMX: Check all exceptions for intercept during delivery to
L2) allows to check all exceptions for intercept during delivery to L2. However,
there is no guarantee the exit reason is exception currently, when there is an
external interrupt occurred on host, maybe a time interrupt for host which should
not be injected to guest, and somewhere queues an exception, then the function
nested_vmx_check_exception() will be called and the vmexit emulation codes will
try to emulate the "Acknowledge interrupt on exit" behavior, the warning is
triggered.

Reusing the exit reason from the L2->L0 vmexit is wrong in this case,
the reason must always be EXCEPTION_NMI when injecting an exception into
L1 as a nested vmexit.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Fixes: e011c663b9 ("KVM: nVMX: Check all exceptions for intercept during delivery to L2")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:31 +02:00
d1d3756f07 KVM: x86: zero base3 of unusable segments
commit f0367ee1d6 upstream.

Static checker noticed that base3 could be used uninitialized if the
segment was not present (useable).  Random stack values probably would
not pass VMCS entry checks.

Reported-by:  Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 1aa366163b ("KVM: x86 emulator: consolidate segment accessors")
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:31 +02:00
f3c3ec96e5 KVM: x86/vPMU: fix undefined shift in intel_pmu_refresh()
commit 34b0dadbdf upstream.

Static analysis noticed that pmu->nr_arch_gp_counters can be 32
(INTEL_PMC_MAX_GENERIC) and therefore cannot be used to shift 'int'.

I didn't add BUILD_BUG_ON for it as we have a better checker.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 25462f7f52 ("KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch")
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:31 +02:00
1eeb794263 KVM: x86: fix emulation of RSM and IRET instructions
commit 6ed071f051 upstream.

On AMD, the effect of set_nmi_mask called by emulate_iret_real and em_rsm
on hflags is reverted later on in x86_emulate_instruction where hflags are
overwritten with ctxt->emul_flags (the kvm_set_hflags call). This manifests
as a hang when rebooting Windows VMs with QEMU, OVMF, and >1 vcpu.

Instead of trying to merge ctxt->emul_flags into vcpu->arch.hflags after
an instruction is emulated, this commit deletes emul_flags altogether and
makes the emulator access vcpu->arch.hflags using two new accessors. This
way all changes, on the emulator side as well as in functions called from
the emulator and accessing vcpu state with emul_to_vcpu, are preserved.

More details on the bug and its manifestation with Windows and OVMF:

  It's a KVM bug in the interaction between SMI/SMM and NMI, specific to AMD.
  I believe that the SMM part explains why we started seeing this only with
  OVMF.

  KVM masks and unmasks NMI when entering and leaving SMM. When KVM emulates
  the RSM instruction in em_rsm, the set_nmi_mask call doesn't stick because
  later on in x86_emulate_instruction we overwrite arch.hflags with
  ctxt->emul_flags, effectively reverting the effect of the set_nmi_mask call.
  The AMD-specific hflag of interest here is HF_NMI_MASK.

  When rebooting the system, Windows sends an NMI IPI to all but the current
  cpu to shut them down. Only after all of them are parked in HLT will the
  initiating cpu finish the restart. If NMI is masked, other cpus never get
  the memo and the initiating cpu spins forever, waiting for
  hal!HalpInterruptProcessorsStarted to drop. That's the symptom we observe.

Fixes: a584539b24 ("KVM: x86: pass the whole hflags field to emulator and back")
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:31 +02:00
982d8d92f2 arm64: fix NULL dereference in have_cpu_die()
commit 335d2c2d19 upstream.

Commit 5c492c3f52 ("arm64: smp: Add function to determine if cpus are
stuck in the kernel") added a helper function to determine if die() is
supported in cpu_ops. This function assumes a cpu will have a valid
cpu_ops entry, but that may not be the case for cpu0 is spin-table or
parking protocol is used to boot secondary cpus. In that case, there
is a NULL dereference if have_cpu_die() is called by cpu0. So add a
check for a valid cpu_ops before dereferencing it.

Fixes: 5c492c3f52 ("arm64: smp: Add function to determine if cpus are stuck in the kernel")
Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:31 +02:00
a4bfcab309 mtd: nand: brcmnand: Check flash #WP pin status before nand erase/program
commit 9d2ee0a60b upstream.

On brcmnand controller v6.x and v7.x, the #WP pin is controlled through
the NAND_WP bit in CS_SELECT register.

The driver currently assumes that toggling the #WP pin is
instantaneously enabling/disabling write-protection, but it actually
takes some time to propagate the new state to the internal NAND chip
logic. This behavior is sometime causing data corruptions when an
erase/program operation is executed before write-protection has really
been disabled.

Fixes: 27c5b17cd1 ("mtd: nand: add NAND driver "library" for Broadcom STB NAND controller")
Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:31 +02:00
de5862335e i2c: brcmstb: Fix START and STOP conditions
commit 2de3ec4f1d upstream.

The BSC data buffers to send and receive data are each of size 32 bytes
or 8 bytes 'xfersz' depending on SoC. The problem observed for all the
combined message transfer was if length of data transfer was a multiple
of 'xfersz' a repeated START was being transmitted by BSC driver. Fixed
this by appropriately setting START/STOP conditions for such transfers.

Fixes: dd1aa2524b ("i2c: brcmstb: Add Broadcom settop SoC i2c controller driver")
Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
Acked-by: Kamal Dasu <kdasu.kdev@gmail.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:30 +02:00
8ee785016d brcmfmac: avoid writing channel out of allocated array
commit 77c0d0cd10 upstream.

Our code was assigning number of channels to the index variable by
default. If firmware reported channel we didn't predict this would
result in using that initial index value and writing out of array. This
never happened so far (we got a complete list of supported channels) but
it means possible memory corruption so we should handle it anyway.

This patch simply detects unexpected channel and ignores it.

As we don't try to create new entry now, it's also safe to drop hw_value
and center_freq assignment. For known channels we have these set anyway.

I decided to fix this issue by assigning NULL or a target channel to the
channel variable. This was one of possible ways, I prefefred this one as
it also avoids using channel[index] over and over.

Fixes: 58de92d2f9 ("brcmfmac: use static superset of channels for wiphy bands")
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:30 +02:00
65fc82cea8 infiniband: hns: avoid gcc-7.0.1 warning for uninitialized data
commit 5b0ff9a007 upstream.

hns_roce_v1_cq_set_ci() calls roce_set_bit() on an uninitialized field,
which will then change only a few of its bits, causing a warning with
the latest gcc:

infiniband/hw/hns/hns_roce_hw_v1.c: In function 'hns_roce_v1_cq_set_ci':
infiniband/hw/hns/hns_roce_hw_v1.c:1854:23: error: 'doorbell[1]' is used uninitialized in this function [-Werror=uninitialized]
  roce_set_bit(doorbell[1], ROCEE_DB_OTHERS_H_ROCEE_DB_OTH_HW_SYNS_S, 1);

The code is actually correct since we always set all bits of the
port_vlan field, but gcc correctly points out that the first
access does contain uninitialized data.

This initializes the field to zero first before setting the
individual bits.

Fixes: 9a4435375c ("IB/hns: Add driver files for hns RoCE driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:30 +02:00
3e51ccbadd objtool: Fix another GCC jump table detection issue
commit 5c51f4ae84 upstream.

Arnd Bergmann reported a (false positive) objtool warning:

  drivers/infiniband/sw/rxe/rxe_resp.o: warning: objtool: rxe_responder()+0xfe: sibling call from callable instruction with changed frame pointer

The issue is in find_switch_table().  It tries to find a switch
statement's jump table by walking backwards from an indirect jump
instruction, looking for a relocation to the .rodata section.  In this
case it stopped walking prematurely: the first .rodata relocation it
encountered was for a variable (resp_state_name) instead of a jump
table, so it just assumed there wasn't a jump table.

The fix is to ignore any .rodata relocation which refers to an ELF
object symbol.  This works because the jump tables are anonymous and
have no symbols associated with them.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 3732710ff6 ("objtool: Improve rare switch jump table pattern detection")
Link: http://lkml.kernel.org/r/20170302225723.3ndbsnl4hkqbne7a@treble
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:30 +02:00
92e6667652 clk: scpi: don't add cpufreq device if the scpi dvfs node is disabled
commit 67bcc2c5f1 upstream.

Currently we add the virtual cpufreq device unconditionally even when
the SCPI DVFS clock provider node is disabled. This will cause cpufreq
driver to throw errors when it gets initailised on boot/modprobe and
also when the CPUs are hot-plugged back in.

This patch fixes the issue by adding the virtual cpufreq device only if
the SCPI DVFS clock provider is available and registered.

Fixes: 9490f01e24 ("clk: scpi: add support for cpufreq virtual device")
Reported-by: Michał Zegan <webczat_200@poczta.onet.pl>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Michał Zegan <webczat_200@poczta.onet.pl>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:30 +02:00
8a6f400a37 cpufreq: s3c2416: double free on driver init error path
commit a69261e447 upstream.

The "goto err_armclk;" error path already does a clk_put(s3c_freq->hclk);
so this is a double free.

Fixes: 34ee550752 ([CPUFREQ] Add S3C2416/S3C2450 cpufreq driver)
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:30 +02:00
1781a29b31 iommu/amd: Fix interrupt remapping when disable guest_mode
commit 84a21dbdef upstream.

Pass-through devices to VM guest can get updated IRQ affinity
information via irq_set_affinity() when not running in guest mode.
Currently, AMD IOMMU driver in GA mode ignores the updated information
if the pass-through device is setup to use vAPIC regardless of guest_mode.
This could cause invalid interrupt remapping.

Also, the guest_mode bit should be set and cleared only when
SVM updates posted-interrupt interrupt remapping information.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Joerg Roedel <jroedel@suse.de>
Fixes: d98de49a53 ('iommu/amd: Enable vAPIC interrupt remapping mode by default')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:30 +02:00
0e55856b8f iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid()
commit 73dbd4a423 upstream.

In function amd_iommu_bind_pasid(), the control flow jumps
to label out_free when pasid_state->mm and mm is NULL. And
mmput(mm) is called.  In function mmput(mm), mm is
referenced without validation. This will result in a NULL
dereference bug. This patch fixes the bug.

Signed-off-by: Pan Bian <bianpan2016@163.com>
Fixes: f0aac63b87 ('iommu/amd: Don't hold a reference to mm_struct')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:30 +02:00
f0c31c674a iommu/dma: Don't reserve PCI I/O windows
commit 938f1bbe35 upstream.

Even if a host controller's CPU-side MMIO windows into PCI I/O space do
happen to leak into PCI memory space such that it might treat them as
peer addresses, trying to reserve the corresponding I/O space addresses
doesn't do anything to help solve that problem. Stop doing a silly thing.

Fixes: fade1ec055 ("iommu/dma: Avoid PCI host bridge windows")
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:30 +02:00
d7fcb303d1 iommu: Handle default domain attach failure
commit 797a8b4d76 upstream.

We wouldn't normally expect ops->attach_dev() to fail, but on IOMMUs
with limited hardware resources, or generally misconfigured systems,
it is certainly possible. We report failure correctly from the external
iommu_attach_device() interface, but do not do so in iommu_group_add()
when attaching to the default domain. The result of failure there is
that the device, group and domain all get left in a broken,
part-configured state which leads to weird errors and misbehaviour down
the line when IOMMU API calls sort-of-but-don't-quite work.

Check the return value of __iommu_attach_device() on the default domain,
and refactor the error handling paths to cope with its failure and clean
up correctly in such cases.

Fixes: e39cb8a3aa ("iommu: Make sure a device is always attached to a domain")
Reported-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:30 +02:00
c19bfc6765 iommu/vt-d: Don't over-free page table directories
commit f7116e115a upstream.

dma_pte_free_level() recurses down the IOMMU page tables and frees
directory pages that are entirely contained in the given PFN range.
Unfortunately, it incorrectly calculates the starting address covered
by the PTE under consideration, which can lead to it clearing an entry
that is still in use.

This occurs if we have a scatterlist with an entry that has a length
greater than 1026 MB and is aligned to 2 MB for both the IOMMU and
physical addresses. For example, if __domain_mapping() is asked to map a
two-entry scatterlist with 2 MB and 1028 MB segments to PFN 0xffff80000,
it will ask if dma_pte_free_pagetable() is asked to PFNs from
0xffff80200 to 0xffffc05ff, it will also incorrectly clear the PFNs from
0xffff80000 to 0xffff801ff because of this issue. The current code will
set level_pfn to 0xffff80200, and 0xffff80200-0xffffc01ff fits inside
the range being cleared. Properly setting the level_pfn for the current
level under consideration catches that this PTE is outside of the range
being cleared.

This patch also changes the value passed into dma_pte_free_level() when
it recurses. This only affects the first PTE of the range being cleared,
and is handled by the existing code that ensures we start our cursor no
lower than start_pfn.

This was found when using dma_map_sg() to map large chunks of contiguous
memory, which immediatedly led to faults on the first access of the
erroneously-deleted mappings.

Fixes: 3269ee0bd6 ("intel-iommu: Fix leaks in pagetable freeing")
Reviewed-by: Benjamin Serebrin <serebrin@google.com>
Signed-off-by: David Dillow <dillow@google.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:29 +02:00
d5c5e8ba5d ocfs2: o2hb: revert hb threshold to keep compatible
commit 33496c3c3d upstream.

Configfs is the interface for ocfs2-tools to set configure to kernel and
$configfs_dir/cluster/$clustername/heartbeat/dead_threshold is the one
used to configure heartbeat dead threshold.  Kernel has a default value
of it but user can set O2CB_HEARTBEAT_THRESHOLD in /etc/sysconfig/o2cb
to override it.

Commit 45b997737a ("ocfs2/cluster: use per-attribute show and store
methods") changed heartbeat dead threshold name while ocfs2-tools did
not, so ocfs2-tools won't set this configurable and the default value is
always used.  So revert it.

Fixes: 45b997737a ("ocfs2/cluster: use per-attribute show and store methods")
Link: http://lkml.kernel.org/r/1490665245-15374-1-git-send-email-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Acked-by: Joseph Qi <jiangqi903@gmail.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:29 +02:00
8af88a950b x86/mm: Fix flush_tlb_page() on Xen
commit dbd68d8e84 upstream.

flush_tlb_page() passes a bogus range to flush_tlb_others() and
expects the latter to fix it up.  native_flush_tlb_others() has the
fixup but Xen's version doesn't.  Move the fixup to
flush_tlb_others().

AFAICS the only real effect is that, without this fix, Xen would
flush everything instead of just the one page on remote vCPUs in
when flush_tlb_page() was called.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: e7b52ffd45 ("x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range")
Link: http://lkml.kernel.org/r/10ed0e4dfea64daef10b87fb85df1746999b4dba.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:29 +02:00
3667dafd6c x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space
commit 5ed386ec09 upstream.

When this function fails it just sends a SIGSEGV signal to
user-space using force_sig(). This signal is missing
essential information about the cause, e.g. the trap_nr or
an error code.

Fix this by propagating the error to the only caller of
mpx_handle_bd_fault(), do_bounds(), which sends the correct
SIGSEGV signal to the process.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: fe3d197f84 ('x86, mpx: On-demand kernel allocation of bounds tables')
Link: http://lkml.kernel.org/r/1491488362-27198-1-git-send-email-joro@8bytes.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:29 +02:00
b287ade87c x86/boot/KASLR: Fix kexec crash due to 'virt_addr' calculation bug
commit 8eabf42ae5 upstream.

Kernel text KASLR is separated into physical address and virtual
address randomization. And for virtual address randomization, we
only randomiza to get an offset between 16M and KERNEL_IMAGE_SIZE.
So the initial value of 'virt_addr' should be LOAD_PHYSICAL_ADDR,
but not the original kernel loading address 'output'.

The bug will cause kernel boot failure if kernel is loaded at a different
position than the address, 16M, which is decided at compiled time.
Kexec/kdump is such practical case.

To fix it, just assign LOAD_PHYSICAL_ADDR to virt_addr as initial
value.

Tested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 8391c73 ("x86/KASLR: Randomize virtual address separately")
Link: http://lkml.kernel.org/r/1498567146-11990-3-git-send-email-bhe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:29 +02:00
15541e6416 tools arch: Sync arch/x86/lib/memcpy_64.S with the kernel
commit e883d09c9e upstream.

Just a minor fix done in:

  Fixes: 26a37ab319 ("x86/mce: Fix copy/paste error in exception table entries")

Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/n/tip-ni9jzdd5yxlail6pq8cuexw2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:29 +02:00
a2c222bef0 ARM: 8685/1: ensure memblock-limit is pmd-aligned
commit 9e25ebfe56 upstream.

The pmd containing memblock_limit is cleared by prepare_page_table()
which creates the opportunity for early_alloc() to allocate unmapped
memory if memblock_limit is not pmd aligned causing a boot-time hang.

Commit 965278dcb8 ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM")
attempted to resolve this problem, but there is a path through the
adjust_lowmem_bounds() routine where if all memory regions start and
end on pmd-aligned addresses the memblock_limit will be set to
arm_lowmem_limit.

Since arm_lowmem_limit can be affected by the vmalloc early parameter,
the value of arm_lowmem_limit may not be pmd-aligned. This commit
corrects this oversight such that memblock_limit is always rounded
down to pmd-alignment.

Fixes: 965278dcb8 ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:29 +02:00
7661b19687 ARM64/ACPI: Fix BAD_MADT_GICC_ENTRY() macro implementation
commit cb7cf772d8 upstream.

The BAD_MADT_GICC_ENTRY() macro checks if a GICC MADT entry passes
muster from an ACPI specification standpoint. Current macro detects the
MADT GICC entry length through ACPI firmware version (it changed from 76
to 80 bytes in the transition from ACPI 5.1 to ACPI 6.0 specification)
but always uses (erroneously) the ACPICA (latest) struct (ie struct
acpi_madt_generic_interrupt - that is 80-bytes long) length to check if
the current GICC entry memory record exceeds the MADT table end in
memory as defined by the MADT table header itself, which may result in
false negatives depending on the ACPI firmware version and how the MADT
entries are laid out in memory (ie on ACPI 5.1 firmware MADT GICC
entries are 76 bytes long, so by adding 80 to a GICC entry start address
in memory the resulting address may well be past the actual MADT end,
triggering a false negative).

Fix the BAD_MADT_GICC_ENTRY() macro by reshuffling the condition checks
and update them to always use the firmware version specific MADT GICC
entry length in order to carry out boundary checks.

Fixes: b6cfb27737 ("ACPI / ARM64: add BAD_MADT_GICC_ENTRY() macro")
Reported-by: Julien Grall <julien.grall@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Al Stone <ahs3@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:29 +02:00
4efe34b500 ARM: dts: OMAP3: Fix MFG ID EEPROM
commit 06e1a5cc57 upstream.

The manufacturing information is stored in the EEPROM.  This chip
is an AT24C64 not not (nor has it ever been) 24C02.  This patch will
correctly address the EEPROM to read the entire contents and not just
256 bytes (of 0xff).

Fixes: 5e3447a29a ("ARM: dts: LogicPD Torpedo: Add AT24 EEPROM Support")

Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:29 +02:00
07bb2c7e7e ARM: OMAP2+: omap_device: Sync omap_device and pm_runtime after probe defer
commit 04abaf07f6 upstream.

Starting from commit 5de85b9d57 ("PM / runtime: Re-init runtime PM
states at probe error and driver unbind") pm_runtime core now changes
device runtime_status back to after RPM_SUSPENDED after a probe defer.
Certain OMAP devices make use of "ti,no-idle-on-init" flag which causes
omap_device_enable to be called during the BUS_NOTIFY_ADD_DEVICE event
during probe, along with pm_runtime_set_active.

This call to pm_runtime_set_active typically will prevent a call to
pm_runtime_get in a driver probe function from re-enabling the
omap_device. However, in the case of a probe defer that happens before
the driver probe function is able to run, such as a missing pinctrl
states defer, pm_runtime_reinit will set the device as RPM_SUSPENDED and
then once driver probe is actually able to run, pm_runtime_get will see
the device as suspended and call through to the omap_device layer,
attempting to enable the already enabled omap_device and causing errors
like this:

omap-gpmc 50000000.gpmc: omap_device: omap_device_enable() called from
invalid state 1
omap-gpmc 50000000.gpmc: use pm_runtime_put_sync_suspend() in driver?

We can avoid this error by making sure the pm_runtime status of a device
matches the omap_device state before a probe attempt. By extending the
omap_device bus notifier to act on the BUS_NOTIFY_BIND_DRIVER event we
can check if a device is enabled in omap_device but with a pm_runtime
status of RPM_SUSPENDED and once again mark the device as RPM_ACTIVE to
avoid a second incorrect call to omap_device_enable.

Fixes: 5de85b9d57 ("PM / runtime: Re-init runtime PM states at probe
error and driver unbind")
Tested-by: Franklin S Cooper Jr. <fcooper@ti.com>
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:29 +02:00
e57aa416ca regulator: tps65086: Fix DT node referencing in of_parse_cb
commit 6308f1787f upstream.

When we check for additional DT properties in the current node we
use the device_node passed in with the configuration data, this
will not point to the correct DT node, use the one passed in
for this purpose.

Fixes: d2a2e729a6 ("regulator: tps65086: Add regulator driver for the TPS65086 PMIC")
Reported-by: Steven Kipisz <s-kipisz2@ti.com>
Signed-off-by: Andrew F. Davis <afd@ti.com>
Tested-by: Steven Kipisz <s-kipisz2@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:29 +02:00
88baad2e71 regulator: tps65086: Fix expected switch DT node names
commit 1c47f7c316 upstream.

The three load switches are called SWA1, SWB1, and SWB2. The
node names describing properties for these are expected to be
the same, but due to a typo they are not. Fix this here.

Fixes: d2a2e729a6 ("regulator: tps65086: Add regulator driver for the TPS65086 PMIC")
Reported-by: Steven Kipisz <s-kipisz2@ti.com>
Signed-off-by: Andrew F. Davis <afd@ti.com>
Tested-by: Steven Kipisz <s-kipisz2@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:28 +02:00
9846c67974 spi: fix device-node leaks
commit 8324147f38 upstream.

Make sure to release the device-node reference taken in
of_register_spi_device() on errors and when deregistering the device.

Fixes: 284b018973 ("spi: Add OF binding support for SPI busses")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:28 +02:00
c52829f60f spi: When no dma_chan map buffers with spi_master's parent
commit 88b0aa544a upstream.

Back before commit 1dccb598df ("arm64: simplify dma_get_ops"), for
arm64, devices for which dma_ops were not explicitly set were automatically
configured to use swiotlb_dma_ops, since this was hard-coded as the
global "dma_ops" in arm64_dma_init().

Now that global "dma_ops" has been removed, all devices much have their
dma_ops explicitly set by a call to arch_setup_dma_ops(), otherwise the
device is assigned dummy_dma_ops, and thus calls to map_sg for such a
device will fail (return 0).

Mediatek SPI uses DMA but does not use a dma channel.  Support for this
was added by commit c37f45b5f1 ("spi: support spi without dma channel
to use can_dma()"), which uses the master_spi dev to DMA map buffers.

The master_spi device is not a platform device, rather it is created
in spi_alloc_device(), and therefore its dma_ops are never set.

Therefore, when the mediatek SPI driver when it does DMA (for large SPI
transactions > 32 bytes), SPI will use spi_map_buf()->dma_map_sg() to
map the buffer for use in DMA.  But dma_map_sg()->dma_map_sg_attrs() returns
0, because ops->map_sg is dummy_dma_ops->__dummy_map_sg, and hence
spi_map_buf() returns -ENOMEM (-12).

Fix this by using the real spi_master's parent device which should be a
real physical device with DMA properties.

Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Fixes: c37f45b5f1 ("spi: support spi without dma channel to use can_dma()")
Cc: Leilk Liu <leilk.liu@mediatek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:28 +02:00
478273e115 sched/loadavg: Avoid loadavg spikes caused by delayed NO_HZ accounting
commit 6e5f32f7a4 upstream.

If we crossed a sample window while in NO_HZ we will add LOAD_FREQ to
the pending sample window time on exit, setting the next update not
one window into the future, but two.

This situation on exiting NO_HZ is described by:

  this_rq->calc_load_update < jiffies < calc_load_update

In this scenario, what we should be doing is:

  this_rq->calc_load_update = calc_load_update		     [ next window ]

But what we actually do is:

  this_rq->calc_load_update = calc_load_update + LOAD_FREQ   [ next+1 window ]

This has the effect of delaying load average updates for potentially
up to ~9seconds.

This can result in huge spikes in the load average values due to
per-cpu uninterruptible task counts being out of sync when accumulated
across all CPUs.

It's safe to update the per-cpu active count if we wake between sample
windows because any load that we left in 'calc_load_idle' will have
been zero'd when the idle load was folded in calc_global_load().

This issue is easy to reproduce before,

  commit 9d89c257df ("sched/fair: Rewrite runnable load and utilization average tracking")

just by forking short-lived process pipelines built from ps(1) and
grep(1) in a loop. I'm unable to reproduce the spikes after that
commit, but the bug still seems to be present from code review.

Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Fixes: commit 5167e8d ("sched/nohz: Rewrite and fix load-avg computation -- again")
Link: http://lkml.kernel.org/r/20170217120731.11868-2-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:28 +02:00
eea0261db8 watchdog: bcm281xx: Fix use of uninitialized spinlock.
commit fedf266f99 upstream.

The bcm_kona_wdt_set_resolution_reg() call takes the spinlock, so
initialize it earlier.  Fixes a warning at boot with lock debugging
enabled.

Fixes: 6adb730dc2 ("watchdog: bcm281xx: Watchdog Driver")
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:28 +02:00
4211442b20 netfilter: use skb_to_full_sk in ip_route_me_harder
commit 29e09229d9 upstream.

inet_sk(skb->sk) is illegal in case skb is attached to request socket.

Fixes: ca6fb06518 ("tcp: attach SYNACK messages to request sockets instead of listener")
Reported by: Daniel J Blueman <daniel@quora.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Tested-by: Daniel J Blueman <daniel@quora.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:28 +02:00
ac2730234c xfrm: Oops on error in pfkey_msg2xfrm_state()
commit 1e3d0c2c70 upstream.

There are some missing error codes here so we accidentally return NULL
instead of an error pointer.  It results in a NULL pointer dereference.

Fixes: df71837d50 ("[LSM-IPSec]: Security association restriction.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:28 +02:00
c460f2beb6 xfrm: NULL dereference on allocation failure
commit e747f64336 upstream.

The default error code in pfkey_msg2xfrm_state() is -ENOBUFS.  We
added a new call to security_xfrm_state_alloc() which sets "err" to zero
so there several places where we can return ERR_PTR(0) if kmalloc()
fails.  The caller is expecting error pointers so it leads to a NULL
dereference.

Fixes: df71837d50 ("[LSM-IPSec]: Security association restriction.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:28 +02:00
1e1666257c xfrm: fix stack access out of bounds with CONFIG_XFRM_SUB_POLICY
commit 9b3eb54106 upstream.

When CONFIG_XFRM_SUB_POLICY=y, xfrm_dst stores a copy of the flowi for
that dst. Unfortunately, the code that allocates and fills this copy
doesn't care about what type of flowi (flowi, flowi4, flowi6) gets
passed. In multiple code paths (from raw_sendmsg, from TCP when
replying to a FIN, in vxlan, geneve, and gre), the flowi that gets
passed to xfrm is actually an on-stack flowi4, so we end up reading
stuff from the stack past the end of the flowi4 struct.

Since xfrm_dst->origin isn't used anywhere following commit
ca116922af ("xfrm: Eliminate "fl" and "pol" args to
xfrm_bundle_ok()."), just get rid of it.  xfrm_dst->partner isn't used
either, so get rid of that too.

Fixes: 9d6ec93801 ("ipv4: Use flowi4 in public route lookup interfaces.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:28 +02:00
647f605276 mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappings
commit 029c54b095 upstream.

Existing code that uses vmalloc_to_page() may assume that any address
for which is_vmalloc_addr() returns true may be passed into
vmalloc_to_page() to retrieve the associated struct page.

This is not un unreasonable assumption to make, but on architectures
that have CONFIG_HAVE_ARCH_HUGE_VMAP=y, it no longer holds, and we need
to ensure that vmalloc_to_page() does not go off into the weeds trying
to dereference huge PUDs or PMDs as table entries.

Given that vmalloc() and vmap() themselves never create huge mappings or
deal with compound pages at all, there is no correct answer in this
case, so return NULL instead, and issue a warning.

When reading /proc/kcore on arm64, you will hit an oops as soon as you
hit the huge mappings used for the various segments that make up the
mapping of vmlinux.  With this patch applied, you will no longer hit the
oops, but the kcore contents willl be incorrect (these regions will be
zeroed out)

We are fixing this for kcore specifically, so it avoids vread() for
those regions.  At least one other problematic user exists, i.e.,
/dev/kmem, but that is currently broken on arm64 for other reasons.

Link: http://lkml.kernel.org/r/20170609082226.26152-1-ard.biesheuvel@linaro.org
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Laura Abbott <labbott@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: zhong jiang <zhongjiang@huawei.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ardb: non-trivial backport to v4.9]
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:28 +02:00
f9f73c58fe ravb: Fix use-after-free on ifconfig eth0 down
[ Upstream commit 79514ef670 ]

Commit a47b70ea86 ("ravb: unmap descriptors when freeing rings") has
introduced the issue seen in [1] reproduced on H3ULCB board.

Fix this by relocating the RX skb ringbuffer free operation, so that
swiotlb page unmapping can be done first. Freeing of aligned TX buffers
is not relevant to the issue seen in [1]. Still, reposition TX free
calls as well, to have all kfree() operations performed consistently
_after_ dma_unmap_*()/dma_free_*().

[1] Console screenshot with the problem reproduced:

salvator-x login: root
root@salvator-x:~# ifconfig eth0 up
Micrel KSZ9031 Gigabit PHY e6800000.ethernet-ffffffff:00: \
       attached PHY driver [Micrel KSZ9031 Gigabit PHY]   \
       (mii_bus:phy_addr=e6800000.ethernet-ffffffff:00, irq=235)
IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
root@salvator-x:~#
root@salvator-x:~# ifconfig eth0 down

==================================================================
BUG: KASAN: use-after-free in swiotlb_tbl_unmap_single+0xc4/0x35c
Write of size 1538 at addr ffff8006d884f780 by task ifconfig/1649

CPU: 0 PID: 1649 Comm: ifconfig Not tainted 4.12.0-rc4-00004-g112eb07287d1 #32
Hardware name: Renesas H3ULCB board based on r8a7795 (DT)
Call trace:
[<ffff20000808f11c>] dump_backtrace+0x0/0x3a4
[<ffff20000808f4d4>] show_stack+0x14/0x1c
[<ffff20000865970c>] dump_stack+0xf8/0x150
[<ffff20000831f8b0>] print_address_description+0x7c/0x330
[<ffff200008320010>] kasan_report+0x2e0/0x2f4
[<ffff20000831eac0>] check_memory_region+0x20/0x14c
[<ffff20000831f054>] memcpy+0x48/0x68
[<ffff20000869ed50>] swiotlb_tbl_unmap_single+0xc4/0x35c
[<ffff20000869fcf4>] unmap_single+0x90/0xa4
[<ffff20000869fd14>] swiotlb_unmap_page+0xc/0x14
[<ffff2000080a2974>] __swiotlb_unmap_page+0xcc/0xe4
[<ffff2000088acdb8>] ravb_ring_free+0x514/0x870
[<ffff2000088b25dc>] ravb_close+0x288/0x36c
[<ffff200008aaf8c4>] __dev_close_many+0x14c/0x174
[<ffff200008aaf9b4>] __dev_close+0xc8/0x144
[<ffff200008ac2100>] __dev_change_flags+0xd8/0x194
[<ffff200008ac221c>] dev_change_flags+0x60/0xb0
[<ffff200008ba2dec>] devinet_ioctl+0x484/0x9d4
[<ffff200008ba7b78>] inet_ioctl+0x190/0x194
[<ffff200008a78c44>] sock_do_ioctl+0x78/0xa8
[<ffff200008a7a128>] sock_ioctl+0x110/0x3c4
[<ffff200008365a70>] vfs_ioctl+0x90/0xa0
[<ffff200008365dbc>] do_vfs_ioctl+0x148/0xc38
[<ffff2000083668f0>] SyS_ioctl+0x44/0x74
[<ffff200008083770>] el0_svc_naked+0x24/0x28

The buggy address belongs to the page:
page:ffff7e001b6213c0 count:0 mapcount:0 mapping:          (null) index:0x0
flags: 0x4000000000000000()
raw: 4000000000000000 0000000000000000 0000000000000000 00000000ffffffff
raw: 0000000000000000 ffff7e001b6213e0 0000000000000000 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8006d884f680: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff8006d884f700: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>ffff8006d884f780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                   ^
 ffff8006d884f800: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff8006d884f880: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================
Disabling lock debugging due to kernel taint
root@salvator-x:~#

Fixes: a47b70ea86 ("ravb: unmap descriptors when freeing rings")
Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:27 +02:00
adfe95fe5b ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets
[ Upstream commit 0e9a709560 ]

This fix addresses two problems in the way the DSCP field is formulated
 on the encapsulating header of IPv6 tunnels.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195661

1) The IPv6 tunneling code was manipulating the DSCP field of the
 encapsulating packet using the 32b flowlabel. Since the flowlabel is
 only the lower 20b it was incorrect to assume that the upper 12b
 containing the DSCP and ECN fields would remain intact when formulating
 the encapsulating header. This fix handles the 'inherit' and
 'fixed-value' DSCP cases explicitly using the extant dsfield u8 variable.

2) The use of INET_ECN_encapsulate(0, dsfield) in ip6_tnl_xmit was
 incorrect and resulted in the DSCP value always being set to 0.

Commit 90427ef5d2 ("ipv6: fix flow labels when the traffic class
 is non-0") caused the regression by masking out the flowlabel
 which exposed the incorrect handling of the DSCP portion of the
 flowlabel in ip6_tunnel and ip6_gre.

Fixes: 90427ef5d2 ("ipv6: fix flow labels when the traffic class is non-0")
Signed-off-by: Peter Dawson <peter.a.dawson@boeing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:27 +02:00
168bd51ec5 sctp: check af before verify address in sctp_addr_id2transport
[ Upstream commit 912964eacb ]

Commit 6f29a13061 ("sctp: sctp_addr_id2transport should verify the
addr before looking up assoc") invoked sctp_verify_addr to verify the
addr.

But it didn't check af variable beforehand, once users pass an address
with family = 0 through sockopt, sctp_get_af_specific will return NULL
and NULL pointer dereference will be caused by af->sockaddr_len.

This patch is to fix it by returning NULL if af variable is NULL.

Fixes: 6f29a13061 ("sctp: sctp_addr_id2transport should verify the addr before looking up assoc")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:27 +02:00
399566f8a4 net/mlx4_core: Eliminate warning messages for SRQ_LIMIT under SRIOV
[ Upstream commit 9577b174cd ]

When running SRIOV, warnings for SRQ LIMIT events flood the Hypervisor's
message log when (correct, normally operating) apps use SRQ LIMIT events
as a trigger to post WQEs to SRQs.

Add more information to the existing debug printout for SRQ_LIMIT, and
output the warning messages only for the SRQ CATAS ERROR event.

Fixes: acba2420f9 ("mlx4_core: Add wrapper functions and comm channel and slave event support to EQs")
Fixes: e0debf9cb5 ("mlx4_core: Reduce warning message for SRQ_LIMIT event to debug level")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:40:27 +02:00