IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
commit 46437f9a554fbe3e110580ca08ab703b59f2f95a upstream.
If the indirect_ptr bit is set on a slot, that indicates we need to redo
the lookup. Introduce a new function radix_tree_iter_retry() which
forces the loop to retry the lookup by setting 'slot' to NULL and
turning the iterator back to point at the problematic entry.
This is a pretty rare problem to hit at the moment; the lookup has to
race with a grow of the radix tree from a height of 0. The consequences
of hitting this race are that gang lookup could return a pointer to a
radix_tree_node instead of a pointer to whatever the user had inserted
in the tree.
Fixes: cebbd29e1c2f ("radix-tree: rewrite gang lookup using iterator")
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6611d8d76132f86faa501de9451a89bf23fb2371 upstream.
A spare array holding mem cgroup threshold events is kept around to make
sure we can always safely deregister an event and have an array to store
the new set of events in.
In the scenario where we're going from 1 to 0 registered events, the
pointer to the primary array containing 1 event is copied to the spare
slot, and then the spare slot is freed because no events are left.
However, it is freed before calling synchronize_rcu(), which means
readers may still be accessing threshold->primary after it is freed.
Fixed by only freeing after synchronize_rcu().
Signed-off-by: Martijn Coenen <maco@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c2ff95e41c9290d16556cd02e35b25d81be8fe0 upstream.
When working with hugetlbfs ptes (which are actually pmds) is not valid to
directly use pte functions like pte_present() because the hardware bit
layout of pmds and ptes can be different. This is the case on s390.
Therefore we have to convert the hugetlbfs ptes first into a valid pte
encoding with huge_ptep_get().
Currently the /proc/<pid>/numa_maps code uses hugetlbfs ptes without
huge_ptep_get(). On s390 this leads to the following two problems:
1) The pte_present() function returns false (instead of true) for
PROT_NONE hugetlb ptes. Therefore PROT_NONE vmas are missing
completely in the "numa_maps" output.
2) The pte_dirty() function always returns false for all hugetlb ptes.
Therefore these pages are reported as "mapped=xxx" instead of
"dirty=xxx".
Therefore use huge_ptep_get() to correctly convert the hugetlb ptes.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9aacdd354d197ad64685941b36d28ea20ab88757 upstream.
Hillf Danton noticed bugs in the hugetlb_vmtruncate_list routine. The
argument end is of type pgoff_t. It was being converted to a vaddr
offset and passed to unmap_hugepage_range. However, end was also being
used as an argument to the vma_interval_tree_foreach controlling loop.
In addition, the conversion of end to vaddr offset was incorrect.
hugetlb_vmtruncate_list is called as part of a file truncate or
fallocate hole punch operation.
When truncating a hugetlbfs file, this bug could prevent some pages from
being unmapped. This is possible if there are multiple vmas mapping the
file, and there is a sufficiently sized hole between the mappings. The
size of the hole between two vmas (A,B) must be such that the starting
virtual address of B is greater than (ending virtual address of A <<
PAGE_SHIFT). In this case, the pages in B would not be unmapped. If
pages are not properly unmapped during truncate, the following BUG is
hit:
kernel BUG at fs/hugetlbfs/inode.c:428!
In the fallocate hole punch case, this bug could prevent pages from
being unmapped as in the truncate case. However, for hole punch the
result is that unmapped pages will not be removed during the operation.
For hole punch, it is also possible that more pages than desired will be
unmapped. This unnecessary unmapping will cause page faults to
reestablish the mappings on subsequent page access.
Fixes: 1bfad99ab (" hugetlbfs: hugetlb_vmtruncate_list() needs to take a range")Reported-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 72214a24a7677d4c7501eecc9517ed681b5f2db2 upstream.
In Python3+ print is a function so the old syntax is not correct
anymore:
$ ./scripts/bloat-o-meter vmlinux.o vmlinux.o.old
File "./scripts/bloat-o-meter", line 61
print "add/remove: %s/%s grow/shrink: %s/%s up/down: %s/%s (%s)" % \
^
SyntaxError: invalid syntax
Fix by calling print as a function.
Tested on python 2.7.11, 3.5.1
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea535e418c01837d07b6c94e817540f50bfdadb0 upstream.
In include/asm-generic/sections.h:
/*
* Usage guidelines:
* _text, _data: architecture specific, don't use them in
* arch-independent code
* [_stext, _etext]: contains .text.* sections, may also contain
* .rodata.*
* and/or .init.* sections
_text is not guaranteed across architectures. Architectures such as ARM
may reuse parts which are not actually text and erroneously trigger a bug.
Switch to using _stext which is guaranteed to contain text sections.
Came out of https://lkml.kernel.org/g/<567B1176.4000106@redhat.com>
Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 601f1db653217f205ffa5fb33514b4e1711e56d1 upstream.
The build of m32104ut_defconfig for m32r arch was failing for long long
time with the error:
ERROR: "memory_start" [fs/udf/udf.ko] undefined!
ERROR: "memory_end" [fs/udf/udf.ko] undefined!
ERROR: "memory_end" [drivers/scsi/sg.ko] undefined!
ERROR: "memory_start" [drivers/scsi/sg.ko] undefined!
ERROR: "memory_end" [drivers/i2c/i2c-dev.ko] undefined!
ERROR: "memory_start" [drivers/i2c/i2c-dev.ko] undefined!
As done in other architectures export the symbols to fix the error.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c82171167adb8e4ac77b91a42cd49fb211a81a0 upstream.
xhci driver frees data for all devices, both usb2 and and usb3 the
first time usb_remove_hcd() is called, including td_list and and xhci_ring
structures.
When usb_remove_hcd() is called a second time for the second xhci bus it
will try to dequeue all pending urbs, and touches td_list which is already
freed for that endpoint.
Reported-by: Joe Lawrence <joe.lawrence@stratus.com>
Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a6835090716a85f2297668ba593bd00e1051e662 upstream.
This reverts commit e210c422b6fd ("xhci: don't finish a TD if we get a
short transfer event mid TD")
Turns out that most host controllers do not follow the xHCI specs and never
send the second event for the last TRB in the TD if there was a short event
mid-TD.
Returning the URB directly after the first short-transfer event is far
better than never returning the URB. (class drivers usually timeout
after 30sec). For the hosts that do send the second event we will go
back to treating it as misplaced event and print an error message for it.
The origial patch was sent to stable kernels and needs to be reverted from
there as well
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 46924008273ed03bd11dbb32136e3da4cfe056e1 upstream.
According to the VT-d specification we need to clear the PPR bit in
the Page Request Status register when handling page requests, or the
hardware won't generate any more interrupts.
This wasn't actually necessary on SKL/KBL (which may well be the
subject of a hardware erratum, although it's harmless enough). But
other implementations do appear to get it right, and we only ever get
one interrupt unless we clear the PPR bit.
Reported-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fda3bec12d0979aae3f02ee645913d66fbc8a26e upstream.
This is a 32-bit register. Apparently harmless on real hardware, but
causing justified warnings in simulation.
Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e57e58bd390a6843db58560bf7b8341665d2e058 upstream.
Holding mm_users works OK for graphics, which was the first user of SVM
with VT-d. However, it works less well for other devices, where we actually
do a mmap() from the file descriptor to which the SVM PASID state is tied.
In this case on process exit we end up with a recursive reference count:
- The MM remains alive until the file is closed and the driver's release()
call ends up unbinding the PASID.
- The VMA corresponding to the mmap() remains intact until the MM is
destroyed.
- Thus the file isn't closed, even when exit_files() runs, because the
VMA is still holding a reference to it. And the MM remains alive…
To address this issue, we *stop* holding mm_users while the PASID is bound.
We already hold mm_count by virtue of the MMU notifier, and that can be
made to be sufficient.
It means that for a period during process exit, the fun part of mmput()
has happened and exit_mmap() has been called so the MM is basically
defunct. But the PGD still exists and the PASID is still bound to it.
During this period, we have to be very careful — exit_mmap() doesn't use
mm->mmap_sem because it doesn't expect anyone else to be touching the MM
(quite reasonably, since mm_users is zero). So we also need to fix the
fault handler to just report failure if mm_users is already zero, and to
temporarily bump mm_users while handling any faults.
Additionally, exit_mmap() calls mmu_notifier_release() *before* it tears
down the page tables, which is too early for us to flush the IOTLB for
this PASID. And __mmu_notifier_release() removes every notifier from the
list, so when exit_mmap() finally *does* tear down the mappings and
clear the page tables, we don't get notified. So we work around this by
clearing the PASID table entry in our MMU notifier release() callback.
That way, the hardware *can't* get any pages back from the page tables
before they get cleared.
Hardware designers have confirmed that the resulting 'PASID not present'
faults should be handled just as gracefully as 'page not present' faults,
the important criterion being that they don't perturb the operation for
any *other* PASID in the system.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9b1a12d29109234d2b9718d04d4d404b7da4e794 upstream.
In below commit alias DTE is set when its peripheral is
setting DTE. However there's a code bug here to wrongly
set the alias DTE, correct it in this patch.
commit e25bfb56ea7f046b71414e02f80f620deb5c6362
Author: Joerg Roedel <jroedel@suse.de>
Date: Tue Oct 20 17:33:38 2015 +0200
iommu/amd: Set alias DTE in do_attach/do_detach
Signed-off-by: Baoquan He <bhe@redhat.com>
Tested-by: Mark Hounschell <markh@compro.net>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d4f1b06d685d11ebdaccf11c0db1cb3c78736862 upstream.
We should set device's capabilities first, and then register it,
otherwise various handlers already present in the kernel will not be
able to connect to the device.
Reported-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 564b026fbd0d28e9f70fb3831293d2922bb7855b upstream.
It was noticed that we lose precision in the final calculation for some
inputs. The most egregious example is size=3000 blk_size=1900 in units
of 10 should yield 5.70 MB but in fact yields 3.00 MB (oops).
This is because the current algorithm doesn't correctly account for
all the remainders in the logarithms. Fix this by doing a correct
calculation in the remainders based on napier's algorithm.
Additionally, now we have the correct result, we have to account for
arithmetic rounding because we're printing 3 digits of precision. This
means that if the fourth digit is five or greater, we have to round up,
so add a section to ensure correct rounding. Finally account for all
possible inputs correctly, including zero for block size.
Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dd0d0d4de582a6a61c032332c91f4f4cb2bab569 upstream.
Without i8042.nomux=1 the Elantech touch pad is not working at all on
a Fujitsu Lifebook U745. This patch does not seem necessary for all
U745 (maybe because of different BIOS versions?). However, it was
verified that the patch does not break those (see opensuse bug 883192:
https://bugzilla.opensuse.org/show_bug.cgi?id=883192).
Signed-off-by: Aurélien Francillon <aurelien@francillon.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6544a1df11c48c8413071aac3316792e4678fbfb upstream.
When using a protocol v2 or v3 hardware, elantech uses the function
elantech_report_semi_mt_data() to report data. This devices are rather
creepy because if num_finger is 3, (x2,y2) is (0,0). Yes, only one valid
touch is reported.
Anyway, userspace (libinput) is now confused by these (0,0) touches,
and detect them as palm, and rejects them.
Commit 3c0213d17a09 ("Input: elantech - fix semi-mt protocol for v3 HW")
was sufficient enough for xf86-input-synaptics and libinput before it has
palm rejection. Now we need to actually tell libinput that this device is
a semi-mt one and it should not rely on the actual values of the 2 touches.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 48f7df329474b49d83d0dffec1b6186647f11976 upstream.
Grazvydas Ignotas has reported a regression in remap_file_pages()
emulation.
Testcase:
#define _GNU_SOURCE
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>
#define SIZE (4096 * 3)
int main(int argc, char **argv)
{
unsigned long *p;
long i;
p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
if (p == MAP_FAILED) {
perror("mmap");
return -1;
}
for (i = 0; i < SIZE / 4096; i++)
p[i * 4096 / sizeof(*p)] = i;
if (remap_file_pages(p, 4096, 0, 1, 0)) {
perror("remap_file_pages");
return -1;
}
if (remap_file_pages(p, 4096 * 2, 0, 1, 0)) {
perror("remap_file_pages");
return -1;
}
assert(p[0] == 1);
munmap(p, SIZE);
return 0;
}
The second remap_file_pages() fails with -EINVAL.
The reason is that remap_file_pages() emulation assumes that the target
vma covers whole area we want to over map. That assumption is broken by
first remap_file_pages() call: it split the area into two vma.
The solution is to check next adjacent vmas, if they map the same file
with the same flags.
Fixes: c8d78c1823f4 ("mm: replace remap_file_pages() syscall with emulation")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 12352d3cae2cebe18805a91fab34b534d7444231 upstream.
Sequence vma_lock_anon_vma() - vma_unlock_anon_vma() isn't safe if
anon_vma appeared between lock and unlock. We have to check anon_vma
first or call anon_vma_prepare() to be sure that it's here. There are
only few users of these legacy helpers. Let's get rid of them.
This patch fixes anon_vma lock imbalance in validate_mm(). Write lock
isn't required here, read lock is enough.
And reorders expand_downwards/expand_upwards: security_mmap_addr() and
wrapping-around check don't have to be under anon vma lock.
Link: https://lkml.kernel.org/r/CACT4Y+Y908EjM2z=706dv4rV6dWtxTLK9nFg9_7DhRMLppBo2g@mail.gmail.com
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7162a1e87b3e380133dadc7909081bb70d0a7041 upstream.
Tetsuo Handa reported underflow of NR_MLOCK on munlock.
Testcase:
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#define BASE ((void *)0x400000000000)
#define SIZE (1UL << 21)
int main(int argc, char *argv[])
{
void *addr;
system("grep Mlocked /proc/meminfo");
addr = mmap(BASE, SIZE, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE | MAP_LOCKED | MAP_FIXED,
-1, 0);
if (addr == MAP_FAILED)
printf("mmap() failed\n"), exit(1);
munmap(addr, SIZE);
system("grep Mlocked /proc/meminfo");
return 0;
}
It happens on munlock_vma_page() due to unfortunate choice of nr_pages
data type:
__mod_zone_page_state(zone, NR_MLOCK, -nr_pages);
For unsigned int nr_pages, implicitly casted to long in
__mod_zone_page_state(), it becomes something around UINT_MAX.
munlock_vma_page() usually called for THP as small pages go though
pagevec.
Let's make nr_pages signed int.
Similar fixes in 6cdb18ad98a4 ("mm/vmstat: fix overflow in
mod_zone_page_state()") used `long' type, but `int' here is OK for a
count of the number of sub-pages in a huge page.
Fixes: ff6a6da60b89 ("mm: accelerate munlock() treatment of THP pages")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Michel Lespinasse <walken@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e07ecd76d4db7bda1e9495395b2110a3fe28845a upstream.
When btt devices were re-worked to be child devices of regions this
routine was overlooked. It mistakenly attempts to_nd_namespace_pmem()
or to_nd_namespace_blk() conversions on btt and pfn devices. By luck to
date we have happened to be hitting valid memory leading to a uuid
miscompare, but a recent change to struct nd_namespace_common causes:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
IP: [<ffffffff814610dc>] memcmp+0xc/0x40
[..]
Call Trace:
[<ffffffffa0028631>] is_uuid_busy+0xc1/0x2a0 [libnvdimm]
[<ffffffffa0028570>] ? to_nd_blk_region+0x50/0x50 [libnvdimm]
[<ffffffff8158c9c0>] device_for_each_child+0x50/0x90
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d96b339f453997f2f08c52da3f41423be48c978f upstream.
I saw the following BUG_ON triggered in a testcase where a process calls
madvise(MADV_SOFT_OFFLINE) on thps, along with a background process that
calls migratepages command repeatedly (doing ping-pong among different
NUMA nodes) for the first process:
Soft offlining page 0x60000 at 0x700000600000
__get_any_page: 0x60000 free buddy page
page:ffffea0001800000 count:0 mapcount:-127 mapping: (null) index:0x1
flags: 0x1fffc0000000000()
page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
------------[ cut here ]------------
kernel BUG at /src/linux-dev/include/linux/mm.h:342!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: cfg80211 rfkill crc32c_intel serio_raw virtio_balloon i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi
CPU: 3 PID: 3035 Comm: test_alloc_gene Tainted: G O 4.4.0-rc8-v4.4-rc8-160107-1501-00000-rc8+ #74
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
task: ffff88007c63d5c0 ti: ffff88007c210000 task.ti: ffff88007c210000
RIP: 0010:[<ffffffff8118998c>] [<ffffffff8118998c>] put_page+0x5c/0x60
RSP: 0018:ffff88007c213e00 EFLAGS: 00010246
Call Trace:
put_hwpoison_page+0x4e/0x80
soft_offline_page+0x501/0x520
SyS_madvise+0x6bc/0x6f0
entry_SYSCALL_64_fastpath+0x12/0x6a
Code: 8b fc ff ff 5b 5d c3 48 89 df e8 b0 fa ff ff 48 89 df 31 f6 e8 c6 7d ff ff 5b 5d c3 48 c7 c6 08 54 a2 81 48 89 df e8 a4 c5 01 00 <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b 47
RIP [<ffffffff8118998c>] put_page+0x5c/0x60
RSP <ffff88007c213e00>
The root cause resides in get_any_page() which retries to get a refcount
of the page to be soft-offlined. This function calls
put_hwpoison_page(), expecting that the target page is putback to LRU
list. But it can be also freed to buddy. So the second check need to
care about such case.
Fixes: af8fae7c0886 ("mm/memory-failure.c: clean up soft_offline_page()")
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3caeaa562733c4836e61086ec07666635006a787 upstream.
While recording guest samples in host using perf kvm record, it will
populate unprocessable sample error, though samples will be recorded
properly. While generating report using perf kvm report, no samples will
be processed and same error will populate. We have seen this behaviour
with upstream perf(4.4-rc3) on x86 and ppc64 hardware.
Reason behind this failure is, when it tries to fetch machine from
rb_tree of machines, it fails. As a part of tracing a bug, we figured
out that this code was incorrectly refactored in commit 54245fdc3576
("perf session: Remove wrappers to machines__find").
This patch will change the functionality such that if it can't fetch
machine in first trial, it will create one node of machine and add that to
rb_tree. So next time when it tries to fetch same machine from rb_tree,
it won't fail. Actually it was the case before refactoring of code in
aforementioned commit.
This patch is generated from acme perf/core branch.
Below I've mention an example that demonstrate the behaviour before and
after applying patch.
Before applying patch:
[Note: One needs to run guest before recording data in host]
ravi@ravi-bangoria:~$ ./perf kvm record -a
Warning:
5903 unprocessable samples recorded.
Do you have a KVM guest running and not using 'perf kvm'?
[ perf record: Captured and wrote 1.409 MB perf.data.guest (285 samples) ]
ravi@ravi-bangoria:~$ ./perf kvm report --stdio
Warning:
5903 unprocessable samples recorded.
Do you have a KVM guest running and not using 'perf kvm'?
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 285 of event 'cycles'
# Event count (approx.): 88715406
#
# Overhead Command Shared Object Symbol
# ........ ....... ............. ......
#
# (For a higher level overview, try: perf report --sort comm,dso)
#
After applying patch:
ravi@ravi-bangoria:~$ ./perf kvm record -a
[ perf record: Captured and wrote 1.188 MB perf.data.guest (17 samples) ]
ravi@ravi-bangoria:~$ ./perf kvm report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 17 of event 'cycles'
# Event count (approx.): 700746
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ......................
#
34.19% :5758 [unknown] [g] 0xffffffff818682ab
22.79% :5758 [unknown] [g] 0xffffffff812dc7f8
22.79% :5758 [unknown] [g] 0xffffffff818650d0
14.83% :5758 [unknown] [g] 0xffffffff8161a1b6
2.49% :5758 [unknown] [g] 0xffffffff818692bf
0.48% :5758 [unknown] [g] 0xffffffff81869253
0.05% :5758 [unknown] [g] 0xffffffff81869250
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Fixes: 54245fdc3576 ("perf session: Remove wrappers to machines__find")
Link: http://lkml.kernel.org/r/1449471302-11283-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b4d7f161feb3015d6306e1d35b565c888ff70c9d upstream.
The get and set operations got exchanged by mistake when moving the
code from book3s.c to powerpc.c.
Fixes: 3840edc8033ad5b86deee309c1c321ca54257452
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 760a7364f27d974d100118d88190e574626e18a6 upstream.
In the old DABR register, the BT (Breakpoint Translation) bit
is bit number 61. In the new DAWRX register, the WT (Watchpoint
Translation) bit is bit number 59. So to move the DABR-BT bit
into the position of the DAWRX-WT bit, it has to be shifted by
two, not only by one. This fixes hardware watchpoints in gdb of
older guests that only use the H_SET_DABR/X interface instead
of the new H_SET_MODE interface.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b3aff6ccbb1d25e506b60ccd9c559013903f3464 upstream.
Commit 4b4b4512da2a ("arm/arm64: KVM: Rework the arch timer to use
level-triggered semantics") brought the virtual architected timer
closer to the VGIC. There is one occasion were we don't properly
check for the VGIC actually having been initialized before, but
instead go on to check the active state of some IRQ number.
If userland hasn't instantiated a virtual GIC, we end up with a
kernel NULL pointer dereference:
=========
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = ffffffc9745c5000
[00000000] *pgd=00000009f631e003, *pud=00000009f631e003, *pmd=0000000000000000
Internal error: Oops: 96000006 [#2] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 2144 Comm: kvm_simplest-ar Tainted: G D 4.5.0-rc2+ #1300
Hardware name: ARM Juno development board (r1) (DT)
task: ffffffc976da8000 ti: ffffffc976e28000 task.ti: ffffffc976e28000
PC is at vgic_bitmap_get_irq_val+0x78/0x90
LR is at kvm_vgic_map_is_active+0xac/0xc8
pc : [<ffffffc0000b7e28>] lr : [<ffffffc0000b972c>] pstate: 20000145
....
=========
Fix this by bailing out early of kvm_timer_flush_hwstate() if we don't
have a VGIC at all.
Reported-by: Cosmin Gorgovan <cosmin@linux-geek.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 722ec35f7faefcc34d12616eca7976a848870f9d upstream.
This patch ensures that devices, which got registered before arch_initcall
will be handled correctly by IOMMU-based DMA-mapping code.
Fixes: 13b8629f6511 ("arm64: Add IOMMU dma_ops")
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4da597d16602d14405b71a18d45e1c59f28f0fd2 upstream.
We don't want to write to .text so let's move ppa_zero_params and
ppa_por_params to .data and access them via pointers.
Note that I have not been able to test as we I don't have a HS
omap4 to test with. The code has been changed in similar way as
for omap3 though.
Cc: Kees Cook <keescook@chromium.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Nishanth Menon <nm@ti.com>
Cc: Richard Woodruff <r-woodruff2@ti.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tero Kristo <t-kristo@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Fixes: 1e6b48116a95 ("ARM: mm: allow non-text sections to be
non-executable")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a5311d4d13df80bd71a9e47f9ecaf327f478fab1 upstream.
We don't want to write to .text and we can move save_secure_ram_context
into .data as it all gets copied into SRAM anyways.
Cc: Kees Cook <keescook@chromium.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Nishanth Menon <nm@ti.com>
Cc: Richard Woodruff <r-woodruff2@ti.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Cc: Tero Kristo <t-kristo@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Fixes: 1e6b48116a95 ("ARM: mm: allow non-text sections to be
non-executable")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eeaf9646aca89d097861caa24d9818434e48810e upstream.
We don't want to write to .text section. Let's move l2dis_3630
to .data and access it via a pointer.
For calculating the offset, let's optimize out the add and do it
in ldr/str as suggested by Nicolas Pitre <nicolas.pitre@linaro.org>.
Cc: Kees Cook <keescook@chromium.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Nishanth Menon <nm@ti.com>
Cc: Richard Woodruff <r-woodruff2@ti.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tero Kristo <t-kristo@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Fixes: 1e6b48116a95 ("ARM: mm: allow non-text sections to be
non-executable")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0a0b13275558c32bbf6241464a7244b1ffd5afb3 upstream.
We don't want to write to .text, so let's move l2_inv_api_params
to .data and access it via a pointer.
Cc: Kees Cook <keescook@chromium.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Nishanth Menon <nm@ti.com>
Cc: Richard Woodruff <r-woodruff2@ti.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tero Kristo <t-kristo@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Fixes: 1e6b48116a95 ("ARM: mm: allow non-text sections to be
non-executable")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d9db59103305eb5ec2a86369f32063e9921b6ac5 upstream.
We don't want to be writing to .text so it can be set rodata.
Fix error "Unable to handle kernel paging request at virtual address
c012396c" in wait_dll_lock_timed if CONFIG_DEBUG_RODATA is selected.
As these counters are for debugging only and unused, we can just
remove them.
Cc: Kees Cook <keescook@chromium.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Nishanth Menon <nm@ti.com>
Cc: Richard Woodruff <r-woodruff2@ti.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tero Kristo <t-kristo@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Fixes: 1e6b48116a95 ("ARM: mm: allow non-text sections to be
non-executable")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aae6b18f5c95b9dc78de66d1e27e8afeee2763b7 upstream.
On SAMA5D4EK board, the Ethernet doesn't work after resuming from the suspend
state.
Signed-off-by: Wenyou Yang <wenyou.yang@atmel.com>
[nicolas.ferre@atmel.com: adapt to newer kernel]
Fixes: 38153a017896 ("ARM: at91/dt: sama5d4: add dts for sama5d4 xplained board")
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e873cc022ce5e2c04bbc53b5874494b657e29d3f upstream.
For phy0 KSZ8081, the type of GPIO IRQ should be "level low" instead of
"edge falling".
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Fixes: 38153a017896 ("ARM: at91/dt: sama5d4: add dts for sama5d4 xplained board")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f505dba762ae826bb68978a85ee5c8ced7dea8d7 upstream.
No interrupt were received from the phy because PIOE 1 may not be properly
muxed. It prevented proper link detection, especially since commit
321beec5047a ("net: phy: Use interrupts when available in NOLINK state")
disables polling.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af756bbccff85504ce05c63a50f80b9d7823c500 upstream.
The palmas PMIC has two control lines that need to be muxed properly
for things to work. The sys_nirq pin is used for interrupts, and msecure
pin is used for enabling writes to some PMIC registers.
Without these pins configured properly things can fail in mysterious
ways. For example, we can't update the RTC registers on palmas PMIC
unless the msecure pin is configured. And this is probably the reason
why we had RTC missing from the omap5 dts file.
According to "OMAP5430 ES2.0 Data Manual [Public] VErsion A (Rev. F)"
swps052f.pdf, mux mode 1 is for sys_drm_msecure so in theory there's
should be no need to configure it as a GPIO pin.
However, it seems there are some reliability issues using the msecure
mux mode. And the TI trees configure the msecure pin as GPIO out high
instead.
As the PMIC only cares that the msecure line is high to allow access
to the RTC registers, let's use a GPIO hog as suggested by Nishanth
Menon <nm@ti.com>. Also the use of the internal pull was considered
but supposedly that may not be capable of keeping the line high in
a noisy environment.
If we ever see high security omap5 products in the mainline tree,
those need to skip the msecure pin muxing and ignore setting the GPIO
hog. Chances are the related pin mux registers are locked in that case
and the msecure pin is managed by whatever software may be running in
the ARM TrustZone.
Who knows what the original intention of the msecure pin was. Maybe
it was supposed to prevent the system time to be set back for some
game demo modes to time out? Anyways, it seems that later PMICs like
tps659037 have recycled this pin for "powerhold" and devices like
beagle-x15 do not need changes to the msecure pin configuration.
To avoid further confusion with TWL variant PMICs, beagle-x15 does
not have a back-up battery for RTC palmas. Instead the mcp79410 RTC
is used with rtc-ds1307 driver. There is a "powerhold" jumper j5
holes near the palmas PMIC, and shorting it seems to power up
beagle-x15 automatically. It is unknown if it also has other side
effects to the beagle-x15 power up sequence.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0ea24daae053a9ba65d2f3eb20523002c1a8af38 upstream.
The tcxo-clock-frequency binding is listed as optional,
but without it the wl12xx used on the torpedo + wireless
may hang. Scanning also appears broken without this patch.
Signed-off-by: Adam Ford <aford173@gmail.com>
Fixes: 687c27676151 ("ARM: dts: Add minimal support for LogicPD
Torpedo DM3730 devkit")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 418d5516568b3fdbc4e7b53677dd78aed8514565 upstream.
The DTSI file for the Nomadik does not properly specify how the
PL180 levelshifter is connected: the Nomadik actually needs all
the five st,sig-dir-* flags set to properly control all lines out.
Further this board supports full power cycling of the card, and
since this variant has no hardware clock gating, it needs a
ridiculously low frequency setting to keep up with the ever
overflowing FIFO.
The pin configuration set-up is a bit of a mystery, because of
course these pins are a mix of inputs and outputs. However the
reference implementation sets all pins to "output" with
unspecified initial value, so let's do that here as well.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5070fb14a0154f075c8b418e5bc58a620ae85a45 upstream.
When trying to set the ICST 307 clock to 25174000 Hz I ran into
this arithmetic error: the icst_hz_to_vco() correctly figure out
DIVIDE=2, RDW=100 and VDW=99 yielding a frequency of
25174000 Hz out of the VCO. (I replicated the icst_hz() function
in a spreadsheet to verify this.)
However, when I called icst_hz() on these VCO settings it would
instead return 4122709 Hz. This causes an error in the common
clock driver for ICST as the common clock framework will call
.round_rate() on the clock which will utilize icst_hz_to_vco()
followed by icst_hz() suggesting the erroneous frequency, and
then the clock gets set to this.
The error did not manifest in the old clock framework since
this high frequency was only used by the CLCD, which calls
clk_set_rate() without first calling clk_round_rate() and since
the old clock framework would not call clk_round_rate() before
setting the frequency, the correct values propagated into
the VCO.
After some experimenting I figured out that it was due to a simple
arithmetic overflow: the divisor for 24Mhz reference frequency
as reference becomes 24000000*2*(99+8)=0x132212400 and the "1"
in bit 32 overflows and is lost.
But introducing an explicit 64-by-32 bit do_div() and casting
the divisor into (u64) we get the right frequency back, and the
right frequency gets set.
Tested on the ARM Versatile.
Cc: linux-clk@vger.kernel.org
Cc: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e972c37459c813190461dabfeaac228e00aae259 upstream.
Since the dawn of time the ICST code has only supported divide
by one or hang in an eternal loop. Luckily we were always dividing
by one because the reference frequency for the systems using
the ICSTs is 24MHz and the [min,max] values for the PLL input
if [10,320] MHz for ICST307 and [6,200] for ICST525, so the loop
will always terminate immediately without assigning any divisor
for the reference frequency.
But for the code to make sense, let's insert the missing i++
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 57adec866c0440976c96a4b8f5b59fb411b1cacb upstream.
Calling apply_to_page_range with an empty range results in a BUG_ON
from the core code. This can be triggered by trying to load the st_drv
module with CONFIG_DEBUG_SET_MODULE_RONX enabled:
kernel BUG at mm/memory.c:1874!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in:
CPU: 3 PID: 1764 Comm: insmod Not tainted 4.5.0-rc1+ #2
Hardware name: ARM Juno development board (r0) (DT)
task: ffffffc9763b8000 ti: ffffffc975af8000 task.ti: ffffffc975af8000
PC is at apply_to_page_range+0x2cc/0x2d0
LR is at change_memory_common+0x80/0x108
This patch fixes the issue by making change_memory_common (called by the
set_memory_* functions) a NOP when numpages == 0, therefore avoiding the
erroneous call to apply_to_page_range and bringing us into line with x86
and s390.
Reviewed-by: Laura Abbott <labbott@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Mika Penttilä <mika.penttila@nextfour.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 079ae0c121fd23287f4ad2be9e9f8a13f63cae73 upstream.
The Armada 388 GP Device Tree file describes two times a regulator
named 'reg_usb2_1_vbus', with the exact same description. This has
been wrong since Armada 388 GP support was introduced.
Fixes: 928413bd859c0 ("ARM: mvebu: Add Armada 388 General Purpose Development Board support")
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6ecad912a0073c768db1491c27ca55ad2d0ee68f upstream.
Quite often drivers set only "write" permission assuming that this
includes "read" permission as well and this works on plenty of
platforms. However IODA2 is strict about this and produces an EEH when
"read" permission is not set and reading happens.
This adds a workaround in the IODA code to always add the "read" bit
when the "write" bit is set.
Fixes: 10b35b2b7485 ("powerpc/powernv: Do not set "read" flag if direction==DMA_NONE")
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Tested-by: Douglas Miller <dougmill@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1bc74f1ccd457832dc515fc1febe6655985fdcd2 upstream.
When PCI bus is unplugged during full hotplug for EEH recovery,
the platform PE instance (struct pnv_ioda_pe) isn't released and
it dereferences the stale PCI bus that has been released. It leads
to kernel crash when referring to the stale PCI bus.
This fixes the issue by correcting the PE's primary bus when it's
oneline at plugging time, in pnv_pci_dma_bus_setup() which is to
be called by pcibios_fixup_bus().
Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 05ba75f848647135f063199dc0e9f40fee769724 upstream.
When PE is created, its primary bus is cached to pe->bus. At later
point, the cached primary bus is returned from eeh_pe_bus_get().
However, we could get stale cached primary bus and run into kernel
crash in one case: full hotplug as part of fenced PHB error recovery
releases all PCI busses under the PHB at unplugging time and recreate
them at plugging time. pe->bus is still dereferencing the PCI bus
that was released.
This adds another PE flag (EEH_PE_PRI_BUS) to represent the validity
of pe->bus. pe->bus is updated when its first child EEH device is
online and the flag is set. Before unplugging in full hotplug for
error recovery, the flag is cleared.
Fixes: 8cdb2833 ("powerpc/eeh: Trace PCI bus from PE")
Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7e56f627768da4e6480986b5145dc3422bc448a5 upstream.
In eeh_pe_loc_get(), the PE location code is retrieved from the
"ibm,loc-code" property of the device node for the bridge of the
PE's primary bus. It's not correct because the property indicates
the parent PE's location code.
This reads the correct PE location code from "ibm,io-base-loc-code"
or "ibm,slot-location-code" property of PE parent bus's device node.
Fixes: 357b2f3dd9b7 ("powerpc/eeh: Dump PE location code")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>