IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
__GFP_REPEAT has a rather weak semantic but since it has been introduced
around 2.6.12 it has been ignored for low order allocations.
pte_alloc_one_kernel uses __get_order_pte but this is obviously always
zero because BITS_FOR_PTE is not larger than 9 yet the page size is
always larger than 4K. This means that this flag has never been
actually useful here because it has always been used only for
PAGE_ALLOC_COSTLY requests.
Link: http://lkml.kernel.org/r/1464599699-30131-7-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__GFP_REPEAT has a rather weak semantic but since it has been introduced
around 2.6.12 it has been ignored for low order allocations.
{pte,pmd,pud}_alloc_one{_kernel}, late_pgtable_alloc use PGALLOC_GFP for
__get_free_page (aka order-0).
pgd_alloc is slightly more complex because it allocates from pgd_cache
if PGD_SIZE != PAGE_SIZE and PGD_SIZE depends on the configuration
(CONFIG_ARM64_VA_BITS, PAGE_SHIFT and CONFIG_PGTABLE_LEVELS).
As per
config PGTABLE_LEVELS
int
default 2 if ARM64_16K_PAGES && ARM64_VA_BITS_36
default 2 if ARM64_64K_PAGES && ARM64_VA_BITS_42
default 3 if ARM64_64K_PAGES && ARM64_VA_BITS_48
default 3 if ARM64_4K_PAGES && ARM64_VA_BITS_39
default 3 if ARM64_16K_PAGES && ARM64_VA_BITS_47
default 4 if !ARM64_64K_PAGES && ARM64_VA_BITS_48
we should have the following options
CONFIG_ARM64_VA_BITS:48 CONFIG_PGTABLE_LEVELS:4 PAGE_SIZE:4k size:4096 pages:1
CONFIG_ARM64_VA_BITS:48 CONFIG_PGTABLE_LEVELS:4 PAGE_SIZE:16k size:16 pages:1
CONFIG_ARM64_VA_BITS:48 CONFIG_PGTABLE_LEVELS:3 PAGE_SIZE:64k size:512 pages:1
CONFIG_ARM64_VA_BITS:47 CONFIG_PGTABLE_LEVELS:3 PAGE_SIZE:16k size:16384 pages:1
CONFIG_ARM64_VA_BITS:42 CONFIG_PGTABLE_LEVELS:2 PAGE_SIZE:64k size:65536 pages:1
CONFIG_ARM64_VA_BITS:39 CONFIG_PGTABLE_LEVELS:3 PAGE_SIZE:4k size:4096 pages:1
CONFIG_ARM64_VA_BITS:36 CONFIG_PGTABLE_LEVELS:2 PAGE_SIZE:16k size:16384 pages:1
All of them fit into a single page (aka order-0). This means that this
flag has never been actually useful here because it has always been used
only for PAGE_ALLOC_COSTLY requests.
Link: http://lkml.kernel.org/r/1464599699-30131-6-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__GFP_REPEAT has a rather weak semantic but since it has been introduced
around 2.6.12 it has been ignored for low order allocations.
efi_alloc_page_tables uses __GFP_REPEAT but it allocates an order-0
page. This means that this flag has never been actually useful here
because it has always been used only for PAGE_ALLOC_COSTLY requests.
Link: http://lkml.kernel.org/r/1464599699-30131-4-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__GFP_REPEAT has a rather weak semantic but since it has been introduced
around 2.6.12 it has been ignored for low order allocations.
PGALLOC_GFP uses __GFP_REPEAT but none of the allocation which uses this
flag is for more than order-0. This means that this flag has never been
actually useful here because it has always been used only for
PAGE_ALLOC_COSTLY requests.
Link: http://lkml.kernel.org/r/1464599699-30131-3-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the third version of the patchset previously sent [1]. I have
basically only rebased it on top of 4.7-rc1 tree and dropped "dm: get
rid of superfluous gfp flags" which went through dm tree. I am sending
it now because it is tree wide and chances for conflicts are reduced
considerably when we want to target rc2. I plan to send the next step
and rename the flag and move to a better semantic later during this
release cycle so we will have a new semantic ready for 4.8 merge window
hopefully.
Motivation:
While working on something unrelated I've checked the current usage of
__GFP_REPEAT in the tree. It seems that a majority of the usage is and
always has been bogus because __GFP_REPEAT has always been about costly
high order allocations while we are using it for order-0 or very small
orders very often. It seems that a big pile of them is just a
copy&paste when a code has been adopted from one arch to another.
I think it makes some sense to get rid of them because they are just
making the semantic more unclear. Please note that GFP_REPEAT is
documented as
* __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt
* _might_ fail. This depends upon the particular VM implementation.
while !costly requests have basically nofail semantic. So one could
reasonably expect that order-0 request with __GFP_REPEAT will not loop
for ever. This is not implemented right now though.
I would like to move on with __GFP_REPEAT and define a better semantic
for it.
$ git grep __GFP_REPEAT origin/master | wc -l
111
$ git grep __GFP_REPEAT | wc -l
36
So we are down to the third after this patch series. The remaining
places really seem to be relying on __GFP_REPEAT due to large allocation
requests. This still needs some double checking which I will do later
after all the simple ones are sorted out.
I am touching a lot of arch specific code here and I hope I got it right
but as a matter of fact I even didn't compile test for some archs as I
do not have cross compiler for them. Patches should be quite trivial to
review for stupid compile mistakes though. The tricky parts are usually
hidden by macro definitions and thats where I would appreciate help from
arch maintainers.
[1] http://lkml.kernel.org/r/1461849846-27209-1-git-send-email-mhocko@kernel.org
This patch (of 19):
__GFP_REPEAT has a rather weak semantic but since it has been introduced
around 2.6.12 it has been ignored for low order allocations. Yet we
have the full kernel tree with its usage for apparently order-0
allocations. This is really confusing because __GFP_REPEAT is
explicitly documented to allow allocation failures which is a weaker
semantic than the current order-0 has (basically nofail).
Let's simply drop __GFP_REPEAT from those places. This would allow to
identify place which really need allocator to retry harder and formulate
a more specific semantic for what the flag is supposed to do actually.
Link: http://lkml.kernel.org/r/1464599699-30131-2-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: John Crispin <blogic@openwrt.org>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The INIT_TASK() initializer was similarly confused about the stack vs
thread_info allocation that the allocators had, and that were fixed in
commit b235beea9e99 ("Clarify naming of thread info/stack allocators").
The task ->stack pointer only incidentally ends up having the same value
as the thread_info, and in fact that will change.
So fix the initial task struct initializer to point to 'init_stack'
instead of 'init_thread_info', and make sure the ia64 definition for
that exists.
This actually makes the ia64 tsk->stack pointer be sensible for the
initial task, but not for any other task. As mentioned in commit
b235beea9e99, that whole pointer isn't actually used on ia64, since
task_stack_page() there just points to the (single) allocation.
All the other architectures seem to have copied the 'init_stack'
definition, even if it tended to be generally unusued.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As the actual pointer value is the same for the thread stack allocation
and the thread_info, code that confused the two worked fine, but will
break when the thread info is moved away from the stack allocation. It
also looks very confusing.
For example, the kprobe code wanted to know the current top of stack.
To do that, it used this:
(unsigned long)current_thread_info() + THREAD_SIZE
which did indeed give the correct value. But it's not only a fairly
nonsensical expression, it's also rather complex, especially since we
actually have this:
static inline unsigned long current_top_of_stack(void)
which not only gives us the value we are interested in, but happens to
be how "current_thread_info()" is currently defined as:
(struct thread_info *)(current_top_of_stack() - THREAD_SIZE);
so using current_thread_info() to figure out the top of the stack really
is a very round-about thing to do.
The other cases are just simpler confusion about task_thread_info() vs
task_stack_page(), which currently return the same pointer - but if you
want the stack page, you really should be using the latter one.
And there was one entirely unused assignment of the current stack to a
thread_info pointer.
All cleaned up to make more sense today, and make it easier to move the
thread_info away from the stack in the future.
No semantic changes.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We've had the thread info allocated together with the thread stack for
most architectures for a long time (since the thread_info was split off
from the task struct), but that is about to change.
But the patches that move the thread info to be off-stack (and a part of
the task struct instead) made it clear how confused the allocator and
freeing functions are.
Because the common case was that we share an allocation with the thread
stack and the thread_info, the two pointers were identical. That
identity then meant that we would have things like
ti = alloc_thread_info_node(tsk, node);
...
tsk->stack = ti;
which certainly _worked_ (since stack and thread_info have the same
value), but is rather confusing: why are we assigning a thread_info to
the stack? And if we move the thread_info away, the "confusing" code
just gets to be entirely bogus.
So remove all this confusion, and make it clear that we are doing the
stack allocation by renaming and clarifying the function names to be
about the stack. The fact that the thread_info then shares the
allocation is an implementation detail, and not really about the
allocation itself.
This is a pure renaming and type fix: we pass in the same pointer, it's
just that we clarify what the pointer means.
The ia64 code that actually only has one single allocation (for all of
task_struct, thread_info and kernel thread stack) now looks a bit odd,
but since "tsk->stack" is actually not even used there, that oddity
doesn't matter. It would be a separate thing to clean that up, I
intentionally left the ia64 changes as a pure brute-force renaming and
type change.
Acked-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
None of the code actually wants a thread_info, it all wants a
task_struct, and it's just converting to a thread_info pointer much too
early.
No semantic change.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When page tables entries are set using xen_set_pte_init() during early
boot there is no page fault handler that could handle a fault when
performing an M2P lookup.
In 64 bit guests (usually dom0) early_ioremap() would fault in
xen_set_pte_init() because an M2P lookup faults because the MFN is in
MMIO space and not mapped in the M2P. This lookup is done to see if
the PFN in in the range used for the initial page table pages, so that
the PTE may be set as read-only.
The M2P lookup can be avoided by moving the check (and clear of RW)
earlier when the PFN is still available.
Reported-by: Kevin Moraga <kmoragas@riseup.net>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
xen_cleanhighmap() is operating on level2_kernel_pgt only. The upper
bound of the loop setting non-kernel-image entries to zero should not
exceed the size of level2_kernel_pgt.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Classic BPF JIT was never ported completely to work on little endian
powerpc. However, it can be enabled and will crash the system when used.
As such, disable use of BPF JIT on ppc64le.
Fixes: 7c105b63bd98 ("powerpc: Add CONFIG_CPU_LITTLE_ENDIAN kernel config option.")
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
As part of the Radix MMU support we added some feature sections in the
SLB miss handler. These are intended to catch the case that we
incorrectly take an SLB miss when Radix is enabled, and instead of
crashing weirdly they bail out to a well defined exit path and trigger
an oops.
However the way they were written meant the bailout case was enabled by
default until we did CPU feature patching.
On powermacs the early debug prints in setup_system() can cause an SLB
miss, which happens before code patching, and so the SLB miss handler
would incorrectly bailout and crash during boot.
Fix it by inverting the sense of the feature section, so that the code
which is in place at boot is correct for the hash case. Once we
determine we are using Radix - which will never happen on a powermac -
only then do we patch in the bailout case which unconditionally jumps.
Fixes: caca285e5ab4 ("powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related code")
Reported-by: Denis Kirjanov <kda@linux-powerpc.org>
Tested-by: Denis Kirjanov <kda@linux-powerpc.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Hibernate relies on cpu hotplug to prevent secondary cores executing
the kernel text while it is being restored.
Add a call to cpus_are_stuck_in_kernel() to determine if there are
CPUs not counted by 'num_online_cpus()', and prevent hibernate in this
case.
Fixes: 82869ac57b5 ("arm64: kernel: Add support for hibernate/suspend-to-disk")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
kernel/smp.c has a fancy counter that keeps track of the number of CPUs
it marked as not-present and left in cpu_park_loop(). If there are any
CPUs spinning in here, features like kexec or hibernate may release them
by overwriting this memory.
This problem also occurs on machines using spin-tables to release
secondary cores.
After commit 44dbcc93ab67 ("arm64: Fix behavior of maxcpus=N")
we bring all known cpus into the secondary holding pen, meaning this
memory can't be re-used by kexec or hibernate.
Add a function cpus_are_stuck_in_kernel() to determine if either of these
cases have occurred.
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
__sync_icache_dcache unconditionally skips the cache maintenance for
anonymous pages, under the assumption that flushing is only required in
the presence of D-side aliases [see 7249b79f6b4cc ("arm64: Do not flush
the D-cache for anonymous pages")].
Unfortunately, this breaks migration of anonymous pages holding
self-modifying code, where userspace cannot be reasonably expected to
reissue maintenance instructions in response to a migration.
This patch fixes the problem by removing the broken page_mapping(page)
check from the cache syncing code, otherwise we may end up fetching and
executing stale instructions from the PoU.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
I fixed boot image dependencies for arch/arm in commit 3939f3345050
("ARM: 8418/1: add boot image dependencies to not generate invalid
images").
I see a similar problem for arch/arm64; "make -jN Image Image.gz"
would sometimes end up generating bad images where N > 1.
Fix the dependency in arch/arm64/Makefile to avoid the race
between "make Image" and "make Image.*".
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
During a rollover, we mark the active ASID on each CPU as reserved, before
allocating a new ID for the task that caused the rollover. This means that
with N CPUs, we can only guarantee the new task to obtain a valid ASID if
we have at least N+1 ASIDs. Update this limit in the initcall check.
Note that this restriction was introduced by commit 8e648066 on the
arch/arm side, which disallow re-using the previously active ASID on the
local CPU, as it would introduce a TLB race.
In addition, we only dispose of NUM_USER_ASIDS-1, since ASID 0 is
reserved. Add this restriction as well.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Pull s390 fixes from Martin Schwidefsky:
"Two more bugs fixes for 4.7:
- a KVM regression introduced with the pgtable.c code split
- a perf issue with two hardware PMUs using a shared event context"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/cpum_cf: use perf software context for hardware counters
KVM: s390/mm: Fix CMMA reset during reboot
Another batch of fixes for ARM SoC platforms. Most are smaller fixes,
Two areas that are worth pointing out are:
* OMAP had a handful of changes to voltage specs that caused a bit of churn,
most of volume of change in this branch is due to this.
* There are a couple of _rcuidle fixes from Paul that touch common code and
came in through the OMAP tree since they were the ones who saw the problems.
The rest is smaller changes across a handful of platforms.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXZjf6AAoJEIwa5zzehBx3ExoP/2eGTZyGUt9DFutZs2OhZRgh
tI3zBgfPaAEmt+rnvVE3NhDbfjPsV2cANDxE/MaZsFaKkgkHNBbmZpJ3Y7OlgB+k
3kl4y87Ez1NEJrzQKzqVICzCD3IKA3cxUwUELIp3C7LhnKO1UXmRXp8UXee1Yc1E
gL23Z2FncrDLOdvVfp/dTj1scB1XQrt3kePSu7sIuyDuGiPLRvO8fNjvIfOQaGDt
Y27Yk1GrNpvqiOAkziOzUmSGZ6ZZ+wUdUKc/+QcxSnqxrSldtaQDsmmL4z6DQ1xj
j0jagfVGXNLrCUj0zyWwwPG7pZ37BDJ1mj7AMiX9N7LDQFHR9owEVNf2zd1ar37k
64Vlz+38m8lXNHM2/gL6gqFZIm0Kjt9C/wrvyealsuflGmx4xMSTRr2yvde5URBO
diFzee3y2NPhvaRaEd1/yFJ/c0D5bVS8M1lced6GXn/l8SrWjg1SrYZley2PGjQH
esEr7odTR7Um1UIXalpL1yBxoOVfGJl3bPYe8/veniFSi4DV+yeCOCc6pN0Km4WQ
huUlzJkIXgtgAUt5gvWAw7sC+qzYPL+qOMAJsfb/vANoGg2QMrt+u9RAWnMKidpo
GcjKNvAhAmAfwUgVHeLxO714MjIHWhKEVGkGsiDwoLCisn7HTLmaYk8qmOTlExcC
g/nj7vtaXFfMDcD3uco0
=C0Az
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Another batch of fixes for ARM SoC platforms. Most are smaller fixes.
Two areas that are worth pointing out are:
- OMAP had a handful of changes to voltage specs that caused a bit of
churn, most of volume of change in this branch is due to this.
- There are a couple of _rcuidle fixes from Paul that touch common
code and came in through the OMAP tree since they were the ones who
saw the problems.
The rest is smaller changes across a handful of platforms"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (36 commits)
ARM: dts: STi: stih407-family: Disable reserved-memory co-processor nodes
ARM: dts: am437x-sk-evm: Reduce i2c0 bus speed for tps65218
ARM: OMAP2+: timer: add probe for clocksources
ARM: OMAP1: fix ams-delta FIQ handler to work with sparse IRQ
memory: omap-gpmc: Fix omap gpmc EXTRADELAY timing
arm: Use _rcuidle for smp_cross_call() tracepoints
MAINTAINERS: Add myself as reviewer of ARM FSL/NXP
ARM: OMAP: DRA7: powerdomain data: Remove unused pwrsts_mem_ret
ARM: OMAP: DRA7: powerdomain data: Remove unused pwrsts_logic_ret
ARM: OMAP: DRA7: powerdomain data: Set L3init and L4per to ON
ARM: imx6ul: Fix Micrel PHY mask
ARM: OMAP2+: Select OMAP_INTERCONNECT for SOC_AM43XX
ARM: dts: DRA74x: fix DSS PLL2 addresses
ARM: OMAP2: Enable Errata 430973 for OMAP3
ARM: dts: socfpga: Add missing PHY phandle
ARM: dts: exynos: Fix port nodes names for Exynos5420 Peach Pit board
ARM: dts: exynos: Fix port nodes names for Exynos5250 Snow board
ARM: dts: sun6i: yones-toptech-bs1078-v2: Drop constraints on dc1sw regulator
ARM: dts: sun6i: primo81: Drop constraints on dc1sw regulator
ARM: dts: sunxi: Add OLinuXino Lime2 eMMC to the Makefile
...
- Fix dra7 for hardware issues limiting L4Per and L3init power domains
to on state. Without this the devices may not work correctly after
some time of use because of asymmetric aging. And related to this,
let's also remove the unusable states.
- Always select omap interconnect for am43x as otherwise the am43x
only configurations will not boot properly. This can happen easily
for any product kernels that leave out other SoCs to save memory.
- Fix DSS PLL2 addresses that have gone unused for now
- Select erratum 430973 for omap3, this is now safe to do and can
save quite a bit of debugging time for people who may have left
it out.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXXm6cAAoJEBvUPslcq6VzgiAP/3j+Zvaks93gLf0Hc1sQ5ow+
UxQ3Gb3/gVGEh1OSb/c4MI800UBK1B0f6CqLK7zDFAuDHyUwmqJ27nrARfazoMPD
DxRYZoEs897peB0/SWwDtHR+yje5UmIB0P31kRJ+t5nYwXBKvmvkWPFrOISxgI1Z
yLc62tFoVy37IYfeH6pRNwMyJz9scl4qXjiBCHTmBQvgo4I3IPpvhFAWN5YMBZlz
VwXDtmR9B/WlcRYel+RplYQrQrXVvaaT01wTPfejKHI9dyNQmbJQDWFMuuvdQKjE
O7yjcgR6DdWjdDwCmIHLuc2FyrwW+wt1AY/5UXKGroxfW6Ct3JKuhUPPxsHfRMKX
2NnQtgUcUxiIAGrsEZFadUAIEedd+DBVK+aztn1reaaHR1R/pBnMcEnch9WRuAOQ
srOaaL2Had/NG+QRE0psgck9ayzpDHw+LMd18BckCN+1mIiFBnXYZrCVUPrutgPP
5RbDWIeSVeAwbdaxPRqkXOcMGZ1MDRGoS+UBlTms+gWuSLjFj6sye0+Dao+68Ehz
im/xhh0YCHgN0TvuvTla5BgLgOunlCNtXMNWyg801lDIvzOj/ngEQlasng9uzuri
mfGf5w8ctub0Ileq4eU+rYb+bRDiDagjmiVBjbRmLWtgOUcsrNr+FX9sp5NDELmS
oEa/VB8jNmI3aSv6OrNi
=mQyc
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v4.7/fixes-powedomain' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Fixes for omaps for v4.7-rc cycle:
- Fix dra7 for hardware issues limiting L4Per and L3init power domains
to on state. Without this the devices may not work correctly after
some time of use because of asymmetric aging. And related to this,
let's also remove the unusable states.
- Always select omap interconnect for am43x as otherwise the am43x
only configurations will not boot properly. This can happen easily
for any product kernels that leave out other SoCs to save memory.
- Fix DSS PLL2 addresses that have gone unused for now
- Select erratum 430973 for omap3, this is now safe to do and can
save quite a bit of debugging time for people who may have left
it out.
* tag 'omap-for-v4.7/fixes-powedomain' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP: DRA7: powerdomain data: Remove unused pwrsts_mem_ret
ARM: OMAP: DRA7: powerdomain data: Remove unused pwrsts_logic_ret
ARM: OMAP: DRA7: powerdomain data: Set L3init and L4per to ON
ARM: OMAP2+: Select OMAP_INTERCONNECT for SOC_AM43XX
ARM: dts: DRA74x: fix DSS PLL2 addresses
ARM: OMAP2: Enable Errata 430973 for OMAP3
+ Linux 4.7-rc2
Signed-off-by: Olof Johansson <olof@lixom.net>
- Two boot warning fixes from the RCU tree that should have gotten
merged several weeks ago already but did not because of issues
with who merges them. Paul has now split the RCU warning fixes into
sets for various maintainers.
- Fix ams-delta FIQ regression caused by omap1 sparse IRQ changes
- Fix PM for omap3 boards using timer12 and gptimer, like the
original beagleboard
- Fix hangs on am437x-sk-evm by lowering the I2C bus speed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXY8wuAAoJEBvUPslcq6VzGBQQAJ6OIH0Gws19Wyi8IqnjMLJN
npu+JXU0xP5bBZ+HbCVjyN8k32drhXdwDMQ+u1DvBYwUuyLIIRZPZF4aHb8fDfOC
v1VqUzQRzj1FCh9MlkdqTedA180WCo5PCGlFOon0BmaZlv9WevEaTOYrEgyZPrmk
quBnaE+baZfGxWBbDSN+OrGYobQRs7Eu8bel0gh628CDiajrbwlIyAcNdEn5C/Uu
GHiEuIQcxb4b62mwAwh/t7el9ureirsS1b6mFB41puPmF/lYawI6YaCWIL48lbMd
XsgKGnFDU6dgSO5QRx5PhP/7YP9FetS0U+g7iFhgjmArNCraNQYBY0ltMweOG0qe
M8BPxDuCnhm1Q+PcjBORteN/PF47kcnBMpiJVVTmq5JRlXUqXecKpoScCt9HfPgy
EJq+esLQNIuRw9cEwVPQBj80GyxFUVqL/Rzo7xjEwTDPYRQERGCB7V68iV25on3w
07dOVl/lSwe902ik580wnlGUq+J09wk+9hIKV2XwQAV/8mKaWMc3pA8rE/efLJoC
buAsccxVcEsR3+uLSsU/P+fFm8nfBRmiOO9yIR4gez0BhbiDMc1zpwwhLkI+vTT4
D3PnuUdVeBvoGTNnpwqSURxajhaK0DSKTwhWnWGubYfXd3B48sW76rvKLO1FThgL
qyaed06QFeWj8gV+VZLb
=P0Vi
-----END PGP SIGNATURE-----
Merge tag 'fixes-rcu-fiq-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Fixes for omaps for v4.7-rc cycle:
- Two boot warning fixes from the RCU tree that should have gotten
merged several weeks ago already but did not because of issues
with who merges them. Paul has now split the RCU warning fixes into
sets for various maintainers.
- Fix ams-delta FIQ regression caused by omap1 sparse IRQ changes
- Fix PM for omap3 boards using timer12 and gptimer, like the
original beagleboard
- Fix hangs on am437x-sk-evm by lowering the I2C bus speed
* tag 'fixes-rcu-fiq-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: am437x-sk-evm: Reduce i2c0 bus speed for tps65218
ARM: OMAP2+: timer: add probe for clocksources
ARM: OMAP1: fix ams-delta FIQ handler to work with sparse IRQ
arm: Use _rcuidle for smp_cross_call() tracepoints
arm: Use _rcuidle tracepoint to allow use from idle
Signed-off-by: Olof Johansson <olof@lixom.net>
This patch fixes a non-booting issue in Mainline.
When booting with a compressed kernel, we need to be careful how we
populate memory close to DDR start. AUTO_ZRELADDR is enabled by default
in multi-arch enabled configurations, which place some restrictions on
where the kernel is placed and where it will be uncompressed to on boot.
AUTO_ZRELADDR takes the decompressor code's start address and masks out
the bottom 28 bits to obtain an address to uncompress the kernel to
(thus a load address of 0x42000000 means that the kernel will be
uncompressed to 0x40000000 i.e. DDR START on this platform).
Even changing the load address to after the co-processor's shared memory
won't render a booting platform, since the AUTO_ZRELADDR algorithm still
ensures the kernel is uncompressed into memory shared with the first
co-processor (0x40000000).
Another option would be to move loading to 0x4A000000, since this will
mean the decompressor will decompress the kernel to 0x48000000. However,
this would mean a large chunk (0x44000000 => 0x48000000 (64MB)) of
memory would essentially be wasted for no good reason.
Until we can work with ST to find a suitable memory location to
relocate co-processor shared memory, let's disable the shared memory
nodes. This will ensure a working platform in the mean time.
NB: The more observant of you will notice that we're leaving the DMU
shared memory node enabled; this is because a) it is the only one in
active use at the time of this writing and b) it is not affected by
the current default behaviour which is causing issues.
Fixes: fe135c6 (ARM: dts: STiH407: Move over to using the 'reserved-memory' API for obtaining DMA memory)
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Reviewed-by Peter Griffin <peter.griffin@linaro.org>
Signed-off-by: Maxime Coquelin <maxime.coquelin@st.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
- Correct Micrel PHY mask to fix the issue that i.MX6UL ethernet works
in U-Boot but not in kernel.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXYg70AAoJEFBXWFqHsHzO/dcH/jrXxr9zJgXclSVjLx0a+HJB
27v2KtrTBKjeLBOuv1WE5J49SA4PaQ8og7MeAUGIVeuXjGk7v4Ba/+3CIauXQ7fF
oEBr6e5WbzHeoSXpwgPkuMAA0vQ09kIKy2uUOecdIVrCCfS+N1koPj/jjzSTWNdk
7OOTI5itm2T4KeuFTm5RqN0JwhsAgVECuA3JZWvmTCXScC1bjFKhi9wRcBIjt5zf
jOVYJ3kCM7OqxCp82EM6SBY0q2MNkUlV50PjKVYeSpeiSJw3nvg+yBlvwCMTssbn
EdH8cYI/Y+snbcvfsEJMzEpoPgBAujKxQaT3aDVF0mSkH0UykfoPjUmTFrOpjSY=
=fD61
-----END PGP SIGNATURE-----
Merge tag 'imx-fixes-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes
The i.MX fixes for 4.7:
- Correct Micrel PHY mask to fix the issue that i.MX6UL ethernet works
in U-Boot but not in kernel.
* tag 'imx-fixes-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: imx6ul: Fix Micrel PHY mask
Signed-off-by: Olof Johansson <olof@lixom.net>
Pull ARM fixes from Russell King:
"A couple of fixes for pmd_mknotpresent()/pmd_present() for LPAE
systems"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8579/1: mm: Fix definition of pmd_mknotpresent
ARM: 8578/1: mm: ensure pmd_present only checks the valid bit
Here are a small number of debugfs, ISA, and one driver core fix for 4.7-rc4.
All of these resolve reported issues. The ISA ones have spent the least
amount of time in linux-next, sorry about that, I didn't realize they
were regressions that needed to get in now (thanks to Thorsten for the
prodding!) but they do all pass the 0-day bot tests. The others have
been in linux-next for a while now.
Full details about them are in the shortlog below.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAldlbeQACgkQMUfUDdst+ymnFACfaWhEKA/84jwNNHiim92diJrY
zYsAoLOmpBw68yL6qTSZbcWJF4Flb6Xk
=N8M2
-----END PGP SIGNATURE-----
Merge tag 'driver-core-4.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are a small number of debugfs, ISA, and one driver core fix for
4.7-rc4.
All of these resolve reported issues. The ISA ones have spent the
least amount of time in linux-next, sorry about that, I didn't realize
they were regressions that needed to get in now (thanks to Thorsten
for the prodding!) but they do all pass the 0-day bot tests. The
others have been in linux-next for a while now.
Full details about them are in the shortlog below"
* tag 'driver-core-4.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
isa: Dummy isa_register_driver should return error code
isa: Call isa_bus_init before dependent ISA bus drivers register
watchdog: ebc-c384_wdt: Allow build for X86_64
iio: stx104: Allow build for X86_64
gpio: Allow PC/104 devices on X86_64
isa: Allow ISA-style drivers on modern systems
base: make module_create_drivers_dir race-free
debugfs: open_proxy_open(): avoid double fops release
debugfs: full_proxy_open(): free proxy on ->open() failure
kernel/kcov: unproxify debugfs file's fops
Several modern devices, such as PC/104 cards, are expected to run on
modern systems via an ISA bus interface. Since ISA is a legacy interface
for most modern architectures, ISA support should remain disabled in
general. Support for ISA-style drivers should be enabled on a per driver
basis.
To allow ISA-style drivers on modern systems, this patch introduces the
ISA_BUS_API and ISA_BUS Kconfig options. The ISA bus driver will now
build conditionally on the ISA_BUS_API Kconfig option, which defaults to
the legacy ISA Kconfig option. The ISA_BUS Kconfig option allows the
ISA_BUS_API Kconfig option to be selected on architectures which do not
enable ISA (e.g. X86_64).
The ISA_BUS Kconfig option is currently only implemented for X86
architectures. Other architectures may have their own ISA_BUS Kconfig
options added as required.
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Plug the ongoing spin_unlock_wait/spin_is_locked mess
- KGDB protocol fix to sync w/ GDB
- Fix MIDR-based PMU probing for old 32-bit SMP systems (OMAP4/Realview)
- Minor tweaks to the fault handling path
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJXY97DAAoJELescNyEwWM08fYH/3QSj8P11PPIoyu1qEs+9kj1
r7lQGLMZoI/VS7OGPmgwpbQRHk227h0Rz4UnQ4/fTharVs5Pv9iPxnFKjCRCVVGK
VuI815tWC+0mKlJ4ztn8cJk7+GgWiZttAn5qWafuXQUTfqm+ywZOxfiU0+aC9Ytn
k5m3G46cyAvrBcNwLFcrwavzdyVjy/O3nEQVWmzT7uXmPknr52BmWjk3VuCMPdYd
tEjoRK3Nt2xMV3E4A6oIINBz1KmmqGVW3Nzx2prD0d0oGwyT+3FDlHbm+b/fn7aK
ti18FPUDiNSE67enYQ1+3wEyn1Ge/0Z/lLnKJxWXv4+qk3zwsCkp4h/SkOqv3fM=
=0YtF
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"The main things are getting kgdb up and running with upstream GDB
after a protocol change was reverted and fixing our spin_unlock_wait
and spin_is_locked implementations after doing some similar work with
PeterZ on the qspinlock code last week. Whilst we haven't seen any
failures in practice, it's still worth getting this fixed.
Summary:
- Plug the ongoing spin_unlock_wait/spin_is_locked mess
- KGDB protocol fix to sync w/ GDB
- Fix MIDR-based PMU probing for old 32-bit SMP systems
(OMAP4/Realview)
- Minor tweaks to the fault handling path"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: kgdb: Match pstate size with gdbserver protocol
arm64: spinlock: Ensure forward-progress in spin_unlock_wait
arm64: spinlock: fix spin_unlock_wait for LSE atomics
arm64: spinlock: order spin_{is_locked,unlock_wait} against local locks
arm: pmu: Fix non-devicetree probing
arm64: mm: mark fault_info table const
arm64: fix dump_instr when PAN and UAO are in use
Based on the latest timing specifications for the TPS65218 from the data
sheet, http://www.ti.com/lit/ds/symlink/tps65218.pdf, document SLDS206
from November 2014, we must change the i2c bus speed to better fit within
the minimum high SCL time required for proper i2c transfer.
When running at 400khz, measurements show that SCL spends
0.8125 uS/1.666 uS high/low which violates the requirement for minimum
high period of SCL provided in datasheet Table 7.6 which is 1 uS.
Switching to 100khz gives us 5 uS/5 uS high/low which both fall above
the minimum given values for 100 khz, 4.0 uS/4.7 uS high/low.
Without this patch occasionally a voltage set operation from the kernel
will appear to have worked but the actual voltage reflected on the PMIC
will not have updated, causing problems especially with cpufreq that may
update to a higher OPP without actually raising the voltage on DCDC2,
leading to a hang.
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Franklin S Cooper Jr <fcooper@ti.com>
Signed-off-by: Aparna Balasubramanian <aparnab@ti.com>
Signed-off-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The PE primary bus cannot be got from its child devices when having
full hotplug in error recovery. The PE primary bus is cached, which
is done in commit <05ba75f84864> ("powerpc/eeh: Fix stale cached primary
bus"). In eeh_reset_device(), the flag (EEH_PE_PRI_BUS) is cleared
before the PCI hot remove. eeh_pe_bus_get() then returns NULL as the
PE primary bus in pnv_eeh_reset() and it crashes the kernel eventually.
This fixes the issue by clearing the flag (EEH_PE_PRI_BUS) before the
PCI hot add. With it, the PowerNV EEH reset backend (pnv_eeh_reset())
can get valid PE primary bus through eeh_pe_bus_get().
Fixes: 67086e32b564 ("powerpc/eeh: powerpc/eeh: Support error recovery for VF PE")
Reported-by: Pridhiviraj Paidipeddi <ppaiddipe@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
ISA 3.0 updated it to be encoded as Radix tree size = 2^(RTS + 31). We
have it encoded as 2^(RTS + 28). Add a helper with the correct encoding
and use it instead of opencoding.
Fixes: 2bfd65e45e87 ("powerpc/mm/radix: Add radix callbacks for early init routines")
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
H_ENTER hcall handling in qemu had assumptions that a cache inhibited
hpte entry won't have memory conference set. Also older kernel
mentioned that some version of pHyp required this (the code removed
by the below commit says:
/* Make pHyp happy */
if ((rflags & _PAGE_NO_CACHE) && !(rflags & _PAGE_WRITETHRU))
hpte_r &= ~HPTE_R_M;
But with older kernel we had some inconsistent memory conherence
mapping. We always enabled memory conherence in the page fault path and
removed memory conherence is _PAGE_NO_CACHE was set when we mapped the
page via htab_bolt_mapping. The commit mentioned below tried to
consolidate that by always enabling memory conherence. But as mentioned
above that breaks Qemu H_ENTER handling.
This patch update this such that we enable memory conherence only if
cache inhibited is not set and bring fault handling, lpar and bolt
mapping in sync.
Fixes: commit 30bda41aba4e("powerpc/mm: Drop WIMG in favour of new constant")
Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
A few platforms are currently missing clocksource_probe() completely
in their time_init functionality. On OMAP3430 for example, this is
causing cpuidle to be pretty much dead, as the counter32k is not
going to be registered and instead a gptimer is used as a clocksource.
This will tick in periodic mode, preventing any deeper idle states.
While here, also drop one unnecessary check for populated DT before
existing clocksource_probe() call.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
After OMAP1 IRQ definitions have been changed by commit 685e2d08c54b
("ARM: OMAP1: Change interrupt numbering for sparse IRQ") introduced
in v4.2, ams-delta FIQ handler which depends on them no longer works
as expected. Fix it.
Created and tested on Amstrad Delta against Linux-4.7-rc3
Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
- One new kvm_stat for s390
- Correctly disable VT-d posted interrupts with the rest of posted interrupts
- "make randconfig" fix for x86 AMD
- Off-by-one in irq route check (the "good" kind that errors out a bit too
early!)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJXYrXKAAoJEL/70l94x66D1MUH/i9kPqfDq+XveHyiY4ovI2Vl
lD1P0dJoXPRjrJJ/LRulr3TiGDVsW6QZ8SnA5QNQvxDdlc7CzS8ZgqaiLPUh8TKJ
OofVUaFgm77MDvGJuJOOJ159ghO+7KwPsq1P05xpO2HRxAD+q1/u1yjfOz7fIEqC
iMne68rfv0OeiMlBOo8G2e1Xmtk1GKNBhmRItUgOF/jVtP2RSvV5o+2rcQ5LS3g6
KV/fpWtRumd3R+TdRvacjADgvWrSokDfph+Ha9qp7sBjkVGLLZ/hdHzTzIimXKF6
x4muv1HYzKSGaCJB2yMLYuy/KJ8zbsk7co0bjn1SmzrSweJxMkDGwLp1Ffau6iM=
=N4kr
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
- miscellaneous fixes for MIPS and s390
- one new kvm_stat for s390
- correctly disable VT-d posted interrupts with the rest of posted
interrupts
- "make randconfig" fix for x86 AMD
- off-by-one in irq route check (the "good" kind that errors out a bit
too early!)
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: vmx: check apicv is active before using VT-d posted interrupt
kvm: Fix irq route entries exceeding KVM_MAX_IRQ_ROUTES
kvm: svm: Do not support AVIC if not CONFIG_X86_LOCAL_APIC
kvm: svm: Fix implicit declaration for __default_cpu_present_to_apicid()
MIPS: KVM: Fix CACHE triggered exception emulation
MIPS: KVM: Don't unwind PC when emulating CACHE
MIPS: KVM: Include bit 31 in segment matches
MIPS: KVM: Fix modular KVM under QEMU
KVM: s390: Add stats for PEI events
KVM: s390: ignore IBC if zero
Current versions of gdb do not interoperate cleanly with kgdb on arm64
systems because gdb and kgdb do not use the same register description.
This patch modifies kgdb to work with recent releases of gdb (>= 7.8.1).
Compatibility with gdb (after the patch is applied) is as follows:
gdb-7.6 and earlier Ok
gdb-7.7 series Works if user provides custom target description
gdb-7.8(.0) Works if user provides custom target description
gdb-7.8.1 and later Ok
When commit 44679a4f142b ("arm64: KGDB: Add step debugging support") was
introduced it was paired with a gdb patch that made an incompatible
change to the gdbserver protocol. This patch was eventually merged into
the gdb sources:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=a4d9ba85ec5597a6a556afe26b712e878374b9dd
The change to the protocol was mostly made to simplify big-endian support
inside the kernel gdb stub. Unfortunately the gdb project released
gdb-7.7.x and gdb-7.8.0 before the protocol incompatibility was identified
and reversed:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=bdc144174bcb11e808b4e73089b850cf9620a7ee
This leaves us in a position where kgdb still uses the no-longer-used
protocol; gdb-7.8.1, which restored the original behaviour, was
released on 2014-10-29.
I don't believe it is possible to detect/correct the protocol
incompatiblity which means the kernel must take a view about which
version of the gdb remote protocol is "correct". This patch takes the
view that the original/current version of the protocol is correct
and that version found in gdb-7.7.x and gdb-7.8.0 is anomalous.
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
VT-d posted interrupt is relying on the CPU side's posted interrupt.
Need to check whether VCPU's APICv is active before enabing VT-d
posted interrupt.
Fixes: d62caabb41f33d96333f9ef15e09cd26e1c12760
Cc: stable@vger.kernel.org
Signed-off-by: Yang Zhang <yang.zhang.wz@gmail.com>
Signed-off-by: Shengge Ding <shengge.dsg@alibaba-inc.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The commit 8221c1370056 ("svm: Manage vcpu load/unload when enable AVIC")
introduces a build error due to implicit function declaration
when #ifdef CONFIG_X86_32 and #ifndef CONFIG_X86_LOCAL_APIC
(as reported by Kbuild test robot i386-randconfig-x0-06121009).
So, this patch introduces kvm_cpu_get_apicid() wrapper
around __default_cpu_present_to_apicid() with additional
handling if CONFIG_X86_LOCAL_APIC is not defined.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: commit 8221c1370056 ("svm: Manage vcpu load/unload when enable AVIC")
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rather than wait until we observe the lock being free (which might never
happen), we can also return from spin_unlock_wait if we observe that the
lock is now held by somebody else, which implies that it was unlocked
but we just missed seeing it in that state.
Furthermore, in such a scenario there is no longer a need to write back
the value that we loaded, since we know that there has been a lock
hand-off, which is sufficient to publish any stores prior to the
unlock_wait because the ARm architecture ensures that a Store-Release
instruction is multi-copy atomic when observed by a Load-Acquire
instruction.
The litmus test is something like:
AArch64
{
0:X1=x; 0:X3=y;
1:X1=y;
2:X1=y; 2:X3=x;
}
P0 | P1 | P2 ;
MOV W0,#1 | MOV W0,#1 | LDAR W0,[X1] ;
STR W0,[X1] | STLR W0,[X1] | LDR W2,[X3] ;
DMB SY | | ;
LDR W2,[X3] | | ;
exists
(0:X2=0 /\ 2:X0=1 /\ 2:X2=0)
where P0 is doing spin_unlock_wait, P1 is doing spin_unlock and P2 is
doing spin_lock.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit d86b8da04dfa ("arm64: spinlock: serialise spin_unlock_wait against
concurrent lockers") fixed spin_unlock_wait for LL/SC-based atomics under
the premise that the LSE atomics (in particular, the LDADDA instruction)
are indivisible.
Unfortunately, these instructions are only indivisible when used with the
-AL (full ordering) suffix and, consequently, the same issue can
theoretically be observed with LSE atomics, where a later (in program
order) load can be speculated before the write portion of the atomic
operation.
This patch fixes the issue by performing a CAS of the lock once we've
established that it's unlocked, in much the same way as the LL/SC code.
Fixes: d86b8da04dfa ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers")
Signed-off-by: Will Deacon <will.deacon@arm.com>
spin_is_locked has grown two very different use-cases:
(1) [The sane case] API functions may require a certain lock to be held
by the caller and can therefore use spin_is_locked as part of an
assert statement in order to verify that the lock is indeed held.
For example, usage of assert_spin_locked.
(2) [The insane case] There are two locks, where a CPU takes one of the
locks and then checks whether or not the other one is held before
accessing some shared state. For example, the "optimized locking" in
ipc/sem.c.
In the latter case, the sequence looks like:
spin_lock(&sem->lock);
if (!spin_is_locked(&sma->sem_perm.lock))
/* Access shared state */
and requires that the spin_is_locked check is ordered after taking the
sem->lock. Unfortunately, since our spinlocks are implemented using a
LDAXR/STXR sequence, the read of &sma->sem_perm.lock can be speculated
before the STXR and consequently return a stale value.
Whilst this hasn't been seen to cause issues in practice, PowerPC fixed
the same issue in 51d7d5205d33 ("powerpc: Add smp_mb() to
arch_spin_is_locked()") and, although we did something similar for
spin_unlock_wait in d86b8da04dfa ("arm64: spinlock: serialise
spin_unlock_wait against concurrent lockers") that doesn't actually take
care of ordering against local acquisition of a different lock.
This patch adds an smp_mb() to the start of our arch_spin_is_locked and
arch_spin_unlock_wait routines to ensure that the lock value is always
loaded after any other locks have been taken by the current CPU.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Further testing with false negatives suppressed by commit 293e2421fe25
("rcu: Remove superfluous versions of rcu_read_lock_sched_held()")
identified another unprotected use of RCU from the idle loop. Because RCU
actively ignores idle-loop code (for energy-efficiency reasons, among
other things), using RCU from the idle loop can result in too-short
grace periods, in turn resulting in arbitrary misbehavior.
The resulting lockdep-RCU splat is as follows:
------------------------------------------------------------------------
===============================
[ INFO: suspicious RCU usage. ]
4.6.0-rc5-next-20160426+ #1112 Not tainted
-------------------------------
include/trace/events/ipi.h:35 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
no locks held by swapper/0/0.
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1112
Hardware name: Generic OMAP4 (Flattened Device Tree)
[<c0110308>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14)
[<c010c3a8>] (show_stack) from [<c047fec8>] (dump_stack+0xb0/0xe4)
[<c047fec8>] (dump_stack) from [<c010dcfc>] (smp_cross_call+0xbc/0x188)
[<c010dcfc>] (smp_cross_call) from [<c01c9e28>] (generic_exec_single+0x9c/0x15c)
[<c01c9e28>] (generic_exec_single) from [<c01ca0a0>] (smp_call_function_single_async+0 x38/0x9c)
[<c01ca0a0>] (smp_call_function_single_async) from [<c0603728>] (cpuidle_coupled_poke_others+0x8c/0xa8)
[<c0603728>] (cpuidle_coupled_poke_others) from [<c0603c10>] (cpuidle_enter_state_coupled+0x26c/0x390)
[<c0603c10>] (cpuidle_enter_state_coupled) from [<c0183c74>] (cpu_startup_entry+0x198/0x3a0)
[<c0183c74>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8)
[<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c)
------------------------------------------------------------------------
Reported-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: <linux-omap@vger.kernel.org>
Cc: <linux-arm-kernel@lists.infradead.org>
Unlike the debug_fault_info table, we never intentionally alter the
fault_info table at runtime, and all derived pointers are treated as
const currently.
Make the table const so that it can be placed in .rodata and protected
from unintentional writes, as we do for the syscall tables.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
If the kernel is set to show unhandled signals, and a user task does not
handle a SIGILL as a result of an instruction abort, we will attempt to
log the offending instruction with dump_instr before killing the task.
We use dump_instr to log the encoding of the offending userspace
instruction. However, dump_instr is also used to dump instructions from
kernel space, and internally always switches to KERNEL_DS before dumping
the instruction with get_user. When both PAN and UAO are in use, reading
a user instruction via get_user while in KERNEL_DS will result in a
permission fault, which leads to an Oops.
As we have regs corresponding to the context of the original instruction
abort, we can inspect this and only flip to KERNEL_DS if the original
abort was taken from the kernel, avoiding this issue. At the same time,
remove the redundant (and incorrect) comments regarding the order
dump_mem and dump_instr are called in.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: <stable@vger.kernel.org> #4.6+
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Fixes: 57f4959bad0a154a ("arm64: kernel: Add support for User Access Override")
Signed-off-by: Will Deacon <will.deacon@arm.com>
Fix kprobe_fault_handler() to clear the TF (trap flag) bit of
the flags register in the case of a fault fixup on single-stepping.
If we put a kprobe on the instruction which caused a
page fault (e.g. actual mov instructions in copy_user_*),
that fault happens on the single-stepping buffer. In this
case, kprobes resets running instance so that the CPU can
retry execution on the original ip address.
However, current code forgets to reset the TF bit. Since this
fault happens with TF bit set for enabling single-stepping,
when it retries, it causes a debug exception and kprobes
can not handle it because it already reset itself.
On the most of x86-64 platform, it can be easily reproduced
by using kprobe tracer. E.g.
# cd /sys/kernel/debug/tracing
# echo p copy_user_enhanced_fast_string+5 > kprobe_events
# echo 1 > events/kprobes/enable
And you'll see a kernel panic on do_debug(), since the debug
trap is not handled by kprobes.
To fix this problem, we just need to clear the TF bit when
resetting running kprobe.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: systemtap@sourceware.org
Cc: stable@vger.kernel.org # All the way back to ancient kernels
Link: http://lkml.kernel.org/r/20160611140648.25885.37482.stgit@devbox
[ Updated the comments. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The NMI watchdog uses either the fixed cycles or a generic cycles
counter. This causes a lot of conflicts with users of the PMU who want
to run a full group including the cycles fixed counter, for example
the --topdown support recently added to perf stat. The code needs to
fall back to not use groups, which can cause measurement inaccuracy
due to multiplexing errors.
This patch switches the NMI watchdog to use reference cycles
on Intel systems. This is actually more accurate than cycles,
because cycles can tick faster than the measured CPU Frequency
due to Turbo mode.
The ref cycles always tick at their frequency, or slower when
the system is idling. That means the NMI watchdog can never
expire too early, unlike with cycles.
The reference cycles tick roughly at the frequency of the TSC,
so the same period computation can be used.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1465478079-19993-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Remove redundant pci_get_drvdata() call. There's another call a few lines
down, just before we test "box" for NULL.
No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20160531212527.28718.92371.stgit@bhelgaas-glaptop2.roam.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>