IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Pull x86 shadow stack support from Dave Hansen:
"This is the long awaited x86 shadow stack support, part of Intel's
Control-flow Enforcement Technology (CET).
CET consists of two related security features: shadow stacks and
indirect branch tracking. This series implements just the shadow stack
part of this feature, and just for userspace.
The main use case for shadow stack is providing protection against
return oriented programming attacks. It works by maintaining a
secondary (shadow) stack using a special memory type that has
protections against modification. When executing a CALL instruction,
the processor pushes the return address to both the normal stack and
to the special permission shadow stack. Upon RET, the processor pops
the shadow stack copy and compares it to the normal stack copy.
For more information, refer to the links below for the earlier
versions of this patch set"
Link: https://lore.kernel.org/lkml/20220130211838.8382-1-rick.p.edgecombe@intel.com/
Link: https://lore.kernel.org/lkml/20230613001108.3040476-1-rick.p.edgecombe@intel.com/
* tag 'x86_shstk_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
x86/shstk: Change order of __user in type
x86/ibt: Convert IBT selftest to asm
x86/shstk: Don't retry vm_munmap() on -EINTR
x86/kbuild: Fix Documentation/ reference
x86/shstk: Move arch detail comment out of core mm
x86/shstk: Add ARCH_SHSTK_STATUS
x86/shstk: Add ARCH_SHSTK_UNLOCK
x86: Add PTRACE interface for shadow stack
selftests/x86: Add shadow stack test
x86/cpufeatures: Enable CET CR4 bit for shadow stack
x86/shstk: Wire in shadow stack interface
x86: Expose thread features in /proc/$PID/status
x86/shstk: Support WRSS for userspace
x86/shstk: Introduce map_shadow_stack syscall
x86/shstk: Check that signal frame is shadow stack mem
x86/shstk: Check that SSP is aligned on sigreturn
x86/shstk: Handle signals for shadow stack
x86/shstk: Introduce routines modifying shstk
x86/shstk: Handle thread shadow stack
x86/shstk: Add user-mode shadow stack support
...
Patch series "Add support for DAX vmemmap optimization for ppc64", v6.
This patch series implements changes required to support DAX vmemmap
optimization for ppc64. The vmemmap optimization is only enabled with
radix MMU translation and 1GB PUD mapping with 64K page size.
The patch series also splits the hugetlb vmemmap optimization as a
separate Kconfig variable so that architectures can enable DAX vmemmap
optimization without enabling hugetlb vmemmap optimization. This should
enable architectures like arm64 to enable DAX vmemmap optimization while
they can't enable hugetlb vmemmap optimization. More details of the same
are in patch "mm/vmemmap optimization: Split hugetlb and devdax vmemmap
optimization".
With 64K page size for 16384 pages added (1G) we save 14 pages
With 4K page size for 262144 pages added (1G) we save 4094 pages
With 4K page size for 512 pages added (2M) we save 6 pages
This patch (of 13):
Architectures like powerpc would like to enable transparent huge page pud
support only with radix translation. To support that add
has_transparent_pud_hugepage() helper that architectures can override.
[aneesh.kumar@linux.ibm.com: use the new has_transparent_pud_hugepage()]
Link: https://lkml.kernel.org/r/87tttrvtaj.fsf@linux.ibm.com
Link: https://lkml.kernel.org/r/20230724190759.483013-1-aneesh.kumar@linux.ibm.com
Link: https://lkml.kernel.org/r/20230724190759.483013-2-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The x86 Shadow stack feature includes a new type of memory called shadow
stack. This shadow stack memory has some unusual properties, which requires
some core mm changes to function properly.
One of these unusual properties is that shadow stack memory is writable,
but only in limited ways. These limits are applied via a specific PTE
bit combination. Nevertheless, the memory is writable, and core mm code
will need to apply the writable permissions in the typical paths that
call pte_mkwrite(). Future patches will make pte_mkwrite() take a VMA, so
that the x86 implementation of it can know whether to create regular
writable or shadow stack mappings.
But there are a couple of challenges to this. Modifying the signatures of
each arch pte_mkwrite() implementation would be error prone because some
are generated with macros and would need to be re-implemented. Also, some
pte_mkwrite() callers operate on kernel memory without a VMA.
So this can be done in a three step process. First pte_mkwrite() can be
renamed to pte_mkwrite_novma() in each arch, with a generic pte_mkwrite()
added that just calls pte_mkwrite_novma(). Next callers without a VMA can
be moved to pte_mkwrite_novma(). And lastly, pte_mkwrite() and all callers
can be changed to take/pass a VMA.
Previous work pte_mkwrite() renamed pte_mkwrite_novma() and converted
callers that don't have a VMA were to use pte_mkwrite_novma(). So now
change pte_mkwrite() to take a VMA and change the remaining callers to
pass a VMA. Apply the same changes for pmd_mkwrite().
No functional change.
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/all/20230613001108.3040476-4-rick.p.edgecombe%40intel.com
The page table debug tests need a physical address to validate low-level
page table manipulation with. The memory at this address is not actually
touched, it just encoded in the page table entries at various levels
during the tests only.
Since the memory is not used, the code just picks the physical address of
the start_kernel symbol. This value is then truncated to get a properly
aligned address that is to be used for various tests. Because of the
truncation, the address might not actually exist, or might not describe a
complete huge page. That's not a problem for most tests, but the
arch-specific code may check for attribute validity and consistency. The
x86 version of {pud,pmd}_set_huge actually validates the MTRRs for the
PMD/PUD range. This may fail with an address derived from start_kernel,
depending on where the kernel was loaded and what the physical memory
layout of the system is. This then leads to false negatives for the
{pud,pmd}_set_huge tests.
Avoid this by finding a properly aligned memory range that exists and is
usable. If such a range is not found, skip the tests that needed it.
[fvdl@google.com: v3]
Link: https://lkml.kernel.org/r/20230110181208.1633879-1-fvdl@google.com
Link: https://lkml.kernel.org/r/20230109174332.329366-1-fvdl@google.com
Fixes: 399145f9eb ("mm/debug: add tests validating architecture page table helpers")
Signed-off-by: Frank van der Linden <fvdl@google.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
so it will be consistent with code mm directory and with
Documentation/admin-guide/mm and won't be confused with virtual machines.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Tested-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Acked-by: Wu XiangCheng <bobwxc@email.cn>
In page table entry modifying tests, set_xxx_at() are used to populate
the page table entries. On ARM64, PG_arch_1 (PG_dcache_clean) flag is
set to the target page flag if execution permission is given. The logic
exits since commit 4f04d8f005 ("arm64: MMU definitions"). The page
flag is kept when the page is free'd to buddy's free area list. However,
it will trigger page checking failure when it's pulled from the buddy's
free area list, as the following warning messages indicate.
BUG: Bad page state in process memhog pfn:08000
page:0000000015c0a628 refcount:0 mapcount:0 \
mapping:0000000000000000 index:0x1 pfn:0x8000
flags: 0x7ffff8000000800(arch_1|node=0|zone=0|lastcpupid=0xfffff)
raw: 07ffff8000000800 dead000000000100 dead000000000122 0000000000000000
raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag(s) set
This fixes the issue by clearing PG_arch_1 through flush_dcache_page()
after set_xxx_at() is called. For architectures other than ARM64, the
unexpected overhead of cache flushing is acceptable.
Link: https://lkml.kernel.org/r/20210809092631.1888748-13-gshan@redhat.com
Fixes: a5c3b9ffb0 ("mm/debug_vm_pgtable: add tests validating advanced arch page table helpers")
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> [powerpc 8xx]
Tested-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> [s390]
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chunyu Hu <chuhu@redhat.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "mm/debug_vm_pgtable: Enhancements", v6.
There are a couple of issues with current implementations and this series
tries to resolve the issues:
(a) All needed information are scattered in variables, passed to various
test functions. The code is organized in pretty much relaxed fashion.
(b) The page isn't allocated from buddy during page table entry modifying
tests. The page can be invalid, conflicting to the implementations
of set_xxx_at() on ARM64. The target page is accessed so that the
iCache can be flushed when execution permission is given on ARM64.
Besides, the target page can be unmapped and accessing to it causes
kernel crash.
"struct pgtable_debug_args" is introduced to address issue (a). For issue
(b), the used page is allocated from buddy in page table entry modifying
tests. The corresponding tets will be skipped if we fail to allocate the
(huge) page. For other test cases, the original page around to kernel
symbol (@start_kernel) is still used.
The patches are organized as below. PATCH[2-10] could be combined to one
patch, but it will make the review harder:
PATCH[1] introduces "struct pgtable_debug_args" as place holder of all
needed information. With it, the old and new implementation
can coexist.
PATCH[2-10] uses "struct pgtable_debug_args" in various test functions.
PATCH[11] removes the unused code for old implementation.
PATCH[12] fixes the issue of corrupted page flag for ARM64
This patch (of 6):
In debug_vm_pgtable(), there are many local variables introduced to track
the needed information and they are passed to the functions for various
test cases. It'd better to introduce a struct as place holder for these
information. With it, what the tests functions need is the struct. In
this way, the code is simplified and easier to be maintained.
Besides, set_xxx_at() could access the data on the corresponding pages in
the page table modifying tests. So the accessed pages in the tests should
have been allocated from buddy. Otherwise, we're accessing pages that
aren't owned by us. This causes issues like page flag corruption or
kernel crash on accessing unmapped page when CONFIG_DEBUG_PAGEALLOC is
enabled.
This introduces "struct pgtable_debug_args". The struct is initialized
and destroyed, but the information in the struct isn't used yet. It will
be used in subsequent patches.
Link: https://lkml.kernel.org/r/20210809092631.1888748-1-gshan@redhat.com
Link: https://lkml.kernel.org/r/20210809092631.1888748-2-gshan@redhat.com
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> [powerpc 8xx]
Tested-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> [s390]
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Chunyu Hu <chuhu@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In pmd/pud_advanced_tests(), the vaddr is aligned up to the next pmd/pud
entry, and so it does not match the given pmdp/pudp and (aligned down)
pfn any more.
For s390, this results in memory corruption, because the IDTE
instruction used e.g. in xxx_get_and_clear() will take the vaddr for
some calculations, in combination with the given pmdp. It will then end
up with a wrong table origin, ending on ...ff8, and some of those
wrongly set low-order bits will also select a wrong pagetable level for
the index addition. IDTE could therefore invalidate (or 0x20) something
outside of the page tables, depending on the wrongly picked index, which
in turn depends on the random vaddr.
As result, we sometimes see "BUG task_struct (Not tainted): Padding
overwritten" on s390, where one 0x5a padding value got overwritten with
0x7a.
Fix this by aligning down, similar to how the pmd/pud_aligned pfns are
calculated.
Link: https://lkml.kernel.org/r/20210525130043.186290-2-gerald.schaefer@linux.ibm.com
Fixes: a5c3b9ffb0 ("mm/debug_vm_pgtable: add tests validating advanced arch page table helpers")
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: <stable@vger.kernel.org> [5.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently the basic tests just validate various page table transformations
after starting with vm_get_page_prot(VM_READ|VM_WRITE|VM_EXEC) protection.
Instead scan over the entire protection_map[] for better coverage. It
also makes sure that all these basic page table tranformations checks hold
true irrespective of the starting protection value for the page table
entry. There is also a slight change in the debug print format for basic
tests to capture the protection value it is being tested with. The
modified output looks something like
[pte_basic_tests ]: Validating PTE basic ()
[pte_basic_tests ]: Validating PTE basic (read)
[pte_basic_tests ]: Validating PTE basic (write)
[pte_basic_tests ]: Validating PTE basic (read|write)
[pte_basic_tests ]: Validating PTE basic (exec)
[pte_basic_tests ]: Validating PTE basic (read|exec)
[pte_basic_tests ]: Validating PTE basic (write|exec)
[pte_basic_tests ]: Validating PTE basic (read|write|exec)
[pte_basic_tests ]: Validating PTE basic (shared)
[pte_basic_tests ]: Validating PTE basic (read|shared)
[pte_basic_tests ]: Validating PTE basic (write|shared)
[pte_basic_tests ]: Validating PTE basic (read|write|shared)
[pte_basic_tests ]: Validating PTE basic (exec|shared)
[pte_basic_tests ]: Validating PTE basic (read|exec|shared)
[pte_basic_tests ]: Validating PTE basic (write|exec|shared)
[pte_basic_tests ]: Validating PTE basic (read|write|exec|shared)
This adds a missing argument 'struct mm_struct *' in pud_basic_tests()
test . This never got exposed before as PUD based THP is available only
on X86 platform where mm_pmd_folded(mm) call gets macro replaced without
requiring the mm_struct i.e __is_defined(__PAGETABLE_PMD_FOLDED).
Link: https://lkml.kernel.org/r/1611137241-26220-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Reviewed-by: Steven Price <steven.price@arm.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>