e71e2ace57
Patch series "userfaultfd: do not untag user pointers", v5. If a user program uses userfaultfd on ranges of heap memory, it may end up passing a tagged pointer to the kernel in the range.start field of the UFFDIO_REGISTER ioctl. This can happen when using an MTE-capable allocator, or on Android if using the Tagged Pointers feature for MTE readiness [1]. When a fault subsequently occurs, the tag is stripped from the fault address returned to the application in the fault.address field of struct uffd_msg. However, from the application's perspective, the tagged address *is* the memory address, so if the application is unaware of memory tags, it may get confused by receiving an address that is, from its point of view, outside of the bounds of the allocation. We observed this behavior in the kselftest for userfaultfd [2] but other applications could have the same problem. Address this by not untagging pointers passed to the userfaultfd ioctls. Instead, let the system call fail. Also change the kselftest to use mmap so that it doesn't encounter this problem. [1] https://source.android.com/devices/tech/debug/tagged-pointers [2] tools/testing/selftests/vm/userfaultfd.c This patch (of 2): Do not untag pointers passed to the userfaultfd ioctls. Instead, let the system call fail. This will provide an early indication of problems with tag-unaware userspace code instead of letting the code get confused later, and is consistent with how we decided to handle brk/mmap/mremap in commitdcde237319
("mm: Avoid creating virtual address aliases in brk()/mmap()/mremap()"), as well as being consistent with the existing tagged address ABI documentation relating to how ioctl arguments are handled. The code change is a revert of commit7d0325749a
("userfaultfd: untag user pointers") plus some fixups to some additional calls to validate_range that have appeared since then. [1] https://source.android.com/devices/tech/debug/tagged-pointers [2] tools/testing/selftests/vm/userfaultfd.c Link: https://lkml.kernel.org/r/20210714195437.118982-1-pcc@google.com Link: https://lkml.kernel.org/r/20210714195437.118982-2-pcc@google.com Link: https://linux-review.googlesource.com/id/I761aa9f0344454c482b83fcfcce547db0a25501b Fixes:63f0c60379
("arm64: Introduce prctl() options to control the tagged user addresses ABI") Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Alistair Delva <adelva@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mitch Phillips <mitchp@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will@kernel.org> Cc: William McVicker <willmcvicker@google.com> Cc: <stable@vger.kernel.org> [5.4] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
180 lines
6.0 KiB
ReStructuredText
180 lines
6.0 KiB
ReStructuredText
==========================
|
|
AArch64 TAGGED ADDRESS ABI
|
|
==========================
|
|
|
|
Authors: Vincenzo Frascino <vincenzo.frascino@arm.com>
|
|
Catalin Marinas <catalin.marinas@arm.com>
|
|
|
|
Date: 21 August 2019
|
|
|
|
This document describes the usage and semantics of the Tagged Address
|
|
ABI on AArch64 Linux.
|
|
|
|
1. Introduction
|
|
---------------
|
|
|
|
On AArch64 the ``TCR_EL1.TBI0`` bit is set by default, allowing
|
|
userspace (EL0) to perform memory accesses through 64-bit pointers with
|
|
a non-zero top byte. This document describes the relaxation of the
|
|
syscall ABI that allows userspace to pass certain tagged pointers to
|
|
kernel syscalls.
|
|
|
|
2. AArch64 Tagged Address ABI
|
|
-----------------------------
|
|
|
|
From the kernel syscall interface perspective and for the purposes of
|
|
this document, a "valid tagged pointer" is a pointer with a potentially
|
|
non-zero top-byte that references an address in the user process address
|
|
space obtained in one of the following ways:
|
|
|
|
- ``mmap()`` syscall where either:
|
|
|
|
- flags have the ``MAP_ANONYMOUS`` bit set or
|
|
- the file descriptor refers to a regular file (including those
|
|
returned by ``memfd_create()``) or ``/dev/zero``
|
|
|
|
- ``brk()`` syscall (i.e. the heap area between the initial location of
|
|
the program break at process creation and its current location).
|
|
|
|
- any memory mapped by the kernel in the address space of the process
|
|
during creation and with the same restrictions as for ``mmap()`` above
|
|
(e.g. data, bss, stack).
|
|
|
|
The AArch64 Tagged Address ABI has two stages of relaxation depending on
|
|
how the user addresses are used by the kernel:
|
|
|
|
1. User addresses not accessed by the kernel but used for address space
|
|
management (e.g. ``mprotect()``, ``madvise()``). The use of valid
|
|
tagged pointers in this context is allowed with these exceptions:
|
|
|
|
- ``brk()``, ``mmap()`` and the ``new_address`` argument to
|
|
``mremap()`` as these have the potential to alias with existing
|
|
user addresses.
|
|
|
|
NOTE: This behaviour changed in v5.6 and so some earlier kernels may
|
|
incorrectly accept valid tagged pointers for the ``brk()``,
|
|
``mmap()`` and ``mremap()`` system calls.
|
|
|
|
- The ``range.start``, ``start`` and ``dst`` arguments to the
|
|
``UFFDIO_*`` ``ioctl()``s used on a file descriptor obtained from
|
|
``userfaultfd()``, as fault addresses subsequently obtained by reading
|
|
the file descriptor will be untagged, which may otherwise confuse
|
|
tag-unaware programs.
|
|
|
|
NOTE: This behaviour changed in v5.14 and so some earlier kernels may
|
|
incorrectly accept valid tagged pointers for this system call.
|
|
|
|
2. User addresses accessed by the kernel (e.g. ``write()``). This ABI
|
|
relaxation is disabled by default and the application thread needs to
|
|
explicitly enable it via ``prctl()`` as follows:
|
|
|
|
- ``PR_SET_TAGGED_ADDR_CTRL``: enable or disable the AArch64 Tagged
|
|
Address ABI for the calling thread.
|
|
|
|
The ``(unsigned int) arg2`` argument is a bit mask describing the
|
|
control mode used:
|
|
|
|
- ``PR_TAGGED_ADDR_ENABLE``: enable AArch64 Tagged Address ABI.
|
|
Default status is disabled.
|
|
|
|
Arguments ``arg3``, ``arg4``, and ``arg5`` must be 0.
|
|
|
|
- ``PR_GET_TAGGED_ADDR_CTRL``: get the status of the AArch64 Tagged
|
|
Address ABI for the calling thread.
|
|
|
|
Arguments ``arg2``, ``arg3``, ``arg4``, and ``arg5`` must be 0.
|
|
|
|
The ABI properties described above are thread-scoped, inherited on
|
|
clone() and fork() and cleared on exec().
|
|
|
|
Calling ``prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE, 0, 0, 0)``
|
|
returns ``-EINVAL`` if the AArch64 Tagged Address ABI is globally
|
|
disabled by ``sysctl abi.tagged_addr_disabled=1``. The default
|
|
``sysctl abi.tagged_addr_disabled`` configuration is 0.
|
|
|
|
When the AArch64 Tagged Address ABI is enabled for a thread, the
|
|
following behaviours are guaranteed:
|
|
|
|
- All syscalls except the cases mentioned in section 3 can accept any
|
|
valid tagged pointer.
|
|
|
|
- The syscall behaviour is undefined for invalid tagged pointers: it may
|
|
result in an error code being returned, a (fatal) signal being raised,
|
|
or other modes of failure.
|
|
|
|
- The syscall behaviour for a valid tagged pointer is the same as for
|
|
the corresponding untagged pointer.
|
|
|
|
|
|
A definition of the meaning of tagged pointers on AArch64 can be found
|
|
in Documentation/arm64/tagged-pointers.rst.
|
|
|
|
3. AArch64 Tagged Address ABI Exceptions
|
|
-----------------------------------------
|
|
|
|
The following system call parameters must be untagged regardless of the
|
|
ABI relaxation:
|
|
|
|
- ``prctl()`` other than pointers to user data either passed directly or
|
|
indirectly as arguments to be accessed by the kernel.
|
|
|
|
- ``ioctl()`` other than pointers to user data either passed directly or
|
|
indirectly as arguments to be accessed by the kernel.
|
|
|
|
- ``shmat()`` and ``shmdt()``.
|
|
|
|
- ``brk()`` (since kernel v5.6).
|
|
|
|
- ``mmap()`` (since kernel v5.6).
|
|
|
|
- ``mremap()``, the ``new_address`` argument (since kernel v5.6).
|
|
|
|
Any attempt to use non-zero tagged pointers may result in an error code
|
|
being returned, a (fatal) signal being raised, or other modes of
|
|
failure.
|
|
|
|
4. Example of correct usage
|
|
---------------------------
|
|
.. code-block:: c
|
|
|
|
#include <stdlib.h>
|
|
#include <string.h>
|
|
#include <unistd.h>
|
|
#include <sys/mman.h>
|
|
#include <sys/prctl.h>
|
|
|
|
#define PR_SET_TAGGED_ADDR_CTRL 55
|
|
#define PR_TAGGED_ADDR_ENABLE (1UL << 0)
|
|
|
|
#define TAG_SHIFT 56
|
|
|
|
int main(void)
|
|
{
|
|
int tbi_enabled = 0;
|
|
unsigned long tag = 0;
|
|
char *ptr;
|
|
|
|
/* check/enable the tagged address ABI */
|
|
if (!prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE, 0, 0, 0))
|
|
tbi_enabled = 1;
|
|
|
|
/* memory allocation */
|
|
ptr = mmap(NULL, sysconf(_SC_PAGE_SIZE), PROT_READ | PROT_WRITE,
|
|
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
|
|
if (ptr == MAP_FAILED)
|
|
return 1;
|
|
|
|
/* set a non-zero tag if the ABI is available */
|
|
if (tbi_enabled)
|
|
tag = rand() & 0xff;
|
|
ptr = (char *)((unsigned long)ptr | (tag << TAG_SHIFT));
|
|
|
|
/* memory access to a tagged address */
|
|
strcpy(ptr, "tagged pointer\n");
|
|
|
|
/* syscall with a tagged pointer */
|
|
write(1, ptr, strlen(ptr));
|
|
|
|
return 0;
|
|
}
|