2019-06-04 10:11:32 +02:00
// SPDX-License-Identifier: GPL-2.0-only
2015-09-04 15:47:23 -07:00
/*
* Stress userfaultfd syscall .
*
* Copyright ( C ) 2015 Red Hat , Inc .
*
* This test allocates two virtual areas and bounces the physical
* memory across the two virtual areas ( from area_src to area_dst )
* using userfaultfd .
*
* There are three threads running per CPU :
*
* 1 ) one per - CPU thread takes a per - page pthread_mutex in a random
* page of the area_dst ( while the physical page may still be in
* area_src ) , and increments a per - page counter in the same page ,
* and checks its value against a verification region .
*
* 2 ) another per - CPU thread handles the userfaults generated by
* thread 1 above . userfaultfd blocking reads or poll ( ) modes are
* exercised interleaved .
*
* 3 ) one last per - CPU thread transfers the memory in the background
* at maximum bandwidth ( if not already transferred by thread
* 2 ) . Each cpu thread takes cares of transferring a portion of the
* area .
*
* When all threads of type 3 completed the transfer , one bounce is
* complete . area_src and area_dst are then swapped . All threads are
* respawned and so the bounce is immediately restarted in the
* opposite direction .
*
* per - CPU threads 1 by triggering userfaults inside
* pthread_mutex_lock will also verify the atomicity of the memory
* transfer ( UFFDIO_COPY ) .
*/
# define _GNU_SOURCE
# include <stdio.h>
# include <errno.h>
# include <unistd.h>
# include <stdlib.h>
# include <sys/types.h>
# include <sys/stat.h>
# include <fcntl.h>
# include <time.h>
# include <signal.h>
# include <poll.h>
# include <string.h>
# include <sys/mman.h>
# include <sys/syscall.h>
# include <sys/ioctl.h>
2017-02-22 15:44:06 -08:00
# include <sys/wait.h>
2015-09-04 15:47:23 -07:00
# include <pthread.h>
2015-09-22 14:58:52 -07:00
# include <linux/userfaultfd.h>
2017-09-06 16:23:43 -07:00
# include <setjmp.h>
2017-09-06 16:23:46 -07:00
# include <stdbool.h>
2020-04-06 20:06:36 -07:00
# include <assert.h>
userfaultfd: selftests: make __{s,u}64 format specifiers portable
On certain platforms (powerpcle is the one on which I ran into this),
"%Ld" and "%Lu" are unsuitable for printing __s64 and __u64, respectively,
resulting in build warnings. Cast to {u,}int64_t, and use the PRI{d,u}64
macros defined in inttypes.h to print them. This ought to be portable to
all platforms.
Splitting this off into a separate macro lets us remove some lines, and
get rid of some (I would argue) stylistically odd cases where we joined
printf() and exit() into a single statement with a ,.
Finally, this also fixes a "missing braces around initializer" warning
when we initialize prms in wp_range().
[axelrasmussen@google.com: v2]
Link: https://lkml.kernel.org/r/20201203180244.1811601-1-axelrasmussen@google.com
Link: https://lkml.kernel.org/r/20201202211542.1121189-1-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Alan Gilbert <dgilbert@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-14 19:13:58 -08:00
# include <inttypes.h>
# include <stdint.h>
2021-11-05 13:42:07 -07:00
# include <sys/random.h>
2015-09-04 15:47:23 -07:00
2018-06-13 21:31:43 -06:00
# include "../kselftest.h"
2015-09-22 14:58:58 -07:00
# ifdef __NR_userfaultfd
2015-09-04 15:47:23 -07:00
static unsigned long nr_cpus , nr_pages , nr_pages_per_cpu , page_size ;
# define BOUNCE_RANDOM (1<<0)
# define BOUNCE_RACINGFAULTS (1<<1)
# define BOUNCE_VERIFY (1<<2)
# define BOUNCE_POLL (1<<3)
static int bounces ;
2017-05-03 14:54:54 -07:00
# define TEST_ANON 1
# define TEST_HUGETLB 2
# define TEST_SHMEM 3
static int test_type ;
2017-09-06 16:23:46 -07:00
/* exercise the test_uffdio_*_eexist every ALARM_INTERVAL_SECS */
# define ALARM_INTERVAL_SECS 10
static volatile bool test_uffdio_copy_eexist = true ;
static volatile bool test_uffdio_zeropage_eexist = true ;
2020-04-06 20:06:36 -07:00
/* Whether to test uffd write-protection */
static bool test_uffdio_wp = false ;
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
/* Whether to test uffd minor faults */
static bool test_uffdio_minor = false ;
2017-09-06 16:23:46 -07:00
static bool map_shared ;
userfaultfd/selftests: use memfd_create for shmem test type
This is a preparatory commit. In the future, we want to be able to setup
alias mappings for area_src and area_dst in the shmem test, like we do in
the hugetlb_shared test. With a VMA obtained via mmap(MAP_ANONYMOUS |
MAP_SHARED), it isn't clear how to do this.
So, mmap() with an fd, so we can create alias mappings. Use memfd_create
instead of actually passing in a tmpfs path like hugetlb does, since it's
more convenient / simpler to run, and works just as well.
Future commits will:
1. Setup the alias mappings.
2. Extend our tests to actually take advantage of this, to test new
userfaultfd behavior being introduced in this series.
Also, a small fix in the area we're changing: when the hugetlb setup fails
in main(), pass in the right argv[] so we actually print out the hugetlb
file path.
Link: https://lkml.kernel.org/r/20210503180737.2487560-8-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 18:49:34 -07:00
static int shm_fd ;
2022-01-14 14:08:04 -08:00
static int huge_fd ;
2017-02-22 15:43:07 -08:00
static char * huge_fd_off0 ;
2015-09-04 15:47:23 -07:00
static unsigned long long * count_verify ;
userfaultfd/selftests: reinitialize test context in each test
Currently, the context (fds, mmap-ed areas, etc.) are global. Each test
mutates this state in some way, in some cases really "clobbering it"
(e.g., the events test mremap-ing area_dst over the top of area_src, or
the minor faults tests overwriting the count_verify values in the test
areas). We run the tests in a particular order, each test is careful to
make the right assumptions about its starting state, etc.
But, this is fragile. It's better for a test's success or failure to not
depend on what some other prior test case did to the global state.
To that end, clear and reinitialize the test context at the start of each
test case, so whatever prior test cases did doesn't affect future tests.
This is particularly relevant to this series because the events test's
mremap of area_dst screws up assumptions the minor fault test was relying
on. This wasn't a problem for hugetlb, as we don't mremap in that case.
[peterx@redhat.com: fix conflict between this patch and the uffd pagemap series]
Link: https://lkml.kernel.org/r/YKQqKrl+/cQ1utrb@t490s
Link: https://lkml.kernel.org/r/20210503180737.2487560-10-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 18:49:41 -07:00
static int uffd = - 1 ;
static int uffd_flags , finished , * pipefd ;
2017-09-06 16:23:46 -07:00
static char * area_src , * area_src_alias , * area_dst , * area_dst_alias ;
2015-09-04 15:47:23 -07:00
static char * zeropage ;
pthread_attr_t attr ;
2020-04-06 20:06:32 -07:00
/* Userfaultfd test statistics */
struct uffd_stats {
int cpu ;
unsigned long missing_faults ;
2020-04-06 20:06:36 -07:00
unsigned long wp_faults ;
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
unsigned long minor_faults ;
2020-04-06 20:06:32 -07:00
} ;
2015-09-04 15:47:23 -07:00
/* pthread_mutex_t starts at page offset 0 */
# define area_mutex(___area, ___nr) \
( ( pthread_mutex_t * ) ( ( ___area ) + ( ___nr ) * page_size ) )
/*
* count is placed in the page after pthread_mutex_t naturally aligned
* to avoid non alignment faults on non - x86 archs .
*/
# define area_count(___area, ___nr) \
( ( volatile unsigned long long * ) ( ( unsigned long ) \
( ( ___area ) + ( ___nr ) * page_size + \
sizeof ( pthread_mutex_t ) + \
sizeof ( unsigned long long ) - 1 ) & \
~ ( unsigned long ) ( sizeof ( unsigned long long ) \
- 1 ) ) )
2018-10-26 15:09:09 -07:00
const char * examples =
" # Run anonymous memory test on 100MiB region with 99999 bounces: \n "
" ./userfaultfd anon 100 99999 \n \n "
" # Run share memory test on 1GiB region with 99 bounces: \n "
" ./userfaultfd shmem 1000 99 \n \n "
" # Run hugetlb memory test on 256MiB region with 50 bounces (using /dev/hugepages/hugefile): \n "
" ./userfaultfd hugetlb 256 50 /dev/hugepages/hugefile \n \n "
" # Run the same hugetlb test but using shmem: \n "
" ./userfaultfd hugetlb_shared 256 50 /dev/hugepages/hugefile \n \n "
" # 10MiB-~6GiB 999 bounces anonymous test, "
" continue forever unless an error triggers \n "
" while ./userfaultfd anon $[RANDOM % 6000 + 10] 999; do true; done \n \n " ;
static void usage ( void )
{
fprintf ( stderr , " \n Usage: ./userfaultfd <test type> <MiB> <bounces> "
" [hugetlbfs_file] \n \n " ) ;
fprintf ( stderr , " Supported <test type>: anon, hugetlb, "
" hugetlb_shared, shmem \n \n " ) ;
fprintf ( stderr , " Examples: \n \n " ) ;
2019-05-27 15:18:59 +00:00
fprintf ( stderr , " %s " , examples ) ;
2018-10-26 15:09:09 -07:00
exit ( 1 ) ;
}
2021-06-30 18:48:55 -07:00
# define _err(fmt, ...) \
do { \
int ret = errno ; \
fprintf ( stderr , " ERROR: " fmt , # # __VA_ARGS__ ) ; \
fprintf ( stderr , " (errno=%d, line=%d) \n " , \
ret , __LINE__ ) ; \
} while ( 0 )
# define err(fmt, ...) \
do { \
_err ( fmt , # # __VA_ARGS__ ) ; \
exit ( 1 ) ; \
userfaultfd: selftests: make __{s,u}64 format specifiers portable
On certain platforms (powerpcle is the one on which I ran into this),
"%Ld" and "%Lu" are unsuitable for printing __s64 and __u64, respectively,
resulting in build warnings. Cast to {u,}int64_t, and use the PRI{d,u}64
macros defined in inttypes.h to print them. This ought to be portable to
all platforms.
Splitting this off into a separate macro lets us remove some lines, and
get rid of some (I would argue) stylistically odd cases where we joined
printf() and exit() into a single statement with a ,.
Finally, this also fixes a "missing braces around initializer" warning
when we initialize prms in wp_range().
[axelrasmussen@google.com: v2]
Link: https://lkml.kernel.org/r/20201203180244.1811601-1-axelrasmussen@google.com
Link: https://lkml.kernel.org/r/20201202211542.1121189-1-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Alan Gilbert <dgilbert@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-14 19:13:58 -08:00
} while ( 0 )
2020-04-06 20:06:32 -07:00
static void uffd_stats_reset ( struct uffd_stats * uffd_stats ,
unsigned long n_cpus )
{
int i ;
for ( i = 0 ; i < n_cpus ; i + + ) {
uffd_stats [ i ] . cpu = i ;
uffd_stats [ i ] . missing_faults = 0 ;
2020-04-06 20:06:36 -07:00
uffd_stats [ i ] . wp_faults = 0 ;
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
uffd_stats [ i ] . minor_faults = 0 ;
2020-04-06 20:06:32 -07:00
}
}
2020-04-06 20:06:36 -07:00
static void uffd_stats_report ( struct uffd_stats * stats , int n_cpus )
{
int i ;
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
unsigned long long miss_total = 0 , wp_total = 0 , minor_total = 0 ;
2020-04-06 20:06:36 -07:00
for ( i = 0 ; i < n_cpus ; i + + ) {
miss_total + = stats [ i ] . missing_faults ;
wp_total + = stats [ i ] . wp_faults ;
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
minor_total + = stats [ i ] . minor_faults ;
2020-04-06 20:06:36 -07:00
}
2021-06-30 18:48:52 -07:00
printf ( " userfaults: " ) ;
if ( miss_total ) {
printf ( " %llu missing ( " , miss_total ) ;
for ( i = 0 ; i < n_cpus ; i + + )
printf ( " %lu+ " , stats [ i ] . missing_faults ) ;
printf ( " \b ) " ) ;
}
if ( wp_total ) {
printf ( " %llu wp ( " , wp_total ) ;
for ( i = 0 ; i < n_cpus ; i + + )
printf ( " %lu+ " , stats [ i ] . wp_faults ) ;
printf ( " \b ) " ) ;
}
if ( minor_total ) {
printf ( " %llu minor ( " , minor_total ) ;
for ( i = 0 ; i < n_cpus ; i + + )
printf ( " %lu+ " , stats [ i ] . minor_faults ) ;
printf ( " \b ) " ) ;
}
printf ( " \n " ) ;
2020-04-06 20:06:36 -07:00
}
2021-06-30 18:48:55 -07:00
static void anon_release_pages ( char * rel_area )
2017-02-22 15:43:07 -08:00
{
2021-06-30 18:48:55 -07:00
if ( madvise ( rel_area , nr_pages * page_size , MADV_DONTNEED ) )
err ( " madvise(MADV_DONTNEED) failed " ) ;
2017-02-22 15:43:07 -08:00
}
2017-05-03 14:54:54 -07:00
static void anon_allocate_area ( void * * alloc_area )
2017-02-22 15:43:07 -08:00
{
2021-07-23 15:50:04 -07:00
* alloc_area = mmap ( NULL , nr_pages * page_size , PROT_READ | PROT_WRITE ,
MAP_ANONYMOUS | MAP_PRIVATE , - 1 , 0 ) ;
if ( * alloc_area = = MAP_FAILED )
err ( " mmap of anonymous memory failed " ) ;
2017-02-22 15:43:07 -08:00
}
2017-09-06 16:23:46 -07:00
static void noop_alias_mapping ( __u64 * start , size_t len , unsigned long offset )
{
}
2017-02-22 15:43:46 -08:00
2021-06-30 18:48:55 -07:00
static void hugetlb_release_pages ( char * rel_area )
2017-02-22 15:43:07 -08:00
{
if ( fallocate ( huge_fd , FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE ,
2021-06-30 18:48:55 -07:00
rel_area = = huge_fd_off0 ? 0 : nr_pages * page_size ,
nr_pages * page_size ) )
err ( " fallocate() failed " ) ;
2017-02-22 15:43:07 -08:00
}
2017-05-03 14:54:54 -07:00
static void hugetlb_allocate_area ( void * * alloc_area )
2017-02-22 15:43:07 -08:00
{
2017-09-06 16:23:46 -07:00
void * area_alias = NULL ;
char * * alloc_area_alias ;
userfaultfd: selftests: fix SIGSEGV if huge mmap fails
The error handling in hugetlb_allocate_area() was incorrect for the
hugetlb_shared test case.
Previously the behavior was:
- mmap a hugetlb area
- If this fails, set the pointer to NULL, and carry on
- mmap an alias of the same hugetlb fd
- If this fails, munmap the original area
If the original mmap failed, it's likely the second one did too. If
both failed, we'd blindly try to munmap a NULL pointer, causing a
SIGSEGV. Instead, "goto fail" so we return before trying to mmap the
alias.
This issue can be hit "in real life" by forgetting to set
/proc/sys/vm/nr_hugepages (leaving it at 0), and then trying to run the
hugetlb_shared test.
Another small improvement is, when the original mmap fails, don't just
print "it failed": perror(), so we can see *why*. :)
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Alan Gilbert <dgilbert@redhat.com>
Link: https://lkml.kernel.org/r/20201204203443.2714693-1-axelrasmussen@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-05 22:15:05 -08:00
2017-02-22 15:43:07 -08:00
* alloc_area = mmap ( NULL , nr_pages * page_size , PROT_READ | PROT_WRITE ,
2022-01-14 14:08:04 -08:00
( map_shared ? MAP_SHARED : MAP_PRIVATE ) |
MAP_HUGETLB |
userfaultfd/selftests: fix hugetlb area allocations
Currently, userfaultfd selftest for hugetlb as run from run_vmtests.sh
or any environment where there are 'just enough' hugetlb pages will
always fail with:
testing events (fork, remap, remove):
ERROR: UFFDIO_COPY error: -12 (errno=12, line=616)
The ENOMEM error code implies there are not enough hugetlb pages.
However, there are free hugetlb pages but they are all reserved. There
is a basic problem with the way the test allocates hugetlb pages which
has existed since the test was originally written.
Due to the way 'cleanup' was done between different phases of the test,
this issue was masked until recently. The issue was uncovered by commit
8ba6e8640844 ("userfaultfd/selftests: reinitialize test context in each
test").
For the hugetlb test, src and dst areas are allocated as PRIVATE
mappings of a hugetlb file. This means that at mmap time, pages are
reserved for the src and dst areas. At the start of event testing (and
other tests) the src area is populated which results in allocation of
huge pages to fill the area and consumption of reserves associated with
the area. Then, a child is forked to fault in the dst area. Note that
the dst area was allocated in the parent and hence the parent owns the
reserves associated with the mapping. The child has normal access to
the dst area, but can not use the reserves created/owned by the parent.
Thus, if there are no other huge pages available allocation of a page
for the dst by the child will fail.
Fix by not creating reserves for the dst area. In this way the child
can use free (non-reserved) pages.
Also, MAP_PRIVATE of a file only makes sense if you are interested in
the contents of the file before making a COW copy. The test does not do
this. So, just use MAP_ANONYMOUS | MAP_HUGETLB to create an anonymous
hugetlb mapping. There is no need to create a hugetlb file in the
non-shared case.
Link: https://lkml.kernel.org/r/20211217172919.7861-1-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-30 20:12:31 -08:00
( * alloc_area = = area_src ? 0 : MAP_NORESERVE ) ,
2022-01-14 14:08:04 -08:00
huge_fd , * alloc_area = = area_src ? 0 :
nr_pages * page_size ) ;
2021-06-30 18:48:55 -07:00
if ( * alloc_area = = MAP_FAILED )
err ( " mmap of hugetlbfs file failed " ) ;
2017-02-22 15:43:07 -08:00
2017-09-06 16:23:46 -07:00
if ( map_shared ) {
area_alias = mmap ( NULL , nr_pages * page_size , PROT_READ | PROT_WRITE ,
2022-01-14 14:08:04 -08:00
MAP_SHARED | MAP_HUGETLB ,
2017-09-06 16:23:46 -07:00
huge_fd , * alloc_area = = area_src ? 0 :
nr_pages * page_size ) ;
2021-06-30 18:48:55 -07:00
if ( area_alias = = MAP_FAILED )
err ( " mmap of hugetlb file alias failed " ) ;
2017-09-06 16:23:46 -07:00
}
userfaultfd: selftests: fix SIGSEGV if huge mmap fails
The error handling in hugetlb_allocate_area() was incorrect for the
hugetlb_shared test case.
Previously the behavior was:
- mmap a hugetlb area
- If this fails, set the pointer to NULL, and carry on
- mmap an alias of the same hugetlb fd
- If this fails, munmap the original area
If the original mmap failed, it's likely the second one did too. If
both failed, we'd blindly try to munmap a NULL pointer, causing a
SIGSEGV. Instead, "goto fail" so we return before trying to mmap the
alias.
This issue can be hit "in real life" by forgetting to set
/proc/sys/vm/nr_hugepages (leaving it at 0), and then trying to run the
hugetlb_shared test.
Another small improvement is, when the original mmap fails, don't just
print "it failed": perror(), so we can see *why*. :)
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Alan Gilbert <dgilbert@redhat.com>
Link: https://lkml.kernel.org/r/20201204203443.2714693-1-axelrasmussen@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-05 22:15:05 -08:00
2017-09-06 16:23:46 -07:00
if ( * alloc_area = = area_src ) {
2017-02-22 15:43:07 -08:00
huge_fd_off0 = * alloc_area ;
2017-09-06 16:23:46 -07:00
alloc_area_alias = & area_src_alias ;
} else {
alloc_area_alias = & area_dst_alias ;
}
if ( area_alias )
* alloc_area_alias = area_alias ;
}
static void hugetlb_alias_mapping ( __u64 * start , size_t len , unsigned long offset )
{
if ( ! map_shared )
return ;
/*
* We can ' t zap just the pagetable with hugetlbfs because
* MADV_DONTEED won ' t work . So exercise - EEXIST on a alias
* mapping where the pagetables are not established initially ,
* this way we ' ll exercise the - EEXEC at the fs level .
*/
* start = ( unsigned long ) area_dst_alias + offset ;
2017-02-22 15:43:07 -08:00
}
2021-06-30 18:48:55 -07:00
static void shmem_release_pages ( char * rel_area )
2017-02-22 15:43:46 -08:00
{
2021-06-30 18:48:55 -07:00
if ( madvise ( rel_area , nr_pages * page_size , MADV_REMOVE ) )
err ( " madvise(MADV_REMOVE) failed " ) ;
2017-02-22 15:43:46 -08:00
}
2017-05-03 14:54:54 -07:00
static void shmem_allocate_area ( void * * alloc_area )
2017-02-22 15:43:46 -08:00
{
2021-06-30 18:49:38 -07:00
void * area_alias = NULL ;
bool is_src = alloc_area = = ( void * * ) & area_src ;
unsigned long offset = is_src ? 0 : nr_pages * page_size ;
userfaultfd/selftests: use memfd_create for shmem test type
This is a preparatory commit. In the future, we want to be able to setup
alias mappings for area_src and area_dst in the shmem test, like we do in
the hugetlb_shared test. With a VMA obtained via mmap(MAP_ANONYMOUS |
MAP_SHARED), it isn't clear how to do this.
So, mmap() with an fd, so we can create alias mappings. Use memfd_create
instead of actually passing in a tmpfs path like hugetlb does, since it's
more convenient / simpler to run, and works just as well.
Future commits will:
1. Setup the alias mappings.
2. Extend our tests to actually take advantage of this, to test new
userfaultfd behavior being introduced in this series.
Also, a small fix in the area we're changing: when the hugetlb setup fails
in main(), pass in the right argv[] so we actually print out the hugetlb
file path.
Link: https://lkml.kernel.org/r/20210503180737.2487560-8-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 18:49:34 -07:00
2017-02-22 15:43:46 -08:00
* alloc_area = mmap ( NULL , nr_pages * page_size , PROT_READ | PROT_WRITE ,
userfaultfd/selftests: use memfd_create for shmem test type
This is a preparatory commit. In the future, we want to be able to setup
alias mappings for area_src and area_dst in the shmem test, like we do in
the hugetlb_shared test. With a VMA obtained via mmap(MAP_ANONYMOUS |
MAP_SHARED), it isn't clear how to do this.
So, mmap() with an fd, so we can create alias mappings. Use memfd_create
instead of actually passing in a tmpfs path like hugetlb does, since it's
more convenient / simpler to run, and works just as well.
Future commits will:
1. Setup the alias mappings.
2. Extend our tests to actually take advantage of this, to test new
userfaultfd behavior being introduced in this series.
Also, a small fix in the area we're changing: when the hugetlb setup fails
in main(), pass in the right argv[] so we actually print out the hugetlb
file path.
Link: https://lkml.kernel.org/r/20210503180737.2487560-8-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 18:49:34 -07:00
MAP_SHARED , shm_fd , offset ) ;
2021-06-30 18:48:55 -07:00
if ( * alloc_area = = MAP_FAILED )
err ( " mmap of memfd failed " ) ;
2021-06-30 18:49:38 -07:00
area_alias = mmap ( NULL , nr_pages * page_size , PROT_READ | PROT_WRITE ,
MAP_SHARED , shm_fd , offset ) ;
if ( area_alias = = MAP_FAILED )
err ( " mmap of memfd alias failed " ) ;
if ( is_src )
area_src_alias = area_alias ;
else
area_dst_alias = area_alias ;
}
static void shmem_alias_mapping ( __u64 * start , size_t len , unsigned long offset )
{
* start = ( unsigned long ) area_dst_alias + offset ;
2017-02-22 15:43:46 -08:00
}
2017-05-03 14:54:54 -07:00
struct uffd_test_ops {
void ( * allocate_area ) ( void * * alloc_area ) ;
2021-06-30 18:48:55 -07:00
void ( * release_pages ) ( char * rel_area ) ;
2017-09-06 16:23:46 -07:00
void ( * alias_mapping ) ( __u64 * start , size_t len , unsigned long offset ) ;
2017-05-03 14:54:54 -07:00
} ;
static struct uffd_test_ops anon_uffd_test_ops = {
. allocate_area = anon_allocate_area ,
. release_pages = anon_release_pages ,
2017-09-06 16:23:46 -07:00
. alias_mapping = noop_alias_mapping ,
2017-05-03 14:54:54 -07:00
} ;
static struct uffd_test_ops shmem_uffd_test_ops = {
. allocate_area = shmem_allocate_area ,
. release_pages = shmem_release_pages ,
2021-06-30 18:49:38 -07:00
. alias_mapping = shmem_alias_mapping ,
2017-05-03 14:54:54 -07:00
} ;
static struct uffd_test_ops hugetlb_uffd_test_ops = {
. allocate_area = hugetlb_allocate_area ,
. release_pages = hugetlb_release_pages ,
2017-09-06 16:23:46 -07:00
. alias_mapping = hugetlb_alias_mapping ,
2017-05-03 14:54:54 -07:00
} ;
static struct uffd_test_ops * uffd_test_ops ;
2017-02-22 15:43:46 -08:00
userfaultfd/selftests: fix feature support detection
Before any tests are run, in set_test_type, we decide what feature(s) we
are going to be testing, based upon our command line arguments.
However, the supported features are not just a function of the memory
type being used, so this is broken.
For instance, consider writeprotect support. It is "normally" supported
for anonymous memory, but furthermore it requires that the kernel has
CONFIG_HAVE_ARCH_USERFAULTFD_WP. So, it is *not* supported at all on
aarch64, for example.
So, this fixes this by querying the kernel for the set of features it
supports in set_test_type, by opening a userfaultfd and issuing a
UFFDIO_API ioctl. Based upon the reported features, we toggle what
tests are enabled.
Link: https://lkml.kernel.org/r/20210930212309.4001967-3-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-05 13:42:10 -07:00
static inline uint64_t uffd_minor_feature ( void )
{
if ( test_type = = TEST_HUGETLB & & map_shared )
return UFFD_FEATURE_MINOR_HUGETLBFS ;
else if ( test_type = = TEST_SHMEM )
return UFFD_FEATURE_MINOR_SHMEM ;
else
return 0 ;
}
userfaultfd/selftests: fix calculation of expected ioctls
Today, we assert that the ioctls the kernel reports as supported for a
registration match a precomputed list. We decide which ioctls are
supported by examining the memory type. Then, in several locations we
"fix up" this list by adding or removing things this initial decision
got wrong.
What ioctls the kernel reports is actually a function of several things:
- The memory type
- Kernel feature support (e.g., no writeprotect on aarch64)
- The registration type (e.g., CONTINUE only supported for MINOR mode)
So, we can't fully compute this at the start, in set_test_type. It
varies per test, depending on what registration mode(s) those tests use.
Instead, introduce a new function which computes the correct list. This
centralizes the add/remove of ioctls depending on these function inputs
in one place, so we don't have to repeat ourselves in various tests.
Not only is the resulting code a bit shorter, but it fixes a real bug in
the existing code: previously, we would incorrectly require the
writeprotect ioctl to be present on aarch64, where it isn't actually
supported.
Link: https://lkml.kernel.org/r/20210930212309.4001967-4-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-05 13:42:13 -07:00
static uint64_t get_expected_ioctls ( uint64_t mode )
{
uint64_t ioctls = UFFD_API_RANGE_IOCTLS ;
if ( test_type = = TEST_HUGETLB )
ioctls & = ~ ( 1 < < _UFFDIO_ZEROPAGE ) ;
if ( ! ( ( mode & UFFDIO_REGISTER_MODE_WP ) & & test_uffdio_wp ) )
ioctls & = ~ ( 1 < < _UFFDIO_WRITEPROTECT ) ;
if ( ! ( ( mode & UFFDIO_REGISTER_MODE_MINOR ) & & test_uffdio_minor ) )
ioctls & = ~ ( 1 < < _UFFDIO_CONTINUE ) ;
return ioctls ;
}
static void assert_expected_ioctls_present ( uint64_t mode , uint64_t ioctls )
{
uint64_t expected = get_expected_ioctls ( mode ) ;
uint64_t actual = ioctls & expected ;
if ( actual ! = expected ) {
err ( " missing ioctl(s): expected % " PRIx64 " actual: % " PRIx64 ,
expected , actual ) ;
}
}
userfaultfd/selftests: reinitialize test context in each test
Currently, the context (fds, mmap-ed areas, etc.) are global. Each test
mutates this state in some way, in some cases really "clobbering it"
(e.g., the events test mremap-ing area_dst over the top of area_src, or
the minor faults tests overwriting the count_verify values in the test
areas). We run the tests in a particular order, each test is careful to
make the right assumptions about its starting state, etc.
But, this is fragile. It's better for a test's success or failure to not
depend on what some other prior test case did to the global state.
To that end, clear and reinitialize the test context at the start of each
test case, so whatever prior test cases did doesn't affect future tests.
This is particularly relevant to this series because the events test's
mremap of area_dst screws up assumptions the minor fault test was relying
on. This wasn't a problem for hugetlb, as we don't mremap in that case.
[peterx@redhat.com: fix conflict between this patch and the uffd pagemap series]
Link: https://lkml.kernel.org/r/YKQqKrl+/cQ1utrb@t490s
Link: https://lkml.kernel.org/r/20210503180737.2487560-10-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 18:49:41 -07:00
static void userfaultfd_open ( uint64_t * features )
{
struct uffdio_api uffdio_api ;
uffd = syscall ( __NR_userfaultfd , O_CLOEXEC | O_NONBLOCK | UFFD_USER_MODE_ONLY ) ;
if ( uffd < 0 )
err ( " userfaultfd syscall not available in this kernel " ) ;
uffd_flags = fcntl ( uffd , F_GETFD , NULL ) ;
uffdio_api . api = UFFD_API ;
uffdio_api . features = * features ;
if ( ioctl ( uffd , UFFDIO_API , & uffdio_api ) )
err ( " UFFDIO_API failed. \n Please make sure to "
" run with either root or ptrace capability. " ) ;
if ( uffdio_api . api ! = UFFD_API )
err ( " UFFDIO_API error: % " PRIu64 , ( uint64_t ) uffdio_api . api ) ;
* features = uffdio_api . features ;
}
static inline void munmap_area ( void * * area )
{
if ( * area )
if ( munmap ( * area , nr_pages * page_size ) )
err ( " munmap " ) ;
* area = NULL ;
}
static void uffd_test_ctx_clear ( void )
{
size_t i ;
if ( pipefd ) {
for ( i = 0 ; i < nr_cpus * 2 ; + + i ) {
if ( close ( pipefd [ i ] ) )
err ( " close pipefd " ) ;
}
free ( pipefd ) ;
pipefd = NULL ;
}
if ( count_verify ) {
free ( count_verify ) ;
count_verify = NULL ;
}
if ( uffd ! = - 1 ) {
if ( close ( uffd ) )
err ( " close uffd " ) ;
uffd = - 1 ;
}
huge_fd_off0 = NULL ;
munmap_area ( ( void * * ) & area_src ) ;
munmap_area ( ( void * * ) & area_src_alias ) ;
munmap_area ( ( void * * ) & area_dst ) ;
munmap_area ( ( void * * ) & area_dst_alias ) ;
}
userfaultfd/selftests: fix feature support detection
Before any tests are run, in set_test_type, we decide what feature(s) we
are going to be testing, based upon our command line arguments.
However, the supported features are not just a function of the memory
type being used, so this is broken.
For instance, consider writeprotect support. It is "normally" supported
for anonymous memory, but furthermore it requires that the kernel has
CONFIG_HAVE_ARCH_USERFAULTFD_WP. So, it is *not* supported at all on
aarch64, for example.
So, this fixes this by querying the kernel for the set of features it
supports in set_test_type, by opening a userfaultfd and issuing a
UFFDIO_API ioctl. Based upon the reported features, we toggle what
tests are enabled.
Link: https://lkml.kernel.org/r/20210930212309.4001967-3-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-05 13:42:10 -07:00
static void uffd_test_ctx_init ( uint64_t features )
userfaultfd/selftests: reinitialize test context in each test
Currently, the context (fds, mmap-ed areas, etc.) are global. Each test
mutates this state in some way, in some cases really "clobbering it"
(e.g., the events test mremap-ing area_dst over the top of area_src, or
the minor faults tests overwriting the count_verify values in the test
areas). We run the tests in a particular order, each test is careful to
make the right assumptions about its starting state, etc.
But, this is fragile. It's better for a test's success or failure to not
depend on what some other prior test case did to the global state.
To that end, clear and reinitialize the test context at the start of each
test case, so whatever prior test cases did doesn't affect future tests.
This is particularly relevant to this series because the events test's
mremap of area_dst screws up assumptions the minor fault test was relying
on. This wasn't a problem for hugetlb, as we don't mremap in that case.
[peterx@redhat.com: fix conflict between this patch and the uffd pagemap series]
Link: https://lkml.kernel.org/r/YKQqKrl+/cQ1utrb@t490s
Link: https://lkml.kernel.org/r/20210503180737.2487560-10-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 18:49:41 -07:00
{
unsigned long nr , cpu ;
uffd_test_ctx_clear ( ) ;
uffd_test_ops - > allocate_area ( ( void * * ) & area_src ) ;
uffd_test_ops - > allocate_area ( ( void * * ) & area_dst ) ;
userfaultfd/selftests: fix feature support detection
Before any tests are run, in set_test_type, we decide what feature(s) we
are going to be testing, based upon our command line arguments.
However, the supported features are not just a function of the memory
type being used, so this is broken.
For instance, consider writeprotect support. It is "normally" supported
for anonymous memory, but furthermore it requires that the kernel has
CONFIG_HAVE_ARCH_USERFAULTFD_WP. So, it is *not* supported at all on
aarch64, for example.
So, this fixes this by querying the kernel for the set of features it
supports in set_test_type, by opening a userfaultfd and issuing a
UFFDIO_API ioctl. Based upon the reported features, we toggle what
tests are enabled.
Link: https://lkml.kernel.org/r/20210930212309.4001967-3-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-05 13:42:10 -07:00
userfaultfd_open ( & features ) ;
userfaultfd/selftests: reinitialize test context in each test
Currently, the context (fds, mmap-ed areas, etc.) are global. Each test
mutates this state in some way, in some cases really "clobbering it"
(e.g., the events test mremap-ing area_dst over the top of area_src, or
the minor faults tests overwriting the count_verify values in the test
areas). We run the tests in a particular order, each test is careful to
make the right assumptions about its starting state, etc.
But, this is fragile. It's better for a test's success or failure to not
depend on what some other prior test case did to the global state.
To that end, clear and reinitialize the test context at the start of each
test case, so whatever prior test cases did doesn't affect future tests.
This is particularly relevant to this series because the events test's
mremap of area_dst screws up assumptions the minor fault test was relying
on. This wasn't a problem for hugetlb, as we don't mremap in that case.
[peterx@redhat.com: fix conflict between this patch and the uffd pagemap series]
Link: https://lkml.kernel.org/r/YKQqKrl+/cQ1utrb@t490s
Link: https://lkml.kernel.org/r/20210503180737.2487560-10-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 18:49:41 -07:00
count_verify = malloc ( nr_pages * sizeof ( unsigned long long ) ) ;
if ( ! count_verify )
err ( " count_verify " ) ;
for ( nr = 0 ; nr < nr_pages ; nr + + ) {
* area_mutex ( area_src , nr ) =
( pthread_mutex_t ) PTHREAD_MUTEX_INITIALIZER ;
count_verify [ nr ] = * area_count ( area_src , nr ) = 1 ;
/*
* In the transition between 255 to 256 , powerpc will
* read out of order in my_bcmp and see both bytes as
* zero , so leave a placeholder below always non - zero
* after the count , to avoid my_bcmp to trigger false
* positives .
*/
* ( area_count ( area_src , nr ) + 1 ) = 1 ;
}
mm/userfaultfd: selftests: fix memory corruption with thp enabled
In RHEL's gating selftests we've encountered memory corruption in the
uffd event test even with upstream kernel:
# ./userfaultfd anon 128 4
nr_pages: 32768, nr_pages_per_cpu: 32768
bounces: 3, mode: rnd racing read, userfaults: 6240 missing (6240) 14729 wp (14729)
bounces: 2, mode: racing read, userfaults: 1444 missing (1444) 28877 wp (28877)
bounces: 1, mode: rnd read, userfaults: 6055 missing (6055) 14699 wp (14699)
bounces: 0, mode: read, userfaults: 82 missing (82) 25196 wp (25196)
testing uffd-wp with pagemap (pgsize=4096): done
testing uffd-wp with pagemap (pgsize=2097152): done
testing events (fork, remap, remove): ERROR: nr 32427 memory corruption 0 1 (errno=0, line=963)
ERROR: faulting process failed (errno=0, line=1117)
It can be easily reproduced when global thp enabled, which is the
default for RHEL.
It's also known as a side effect of commit 0db282ba2c12 ("selftest: use
mmap instead of posix_memalign to allocate memory", 2021-07-23), which
is imho right itself on using mmap() to make sure the addresses will be
untagged even on arm.
The problem is, for each test we allocate buffers using two
allocate_area() calls. We assumed these two buffers won't affect each
other, however they could, because mmap() could have found that the two
buffers are near each other and having the same VMA flags, so they got
merged into one VMA.
It won't be a big problem if thp is not enabled, but when thp is
agressively enabled it means when initializing the src buffer it could
accidentally setup part of the dest buffer too when there's a shared THP
that overlaps the two regions. Then some of the dest buffer won't be
able to be trapped by userfaultfd missing mode, then it'll cause memory
corruption as described.
To fix it, do release_pages() after initializing the src buffer.
Since the previous two release_pages() calls are after
uffd_test_ctx_clear() which will unmap all the buffers anyway (which is
stronger than release pages; as unmap() also tear town pgtables), drop
them as they shouldn't really be anything useful.
We can mark the Fixes tag upon 0db282ba2c12 as it's reported to only
happen there, however the real "Fixes" IMHO should be 8ba6e8640844, as
before that commit we'll always do explicit release_pages() before
registration of uffd, and 8ba6e8640844 changed that logic by adding
extra unmap/map and we didn't release the pages at the right place.
Meanwhile I don't have a solid glue anyway on whether posix_memalign()
could always avoid triggering this bug, hence it's safer to attach this
fix to commit 8ba6e8640844.
Link: https://lkml.kernel.org/r/20210923232512.210092-1-peterx@redhat.com
Fixes: 8ba6e8640844 ("userfaultfd/selftests: reinitialize test context in each test")
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1994931
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Li Wang <liwan@redhat.com>
Tested-by: Li Wang <liwang@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18 15:15:22 -07:00
/*
* After initialization of area_src , we must explicitly release pages
* for area_dst to make sure it ' s fully empty . Otherwise we could have
* some area_dst pages be errornously initialized with zero pages ,
* hence we could hit memory corruption later in the test .
*
* One example is when THP is globally enabled , above allocate_area ( )
* calls could have the two areas merged into a single VMA ( as they
* will have the same VMA flags so they ' re mergeable ) . When we
* initialize the area_src above , it ' s possible that some part of
* area_dst could have been faulted in via one huge THP that will be
* shared between area_src and area_dst . It could cause some of the
* area_dst won ' t be trapped by missing userfaults .
*
* This release_pages ( ) will guarantee even if that happened , we ' ll
* proactively split the thp and drop any accidentally initialized
* pages within area_dst .
*/
uffd_test_ops - > release_pages ( area_dst ) ;
userfaultfd/selftests: reinitialize test context in each test
Currently, the context (fds, mmap-ed areas, etc.) are global. Each test
mutates this state in some way, in some cases really "clobbering it"
(e.g., the events test mremap-ing area_dst over the top of area_src, or
the minor faults tests overwriting the count_verify values in the test
areas). We run the tests in a particular order, each test is careful to
make the right assumptions about its starting state, etc.
But, this is fragile. It's better for a test's success or failure to not
depend on what some other prior test case did to the global state.
To that end, clear and reinitialize the test context at the start of each
test case, so whatever prior test cases did doesn't affect future tests.
This is particularly relevant to this series because the events test's
mremap of area_dst screws up assumptions the minor fault test was relying
on. This wasn't a problem for hugetlb, as we don't mremap in that case.
[peterx@redhat.com: fix conflict between this patch and the uffd pagemap series]
Link: https://lkml.kernel.org/r/YKQqKrl+/cQ1utrb@t490s
Link: https://lkml.kernel.org/r/20210503180737.2487560-10-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 18:49:41 -07:00
pipefd = malloc ( sizeof ( int ) * nr_cpus * 2 ) ;
if ( ! pipefd )
err ( " pipefd " ) ;
for ( cpu = 0 ; cpu < nr_cpus ; cpu + + )
if ( pipe2 ( & pipefd [ cpu * 2 ] , O_CLOEXEC | O_NONBLOCK ) )
err ( " pipe " ) ;
}
2015-09-04 15:47:23 -07:00
static int my_bcmp ( char * str1 , char * str2 , size_t n )
{
unsigned long i ;
for ( i = 0 ; i < n ; i + + )
if ( str1 [ i ] ! = str2 [ i ] )
return 1 ;
return 0 ;
}
2020-04-06 20:06:36 -07:00
static void wp_range ( int ufd , __u64 start , __u64 len , bool wp )
{
userfaultfd: selftests: make __{s,u}64 format specifiers portable
On certain platforms (powerpcle is the one on which I ran into this),
"%Ld" and "%Lu" are unsuitable for printing __s64 and __u64, respectively,
resulting in build warnings. Cast to {u,}int64_t, and use the PRI{d,u}64
macros defined in inttypes.h to print them. This ought to be portable to
all platforms.
Splitting this off into a separate macro lets us remove some lines, and
get rid of some (I would argue) stylistically odd cases where we joined
printf() and exit() into a single statement with a ,.
Finally, this also fixes a "missing braces around initializer" warning
when we initialize prms in wp_range().
[axelrasmussen@google.com: v2]
Link: https://lkml.kernel.org/r/20201203180244.1811601-1-axelrasmussen@google.com
Link: https://lkml.kernel.org/r/20201202211542.1121189-1-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Alan Gilbert <dgilbert@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-14 19:13:58 -08:00
struct uffdio_writeprotect prms ;
2020-04-06 20:06:36 -07:00
/* Write protection page faults */
prms . range . start = start ;
prms . range . len = len ;
/* Undo write-protect, do wakeup after that */
prms . mode = wp ? UFFDIO_WRITEPROTECT_MODE_WP : 0 ;
2021-06-30 18:48:55 -07:00
if ( ioctl ( ufd , UFFDIO_WRITEPROTECT , & prms ) )
err ( " clear WP failed: address=0x% " PRIx64 , ( uint64_t ) start ) ;
2020-04-06 20:06:36 -07:00
}
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
static void continue_range ( int ufd , __u64 start , __u64 len )
{
struct uffdio_continue req ;
2021-06-30 18:49:45 -07:00
int ret ;
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
req . range . start = start ;
req . range . len = len ;
req . mode = 0 ;
2021-06-30 18:48:55 -07:00
if ( ioctl ( ufd , UFFDIO_CONTINUE , & req ) )
err ( " UFFDIO_CONTINUE failed for address 0x% " PRIx64 ,
( uint64_t ) start ) ;
2021-06-30 18:49:45 -07:00
/*
* Error handling within the kernel for continue is subtly different
* from copy or zeropage , so it may be a source of bugs . Trigger an
* error ( - EEXIST ) on purpose , to verify doing so doesn ' t cause a BUG .
*/
req . mapped = 0 ;
ret = ioctl ( ufd , UFFDIO_CONTINUE , & req ) ;
if ( ret > = 0 | | req . mapped ! = - EEXIST )
err ( " failed to exercise UFFDIO_CONTINUE error handling, ret=%d, mapped=% " PRId64 ,
ret , ( int64_t ) req . mapped ) ;
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
}
2015-09-04 15:47:23 -07:00
static void * locking_thread ( void * arg )
{
unsigned long cpu = ( unsigned long ) arg ;
unsigned long page_nr = * ( & ( page_nr ) ) ; /* uninitialized warning */
unsigned long long count ;
2021-11-05 13:42:07 -07:00
if ( ! ( bounces & BOUNCE_RANDOM ) ) {
2015-09-04 15:47:23 -07:00
page_nr = - bounces ;
if ( ! ( bounces & BOUNCE_RACINGFAULTS ) )
page_nr + = cpu * nr_pages_per_cpu ;
}
while ( ! finished ) {
if ( bounces & BOUNCE_RANDOM ) {
2021-11-05 13:42:07 -07:00
if ( getrandom ( & page_nr , sizeof ( page_nr ) , 0 ) ! = sizeof ( page_nr ) )
err ( " getrandom failed " ) ;
2015-09-04 15:47:23 -07:00
} else
page_nr + = 1 ;
page_nr % = nr_pages ;
pthread_mutex_lock ( area_mutex ( area_dst , page_nr ) ) ;
count = * area_count ( area_dst , page_nr ) ;
2021-06-30 18:48:55 -07:00
if ( count ! = count_verify [ page_nr ] )
err ( " page_nr %lu memory corruption %llu %llu " ,
page_nr , count , count_verify [ page_nr ] ) ;
2015-09-04 15:47:23 -07:00
count + + ;
* area_count ( area_dst , page_nr ) = count_verify [ page_nr ] = count ;
pthread_mutex_unlock ( area_mutex ( area_dst , page_nr ) ) ;
}
return NULL ;
}
2017-09-06 16:23:46 -07:00
static void retry_copy_page ( int ufd , struct uffdio_copy * uffdio_copy ,
unsigned long offset )
{
uffd_test_ops - > alias_mapping ( & uffdio_copy - > dst ,
uffdio_copy - > len ,
offset ) ;
if ( ioctl ( ufd , UFFDIO_COPY , uffdio_copy ) ) {
/* real retval in ufdio_copy.copy */
2021-06-30 18:48:55 -07:00
if ( uffdio_copy - > copy ! = - EEXIST )
err ( " UFFDIO_COPY retry error: % " PRId64 ,
( int64_t ) uffdio_copy - > copy ) ;
} else {
err ( " UFFDIO_COPY retry unexpected: % " PRId64 ,
( int64_t ) uffdio_copy - > copy ) ;
}
2017-09-06 16:23:46 -07:00
}
2021-09-02 14:59:02 -07:00
static void wake_range ( int ufd , unsigned long addr , unsigned long len )
{
struct uffdio_range uffdio_wake ;
uffdio_wake . start = addr ;
uffdio_wake . len = len ;
if ( ioctl ( ufd , UFFDIO_WAKE , & uffdio_wake ) )
fprintf ( stderr , " error waking %lu \n " ,
addr ) , exit ( 1 ) ;
}
2017-10-13 15:57:54 -07:00
static int __copy_page ( int ufd , unsigned long offset , bool retry )
2015-09-04 15:47:23 -07:00
{
struct uffdio_copy uffdio_copy ;
2021-06-30 18:48:55 -07:00
if ( offset > = nr_pages * page_size )
err ( " unexpected offset %lu \n " , offset ) ;
2015-09-04 15:47:23 -07:00
uffdio_copy . dst = ( unsigned long ) area_dst + offset ;
uffdio_copy . src = ( unsigned long ) area_src + offset ;
uffdio_copy . len = page_size ;
2020-04-06 20:06:36 -07:00
if ( test_uffdio_wp )
uffdio_copy . mode = UFFDIO_COPY_MODE_WP ;
else
uffdio_copy . mode = 0 ;
2015-09-04 15:47:23 -07:00
uffdio_copy . copy = 0 ;
2017-02-22 15:44:04 -08:00
if ( ioctl ( ufd , UFFDIO_COPY , & uffdio_copy ) ) {
2015-09-04 15:47:23 -07:00
/* real retval in ufdio_copy.copy */
userfaultfd: selftests: make __{s,u}64 format specifiers portable
On certain platforms (powerpcle is the one on which I ran into this),
"%Ld" and "%Lu" are unsuitable for printing __s64 and __u64, respectively,
resulting in build warnings. Cast to {u,}int64_t, and use the PRI{d,u}64
macros defined in inttypes.h to print them. This ought to be portable to
all platforms.
Splitting this off into a separate macro lets us remove some lines, and
get rid of some (I would argue) stylistically odd cases where we joined
printf() and exit() into a single statement with a ,.
Finally, this also fixes a "missing braces around initializer" warning
when we initialize prms in wp_range().
[axelrasmussen@google.com: v2]
Link: https://lkml.kernel.org/r/20201203180244.1811601-1-axelrasmussen@google.com
Link: https://lkml.kernel.org/r/20201202211542.1121189-1-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Alan Gilbert <dgilbert@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-14 19:13:58 -08:00
if ( uffdio_copy . copy ! = - EEXIST )
2021-06-30 18:48:55 -07:00
err ( " UFFDIO_COPY error: % " PRId64 ,
( int64_t ) uffdio_copy . copy ) ;
2021-09-02 14:59:02 -07:00
wake_range ( ufd , uffdio_copy . dst , page_size ) ;
2015-09-04 15:47:23 -07:00
} else if ( uffdio_copy . copy ! = page_size ) {
2021-06-30 18:48:55 -07:00
err ( " UFFDIO_COPY error: % " PRId64 , ( int64_t ) uffdio_copy . copy ) ;
2017-09-06 16:23:46 -07:00
} else {
2017-10-13 15:57:54 -07:00
if ( test_uffdio_copy_eexist & & retry ) {
2017-09-06 16:23:46 -07:00
test_uffdio_copy_eexist = false ;
retry_copy_page ( ufd , & uffdio_copy , offset ) ;
}
2015-09-04 15:47:23 -07:00
return 1 ;
2017-09-06 16:23:46 -07:00
}
2015-09-04 15:47:23 -07:00
return 0 ;
}
2017-10-13 15:57:54 -07:00
static int copy_page_retry ( int ufd , unsigned long offset )
{
return __copy_page ( ufd , offset , true ) ;
}
static int copy_page ( int ufd , unsigned long offset )
{
return __copy_page ( ufd , offset , false ) ;
}
2018-10-26 15:09:13 -07:00
static int uffd_read_msg ( int ufd , struct uffd_msg * msg )
{
int ret = read ( uffd , msg , sizeof ( * msg ) ) ;
if ( ret ! = sizeof ( * msg ) ) {
if ( ret < 0 ) {
2022-01-14 14:08:01 -08:00
if ( errno = = EAGAIN | | errno = = EINTR )
2018-10-26 15:09:13 -07:00
return 1 ;
2021-06-30 18:48:55 -07:00
err ( " blocking read error " ) ;
2018-10-26 15:09:13 -07:00
} else {
2021-06-30 18:48:55 -07:00
err ( " short read " ) ;
2018-10-26 15:09:13 -07:00
}
}
return 0 ;
}
2020-04-06 20:06:32 -07:00
static void uffd_handle_page_fault ( struct uffd_msg * msg ,
struct uffd_stats * stats )
2018-10-26 15:09:13 -07:00
{
unsigned long offset ;
2021-06-30 18:48:55 -07:00
if ( msg - > event ! = UFFD_EVENT_PAGEFAULT )
err ( " unexpected msg event %u " , msg - > event ) ;
2018-10-26 15:09:13 -07:00
2020-04-06 20:06:36 -07:00
if ( msg - > arg . pagefault . flags & UFFD_PAGEFAULT_FLAG_WP ) {
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
/* Write protect page faults */
2020-04-06 20:06:36 -07:00
wp_range ( uffd , msg - > arg . pagefault . address , page_size , false ) ;
stats - > wp_faults + + ;
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
} else if ( msg - > arg . pagefault . flags & UFFD_PAGEFAULT_FLAG_MINOR ) {
uint8_t * area ;
int b ;
/*
* Minor page faults
*
* To prove we can modify the original range for testing
* purposes , we ' re going to bit flip this range before
* continuing .
*
* Note that this requires all minor page fault tests operate on
* area_dst ( non - UFFD - registered ) and area_dst_alias
* ( UFFD - registered ) .
*/
area = ( uint8_t * ) ( area_dst +
( ( char * ) msg - > arg . pagefault . address -
area_dst_alias ) ) ;
for ( b = 0 ; b < page_size ; + + b )
area [ b ] = ~ area [ b ] ;
continue_range ( uffd , msg - > arg . pagefault . address , page_size ) ;
stats - > minor_faults + + ;
2020-04-06 20:06:36 -07:00
} else {
/* Missing page faults */
2021-06-30 18:48:55 -07:00
if ( msg - > arg . pagefault . flags & UFFD_PAGEFAULT_FLAG_WRITE )
err ( " unexpected write fault " ) ;
2018-10-26 15:09:13 -07:00
2020-04-06 20:06:36 -07:00
offset = ( char * ) ( unsigned long ) msg - > arg . pagefault . address - area_dst ;
offset & = ~ ( page_size - 1 ) ;
2018-10-26 15:09:13 -07:00
2020-04-06 20:06:36 -07:00
if ( copy_page ( uffd , offset ) )
stats - > missing_faults + + ;
}
2018-10-26 15:09:13 -07:00
}
2015-09-04 15:47:23 -07:00
static void * uffd_poll_thread ( void * arg )
{
2020-04-06 20:06:32 -07:00
struct uffd_stats * stats = ( struct uffd_stats * ) arg ;
unsigned long cpu = stats - > cpu ;
2015-09-04 15:47:23 -07:00
struct pollfd pollfd [ 2 ] ;
struct uffd_msg msg ;
2017-02-22 15:44:06 -08:00
struct uffdio_register uffd_reg ;
2015-09-04 15:47:23 -07:00
int ret ;
char tmp_chr ;
pollfd [ 0 ] . fd = uffd ;
pollfd [ 0 ] . events = POLLIN ;
pollfd [ 1 ] . fd = pipefd [ cpu * 2 ] ;
pollfd [ 1 ] . events = POLLIN ;
for ( ; ; ) {
ret = poll ( pollfd , 2 , - 1 ) ;
2022-01-14 14:08:01 -08:00
if ( ret < = 0 ) {
if ( errno = = EINTR | | errno = = EAGAIN )
continue ;
2021-06-30 18:48:55 -07:00
err ( " poll error: %d " , ret ) ;
2022-01-14 14:08:01 -08:00
}
2015-09-04 15:47:23 -07:00
if ( pollfd [ 1 ] . revents & POLLIN ) {
2021-06-30 18:48:55 -07:00
if ( read ( pollfd [ 1 ] . fd , & tmp_chr , 1 ) ! = 1 )
err ( " read pipefd error " ) ;
2015-09-04 15:47:23 -07:00
break ;
}
2021-06-30 18:48:55 -07:00
if ( ! ( pollfd [ 0 ] . revents & POLLIN ) )
err ( " pollfd[0].revents %d " , pollfd [ 0 ] . revents ) ;
2018-10-26 15:09:13 -07:00
if ( uffd_read_msg ( uffd , & msg ) )
continue ;
2017-02-22 15:44:06 -08:00
switch ( msg . event ) {
default :
2021-06-30 18:48:55 -07:00
err ( " unexpected msg event %u \n " , msg . event ) ;
2017-02-22 15:44:06 -08:00
break ;
case UFFD_EVENT_PAGEFAULT :
2020-04-06 20:06:32 -07:00
uffd_handle_page_fault ( & msg , stats ) ;
2017-02-22 15:44:06 -08:00
break ;
case UFFD_EVENT_FORK :
2017-09-06 16:23:43 -07:00
close ( uffd ) ;
2017-02-22 15:44:06 -08:00
uffd = msg . arg . fork . ufd ;
pollfd [ 0 ] . fd = uffd ;
break ;
2017-02-24 14:56:02 -08:00
case UFFD_EVENT_REMOVE :
uffd_reg . range . start = msg . arg . remove . start ;
uffd_reg . range . len = msg . arg . remove . end -
msg . arg . remove . start ;
2021-06-30 18:48:55 -07:00
if ( ioctl ( uffd , UFFDIO_UNREGISTER , & uffd_reg . range ) )
err ( " remove failure " ) ;
2017-02-22 15:44:06 -08:00
break ;
case UFFD_EVENT_REMAP :
area_dst = ( char * ) ( unsigned long ) msg . arg . remap . to ;
break ;
}
2015-09-04 15:47:23 -07:00
}
2020-04-06 20:06:32 -07:00
return NULL ;
2015-09-04 15:47:23 -07:00
}
pthread_mutex_t uffd_read_mutex = PTHREAD_MUTEX_INITIALIZER ;
static void * uffd_read_thread ( void * arg )
{
2020-04-06 20:06:32 -07:00
struct uffd_stats * stats = ( struct uffd_stats * ) arg ;
2015-09-04 15:47:23 -07:00
struct uffd_msg msg ;
pthread_mutex_unlock ( & uffd_read_mutex ) ;
/* from here cancellation is ok */
for ( ; ; ) {
2018-10-26 15:09:13 -07:00
if ( uffd_read_msg ( uffd , & msg ) )
continue ;
2020-04-06 20:06:32 -07:00
uffd_handle_page_fault ( & msg , stats ) ;
2015-09-04 15:47:23 -07:00
}
2020-04-06 20:06:32 -07:00
return NULL ;
2015-09-04 15:47:23 -07:00
}
static void * background_thread ( void * arg )
{
unsigned long cpu = ( unsigned long ) arg ;
2020-04-06 20:06:36 -07:00
unsigned long page_nr , start_nr , mid_nr , end_nr ;
start_nr = cpu * nr_pages_per_cpu ;
end_nr = ( cpu + 1 ) * nr_pages_per_cpu ;
mid_nr = ( start_nr + end_nr ) / 2 ;
/* Copy the first half of the pages */
for ( page_nr = start_nr ; page_nr < mid_nr ; page_nr + + )
copy_page_retry ( uffd , page_nr * page_size ) ;
2015-09-04 15:47:23 -07:00
2020-04-06 20:06:36 -07:00
/*
* If we need to test uffd - wp , set it up now . Then we ' ll have
* at least the first half of the pages mapped already which
* can be write - protected for testing
*/
if ( test_uffdio_wp )
wp_range ( uffd , ( unsigned long ) area_dst + start_nr * page_size ,
nr_pages_per_cpu * page_size , true ) ;
/*
* Continue the 2 nd half of the page copying , handling write
* protection faults if any
*/
for ( page_nr = mid_nr ; page_nr < end_nr ; page_nr + + )
2017-10-13 15:57:54 -07:00
copy_page_retry ( uffd , page_nr * page_size ) ;
2015-09-04 15:47:23 -07:00
return NULL ;
}
2020-04-06 20:06:32 -07:00
static int stress ( struct uffd_stats * uffd_stats )
2015-09-04 15:47:23 -07:00
{
unsigned long cpu ;
pthread_t locking_threads [ nr_cpus ] ;
pthread_t uffd_threads [ nr_cpus ] ;
pthread_t background_threads [ nr_cpus ] ;
finished = 0 ;
for ( cpu = 0 ; cpu < nr_cpus ; cpu + + ) {
if ( pthread_create ( & locking_threads [ cpu ] , & attr ,
locking_thread , ( void * ) cpu ) )
return 1 ;
if ( bounces & BOUNCE_POLL ) {
if ( pthread_create ( & uffd_threads [ cpu ] , & attr ,
2020-04-06 20:06:32 -07:00
uffd_poll_thread ,
( void * ) & uffd_stats [ cpu ] ) )
2015-09-04 15:47:23 -07:00
return 1 ;
} else {
if ( pthread_create ( & uffd_threads [ cpu ] , & attr ,
uffd_read_thread ,
2020-04-06 20:06:32 -07:00
( void * ) & uffd_stats [ cpu ] ) )
2015-09-04 15:47:23 -07:00
return 1 ;
pthread_mutex_lock ( & uffd_read_mutex ) ;
}
if ( pthread_create ( & background_threads [ cpu ] , & attr ,
background_thread , ( void * ) cpu ) )
return 1 ;
}
for ( cpu = 0 ; cpu < nr_cpus ; cpu + + )
if ( pthread_join ( background_threads [ cpu ] , NULL ) )
return 1 ;
/*
* Be strict and immediately zap area_src , the whole area has
* been transferred already by the background treads . The
* area_src could then be faulted in in a racy way by still
* running uffdio_threads reading zeropages after we zapped
* area_src ( but they ' re guaranteed to get - EEXIST from
* UFFDIO_COPY without writing zero pages into area_dst
* because the background threads already completed ) .
*/
2021-06-30 18:48:55 -07:00
uffd_test_ops - > release_pages ( area_src ) ;
2018-10-26 15:09:17 -07:00
finished = 1 ;
for ( cpu = 0 ; cpu < nr_cpus ; cpu + + )
if ( pthread_join ( locking_threads [ cpu ] , NULL ) )
return 1 ;
2015-09-04 15:47:23 -07:00
for ( cpu = 0 ; cpu < nr_cpus ; cpu + + ) {
char c ;
if ( bounces & BOUNCE_POLL ) {
2021-06-30 18:48:55 -07:00
if ( write ( pipefd [ cpu * 2 + 1 ] , & c , 1 ) ! = 1 )
err ( " pipefd write error " ) ;
2020-04-06 20:06:32 -07:00
if ( pthread_join ( uffd_threads [ cpu ] ,
( void * ) & uffd_stats [ cpu ] ) )
2015-09-04 15:47:23 -07:00
return 1 ;
} else {
if ( pthread_cancel ( uffd_threads [ cpu ] ) )
return 1 ;
if ( pthread_join ( uffd_threads [ cpu ] , NULL ) )
return 1 ;
}
}
return 0 ;
}
2017-09-06 16:23:43 -07:00
sigjmp_buf jbuf , * sigbuf ;
static void sighndl ( int sig , siginfo_t * siginfo , void * ptr )
{
if ( sig = = SIGBUS ) {
if ( sigbuf )
siglongjmp ( * sigbuf , 1 ) ;
abort ( ) ;
}
}
2017-02-22 15:44:06 -08:00
/*
* For non - cooperative userfaultfd test we fork ( ) a process that will
* generate pagefaults , will mremap the area monitored by the
* userfaultfd and at last this process will release the monitored
* area .
* For the anonymous and shared memory the area is divided into two
* parts , the first part is accessed before mremap , and the second
* part is accessed after mremap . Since hugetlbfs does not support
* mremap , the entire monitored area is accessed in a single pass for
* HUGETLB_TEST .
2017-02-24 14:56:08 -08:00
* The release of the pages currently generates event for shmem and
2017-02-24 14:56:02 -08:00
* anonymous memory ( UFFD_EVENT_REMOVE ) , hence it is not checked
2017-02-24 14:56:08 -08:00
* for hugetlb .
2017-09-06 16:23:43 -07:00
* For signal test ( UFFD_FEATURE_SIGBUS ) , signal_test = 1 , we register
* monitored area , generate pagefaults and test that signal is delivered .
* Use UFFDIO_COPY to allocate missing page and retry . For signal_test = 2
* test robustness use case - we release monitored area , fork a process
* that will generate pagefaults and verify signal is generated .
* This also tests UFFD_FEATURE_EVENT_FORK event along with the signal
* feature . Using monitor thread , verify no userfault events are generated .
2017-02-22 15:44:06 -08:00
*/
2017-09-06 16:23:43 -07:00
static int faulting_process ( int signal_test )
2017-02-22 15:44:06 -08:00
{
unsigned long nr ;
unsigned long long count ;
2017-05-03 14:54:54 -07:00
unsigned long split_nr_pages ;
2017-09-06 16:23:43 -07:00
unsigned long lastnr ;
struct sigaction act ;
unsigned long signalled = 0 ;
2017-02-22 15:44:06 -08:00
2017-05-03 14:54:54 -07:00
if ( test_type ! = TEST_HUGETLB )
split_nr_pages = ( nr_pages + 1 ) / 2 ;
else
split_nr_pages = nr_pages ;
2017-02-22 15:44:06 -08:00
2017-09-06 16:23:43 -07:00
if ( signal_test ) {
sigbuf = & jbuf ;
memset ( & act , 0 , sizeof ( act ) ) ;
act . sa_sigaction = sighndl ;
act . sa_flags = SA_SIGINFO ;
2021-06-30 18:48:55 -07:00
if ( sigaction ( SIGBUS , & act , 0 ) )
err ( " sigaction " ) ;
2017-09-06 16:23:43 -07:00
lastnr = ( unsigned long ) - 1 ;
}
2017-02-22 15:44:06 -08:00
for ( nr = 0 ; nr < split_nr_pages ; nr + + ) {
2020-04-06 20:06:36 -07:00
int steps = 1 ;
unsigned long offset = nr * page_size ;
2017-09-06 16:23:43 -07:00
if ( signal_test ) {
if ( sigsetjmp ( * sigbuf , 1 ) ! = 0 ) {
2021-06-30 18:48:55 -07:00
if ( steps = = 1 & & nr = = lastnr )
err ( " Signal repeated " ) ;
2017-09-06 16:23:43 -07:00
lastnr = nr ;
if ( signal_test = = 1 ) {
2020-04-06 20:06:36 -07:00
if ( steps = = 1 ) {
/* This is a MISSING request */
steps + + ;
if ( copy_page ( uffd , offset ) )
signalled + + ;
} else {
/* This is a WP request */
assert ( steps = = 2 ) ;
wp_range ( uffd ,
( __u64 ) area_dst +
offset ,
page_size , false ) ;
}
2017-09-06 16:23:43 -07:00
} else {
signalled + + ;
continue ;
}
}
}
2017-02-22 15:44:06 -08:00
count = * area_count ( area_dst , nr ) ;
2021-06-30 18:48:55 -07:00
if ( count ! = count_verify [ nr ] )
err ( " nr %lu memory corruption %llu %llu \n " ,
nr , count , count_verify [ nr ] ) ;
2020-04-06 20:06:36 -07:00
/*
2020-11-07 17:19:35 +08:00
* Trigger write protection if there is by writing
2020-04-06 20:06:36 -07:00
* the same value back .
*/
* area_count ( area_dst , nr ) = count ;
2017-02-22 15:44:06 -08:00
}
2017-09-06 16:23:43 -07:00
if ( signal_test )
return signalled ! = split_nr_pages ;
2017-05-03 14:54:54 -07:00
if ( test_type = = TEST_HUGETLB )
return 0 ;
2017-02-22 15:44:06 -08:00
area_dst = mremap ( area_dst , nr_pages * page_size , nr_pages * page_size ,
MREMAP_MAYMOVE | MREMAP_FIXED , area_src ) ;
2021-06-30 18:48:55 -07:00
if ( area_dst = = MAP_FAILED )
err ( " mremap " ) ;
userfaultfd/selftests: reinitialize test context in each test
Currently, the context (fds, mmap-ed areas, etc.) are global. Each test
mutates this state in some way, in some cases really "clobbering it"
(e.g., the events test mremap-ing area_dst over the top of area_src, or
the minor faults tests overwriting the count_verify values in the test
areas). We run the tests in a particular order, each test is careful to
make the right assumptions about its starting state, etc.
But, this is fragile. It's better for a test's success or failure to not
depend on what some other prior test case did to the global state.
To that end, clear and reinitialize the test context at the start of each
test case, so whatever prior test cases did doesn't affect future tests.
This is particularly relevant to this series because the events test's
mremap of area_dst screws up assumptions the minor fault test was relying
on. This wasn't a problem for hugetlb, as we don't mremap in that case.
[peterx@redhat.com: fix conflict between this patch and the uffd pagemap series]
Link: https://lkml.kernel.org/r/YKQqKrl+/cQ1utrb@t490s
Link: https://lkml.kernel.org/r/20210503180737.2487560-10-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 18:49:41 -07:00
/* Reset area_src since we just clobbered it */
area_src = NULL ;
2017-02-22 15:44:06 -08:00
for ( ; nr < nr_pages ; nr + + ) {
count = * area_count ( area_dst , nr ) ;
if ( count ! = count_verify [ nr ] ) {
2021-06-30 18:48:55 -07:00
err ( " nr %lu memory corruption %llu %llu \n " ,
nr , count , count_verify [ nr ] ) ;
2017-02-22 15:44:06 -08:00
}
2020-04-06 20:06:36 -07:00
/*
2020-11-07 17:19:35 +08:00
* Trigger write protection if there is by writing
2020-04-06 20:06:36 -07:00
* the same value back .
*/
* area_count ( area_dst , nr ) = count ;
2017-02-22 15:44:06 -08:00
}
2021-06-30 18:48:55 -07:00
uffd_test_ops - > release_pages ( area_dst ) ;
2017-02-22 15:44:06 -08:00
2021-06-30 18:48:55 -07:00
for ( nr = 0 ; nr < nr_pages ; nr + + )
if ( my_bcmp ( area_dst + nr * page_size , zeropage , page_size ) )
err ( " nr %lu is not zero " , nr ) ;
2017-02-22 15:44:06 -08:00
return 0 ;
}
2017-09-06 16:23:46 -07:00
static void retry_uffdio_zeropage ( int ufd ,
struct uffdio_zeropage * uffdio_zeropage ,
unsigned long offset )
{
uffd_test_ops - > alias_mapping ( & uffdio_zeropage - > range . start ,
uffdio_zeropage - > range . len ,
offset ) ;
if ( ioctl ( ufd , UFFDIO_ZEROPAGE , uffdio_zeropage ) ) {
2021-06-30 18:48:55 -07:00
if ( uffdio_zeropage - > zeropage ! = - EEXIST )
err ( " UFFDIO_ZEROPAGE error: % " PRId64 ,
( int64_t ) uffdio_zeropage - > zeropage ) ;
2017-09-06 16:23:46 -07:00
} else {
2021-06-30 18:48:55 -07:00
err ( " UFFDIO_ZEROPAGE error: % " PRId64 ,
( int64_t ) uffdio_zeropage - > zeropage ) ;
2017-09-06 16:23:46 -07:00
}
}
2017-10-13 15:57:54 -07:00
static int __uffdio_zeropage ( int ufd , unsigned long offset , bool retry )
2017-02-22 15:44:10 -08:00
{
struct uffdio_zeropage uffdio_zeropage ;
int ret ;
userfaultfd/selftests: fix calculation of expected ioctls
Today, we assert that the ioctls the kernel reports as supported for a
registration match a precomputed list. We decide which ioctls are
supported by examining the memory type. Then, in several locations we
"fix up" this list by adding or removing things this initial decision
got wrong.
What ioctls the kernel reports is actually a function of several things:
- The memory type
- Kernel feature support (e.g., no writeprotect on aarch64)
- The registration type (e.g., CONTINUE only supported for MINOR mode)
So, we can't fully compute this at the start, in set_test_type. It
varies per test, depending on what registration mode(s) those tests use.
Instead, introduce a new function which computes the correct list. This
centralizes the add/remove of ioctls depending on these function inputs
in one place, so we don't have to repeat ourselves in various tests.
Not only is the resulting code a bit shorter, but it fixes a real bug in
the existing code: previously, we would incorrectly require the
writeprotect ioctl to be present on aarch64, where it isn't actually
supported.
Link: https://lkml.kernel.org/r/20210930212309.4001967-4-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-05 13:42:13 -07:00
bool has_zeropage = get_expected_ioctls ( 0 ) & ( 1 < < _UFFDIO_ZEROPAGE ) ;
userfaultfd: selftests: make __{s,u}64 format specifiers portable
On certain platforms (powerpcle is the one on which I ran into this),
"%Ld" and "%Lu" are unsuitable for printing __s64 and __u64, respectively,
resulting in build warnings. Cast to {u,}int64_t, and use the PRI{d,u}64
macros defined in inttypes.h to print them. This ought to be portable to
all platforms.
Splitting this off into a separate macro lets us remove some lines, and
get rid of some (I would argue) stylistically odd cases where we joined
printf() and exit() into a single statement with a ,.
Finally, this also fixes a "missing braces around initializer" warning
when we initialize prms in wp_range().
[axelrasmussen@google.com: v2]
Link: https://lkml.kernel.org/r/20201203180244.1811601-1-axelrasmussen@google.com
Link: https://lkml.kernel.org/r/20201202211542.1121189-1-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Alan Gilbert <dgilbert@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-14 19:13:58 -08:00
__s64 res ;
2017-05-03 14:54:54 -07:00
2021-06-30 18:48:55 -07:00
if ( offset > = nr_pages * page_size )
err ( " unexpected offset %lu " , offset ) ;
2017-02-22 15:44:10 -08:00
uffdio_zeropage . range . start = ( unsigned long ) area_dst + offset ;
uffdio_zeropage . range . len = page_size ;
uffdio_zeropage . mode = 0 ;
ret = ioctl ( ufd , UFFDIO_ZEROPAGE , & uffdio_zeropage ) ;
userfaultfd: selftests: make __{s,u}64 format specifiers portable
On certain platforms (powerpcle is the one on which I ran into this),
"%Ld" and "%Lu" are unsuitable for printing __s64 and __u64, respectively,
resulting in build warnings. Cast to {u,}int64_t, and use the PRI{d,u}64
macros defined in inttypes.h to print them. This ought to be portable to
all platforms.
Splitting this off into a separate macro lets us remove some lines, and
get rid of some (I would argue) stylistically odd cases where we joined
printf() and exit() into a single statement with a ,.
Finally, this also fixes a "missing braces around initializer" warning
when we initialize prms in wp_range().
[axelrasmussen@google.com: v2]
Link: https://lkml.kernel.org/r/20201203180244.1811601-1-axelrasmussen@google.com
Link: https://lkml.kernel.org/r/20201202211542.1121189-1-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Alan Gilbert <dgilbert@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-14 19:13:58 -08:00
res = uffdio_zeropage . zeropage ;
2017-02-22 15:44:10 -08:00
if ( ret ) {
/* real retval in ufdio_zeropage.zeropage */
2021-06-30 18:48:55 -07:00
if ( has_zeropage )
err ( " UFFDIO_ZEROPAGE error: % " PRId64 , ( int64_t ) res ) ;
else if ( res ! = - EINVAL )
err ( " UFFDIO_ZEROPAGE not -EINVAL " ) ;
2017-02-22 15:44:10 -08:00
} else if ( has_zeropage ) {
userfaultfd: selftests: make __{s,u}64 format specifiers portable
On certain platforms (powerpcle is the one on which I ran into this),
"%Ld" and "%Lu" are unsuitable for printing __s64 and __u64, respectively,
resulting in build warnings. Cast to {u,}int64_t, and use the PRI{d,u}64
macros defined in inttypes.h to print them. This ought to be portable to
all platforms.
Splitting this off into a separate macro lets us remove some lines, and
get rid of some (I would argue) stylistically odd cases where we joined
printf() and exit() into a single statement with a ,.
Finally, this also fixes a "missing braces around initializer" warning
when we initialize prms in wp_range().
[axelrasmussen@google.com: v2]
Link: https://lkml.kernel.org/r/20201203180244.1811601-1-axelrasmussen@google.com
Link: https://lkml.kernel.org/r/20201202211542.1121189-1-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Alan Gilbert <dgilbert@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-14 19:13:58 -08:00
if ( res ! = page_size ) {
2021-06-30 18:48:55 -07:00
err ( " UFFDIO_ZEROPAGE unexpected size " ) ;
2017-09-06 16:23:46 -07:00
} else {
2017-10-13 15:57:54 -07:00
if ( test_uffdio_zeropage_eexist & & retry ) {
2017-09-06 16:23:46 -07:00
test_uffdio_zeropage_eexist = false ;
retry_uffdio_zeropage ( ufd , & uffdio_zeropage ,
offset ) ;
}
2017-02-22 15:44:10 -08:00
return 1 ;
2017-09-06 16:23:46 -07:00
}
userfaultfd: selftests: make __{s,u}64 format specifiers portable
On certain platforms (powerpcle is the one on which I ran into this),
"%Ld" and "%Lu" are unsuitable for printing __s64 and __u64, respectively,
resulting in build warnings. Cast to {u,}int64_t, and use the PRI{d,u}64
macros defined in inttypes.h to print them. This ought to be portable to
all platforms.
Splitting this off into a separate macro lets us remove some lines, and
get rid of some (I would argue) stylistically odd cases where we joined
printf() and exit() into a single statement with a ,.
Finally, this also fixes a "missing braces around initializer" warning
when we initialize prms in wp_range().
[axelrasmussen@google.com: v2]
Link: https://lkml.kernel.org/r/20201203180244.1811601-1-axelrasmussen@google.com
Link: https://lkml.kernel.org/r/20201202211542.1121189-1-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Alan Gilbert <dgilbert@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-14 19:13:58 -08:00
} else
2021-06-30 18:48:55 -07:00
err ( " UFFDIO_ZEROPAGE succeeded " ) ;
2017-02-22 15:44:10 -08:00
return 0 ;
}
2017-10-13 15:57:54 -07:00
static int uffdio_zeropage ( int ufd , unsigned long offset )
{
return __uffdio_zeropage ( ufd , offset , false ) ;
}
2017-02-22 15:44:10 -08:00
/* exercise UFFDIO_ZEROPAGE */
static int userfaultfd_zeropage_test ( void )
{
struct uffdio_register uffdio_register ;
printf ( " testing UFFDIO_ZEROPAGE: " ) ;
fflush ( stdout ) ;
userfaultfd/selftests: reinitialize test context in each test
Currently, the context (fds, mmap-ed areas, etc.) are global. Each test
mutates this state in some way, in some cases really "clobbering it"
(e.g., the events test mremap-ing area_dst over the top of area_src, or
the minor faults tests overwriting the count_verify values in the test
areas). We run the tests in a particular order, each test is careful to
make the right assumptions about its starting state, etc.
But, this is fragile. It's better for a test's success or failure to not
depend on what some other prior test case did to the global state.
To that end, clear and reinitialize the test context at the start of each
test case, so whatever prior test cases did doesn't affect future tests.
This is particularly relevant to this series because the events test's
mremap of area_dst screws up assumptions the minor fault test was relying
on. This wasn't a problem for hugetlb, as we don't mremap in that case.
[peterx@redhat.com: fix conflict between this patch and the uffd pagemap series]
Link: https://lkml.kernel.org/r/YKQqKrl+/cQ1utrb@t490s
Link: https://lkml.kernel.org/r/20210503180737.2487560-10-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 18:49:41 -07:00
uffd_test_ctx_init ( 0 ) ;
2017-02-22 15:44:10 -08:00
uffdio_register . range . start = ( unsigned long ) area_dst ;
uffdio_register . range . len = nr_pages * page_size ;
uffdio_register . mode = UFFDIO_REGISTER_MODE_MISSING ;
2020-04-06 20:06:36 -07:00
if ( test_uffdio_wp )
uffdio_register . mode | = UFFDIO_REGISTER_MODE_WP ;
2021-06-30 18:48:55 -07:00
if ( ioctl ( uffd , UFFDIO_REGISTER , & uffdio_register ) )
err ( " register failure " ) ;
2017-02-22 15:44:10 -08:00
userfaultfd/selftests: fix calculation of expected ioctls
Today, we assert that the ioctls the kernel reports as supported for a
registration match a precomputed list. We decide which ioctls are
supported by examining the memory type. Then, in several locations we
"fix up" this list by adding or removing things this initial decision
got wrong.
What ioctls the kernel reports is actually a function of several things:
- The memory type
- Kernel feature support (e.g., no writeprotect on aarch64)
- The registration type (e.g., CONTINUE only supported for MINOR mode)
So, we can't fully compute this at the start, in set_test_type. It
varies per test, depending on what registration mode(s) those tests use.
Instead, introduce a new function which computes the correct list. This
centralizes the add/remove of ioctls depending on these function inputs
in one place, so we don't have to repeat ourselves in various tests.
Not only is the resulting code a bit shorter, but it fixes a real bug in
the existing code: previously, we would incorrectly require the
writeprotect ioctl to be present on aarch64, where it isn't actually
supported.
Link: https://lkml.kernel.org/r/20210930212309.4001967-4-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-05 13:42:13 -07:00
assert_expected_ioctls_present (
uffdio_register . mode , uffdio_register . ioctls ) ;
2017-02-22 15:44:10 -08:00
2021-06-30 18:48:55 -07:00
if ( uffdio_zeropage ( uffd , 0 ) )
if ( my_bcmp ( area_dst , zeropage , page_size ) )
err ( " zeropage is not zero " ) ;
2017-02-22 15:44:10 -08:00
printf ( " done. \n " ) ;
return 0 ;
}
2017-02-22 15:44:06 -08:00
static int userfaultfd_events_test ( void )
{
struct uffdio_register uffdio_register ;
pthread_t uffd_mon ;
int err , features ;
pid_t pid ;
char c ;
2020-04-06 20:06:32 -07:00
struct uffd_stats stats = { 0 } ;
2017-02-22 15:44:06 -08:00
2017-02-24 14:56:02 -08:00
printf ( " testing events (fork, remap, remove): " ) ;
2017-02-22 15:44:06 -08:00
fflush ( stdout ) ;
features = UFFD_FEATURE_EVENT_FORK | UFFD_FEATURE_EVENT_REMAP |
2017-02-24 14:56:02 -08:00
UFFD_FEATURE_EVENT_REMOVE ;
userfaultfd/selftests: reinitialize test context in each test
Currently, the context (fds, mmap-ed areas, etc.) are global. Each test
mutates this state in some way, in some cases really "clobbering it"
(e.g., the events test mremap-ing area_dst over the top of area_src, or
the minor faults tests overwriting the count_verify values in the test
areas). We run the tests in a particular order, each test is careful to
make the right assumptions about its starting state, etc.
But, this is fragile. It's better for a test's success or failure to not
depend on what some other prior test case did to the global state.
To that end, clear and reinitialize the test context at the start of each
test case, so whatever prior test cases did doesn't affect future tests.
This is particularly relevant to this series because the events test's
mremap of area_dst screws up assumptions the minor fault test was relying
on. This wasn't a problem for hugetlb, as we don't mremap in that case.
[peterx@redhat.com: fix conflict between this patch and the uffd pagemap series]
Link: https://lkml.kernel.org/r/YKQqKrl+/cQ1utrb@t490s
Link: https://lkml.kernel.org/r/20210503180737.2487560-10-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 18:49:41 -07:00
uffd_test_ctx_init ( features ) ;
2017-02-22 15:44:06 -08:00
fcntl ( uffd , F_SETFL , uffd_flags | O_NONBLOCK ) ;
uffdio_register . range . start = ( unsigned long ) area_dst ;
uffdio_register . range . len = nr_pages * page_size ;
uffdio_register . mode = UFFDIO_REGISTER_MODE_MISSING ;
2020-04-06 20:06:36 -07:00
if ( test_uffdio_wp )
uffdio_register . mode | = UFFDIO_REGISTER_MODE_WP ;
2021-06-30 18:48:55 -07:00
if ( ioctl ( uffd , UFFDIO_REGISTER , & uffdio_register ) )
err ( " register failure " ) ;
2017-02-22 15:44:06 -08:00
userfaultfd/selftests: fix calculation of expected ioctls
Today, we assert that the ioctls the kernel reports as supported for a
registration match a precomputed list. We decide which ioctls are
supported by examining the memory type. Then, in several locations we
"fix up" this list by adding or removing things this initial decision
got wrong.
What ioctls the kernel reports is actually a function of several things:
- The memory type
- Kernel feature support (e.g., no writeprotect on aarch64)
- The registration type (e.g., CONTINUE only supported for MINOR mode)
So, we can't fully compute this at the start, in set_test_type. It
varies per test, depending on what registration mode(s) those tests use.
Instead, introduce a new function which computes the correct list. This
centralizes the add/remove of ioctls depending on these function inputs
in one place, so we don't have to repeat ourselves in various tests.
Not only is the resulting code a bit shorter, but it fixes a real bug in
the existing code: previously, we would incorrectly require the
writeprotect ioctl to be present on aarch64, where it isn't actually
supported.
Link: https://lkml.kernel.org/r/20210930212309.4001967-4-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-05 13:42:13 -07:00
assert_expected_ioctls_present (
uffdio_register . mode , uffdio_register . ioctls ) ;
2017-02-22 15:44:06 -08:00
2021-06-30 18:48:55 -07:00
if ( pthread_create ( & uffd_mon , & attr , uffd_poll_thread , & stats ) )
err ( " uffd_poll_thread create " ) ;
2017-02-22 15:44:06 -08:00
pid = fork ( ) ;
2021-06-30 18:48:55 -07:00
if ( pid < 0 )
err ( " fork " ) ;
2017-02-22 15:44:06 -08:00
if ( ! pid )
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
exit ( faulting_process ( 0 ) ) ;
2017-02-22 15:44:06 -08:00
waitpid ( pid , & err , 0 ) ;
2021-06-30 18:48:55 -07:00
if ( err )
err ( " faulting process failed " ) ;
if ( write ( pipefd [ 1 ] , & c , sizeof ( c ) ) ! = sizeof ( c ) )
err ( " pipe write " ) ;
2020-04-06 20:06:32 -07:00
if ( pthread_join ( uffd_mon , NULL ) )
2017-02-22 15:44:06 -08:00
return 1 ;
2020-04-06 20:06:36 -07:00
uffd_stats_report ( & stats , 1 ) ;
2017-02-22 15:44:06 -08:00
2020-04-06 20:06:32 -07:00
return stats . missing_faults ! = nr_pages ;
2017-02-22 15:44:06 -08:00
}
2017-09-06 16:23:43 -07:00
static int userfaultfd_sig_test ( void )
{
struct uffdio_register uffdio_register ;
unsigned long userfaults ;
pthread_t uffd_mon ;
int err , features ;
pid_t pid ;
char c ;
2020-04-06 20:06:32 -07:00
struct uffd_stats stats = { 0 } ;
2017-09-06 16:23:43 -07:00
printf ( " testing signal delivery: " ) ;
fflush ( stdout ) ;
features = UFFD_FEATURE_EVENT_FORK | UFFD_FEATURE_SIGBUS ;
userfaultfd/selftests: reinitialize test context in each test
Currently, the context (fds, mmap-ed areas, etc.) are global. Each test
mutates this state in some way, in some cases really "clobbering it"
(e.g., the events test mremap-ing area_dst over the top of area_src, or
the minor faults tests overwriting the count_verify values in the test
areas). We run the tests in a particular order, each test is careful to
make the right assumptions about its starting state, etc.
But, this is fragile. It's better for a test's success or failure to not
depend on what some other prior test case did to the global state.
To that end, clear and reinitialize the test context at the start of each
test case, so whatever prior test cases did doesn't affect future tests.
This is particularly relevant to this series because the events test's
mremap of area_dst screws up assumptions the minor fault test was relying
on. This wasn't a problem for hugetlb, as we don't mremap in that case.
[peterx@redhat.com: fix conflict between this patch and the uffd pagemap series]
Link: https://lkml.kernel.org/r/YKQqKrl+/cQ1utrb@t490s
Link: https://lkml.kernel.org/r/20210503180737.2487560-10-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 18:49:41 -07:00
uffd_test_ctx_init ( features ) ;
2017-09-06 16:23:43 -07:00
fcntl ( uffd , F_SETFL , uffd_flags | O_NONBLOCK ) ;
uffdio_register . range . start = ( unsigned long ) area_dst ;
uffdio_register . range . len = nr_pages * page_size ;
uffdio_register . mode = UFFDIO_REGISTER_MODE_MISSING ;
2020-04-06 20:06:36 -07:00
if ( test_uffdio_wp )
uffdio_register . mode | = UFFDIO_REGISTER_MODE_WP ;
2021-06-30 18:48:55 -07:00
if ( ioctl ( uffd , UFFDIO_REGISTER , & uffdio_register ) )
err ( " register failure " ) ;
2017-09-06 16:23:43 -07:00
userfaultfd/selftests: fix calculation of expected ioctls
Today, we assert that the ioctls the kernel reports as supported for a
registration match a precomputed list. We decide which ioctls are
supported by examining the memory type. Then, in several locations we
"fix up" this list by adding or removing things this initial decision
got wrong.
What ioctls the kernel reports is actually a function of several things:
- The memory type
- Kernel feature support (e.g., no writeprotect on aarch64)
- The registration type (e.g., CONTINUE only supported for MINOR mode)
So, we can't fully compute this at the start, in set_test_type. It
varies per test, depending on what registration mode(s) those tests use.
Instead, introduce a new function which computes the correct list. This
centralizes the add/remove of ioctls depending on these function inputs
in one place, so we don't have to repeat ourselves in various tests.
Not only is the resulting code a bit shorter, but it fixes a real bug in
the existing code: previously, we would incorrectly require the
writeprotect ioctl to be present on aarch64, where it isn't actually
supported.
Link: https://lkml.kernel.org/r/20210930212309.4001967-4-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-05 13:42:13 -07:00
assert_expected_ioctls_present (
uffdio_register . mode , uffdio_register . ioctls ) ;
2017-09-06 16:23:43 -07:00
2021-06-30 18:48:55 -07:00
if ( faulting_process ( 1 ) )
err ( " faulting process failed " ) ;
2017-09-06 16:23:43 -07:00
2021-06-30 18:48:55 -07:00
uffd_test_ops - > release_pages ( area_dst ) ;
2017-09-06 16:23:43 -07:00
2021-06-30 18:48:55 -07:00
if ( pthread_create ( & uffd_mon , & attr , uffd_poll_thread , & stats ) )
err ( " uffd_poll_thread create " ) ;
2017-09-06 16:23:43 -07:00
pid = fork ( ) ;
2021-06-30 18:48:55 -07:00
if ( pid < 0 )
err ( " fork " ) ;
2017-09-06 16:23:43 -07:00
if ( ! pid )
exit ( faulting_process ( 2 ) ) ;
waitpid ( pid , & err , 0 ) ;
2021-06-30 18:48:55 -07:00
if ( err )
err ( " faulting process failed " ) ;
if ( write ( pipefd [ 1 ] , & c , sizeof ( c ) ) ! = sizeof ( c ) )
err ( " pipe write " ) ;
2017-09-06 16:23:43 -07:00
if ( pthread_join ( uffd_mon , ( void * * ) & userfaults ) )
return 1 ;
printf ( " done. \n " ) ;
2017-09-06 16:23:49 -07:00
if ( userfaults )
2021-06-30 18:48:55 -07:00
err ( " Signal test failed, userfaults: %ld " , userfaults ) ;
2017-09-06 16:23:43 -07:00
return userfaults ! = 0 ;
}
2020-04-06 20:06:32 -07:00
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
static int userfaultfd_minor_test ( void )
{
struct uffdio_register uffdio_register ;
unsigned long p ;
pthread_t uffd_mon ;
uint8_t expected_byte ;
void * expected_page ;
char c ;
struct uffd_stats stats = { 0 } ;
if ( ! test_uffdio_minor )
return 0 ;
printf ( " testing minor faults: " ) ;
fflush ( stdout ) ;
userfaultfd/selftests: fix feature support detection
Before any tests are run, in set_test_type, we decide what feature(s) we
are going to be testing, based upon our command line arguments.
However, the supported features are not just a function of the memory
type being used, so this is broken.
For instance, consider writeprotect support. It is "normally" supported
for anonymous memory, but furthermore it requires that the kernel has
CONFIG_HAVE_ARCH_USERFAULTFD_WP. So, it is *not* supported at all on
aarch64, for example.
So, this fixes this by querying the kernel for the set of features it
supports in set_test_type, by opening a userfaultfd and issuing a
UFFDIO_API ioctl. Based upon the reported features, we toggle what
tests are enabled.
Link: https://lkml.kernel.org/r/20210930212309.4001967-3-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-05 13:42:10 -07:00
uffd_test_ctx_init ( uffd_minor_feature ( ) ) ;
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
uffdio_register . range . start = ( unsigned long ) area_dst_alias ;
uffdio_register . range . len = nr_pages * page_size ;
uffdio_register . mode = UFFDIO_REGISTER_MODE_MINOR ;
2021-06-30 18:48:55 -07:00
if ( ioctl ( uffd , UFFDIO_REGISTER , & uffdio_register ) )
err ( " register failure " ) ;
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
userfaultfd/selftests: fix calculation of expected ioctls
Today, we assert that the ioctls the kernel reports as supported for a
registration match a precomputed list. We decide which ioctls are
supported by examining the memory type. Then, in several locations we
"fix up" this list by adding or removing things this initial decision
got wrong.
What ioctls the kernel reports is actually a function of several things:
- The memory type
- Kernel feature support (e.g., no writeprotect on aarch64)
- The registration type (e.g., CONTINUE only supported for MINOR mode)
So, we can't fully compute this at the start, in set_test_type. It
varies per test, depending on what registration mode(s) those tests use.
Instead, introduce a new function which computes the correct list. This
centralizes the add/remove of ioctls depending on these function inputs
in one place, so we don't have to repeat ourselves in various tests.
Not only is the resulting code a bit shorter, but it fixes a real bug in
the existing code: previously, we would incorrectly require the
writeprotect ioctl to be present on aarch64, where it isn't actually
supported.
Link: https://lkml.kernel.org/r/20210930212309.4001967-4-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-05 13:42:13 -07:00
assert_expected_ioctls_present (
uffdio_register . mode , uffdio_register . ioctls ) ;
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
/*
* After registering with UFFD , populate the non - UFFD - registered side of
* the shared mapping . This should * not * trigger any UFFD minor faults .
*/
for ( p = 0 ; p < nr_pages ; + + p ) {
memset ( area_dst + ( p * page_size ) , p % ( ( uint8_t ) - 1 ) ,
page_size ) ;
}
2021-06-30 18:48:55 -07:00
if ( pthread_create ( & uffd_mon , & attr , uffd_poll_thread , & stats ) )
err ( " uffd_poll_thread create " ) ;
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
/*
* Read each of the pages back using the UFFD - registered mapping . We
* expect that the first time we touch a page , it will result in a minor
* fault . uffd_poll_thread will resolve the fault by bit - flipping the
* page ' s contents , and then issuing a CONTINUE ioctl .
*/
2021-06-30 18:48:55 -07:00
if ( posix_memalign ( & expected_page , page_size , page_size ) )
err ( " out of memory " ) ;
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
for ( p = 0 ; p < nr_pages ; + + p ) {
expected_byte = ~ ( ( uint8_t ) ( p % ( ( uint8_t ) - 1 ) ) ) ;
memset ( expected_page , expected_byte , page_size ) ;
if ( my_bcmp ( expected_page , area_dst_alias + ( p * page_size ) ,
2021-06-30 18:48:55 -07:00
page_size ) )
err ( " unexpected page contents after minor fault " ) ;
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
}
2021-06-30 18:48:55 -07:00
if ( write ( pipefd [ 1 ] , & c , sizeof ( c ) ) ! = sizeof ( c ) )
err ( " pipe write " ) ;
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
if ( pthread_join ( uffd_mon , NULL ) )
return 1 ;
uffd_stats_report ( & stats , 1 ) ;
return stats . missing_faults ! = 0 | | stats . minor_faults ! = nr_pages ;
}
2021-06-30 18:49:13 -07:00
# define BIT_ULL(nr) (1ULL << (nr))
# define PM_SOFT_DIRTY BIT_ULL(55)
# define PM_MMAP_EXCLUSIVE BIT_ULL(56)
# define PM_UFFD_WP BIT_ULL(57)
# define PM_FILE BIT_ULL(61)
# define PM_SWAP BIT_ULL(62)
# define PM_PRESENT BIT_ULL(63)
static int pagemap_open ( void )
{
int fd = open ( " /proc/self/pagemap " , O_RDONLY ) ;
if ( fd < 0 )
err ( " open pagemap " ) ;
return fd ;
}
static uint64_t pagemap_read_vaddr ( int fd , void * vaddr )
{
uint64_t value ;
int ret ;
ret = pread ( fd , & value , sizeof ( uint64_t ) ,
( ( uint64_t ) vaddr > > 12 ) * sizeof ( uint64_t ) ) ;
if ( ret ! = sizeof ( uint64_t ) )
err ( " pread() on pagemap failed " ) ;
return value ;
}
/* This macro let __LINE__ works in err() */
# define pagemap_check_wp(value, wp) do { \
if ( ! ! ( value & PM_UFFD_WP ) ! = wp ) \
err ( " pagemap uffd-wp bit error: 0x% " PRIx64 , value ) ; \
} while ( 0 )
static int pagemap_test_fork ( bool present )
{
pid_t child = fork ( ) ;
uint64_t value ;
int fd , result ;
if ( ! child ) {
/* Open the pagemap fd of the child itself */
fd = pagemap_open ( ) ;
value = pagemap_read_vaddr ( fd , area_dst ) ;
/*
* After fork ( ) uffd - wp bit should be gone as long as we ' re
* without UFFD_FEATURE_EVENT_FORK
*/
pagemap_check_wp ( value , false ) ;
/* Succeed */
exit ( 0 ) ;
}
waitpid ( child , & result , 0 ) ;
return result ;
}
static void userfaultfd_pagemap_test ( unsigned int test_pgsize )
{
struct uffdio_register uffdio_register ;
int pagemap_fd ;
uint64_t value ;
/* Pagemap tests uffd-wp only */
if ( ! test_uffdio_wp )
return ;
/* Not enough memory to test this page size */
if ( test_pgsize > nr_pages * page_size )
return ;
printf ( " testing uffd-wp with pagemap (pgsize=%u): " , test_pgsize ) ;
/* Flush so it doesn't flush twice in parent/child later */
fflush ( stdout ) ;
userfaultfd/selftests: reinitialize test context in each test
Currently, the context (fds, mmap-ed areas, etc.) are global. Each test
mutates this state in some way, in some cases really "clobbering it"
(e.g., the events test mremap-ing area_dst over the top of area_src, or
the minor faults tests overwriting the count_verify values in the test
areas). We run the tests in a particular order, each test is careful to
make the right assumptions about its starting state, etc.
But, this is fragile. It's better for a test's success or failure to not
depend on what some other prior test case did to the global state.
To that end, clear and reinitialize the test context at the start of each
test case, so whatever prior test cases did doesn't affect future tests.
This is particularly relevant to this series because the events test's
mremap of area_dst screws up assumptions the minor fault test was relying
on. This wasn't a problem for hugetlb, as we don't mremap in that case.
[peterx@redhat.com: fix conflict between this patch and the uffd pagemap series]
Link: https://lkml.kernel.org/r/YKQqKrl+/cQ1utrb@t490s
Link: https://lkml.kernel.org/r/20210503180737.2487560-10-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 18:49:41 -07:00
uffd_test_ctx_init ( 0 ) ;
2021-06-30 18:49:13 -07:00
if ( test_pgsize > page_size ) {
/* This is a thp test */
if ( madvise ( area_dst , nr_pages * page_size , MADV_HUGEPAGE ) )
err ( " madvise(MADV_HUGEPAGE) failed " ) ;
} else if ( test_pgsize = = page_size ) {
/* This is normal page test; force no thp */
if ( madvise ( area_dst , nr_pages * page_size , MADV_NOHUGEPAGE ) )
err ( " madvise(MADV_NOHUGEPAGE) failed " ) ;
}
uffdio_register . range . start = ( unsigned long ) area_dst ;
uffdio_register . range . len = nr_pages * page_size ;
uffdio_register . mode = UFFDIO_REGISTER_MODE_WP ;
if ( ioctl ( uffd , UFFDIO_REGISTER , & uffdio_register ) )
err ( " register failed " ) ;
pagemap_fd = pagemap_open ( ) ;
/* Touch the page */
* area_dst = 1 ;
wp_range ( uffd , ( uint64_t ) area_dst , test_pgsize , true ) ;
value = pagemap_read_vaddr ( pagemap_fd , area_dst ) ;
pagemap_check_wp ( value , true ) ;
/* Make sure uffd-wp bit dropped when fork */
if ( pagemap_test_fork ( true ) )
err ( " Detected stall uffd-wp bit in child " ) ;
/* Exclusive required or PAGEOUT won't work */
if ( ! ( value & PM_MMAP_EXCLUSIVE ) )
err ( " multiple mapping detected: 0x% " PRIx64 , value ) ;
if ( madvise ( area_dst , test_pgsize , MADV_PAGEOUT ) )
err ( " madvise(MADV_PAGEOUT) failed " ) ;
/* Uffd-wp should persist even swapped out */
value = pagemap_read_vaddr ( pagemap_fd , area_dst ) ;
pagemap_check_wp ( value , true ) ;
/* Make sure uffd-wp bit dropped when fork */
if ( pagemap_test_fork ( false ) )
err ( " Detected stall uffd-wp bit in child " ) ;
/* Unprotect; this tests swap pte modifications */
wp_range ( uffd , ( uint64_t ) area_dst , page_size , false ) ;
value = pagemap_read_vaddr ( pagemap_fd , area_dst ) ;
pagemap_check_wp ( value , false ) ;
/* Fault in the page from disk */
* area_dst = 2 ;
value = pagemap_read_vaddr ( pagemap_fd , area_dst ) ;
pagemap_check_wp ( value , false ) ;
close ( pagemap_fd ) ;
printf ( " done \n " ) ;
}
2017-02-22 15:44:01 -08:00
static int userfaultfd_stress ( void )
{
void * area ;
2022-02-03 20:49:45 -08:00
char * tmp_area ;
2017-02-22 15:44:01 -08:00
unsigned long nr ;
struct uffdio_register uffdio_register ;
2020-04-06 20:06:32 -07:00
struct uffd_stats uffd_stats [ nr_cpus ] ;
2017-02-22 15:44:01 -08:00
userfaultfd/selftests: reinitialize test context in each test
Currently, the context (fds, mmap-ed areas, etc.) are global. Each test
mutates this state in some way, in some cases really "clobbering it"
(e.g., the events test mremap-ing area_dst over the top of area_src, or
the minor faults tests overwriting the count_verify values in the test
areas). We run the tests in a particular order, each test is careful to
make the right assumptions about its starting state, etc.
But, this is fragile. It's better for a test's success or failure to not
depend on what some other prior test case did to the global state.
To that end, clear and reinitialize the test context at the start of each
test case, so whatever prior test cases did doesn't affect future tests.
This is particularly relevant to this series because the events test's
mremap of area_dst screws up assumptions the minor fault test was relying
on. This wasn't a problem for hugetlb, as we don't mremap in that case.
[peterx@redhat.com: fix conflict between this patch and the uffd pagemap series]
Link: https://lkml.kernel.org/r/YKQqKrl+/cQ1utrb@t490s
Link: https://lkml.kernel.org/r/20210503180737.2487560-10-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 18:49:41 -07:00
uffd_test_ctx_init ( 0 ) ;
2015-09-04 15:47:23 -07:00
2021-06-30 18:48:55 -07:00
if ( posix_memalign ( & area , page_size , page_size ) )
err ( " out of memory " ) ;
2015-09-04 15:47:23 -07:00
zeropage = area ;
bzero ( zeropage , page_size ) ;
pthread_mutex_lock ( & uffd_read_mutex ) ;
pthread_attr_init ( & attr ) ;
pthread_attr_setstacksize ( & attr , 16 * 1024 * 1024 ) ;
while ( bounces - - ) {
printf ( " bounces: %d, mode: " , bounces ) ;
if ( bounces & BOUNCE_RANDOM )
printf ( " rnd " ) ;
if ( bounces & BOUNCE_RACINGFAULTS )
printf ( " racing " ) ;
if ( bounces & BOUNCE_VERIFY )
printf ( " ver " ) ;
if ( bounces & BOUNCE_POLL )
printf ( " poll " ) ;
2020-12-14 19:14:02 -08:00
else
printf ( " read " ) ;
2015-09-04 15:47:23 -07:00
printf ( " , " ) ;
fflush ( stdout ) ;
if ( bounces & BOUNCE_POLL )
fcntl ( uffd , F_SETFL , uffd_flags | O_NONBLOCK ) ;
else
fcntl ( uffd , F_SETFL , uffd_flags & ~ O_NONBLOCK ) ;
/* register */
uffdio_register . range . start = ( unsigned long ) area_dst ;
uffdio_register . range . len = nr_pages * page_size ;
uffdio_register . mode = UFFDIO_REGISTER_MODE_MISSING ;
2020-04-06 20:06:36 -07:00
if ( test_uffdio_wp )
uffdio_register . mode | = UFFDIO_REGISTER_MODE_WP ;
2021-06-30 18:48:55 -07:00
if ( ioctl ( uffd , UFFDIO_REGISTER , & uffdio_register ) )
err ( " register failure " ) ;
userfaultfd/selftests: fix calculation of expected ioctls
Today, we assert that the ioctls the kernel reports as supported for a
registration match a precomputed list. We decide which ioctls are
supported by examining the memory type. Then, in several locations we
"fix up" this list by adding or removing things this initial decision
got wrong.
What ioctls the kernel reports is actually a function of several things:
- The memory type
- Kernel feature support (e.g., no writeprotect on aarch64)
- The registration type (e.g., CONTINUE only supported for MINOR mode)
So, we can't fully compute this at the start, in set_test_type. It
varies per test, depending on what registration mode(s) those tests use.
Instead, introduce a new function which computes the correct list. This
centralizes the add/remove of ioctls depending on these function inputs
in one place, so we don't have to repeat ourselves in various tests.
Not only is the resulting code a bit shorter, but it fixes a real bug in
the existing code: previously, we would incorrectly require the
writeprotect ioctl to be present on aarch64, where it isn't actually
supported.
Link: https://lkml.kernel.org/r/20210930212309.4001967-4-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-05 13:42:13 -07:00
assert_expected_ioctls_present (
uffdio_register . mode , uffdio_register . ioctls ) ;
2015-09-04 15:47:23 -07:00
2017-09-06 16:23:46 -07:00
if ( area_dst_alias ) {
uffdio_register . range . start = ( unsigned long )
area_dst_alias ;
2021-06-30 18:48:55 -07:00
if ( ioctl ( uffd , UFFDIO_REGISTER , & uffdio_register ) )
err ( " register failure alias " ) ;
2017-09-06 16:23:46 -07:00
}
2015-09-04 15:47:23 -07:00
/*
* The madvise done previously isn ' t enough : some
* uffd_thread could have read userfaults ( one of
* those already resolved by the background thread )
* and it may be in the process of calling
* UFFDIO_COPY . UFFDIO_COPY will read the zapped
* area_src and it would map a zero page in it ( of
* course such a UFFDIO_COPY is perfectly safe as it ' d
* return - EEXIST ) . The problem comes at the next
* bounce though : that racing UFFDIO_COPY would
* generate zeropages in the area_src , so invalidating
* the previous MADV_DONTNEED . Without this additional
* MADV_DONTNEED those zeropages leftovers in the
* area_src would lead to - EEXIST failure during the
* next bounce , effectively leaving a zeropage in the
* area_dst .
*
* Try to comment this out madvise to see the memory
* corruption being caught pretty quick .
*
* khugepaged is also inhibited to collapse THP after
* MADV_DONTNEED only after the UFFDIO_REGISTER , so it ' s
* required to MADV_DONTNEED here .
*/
2021-06-30 18:48:55 -07:00
uffd_test_ops - > release_pages ( area_dst ) ;
2015-09-04 15:47:23 -07:00
2020-04-06 20:06:32 -07:00
uffd_stats_reset ( uffd_stats , nr_cpus ) ;
2015-09-04 15:47:23 -07:00
/* bounce pass */
2020-04-06 20:06:32 -07:00
if ( stress ( uffd_stats ) )
2015-09-04 15:47:23 -07:00
return 1 ;
2020-04-06 20:06:36 -07:00
/* Clear all the write protections if there is any */
if ( test_uffdio_wp )
wp_range ( uffd , ( unsigned long ) area_dst ,
nr_pages * page_size , false ) ;
2015-09-04 15:47:23 -07:00
/* unregister */
2021-06-30 18:48:55 -07:00
if ( ioctl ( uffd , UFFDIO_UNREGISTER , & uffdio_register . range ) )
err ( " unregister failure " ) ;
2017-09-06 16:23:46 -07:00
if ( area_dst_alias ) {
uffdio_register . range . start = ( unsigned long ) area_dst ;
if ( ioctl ( uffd , UFFDIO_UNREGISTER ,
2021-06-30 18:48:55 -07:00
& uffdio_register . range ) )
err ( " unregister failure alias " ) ;
2017-09-06 16:23:46 -07:00
}
2015-09-04 15:47:23 -07:00
/* verification */
2021-06-30 18:48:55 -07:00
if ( bounces & BOUNCE_VERIFY )
for ( nr = 0 ; nr < nr_pages ; nr + + )
if ( * area_count ( area_dst , nr ) ! = count_verify [ nr ] )
err ( " error area_count %llu %llu %lu \n " ,
* area_count ( area_src , nr ) ,
count_verify [ nr ] , nr ) ;
2015-09-04 15:47:23 -07:00
/* prepare next bounce */
2022-02-03 20:49:45 -08:00
tmp_area = area_src ;
area_src = area_dst ;
area_dst = tmp_area ;
2015-09-04 15:47:23 -07:00
2022-02-03 20:49:45 -08:00
tmp_area = area_src_alias ;
area_src_alias = area_dst_alias ;
area_dst_alias = tmp_area ;
2017-09-06 16:23:46 -07:00
2020-04-06 20:06:36 -07:00
uffd_stats_report ( uffd_stats , nr_cpus ) ;
2015-09-04 15:47:23 -07:00
}
2021-06-30 18:49:13 -07:00
if ( test_type = = TEST_ANON ) {
/*
* shmem / hugetlb won ' t be able to run since they have different
* behavior on fork ( ) ( file - backed memory normally drops ptes
* directly when fork ) , meanwhile the pagemap test will verify
* pgtable entry of fork ( ) ed child .
*/
userfaultfd_pagemap_test ( page_size ) ;
/*
* Hard - code for x86_64 for now for 2 M THP , as x86_64 is
* currently the only one that supports uffd - wp
*/
userfaultfd_pagemap_test ( page_size * 512 ) ;
}
2017-09-06 16:23:43 -07:00
return userfaultfd_zeropage_test ( ) | | userfaultfd_sig_test ( )
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
| | userfaultfd_events_test ( ) | | userfaultfd_minor_test ( ) ;
2015-09-04 15:47:23 -07:00
}
2017-02-22 15:43:07 -08:00
/*
* Copied from mlock2 - tests . c
*/
unsigned long default_huge_page_size ( void )
{
unsigned long hps = 0 ;
char * line = NULL ;
size_t linelen = 0 ;
FILE * f = fopen ( " /proc/meminfo " , " r " ) ;
if ( ! f )
return 0 ;
while ( getline ( & line , & linelen , f ) > 0 ) {
if ( sscanf ( line , " Hugepagesize: %lu kB " , & hps ) = = 1 ) {
hps < < = 10 ;
break ;
}
}
free ( line ) ;
fclose ( f ) ;
return hps ;
}
2017-05-03 14:54:54 -07:00
static void set_test_type ( const char * type )
2017-02-22 15:43:07 -08:00
{
userfaultfd/selftests: fix feature support detection
Before any tests are run, in set_test_type, we decide what feature(s) we
are going to be testing, based upon our command line arguments.
However, the supported features are not just a function of the memory
type being used, so this is broken.
For instance, consider writeprotect support. It is "normally" supported
for anonymous memory, but furthermore it requires that the kernel has
CONFIG_HAVE_ARCH_USERFAULTFD_WP. So, it is *not* supported at all on
aarch64, for example.
So, this fixes this by querying the kernel for the set of features it
supports in set_test_type, by opening a userfaultfd and issuing a
UFFDIO_API ioctl. Based upon the reported features, we toggle what
tests are enabled.
Link: https://lkml.kernel.org/r/20210930212309.4001967-3-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-05 13:42:10 -07:00
uint64_t features = UFFD_API_FEATURES ;
2017-05-03 14:54:54 -07:00
if ( ! strcmp ( type , " anon " ) ) {
test_type = TEST_ANON ;
uffd_test_ops = & anon_uffd_test_ops ;
2020-04-06 20:06:36 -07:00
/* Only enable write-protect test for anonymous test */
test_uffdio_wp = true ;
2017-05-03 14:54:54 -07:00
} else if ( ! strcmp ( type , " hugetlb " ) ) {
test_type = TEST_HUGETLB ;
uffd_test_ops = & hugetlb_uffd_test_ops ;
2017-09-06 16:23:46 -07:00
} else if ( ! strcmp ( type , " hugetlb_shared " ) ) {
map_shared = true ;
test_type = TEST_HUGETLB ;
uffd_test_ops = & hugetlb_uffd_test_ops ;
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`. This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.
Add a simple test case which exercises minor faults. In short, it does
the following:
1. "Sets up" an area (area_dst) and a second shared mapping to the same
underlying pages (area_dst_alias).
2. Register one of these areas with userfaultfd, in minor fault mode.
3. Start a second thread to handle any minor faults.
4. Populate the underlying pages with the non-UFFD-registered side of
the mapping. Basically, memset() each page with some arbitrary
contents.
5. Then, using the UFFD-registered mapping, read all of the page
contents, asserting that the contents match expectations (we expect
the minor fault handling thread can modify the page contents before
resolving the fault).
The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and
see this modification.
Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.
Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-04 18:35:57 -07:00
/* Minor faults require shared hugetlb; only enable here. */
test_uffdio_minor = true ;
2017-05-03 14:54:54 -07:00
} else if ( ! strcmp ( type , " shmem " ) ) {
2017-09-06 16:23:46 -07:00
map_shared = true ;
2017-05-03 14:54:54 -07:00
test_type = TEST_SHMEM ;
uffd_test_ops = & shmem_uffd_test_ops ;
2021-06-30 18:49:45 -07:00
test_uffdio_minor = true ;
2017-05-03 14:54:54 -07:00
} else {
2021-06-30 18:48:55 -07:00
err ( " Unknown test type: %s " , type ) ;
2017-05-03 14:54:54 -07:00
}
if ( test_type = = TEST_HUGETLB )
page_size = default_huge_page_size ( ) ;
else
page_size = sysconf ( _SC_PAGE_SIZE ) ;
2021-06-30 18:48:55 -07:00
if ( ! page_size )
err ( " Unable to determine page size " ) ;
2017-02-22 15:43:07 -08:00
if ( ( unsigned long ) area_count ( NULL , 0 ) + sizeof ( unsigned long long ) * 2
2021-06-30 18:48:55 -07:00
> page_size )
err ( " Impossible to run this test " ) ;
userfaultfd/selftests: fix feature support detection
Before any tests are run, in set_test_type, we decide what feature(s) we
are going to be testing, based upon our command line arguments.
However, the supported features are not just a function of the memory
type being used, so this is broken.
For instance, consider writeprotect support. It is "normally" supported
for anonymous memory, but furthermore it requires that the kernel has
CONFIG_HAVE_ARCH_USERFAULTFD_WP. So, it is *not* supported at all on
aarch64, for example.
So, this fixes this by querying the kernel for the set of features it
supports in set_test_type, by opening a userfaultfd and issuing a
UFFDIO_API ioctl. Based upon the reported features, we toggle what
tests are enabled.
Link: https://lkml.kernel.org/r/20210930212309.4001967-3-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-05 13:42:10 -07:00
/*
* Whether we can test certain features depends not just on test type ,
* but also on whether or not this particular kernel supports the
* feature .
*/
userfaultfd_open ( & features ) ;
test_uffdio_wp = test_uffdio_wp & &
( features & UFFD_FEATURE_PAGEFAULT_FLAG_WP ) ;
test_uffdio_minor = test_uffdio_minor & &
( features & uffd_minor_feature ( ) ) ;
close ( uffd ) ;
uffd = - 1 ;
2017-05-03 14:54:54 -07:00
}
2017-09-06 16:23:46 -07:00
static void sigalrm ( int sig )
{
if ( sig ! = SIGALRM )
abort ( ) ;
test_uffdio_copy_eexist = true ;
test_uffdio_zeropage_eexist = true ;
alarm ( ALARM_INTERVAL_SECS ) ;
}
2017-05-03 14:54:54 -07:00
int main ( int argc , char * * argv )
{
if ( argc < 4 )
2018-10-26 15:09:09 -07:00
usage ( ) ;
2017-05-03 14:54:54 -07:00
2021-06-30 18:48:55 -07:00
if ( signal ( SIGALRM , sigalrm ) = = SIG_ERR )
err ( " failed to arm SIGALRM " ) ;
2017-09-06 16:23:46 -07:00
alarm ( ALARM_INTERVAL_SECS ) ;
2017-05-03 14:54:54 -07:00
set_test_type ( argv [ 1 ] ) ;
nr_cpus = sysconf ( _SC_NPROCESSORS_ONLN ) ;
nr_pages_per_cpu = atol ( argv [ 2 ] ) * 1024 * 1024 / page_size /
2017-02-22 15:43:07 -08:00
nr_cpus ;
if ( ! nr_pages_per_cpu ) {
2021-06-30 18:48:55 -07:00
_err ( " invalid MiB " ) ;
2018-10-26 15:09:09 -07:00
usage ( ) ;
2017-02-22 15:43:07 -08:00
}
2017-05-03 14:54:54 -07:00
bounces = atoi ( argv [ 3 ] ) ;
2017-02-22 15:43:07 -08:00
if ( bounces < = 0 ) {
2021-06-30 18:48:55 -07:00
_err ( " invalid bounces " ) ;
2018-10-26 15:09:09 -07:00
usage ( ) ;
2017-02-22 15:43:07 -08:00
}
nr_pages = nr_pages_per_cpu * nr_cpus ;
2017-05-03 14:54:54 -07:00
if ( test_type = = TEST_HUGETLB ) {
if ( argc < 5 )
2018-10-26 15:09:09 -07:00
usage ( ) ;
2017-05-03 14:54:54 -07:00
huge_fd = open ( argv [ 4 ] , O_CREAT | O_RDWR , 0755 ) ;
2021-06-30 18:48:55 -07:00
if ( huge_fd < 0 )
err ( " Open of %s failed " , argv [ 4 ] ) ;
if ( ftruncate ( huge_fd , 0 ) )
err ( " ftruncate %s to size 0 failed " , argv [ 4 ] ) ;
userfaultfd/selftests: use memfd_create for shmem test type
This is a preparatory commit. In the future, we want to be able to setup
alias mappings for area_src and area_dst in the shmem test, like we do in
the hugetlb_shared test. With a VMA obtained via mmap(MAP_ANONYMOUS |
MAP_SHARED), it isn't clear how to do this.
So, mmap() with an fd, so we can create alias mappings. Use memfd_create
instead of actually passing in a tmpfs path like hugetlb does, since it's
more convenient / simpler to run, and works just as well.
Future commits will:
1. Setup the alias mappings.
2. Extend our tests to actually take advantage of this, to test new
userfaultfd behavior being introduced in this series.
Also, a small fix in the area we're changing: when the hugetlb setup fails
in main(), pass in the right argv[] so we actually print out the hugetlb
file path.
Link: https://lkml.kernel.org/r/20210503180737.2487560-8-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 18:49:34 -07:00
} else if ( test_type = = TEST_SHMEM ) {
shm_fd = memfd_create ( argv [ 0 ] , 0 ) ;
if ( shm_fd < 0 )
err ( " memfd_create " ) ;
if ( ftruncate ( shm_fd , nr_pages * page_size * 2 ) )
err ( " ftruncate " ) ;
if ( fallocate ( shm_fd ,
FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE , 0 ,
nr_pages * page_size * 2 ) )
err ( " fallocate " ) ;
2017-02-22 15:43:07 -08:00
}
printf ( " nr_pages: %lu, nr_pages_per_cpu: %lu \n " ,
nr_pages , nr_pages_per_cpu ) ;
return userfaultfd_stress ( ) ;
}
2015-09-22 14:58:58 -07:00
# else /* __NR_userfaultfd */
# warning "missing __NR_userfaultfd definition"
int main ( void )
{
printf ( " skip: Skipping userfaultfd test (missing __NR_userfaultfd) \n " ) ;
2018-06-13 21:31:43 -06:00
return KSFT_SKIP ;
2015-09-22 14:58:58 -07:00
}
# endif /* __NR_userfaultfd */