IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
As we want to see the number of lost samples in the perf report, set the
LOST format when it configs evsel. On old kernels, it'd fallback to
disable it.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220901195739.668604-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make the header guard consistent with others.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: florian fischer <florian.fischer@muhq.space>
Link: http://lore.kernel.org/lkml/20220830164846.401143-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This updates the perf tool with arch specific branch type classification
used for BRBE on arm64 platform as added in the kernel earlier.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220824044822.70230-9-anshuman.khandual@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This updates the perf tools with branch privilege information request flag
i.e PERF_SAMPLE_BRANCH_PRIV_SAVE that has been added earlier in the kernel.
This also updates 'perf record' documentation, branch_modes[], and generic
branch privilege level enumeration as added earlier in the kernel.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220824044822.70230-8-anshuman.khandual@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This updates the perf tool with generic branch type classification with new
ABI extender place holder i.e PERF_BR_EXTEND_ABI, the new 4 bit branch type
field i.e perf_branch_entry.new_type, new generic page fault related branch
types and some arch specific branch types as added earlier in the kernel.
Committer note:
Add an extra entry to the branch_type_name array to cope with
PERF_BR_EXTEND_ABI, to address build warnings on some compiler/systems,
like:
75 8.89 ubuntu:20.04-x-powerpc64el : FAIL gcc version 10.3.0 (Ubuntu 10.3.0-1ubuntu1~20.04)
inlined from 'branch_type_stat_display' at util/branch.c:152:4:
/usr/powerpc64le-linux-gnu/include/bits/stdio2.h💯10: error: '%8s' directive argument is null [-Werror=format-overflow=]
100 | return __fprintf_chk (__stream, __USE_FORTIFY_LEVEL - 1, __fmt,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
101 | __va_arg_pack ());
| ~~~~~~~~~~~~~~~~~
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220824044822.70230-7-anshuman.khandual@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This updates the perf tool with generic branch type classification with
two new branch types i.e system error (PERF_BR_SERROR) and not in
transaction (PERF_BR_NO_TX) which got updated earlier in the kernel.
This also updates corresponding branch type strings in
branch_type_name().
Committer notes:
At perf tools merge time this is only on PeterZ's tree, at:
git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/core
So for testing one has to build a kernel with that branch, then test
the tooling side from:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220824044822.70230-6-anshuman.khandual@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add annotations to describe lock behavior. Add unlocks so that mutexes
aren't conditionally held on exit from perf_sched__replay. Add an exit
variable so that thread_func can terminate, rather than leaving the
threads blocked on mutexes.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Truong <alexandre.truong@arm.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: André Almeida <andrealmeid@igalia.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Colin Ian King <colin.king@intel.com>
Cc: Dario Petrillo <dario.pk1@gmail.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Fangrui Song <maskray@google.com>
Cc: Hewenliang <hewenliang4@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jason Wang <wangborong@cdjrlc.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tom Rix <trix@redhat.com>
Cc: Weiguo Li <liwg06@foxmail.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: William Cohen <wcohen@redhat.com>
Cc: Zechuan Chen <chenzechuan1@huawei.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Cc: yaowenbin <yaowenbin1@huawei.com>
Link: https://lore.kernel.org/r/20220826164242.43412-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There may be threads racing to update dso->nsinfo:
https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@mail.gmail.com/
Holding the dso->lock avoids use-after-free, memory leaks and other such
bugs. Apply the fix in:
https://lore.kernel.org/linux-perf-users/20211118193714.2293728-1-irogers@google.com/
of there being a missing nsinfo__put now that the accesses are data race
free. Fixes test "Lookup mmap thread" when compiled with address
sanitizer.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Truong <alexandre.truong@arm.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: André Almeida <andrealmeid@igalia.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Colin Ian King <colin.king@intel.com>
Cc: Dario Petrillo <dario.pk1@gmail.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Fangrui Song <maskray@google.com>
Cc: Hewenliang <hewenliang4@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jason Wang <wangborong@cdjrlc.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tom Rix <trix@redhat.com>
Cc: Weiguo Li <liwg06@foxmail.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: William Cohen <wcohen@redhat.com>
Cc: Zechuan Chen <chenzechuan1@huawei.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Cc: yaowenbin <yaowenbin1@huawei.com>
Link: https://lore.kernel.org/r/20220826164242.43412-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
pthread.h is being included for the side-effect of getting sched.h and
macros like CPU_CLR. Switch to directly using sched.h, or if that is
already present, just remove the pthread.h inclusion entirely.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Truong <alexandre.truong@arm.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: André Almeida <andrealmeid@igalia.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Colin Ian King <colin.king@intel.com>
Cc: Dario Petrillo <dario.pk1@gmail.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Fangrui Song <maskray@google.com>
Cc: Hewenliang <hewenliang4@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jason Wang <wangborong@cdjrlc.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tom Rix <trix@redhat.com>
Cc: Weiguo Li <liwg06@foxmail.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: William Cohen <wcohen@redhat.com>
Cc: Zechuan Chen <chenzechuan1@huawei.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Cc: yaowenbin <yaowenbin1@huawei.com>
Link: https://lore.kernel.org/r/20220826164242.43412-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Added a new header file mutex.h that wraps the usage of
pthread_mutex_t and pthread_cond_t. By abstracting these it is
possible to introduce error checking.
Signed-off-by: Pavithra Gurushankar <gpavithrasha@gmail.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Truong <alexandre.truong@arm.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: André Almeida <andrealmeid@igalia.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Colin Ian King <colin.king@intel.com>
Cc: Dario Petrillo <dario.pk1@gmail.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Fangrui Song <maskray@google.com>
Cc: Hewenliang <hewenliang4@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jason Wang <wangborong@cdjrlc.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tom Rix <trix@redhat.com>
Cc: Weiguo Li <liwg06@foxmail.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: William Cohen <wcohen@redhat.com>
Cc: Zechuan Chen <chenzechuan1@huawei.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Cc: yaowenbin <yaowenbin1@huawei.com>
Link: https://lore.kernel.org/r/20220826164242.43412-2-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
AUX area traces can produce too much data to record successfully or
analyze subsequently. Add another means to reduce data collection by
allowing multiple recording time ranges.
This is useful, for instance, in cases where a workload produces
predictably reproducible events in specific time ranges.
Today we only have perf record -D <msecs> to start at a specific region, or
some complicated approach using snapshot mode and external scripts sending
signals or using the fifos. But these approaches are difficult to set up
compared with simply having perf do it.
Extend perf record option -D/--delay option to specifying relative time
stamps for start stop controlled by perf with the right time offset, for
instance:
perf record -e intel_pt// -D 10-20,30-40
to record 10ms to 20ms into the trace and 30ms to 40ms.
Example:
The example workload is:
$ cat repeat-usleep.c
int usleep(useconds_t usec);
int usage(int ret, const char *msg)
{
if (msg)
fprintf(stderr, "%s\n", msg);
fprintf(stderr, "Usage is: repeat-usleep <microseconds>\n");
return ret;
}
int main(int argc, char *argv[])
{
unsigned long usecs;
char *end_ptr;
if (argc != 2)
return usage(1, "Error: Wrong number of arguments!");
errno = 0;
usecs = strtoul(argv[1], &end_ptr, 0);
if (errno || *end_ptr || usecs > UINT_MAX)
return usage(1, "Error: Invalid argument!");
while (1) {
int ret = usleep(usecs);
if (ret & errno != EINTR)
return usage(1, "Error: usleep() failed!");
}
return 0;
}
$ perf record -e intel_pt//u --delay 10-20,40-70,110-160 -- ./repeat-usleep 500
Events disabled
Events enabled
Events disabled
Events enabled
Events disabled
Events enabled
Events disabled
[ perf record: Woken up 5 times to write data ]
[ perf record: Captured and wrote 0.204 MB perf.data ]
Terminated
A dlfilter is used to determine continuous data collection (timestamps
less than 1ms apart):
$ cat dlfilter-show-delays.c
static __u64 start_time;
static __u64 last_time;
int start(void **data, void *ctx)
{
printf("%-17s\t%-9s\t%-6s\n", " Time", " Duration", " Delay");
return 0;
}
int filter_event_early(void *data, const struct perf_dlfilter_sample *sample, void *ctx)
{
__u64 delta;
if (!sample->time)
return 1;
if (!last_time)
goto out;
delta = sample->time - last_time;
if (delta < 1000000)
goto out2;;
printf("%17.9f\t%9.1f\t%6.1f\n", start_time / 1000000000.0, (last_time - start_time) / 1000000.0, delta / 1000000.0);
out:
start_time = sample->time;
out2:
last_time = sample->time;
return 1;
}
int stop(void *data, void *ctx)
{
printf("%17.9f\t%9.1f\n", start_time / 1000000000.0, (last_time - start_time) / 1000000.0);
return 0;
}
The result shows the times roughly match the --delay option:
$ perf script --itrace=qb --dlfilter dlfilter-show-delays.so
Time Duration Delay
39215.302317300 9.7 20.5
39215.332480217 30.4 40.9
39215.403837717 49.8
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220824072814.16422-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Dummy events are used to provide sideband information like MMAP events that
are always needed even when main events are disabled. Add functions that
take that into account.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220824072814.16422-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Patch "perf record: Fix way of handling non-perf-event pollfds" added a
generic way to handle non-perf-event file descriptors like evlist->ctl_fd.
Use it instead of handling evlist->ctl_fd separately.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220824072814.16422-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf record __cmd_record() does not poll evlist pollfds. Instead it polls
thread_data[0].pollfd. That happens whether or not threads are being used.
perf record duplicates evlist mmap pollfds as needed for separate threads.
The non-perf-event represented by evlist->ctl_fd has to handled separately,
which is done explicitly, duplicating it into the thread_data[0] pollfds.
That approach neglects any other non-perf-event file descriptors. Currently
there is also done_fd which needs the same handling.
Add a new generalized approach.
Add fdarray_flag__non_perf_event to identify the file descriptors that
need the special handling. For those cases, also keep a mapping of the
evlist pollfd index and thread pollfd index, so that the evlist revents
can be updated.
Although this patch adds the new handling, it does not take it into use.
There is no functional change, but it is the precursor to a fix, so is
marked as a fix.
Fixes: 415ccb58f68a6beb ("perf record: Introduce thread specific data array")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220824072814.16422-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When libbpf is present the build uses definitions in libbpf hashmap.c,
however, libbpf's hashmap.h wasn't being used. Switch to using the
correct hashmap.h dependent on the define HAVE_LIBBPF_SUPPORT. This was
the original intent in:
https://lore.kernel.org/lkml/20200515221732.44078-8-irogers@google.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20220824050604.352156-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'unsigned int' should be clearer than 'unsigned'.
Signed-off-by: Xin Gao <gaoxin@cdjrlc.com>
Cc: Ian Rogers <irogers@google.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220816173804.7539-1-gaoxin@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'unsigned int' should be clearer than 'unsigned'.
Signed-off-by: Xin Gao <gaoxin@cdjrlc.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220816174109.7718-1-gaoxin@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Sometimes, features are simply different flavors of another feature, to
properly detect the exact dependencies needed by different Linux
distributions.
For example, libbfd has three flavors: libbfd if the distro does not
require any additional dependency; libbfd-liberty if it requires libiberty;
libbfd-liberty-z if it requires libiberty and libz.
It might not be clear to the user whether a feature has been successfully
detected or not, given that some of its flavors will be set to OFF, others
to ON.
Instead, display only the feature main flavor if not in verbose mode
(VF != 1), and set it to ON if at least one of its flavors has been
successfully detected (logical OR), OFF otherwise. Omit the other flavors.
Accomplish that by declaring a FEATURE_GROUP_MEMBERS-<feature main flavor>
variable, with the list of the other flavors as variable value. For now, do
it just for libbfd.
In verbose mode, of if no group is defined for a feature, show the feature
detection result as before.
Committer testing:
Collecting the output from:
$ make -C tools/bpf/bpftool/ clean
$ make -C tools/bpf/bpftool/ |& grep "Auto-detecting system features" -A10
$ diff -u before after
--- before 2022-08-18 10:06:40.422086966 -0300
+++ after 2022-08-18 10:07:59.202138282 -0300
@@ -1,6 +1,4 @@
Auto-detecting system features:
... libbfd: [ on ]
-... libbfd-liberty: [ on ]
-... libbfd-liberty-z: [ on ]
... libcap: [ on ]
... clang-bpf-co-re: [ on ]
$
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220818120957.319995-3-roberto.sassu@huaweicloud.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since now there are features with a long name, increase the room for them,
so that fields are correctly aligned.
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220818120957.319995-2-roberto.sassu@huaweicloud.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As the first eval expansion is used only to generate Makefile statements,
messages should not be displayed at this stage, as for example conditional
expressions are not evaluated.
It can be seen for example in the output of feature detection for bpftool,
where the number of detected features does not change, despite turning on
the verbose mode (VF = 1) and there are additional features to display.
Fix this issue by escaping the $ before $(info) statements, to ensure that
messages are printed only when the function containing them is actually
executed, and not when it is expanded.
In addition, move the $(info) statement out of feature_print_status, due to
the fact that is called both inside and outside an eval context, and place
it to the caller so that the $ can be escaped when necessary. For symmetry,
move the $(info) statement also out of feature_print_text, and place it to
the caller.
Force the TMP variable evaluation in verbose mode, to display the features
in FEATURE_TESTS that are not in FEATURE_DISPLAY.
Reorder perf feature detection messages (first non-verbose, then verbose
ones) by moving the call to feature_display_entries earlier, before the VF
environment variable check.
Also, remove the newline from that function, as perf might display
additional messages. Move the newline to perf Makefile, and display another
one if displaying the detection result is not deferred as in the case of
bpftool.
Committer testing:
Collecting the output from:
$ make VF=1 -C tools/bpf/bpftool/ |& grep "Auto-detecting system features" -A20
$ diff -u before after
--- before 2022-08-18 09:59:55.460529231 -0300
+++ after 2022-08-18 10:01:11.182517282 -0300
@@ -4,3 +4,5 @@
... libbfd-liberty-z: [ on ]
... libcap: [ on ]
... clang-bpf-co-re: [ on ]
+... disassembler-four-args: [ on ]
+... disassembler-init-styled: [ OFF ]
$
Fixes: 0afc5cad387db560 ("perf build: Separate feature make support into config/Makefile.feature")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: bpf@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20220818120957.319995-1-roberto.sassu@huaweicloud.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This commit adds the option --known-build-ids to perf inject.
It allows the user to explicitly specify the build id for a given
path, instead of retrieving it from the current system. This is
useful in cases where a perf.data file is processed on a different
system from where it was collected, or if some of the binaries are
no longer available.
The build ids and paths are specified in pairs in the command line.
Using the file:// specifier, build ids can be loaded from a file
directly generated by perf buildid-list. This is convenient to copy
build ids from one perf.data file to another.
** Example: In this example we use perf record to create two
perf.data files, one with build ids and another without, and use
perf buildid-list and perf inject to copy the build ids from the
first file to the second.
$ perf record ls /tmp
$ perf record --no-buildid -o perf.data.no-buildid ls /tmp
$ perf buildid-list > build-ids.txt
$ perf inject -b --known-build-ids='file://build-ids.txt' \
-i perf.data.no-buildid -o perf.data.buildid
Signed-off-by: Raul Silvera <rsilvera@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220815225922.2118745-1-rsilvera@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make statx() support reporting direct I/O (DIO) alignment information.
This provides a generic interface for userspace programs to determine
whether a file supports DIO, and if so with what alignment restrictions.
Specifically, STATX_DIOALIGN works on block devices, and on regular
files when their containing filesystem has implemented support.
An interface like this has been requested for years, since the
conditions for when DIO is supported in Linux have gotten increasingly
complex over time. Today, DIO support and alignment requirements can be
affected by various filesystem features such as multi-device support,
data journalling, inline data, encryption, verity, compression,
checkpoint disabling, log-structured mode, etc. Further complicating
things, Linux v6.0 relaxed the traditional rule of DIO needing to be
aligned to the block device's logical block size; now user buffers (but
not file offsets) only need to be aligned to the DMA alignment.
The approach of uplifting the XFS specific ioctl XFS_IOC_DIOINFO was
discarded in favor of creating a clean new interface with statx().
For more information, see the individual commits and the man page update
https://lore.kernel.org/r/20220722074229.148925-1-ebiggers@kernel.org.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCYzpV2xQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOKwF1AQDetPX5hyuq0/mwikOywLTTJsoHgGY5
euO+dISqjH/InwD9HAQqfPRkdM1j4ml82BjjkAfrhzZXOOWPKJm0zOhMIQg=
=0Oav
-----END PGP SIGNATURE-----
Merge tag 'statx-dioalign-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull STATX_DIOALIGN support from Eric Biggers:
"Make statx() support reporting direct I/O (DIO) alignment information.
This provides a generic interface for userspace programs to determine
whether a file supports DIO, and if so with what alignment
restrictions. Specifically, STATX_DIOALIGN works on block devices, and
on regular files when their containing filesystem has implemented
support.
An interface like this has been requested for years, since the
conditions for when DIO is supported in Linux have gotten increasingly
complex over time. Today, DIO support and alignment requirements can
be affected by various filesystem features such as multi-device
support, data journalling, inline data, encryption, verity,
compression, checkpoint disabling, log-structured mode, etc.
Further complicating things, Linux v6.0 relaxed the traditional rule
of DIO needing to be aligned to the block device's logical block size;
now user buffers (but not file offsets) only need to be aligned to the
DMA alignment.
The approach of uplifting the XFS specific ioctl XFS_IOC_DIOINFO was
discarded in favor of creating a clean new interface with statx().
For more information, see the individual commits and the man page
update[1]"
Link: https://lore.kernel.org/r/20220722074229.148925-1-ebiggers@kernel.org [1]
* tag 'statx-dioalign-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
xfs: support STATX_DIOALIGN
f2fs: support STATX_DIOALIGN
f2fs: simplify f2fs_force_buffered_io()
f2fs: move f2fs_force_buffered_io() into file.c
ext4: support STATX_DIOALIGN
fscrypt: change fscrypt_dio_supported() to prepare for STATX_DIOALIGN
vfs: support STATX_DIOALIGN on block devices
statx: add direct I/O alignment information
Minor changes to convert uses of kmap() to kmap_local_page().
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCYzpKvRQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK9KSAP9PyrI3kDLBpiG9os3HIZtHyPHgt6OZ
FA978i0UuAxgHAD7BPiIT55oBdOrn6CVy2g8PkwmkcKQx0kvhVQq8Pyz5ww=
=49pm
-----END PGP SIGNATURE-----
Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fsverity updates from Eric Biggers:
"Minor changes to convert uses of kmap() to kmap_local_page()"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fs-verity: use kmap_local_page() instead of kmap()
fs-verity: use memcpy_from_page()
This release contains some implementation changes, but no new features:
- Rework the implementation of the fscrypt filesystem-level keyring to
not be as tightly coupled to the keyrings subsystem. This resolves
several issues.
- Eliminate most direct uses of struct request_queue from fs/crypto/,
since struct request_queue is considered to be a block layer
implementation detail.
- Stop using the PG_error flag to track decryption failures. This is a
prerequisite for freeing up PG_error for other uses.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCYzpMMRQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOKxYbAP0VrWjlqonO75gYkIxwX0aTxajoKC3m
awUDAC/feQ910gD6A4WbJivanLngJKgcxfbhN5paalZJEGNOBBrOUB1WLgs=
=CxSh
-----END PGP SIGNATURE-----
Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fscrypt updates from Eric Biggers:
"This release contains some implementation changes, but no new
features:
- Rework the implementation of the fscrypt filesystem-level keyring
to not be as tightly coupled to the keyrings subsystem. This
resolves several issues.
- Eliminate most direct uses of struct request_queue from fs/crypto/,
since struct request_queue is considered to be a block layer
implementation detail.
- Stop using the PG_error flag to track decryption failures. This is
a prerequisite for freeing up PG_error for other uses"
* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fscrypt: work on block_devices instead of request_queues
fscrypt: stop holding extra request_queue references
fscrypt: stop using keyrings subsystem for fscrypt_master_key
fscrypt: stop using PG_error to track error status
fscrypt: remove fscrypt_set_test_dummy_encryption()
This set of commits includes:
. Fix a couple races found with a new torture test.
. Improve errors when api functions are used incorrectly.
. Improve tracing for lock requests from user space.
. Fix use after free in recently added tracing code.
. Small internal code cleanups.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJjOyfeAAoJEDgbc8f8gGmqHF4QALKGo+95JGzfXN37dNL2ve8L
DAKxESYIwaTEWuKxmD4AGogClEl55UoC8kxMB3dHwLZEd4U0v5ZDULR6NUYXMpos
6miaoF+pJfBnpNRqpCieWRW5dYXD4TwSdquv5rUSmUBrdOSy34s/nORWB4kL443K
hFPcbo5Mv1L0W70/+gdj1uBlBsenZxnXu6aEmrckONqwj9Q2SBjJTik9WuNwh+FF
tEcmUt8kDanGkbwtMCxnbT3HDOdfQyW+qq4IJ6MOYHlW9Cqbp9QUvAIho4DEpr7f
eGurQ/urSD3dltzuYQcZ81zGhaGxzaRt5d2AEHRrGugQ2ZvnsG74oSAmEINZTSw4
RV2EXyJ4hXcXK/yJXo3fGzFm2/5JFvYhnvddo6wts3vQZHwefExIRCHVz2cJL9eS
gFpfFu4uB8z7w7l9s9LJKv7cTriaDd1WHuIWZGonz3wlFSUOn7IxunDxM3Hc5YO3
okawhr6sWe03fFcKsw1WeWymfDUwmk/7OV15OSDanItAwX5vkBYDBvAcA/cwm8cj
P0Vb3c1/Sf1IjjHGGA13vHpD1JXJ7FHafg6jyWmjJNqaS+wtShvs2As9MqbtSWMb
o2OcYTEEzME4mMIXZzVlKP7hhkLMaVR5PwGmbPovlyAkEUX0soH7nefyLMAqP3JG
7VZYV46VCL7wm3yjrKYw
=sL1G
-----END PGP SIGNATURE-----
Merge tag 'dlm-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
- Fix a couple races found with a new torture test
- Improve errors when api functions are used incorrectly
- Improve tracing for lock requests from user space
- Fix use after free in recently added tracing cod.
- Small internal code cleanups
* tag 'dlm-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
fs: dlm: fix possible use after free if tracing
fs: dlm: const void resource name parameter
fs: dlm: LSFL_CB_DELAY only for kernel lockspaces
fs: dlm: remove DLM_LSFL_FS from uapi
fs: dlm: trace user space callbacks
fs: dlm: change ls_clear_proc_locks to spinlock
fs: dlm: remove dlm_del_ast prototype
fs: dlm: handle rcom in else if branch
fs: dlm: allow lockspaces have zero lvblen
fs: dlm: fix invalid derefence of sb_lvbptr
fs: dlm: handle -EINVAL as log_error()
fs: dlm: use __func__ for function name
fs: dlm: handle -EBUSY first in unlock validation
fs: dlm: handle -EBUSY first in lock arg validation
fs: dlm: fix race between test_bit() and queue_work()
fs: dlm: fix race in lowcomms
This release is mostly bug fixes, clean-ups, and optimizations.
One notable set of fixes addresses a subtle buffer overflow issue
that occurs if a small RPC Call message arrives in an oversized
RPC record. This is only possible on a framed RPC transport such
as TCP.
Because NFSD shares the receive and send buffers in one set of
pages, an oversized RPC record steals pages from the send buffer
that will be used to construct the RPC Reply message. NFSD must
not assume that a full-sized buffer is always available to it;
otherwise, it will walk off the end of the send buffer while
constructing its reply.
In this release, we also introduce the ability for the server to
wait a moment for clients to return delegations before it responds
with NFS4ERR_DELAY. This saves a retransmit and a network round-
trip when a delegation recall is needed. This work will be built
upon in future releases.
The NFS server adds another shrinker to its collection. Because
courtesy clients can linger for quite some time, they might be
freeable when the server host comes under memory pressure. A new
shrinker has been added that releases courtesy client resources
during low memory scenarios.
Lastly, of note: the maximum number of operations per NFSv4
COMPOUND that NFSD can handle is increased from 16 to 50. There
are NFSv4 client implementations that need more than 16 to
successfully perform a mount operation that uses a pathname
with many components.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmM66P4ACgkQM2qzM29m
f5fhMg//afS2mp4fgPz4MjoFIqD/Icep8qFEPA8Gy6I1dDGRxd9wNgjoN4JALFdr
NKX1oRVISBvDrOG/C84GbYnXEDlzY8q1HmyPoJA8VAR57hnJXfPZN6CBN//Bx4mU
nISPJeNGY9SMNVhS8916V/yzd41uWDQuD+H+i5mluBTJHONgSzwzc80sQ+eq+yZQ
PV6mlJN6hcm14LCaDOTXF7oY2Wm6dQc2rV87YChJWnc+vdXKnme/LWTMY1ABkePD
g88mSL6w3YDKEuKciWda5/QU1ETp/Q7XTjFGDKEQSnnNsvCLmUKogJTKVa2QqyLY
P1qlrj6XwukqAe414W4amlLL3q4NUFmJZPNWDxdf+qtTrQrBBEFrsKy/bSt27XoD
cTvBWcorMG2riSlYPViVeh8RpyC6qwhttPbvGAmflVF2KEyXpfgc5Pnn0/Xm1Ac9
XKzaCTJlUyRb/2wdqVtQbIpyh3sbhzp8zhv7sWKXgQOEXxKOO3ZAIrQXeL6oFN/b
HlXDty7wKhFRj8IbkZfQ9SvN1saTONwB3clYHbCXTetkw/nnrUgLYcu8NIDBK9ou
wkBcz1++XgVTqjRFwUwagb62cPJnRM6UiROYCVbQp7qcUe4/U+WP9t6dnZlnGZVZ
dtipKlH/LTGKW+d7ysZOqb4hsRza5Kaduz5a7lML7UIGQXmxjM0=
=hE6t
-----END PGP SIGNATURE-----
Merge tag 'nfsd-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd updates from Chuck Lever:
"This release is mostly bug fixes, clean-ups, and optimizations.
One notable set of fixes addresses a subtle buffer overflow issue that
occurs if a small RPC Call message arrives in an oversized RPC record.
This is only possible on a framed RPC transport such as TCP.
Because NFSD shares the receive and send buffers in one set of pages,
an oversized RPC record steals pages from the send buffer that will be
used to construct the RPC Reply message. NFSD must not assume that a
full-sized buffer is always available to it; otherwise, it will walk
off the end of the send buffer while constructing its reply.
In this release, we also introduce the ability for the server to wait
a moment for clients to return delegations before it responds with
NFS4ERR_DELAY. This saves a retransmit and a network round- trip when
a delegation recall is needed. This work will be built upon in future
releases.
The NFS server adds another shrinker to its collection. Because
courtesy clients can linger for quite some time, they might be
freeable when the server host comes under memory pressure. A new
shrinker has been added that releases courtesy client resources during
low memory scenarios.
Lastly, of note: the maximum number of operations per NFSv4 COMPOUND
that NFSD can handle is increased from 16 to 50. There are NFSv4
client implementations that need more than 16 to successfully perform
a mount operation that uses a pathname with many components"
* tag 'nfsd-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (53 commits)
nfsd: extra checks when freeing delegation stateids
nfsd: make nfsd4_run_cb a bool return function
nfsd: fix comments about spinlock handling with delegations
nfsd: only fill out return pointer on success in nfsd4_lookup_stateid
NFSD: fix use-after-free on source server when doing inter-server copy
NFSD: Cap rsize_bop result based on send buffer size
NFSD: Rename the fields in copy_stateid_t
nfsd: use DEFINE_SHOW_ATTRIBUTE to define nfsd_file_cache_stats_fops
nfsd: use DEFINE_SHOW_ATTRIBUTE to define nfsd_reply_cache_stats_fops
nfsd: use DEFINE_SHOW_ATTRIBUTE to define client_info_fops
nfsd: use DEFINE_SHOW_ATTRIBUTE to define export_features_fops and supported_enctypes_fops
nfsd: use DEFINE_PROC_SHOW_ATTRIBUTE to define nfsd_proc_ops
NFSD: Pack struct nfsd4_compoundres
NFSD: Remove unused nfsd4_compoundargs::cachetype field
NFSD: Remove "inline" directives on op_rsize_bop helpers
NFSD: Clean up nfs4svc_encode_compoundres()
SUNRPC: Fix typo in xdr_buf_subsegment's kdoc comment
NFSD: Clean up WRITE arg decoders
NFSD: Use xdr_inline_decode() to decode NFSv3 symlinks
NFSD: Refactor common code out of dirlist helpers
...
- Introduce fscache-based domain to share blobs between images;
- Support recording fragments in a special packed inode;
- Support partial-referenced pclusters for global compressed data
deduplication;
- Fix an order >= MAX_ORDER warning due to crafted negative i_size;
- Several cleanups.
-----BEGIN PGP SIGNATURE-----
iIcEABYIAC8WIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCYzq3FxEceGlhbmdAa2Vy
bmVsLm9yZwAKCRA5NzHcH7XmBJbRAQDab/0DJu7iDktzupazfCibkg8vWzakXIi+
KE0y5O8VaQEAwn9bdPU4cp+raowoMt3z8eGsj4H9ZO9NM8NfPUX0uQQ=
=TNVH
-----END PGP SIGNATURE-----
Merge tag 'erofs-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs updates from Gao Xiang:
"In this cycle, for container use cases, fscache-based shared domain is
introduced [1] so that data blobs in the same domain will be storage
deduplicated and it will also be used for page cache sharing later.
Also, a special packed inode is now introduced to record inode
fragments which keep the tail part of files by Yue Hu [2]. You can
keep arbitary length or (at will) the whole file as a fragment and
then fragments can be optionally compressed in the packed inode
together and even deduplicated for smaller image sizes.
In addition to that, global compressed data deduplication by sharing
partial-referenced pclusters is also supported in this cycle.
Summary:
- Introduce fscache-based domain to share blobs between images
- Support recording fragments in a special packed inode
- Support partial-referenced pclusters for global compressed data
deduplication
- Fix an order >= MAX_ORDER warning due to crafted negative i_size
- Several cleanups"
Link: https://lore.kernel.org/r/20220916085940.89392-1-zhujia.zj@bytedance.com [1]
Link: https://lore.kernel.org/r/cover.1663065968.git.huyue2@coolpad.com [2]
* tag 'erofs-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: clean up erofs_iget()
erofs: clean up unnecessary code and comments
erofs: fold in z_erofs_reload_indexes()
erofs: introduce partial-referenced pclusters
erofs: support on-disk compressed fragments data
erofs: support interlaced uncompressed data for compressed files
erofs: clean up .read_folio() and .readahead() in fscache mode
erofs: introduce 'domain_id' mount option
erofs: Support sharing cookies in the same domain
erofs: introduce a pseudo mnt to manage shared cookies
erofs: introduce fscache-based domain
erofs: code clean up for fscache
erofs: use kill_anon_super() to kill super in fscache mode
erofs: fix order >= MAX_ORDER warning due to crafted negative i_size
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYzqjGgAKCRCRxhvAZXjc
ovVFAQCbZXflZk/DGy1CVEWHwJDkMlmN62jCY+3gZP6UPsCeCQEA4uB3Hub7jihO
b2q9yhR+p6G7DHkeNAo2qUu4bI0lDgQ=
=zG6Z
-----END PGP SIGNATURE-----
Merge tag 'fs.vfsuid.fat.v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull fatfs vfsuid conversion from Christian Brauner:
"Last cycle we introduced the new vfs{g,u}id_t types that we had agreed
on. The most important parts of the vfs have been converted but there
are a few more places we need to switch before we can remove the old
helpers completely.
This cycle we converted all filesystems that called idmapped mount
helpers directly. The affected filesystems are f2fs, fat, fuse, ksmbd,
overlayfs, and xfs. We've sent patches for all of them. Looking at
-next f2fs, ksmbd, overlayfs, and xfs have all picked up these patches
and they should land in mainline during the v6.1 merge window.
So all filesystems that have a separate tree should send the vfsuid
conversion themselves. Onle the fat conversion is going through this
generic fs trees because there is no fat tree.
In order to change time settings on an inode fat checks that the
caller either is the owner of the inode or the inode's group is in the
caller's group list. If fat is on an idmapped mount we compare whether
the inode mapped into the mount is equivalent to the caller's fsuid.
If it isn't we compare whether the inode's group mapped into the mount
is in the caller's group list.
We now use the new vfsuid based helpers for that"
* tag 'fs.vfsuid.fat.v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
fat: port to vfs{g,u}id_t and associated helpers
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYzqi8gAKCRCRxhvAZXjc
orKNAQCGKPJ3Kc3LVVnh8qdjm9npP+j9UQAB7jDZi9q7RijIIAD/VYjj+z5XLg4V
k96ibCyir1+4EOF8ihY0WQi40MSWYws=
=S/Wf
-----END PGP SIGNATURE-----
Merge tag 'fs.acl.rework.prep.v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull vfs acl updates from Christian Brauner:
"These are general fixes and preparatory changes related to the ongoing
posix acl rework. The actual rework where we build a type safe posix
acl api wasn't ready for this merge window but we're hopeful for the
next merge window.
General fixes:
- Some filesystems like 9p and cifs have to implement custom posix
acl handlers because they require access to the dentry in order to
set and get posix acls while the set and get inode operations
currently don't. But the ntfs3 filesystem has no such requirement
and thus implemented custom posix acl xattr handlers when it really
didn't have to. So this pr contains patch that just implements set
and get inode operations for ntfs3 and switches it to rely on the
generic posix acl xattr handlers. (We would've appreciated reviews
from the ntfs3 maintainers but we didn't get any. But hey, if we
really broke it we'll fix it. But fstests for ntfs3 said it's
fine.)
- The posix_acl_fix_xattr_common() helper has been adapted so it can
be used by a few more callers and avoiding open-coding the same
checks over and over.
Other than the two general fixes this series introduces a new helper
vfs_set_acl_prepare(). The reason for this helper is so that we can
mitigate one of the source that change {g,u}id values directly in the
uapi struct. With the vfs_set_acl_prepare() helper we can move the
idmapped mount fixup into the generic posix acl set handler.
The advantage of this is that it allows us to remove the
posix_acl_setxattr_idmapped_mnt() helper which so far we had to call
in vfs_setxattr() to account for idmapped mounts. While semantically
correct the problem with this approach was that we had to keep the
value parameter of the generic vfs_setxattr() call as non-const. This
is rectified in this series.
Ultimately, we will get rid of all the extreme kludges and type
unsafety once we have merged the posix api - hopefully during the next
merge window - built solely around get and set inode operations. Which
incidentally will also improve handling of posix acls in security and
especially in integrity modesl. While this will come with temporarily
having two inode operation for posix acls that is nothing compared to
the problems we have right now and so well worth it. We'll end up with
something that we can actually reason about instead of needing to
write novels to explain what's going on"
* tag 'fs.acl.rework.prep.v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
xattr: always us is_posix_acl_xattr() helper
acl: fix the comments of posix_acl_xattr_set
xattr: constify value argument in vfs_setxattr()
ovl: use vfs_set_acl_prepare()
acl: move idmapping handling into posix_acl_xattr_set()
acl: add vfs_set_acl_prepare()
acl: return EOPNOTSUPP in posix_acl_fix_xattr_common()
ntfs3: rework xattr handlers and switch to POSIX ACL VFS helpers
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCYzuCiQAKCRBZ7Krx/gZQ
65z6AQDbHgeZ3vXLyHdxs3VWsUkKWMUV1gb5MmzBs/eGq0K3hAD9FfGsOoT2ow9h
2ics8AGrvMZMHrFOwFAmolXjLW7qQws=
=y0rq
-----END PGP SIGNATURE-----
Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull coredump fix from Al Viro:
"Brown paper bag bug fix for the coredumping fix late in the 6.0
release cycle"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
[brown paperbag] fix coredump breakage
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmM68YIUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOTbA//TR8i+Wy8iswUCmtfmYg91h1uebpl
/kjNsSmfgivAUTGamr3eN2WRlGhZfkFDPIHa25uybSA6Q+75p4lst83Rt3HDbjkv
Ga7grCXnHwSDwJoHOSeFh0pojV2u7Zvfmiib2U5hPZEmd3kBw3NCgAJVcSGN80B2
dct36fzZNXjvpWDbygmFtRRkmEseslSkft8bUVvNZBP+B0zvv3vcNY1QFuKuK+W2
8wWpvO/cCSmke5i2c2ktHSk2f8/Y6n26Ik/OTHcTVfoKZLRaFbXEzLyxzLrNWd6m
hujXgcxszTtHdmoXx+J6uBauju7TR8pi1x8mO2LSGrlpRc1cX0A5ED8WcH71+HVE
8L1fIOmZShccPZn8xRok7oYycAUm/gIfpmSLzmZA76JsZYAe+mp9Ze9FA6fZtSwp
7Q/rfw/Rlz25WcFBe4xypP078HkOmqutkCk2zy5liR+cWGrgy/WKX15vyC0TaPrX
tbsRKuCLkipgfXrTk0dX3kmhz+3bJYjqeZEt7sfPSZYpaOGkNXVmAW0wnCOTuLMU
+8pIVktvQxMmACEj2gBMz11iooR4DpWLxOcQQR/impgCpNdZ60nA0a6KPJoIXC+5
NfTa422FZkc99QRVblUZyWSgJBW78Z3ZAQcQlo1AGLlFydbfrSFTRLbmNJZo/Nkl
KwpGvWs5nB0rVw0=
=VZl5
-----END PGP SIGNATURE-----
Merge tag 'lsm-pr-20221003' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull LSM updates from Paul Moore:
"Seven patches for the LSM layer and we've got a mix of trivial and
significant patches. Highlights below, starting with the smaller bits
first so they don't get lost in the discussion of the larger items:
- Remove some redundant NULL pointer checks in the common LSM audit
code.
- Ratelimit the lockdown LSM's access denial messages.
With this change there is a chance that the last visible lockdown
message on the console is outdated/old, but it does help preserve
the initial series of lockdown denials that started the denial
message flood and my gut feeling is that these might be the more
valuable messages.
- Open userfaultfds as readonly instead of read/write.
While this code obviously lives outside the LSM, it does have a
noticeable impact on the LSMs with Ondrej explaining the situation
in the commit description. It is worth noting that this patch
languished on the VFS list for over a year without any comments
(objections or otherwise) so I took the liberty of pulling it into
the LSM tree after giving fair notice. It has been in linux-next
since the end of August without any noticeable problems.
- Add a LSM hook for user namespace creation, with implementations
for both the BPF LSM and SELinux.
Even though the changes are fairly small, this is the bulk of the
diffstat as we are also including BPF LSM selftests for the new
hook.
It's also the most contentious of the changes in this pull request
with Eric Biederman NACK'ing the LSM hook multiple times during its
development and discussion upstream. While I've never taken NACK's
lightly, I'm sending these patches to you because it is my belief
that they are of good quality, satisfy a long-standing need of
users and distros, and are in keeping with the existing nature of
the LSM layer and the Linux Kernel as a whole.
The patches in implement a LSM hook for user namespace creation
that allows for a granular approach, configurable at runtime, which
enables both monitoring and control of user namespaces. The general
consensus has been that this is far preferable to the other
solutions that have been adopted downstream including outright
removal from the kernel, disabling via system wide sysctls, or
various other out-of-tree mechanisms that users have been forced to
adopt since we haven't been able to provide them an upstream
solution for their requests. Eric has been steadfast in his
objections to this LSM hook, explaining that any restrictions on
the user namespace could have significant impact on userspace.
While there is the possibility of impacting userspace, it is
important to note that this solution only impacts userspace when it
is requested based on the runtime configuration supplied by the
distro/admin/user. Frederick (the pathset author), the LSM/security
community, and myself have tried to work with Eric during
development of this patchset to find a mutually acceptable
solution, but Eric's approach and unwillingness to engage in a
meaningful way have made this impossible. I have CC'd Eric directly
on this pull request so he has a chance to provide his side of the
story; there have been no objections outside of Eric's"
* tag 'lsm-pr-20221003' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
lockdown: ratelimit denial messages
userfaultfd: open userfaultfds with O_RDONLY
selinux: Implement userns_create hook
selftests/bpf: Add tests verifying bpf lsm userns_create hook
bpf-lsm: Make bpf_lsm_userns_create() sleepable
security, lsm: Introduce security_create_user_ns()
lsm: clean up redundant NULL pointer check
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmM68ZsUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOAtRAAw/lcyPoyN8ia6+PPihRtAKGUFIf5
+IdEPYfCqkGghqB7BRDl5bXOLFgpY/m/41g+xFvzJ0fhVPLa7UWB//N7yTu3OnW/
vXz1wn0EJAeDlLbPzWd6V/SpcxJ1WPzjHj2B3YXNWnukfMjCnPIA8XlZc18zAWS1
/OOEBoOo/a/8Giw2l1bEXxfmDI20NrXNL3vWKQ+Bbhg2PJaH/FTk4DNxopt84o28
vA+cbfQcOOjeRjBuncnTp9/b244ojeM+lRSJZozGTogFIeDUp3KW1D7NHqNwyX12
seDooqLEP25vP+kQh8zH7gvacpoeDLz40bSpd+MKKj02IxKGikykWuvtlFWY3xNB
o1mT4SJhh3JcewS7gh6P5aESSSgLg9zb3zMGtjHhtz+HHi/Sq7PK7xJgrnKOBNgu
CLIu3L+5vJpAgrsze2tIcwRUySIzDKnfgw8Oz7zaS2lOTJ58emz00QwEioHMQufK
8gZXTvZykJAtLF19PJw+mHKu38hbdD/4vt8AFuIgJzFkjWKzaZAxUBT+3p/uaLHG
2PegjKzpCqH9vZ/HCdYI42OB8TKiPU3eBtYZ2eP3h7cdDu++tp1rf0hwHQrwE2AD
PRuoCaBYOTUedbR8CV07fSSGFnZvlPnuk9yB7/eztV2thBQG28ALGxVhWadn4ap/
UIFgCs5QDRj11u8=
=BQ+i
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20221003' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull SELinux updates from Paul Moore:
"Six SELinux patches, all are simple and easily understood, but a list
of the highlights is below:
- Use 'grep -E' instead of 'egrep' in the SELinux policy install
script.
Fun fact, this seems to be GregKH's *second* dedicated SELinux
patch since we transitioned to git (ignoring merges, the SPDX
stuff, and a trivial fs reference removal when lustre was yanked);
the first was back in 2011 when selinuxfs was placed in
/sys/fs/selinux. Oh, the memories ...
- Convert the SELinux policy boolean values to use signed integer
types throughout the SELinux kernel code.
Prior to this we were using a mix of signed and unsigned integers
which was probably okay in this particular case, but it is
definitely not a good idea in general.
- Remove a reference to the SELinux runtime disable functionality in
/etc/selinux/config as we are in the process of deprecating that.
See [1] for more background on this if you missed the previous
notes on the deprecation.
- Minor cleanups: remove unneeded variables and function parameter
constification"
Link: https://github.com/SELinuxProject/selinux-kernel/wiki/DEPRECATE-runtime-disable [1]
* tag 'selinux-pr-20221003' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: remove runtime disable message in the install_policy.sh script
selinux: use "grep -E" instead of "egrep"
selinux: remove the unneeded result variable
selinux: declare read-only parameters const
selinux: use int arrays for boolean values
selinux: remove an unneeded variable in sel_make_class_dir_entries()
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCYzrMqBQcem9oYXJAbGlu
dXguaWJtLmNvbQAKCRDLwZzRsCrn5aSaAP9Xk5xRyMgPjFV6SsaJU9jtbcHunyoC
29QyCaIa7z7znwEApO9xItOLHG34dVqGdrqxGDwNb9bgIpdCEkO3YMhNmww=
=RyKF
-----END PGP SIGNATURE-----
Merge tag 'integrity-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull integrity updates from Mimi Zohar:
"Just two bug fixes"
* tag 'integrity-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
efi: Correct Macmini DMI match in uefi cert quirk
ima: fix blocking of security.ima xattrs of unsupported algorithms
-----BEGIN PGP SIGNATURE-----
iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAmMzNOsXHGNhc2V5QHNj
aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBHSyg//XprfrAxU5Mk13fEKv1+L2TQ5
07510lqIevJObY9WwhzPwYW/3KZwlXDc8pcYnJZt5o6zV9YXipB4kRtdDVdew5k7
l+WJzwx+6uQjoHk6GrY7d50PhNFOpe+QPP68zs2iBJMairqpHEhEPbX81b2fhD2v
7VnWWGhKMS+iYR9SEGldA8NNnPpzz4+1xs7OlT6CEM3pnZFANlR1RCSsr1DvYFvZ
mJEXVWZNGQsLrwKLLesYGBzRRJeZtU47VMROyOqiXgSh+D2p9Z4ajVzdROSVNENY
8e2CRp2al9Ij0arUBaq1JaAIrvoO2P0YiOSa5wPU2yghj3McvAkIphQ8+c1PxkzM
r8Qk3hyZfjDMbh3jBFEugXt+UaQCCqELWnCrxoWZVflUdi5YXT1/7STifsQ1DhOw
okppOmAXsQ7rsr3+GW0249i7ySzvXCI/xtXfpvnT4aw0rjBML0uN7GeoEzPr84Pw
2vPM0lhULLifvfoaUwkySYVt0VHS2LVk1xaNFVikM80rkFagAjqU4ouzZw0JCa2U
VA45/h5/kWt+57uj8hdmaPZtfkw7saSl51kozwISltJS7ga6X6lCm1VwWZC6bjJF
QGUXWZlMC1hgwYK4DmMvjr9wWIwkxmEcVWSBMmsHiacr1Rl5N0Lnq0Rp8xD15u/R
TIdvYo9hHV6biX9+pkU=
=rKZK
-----END PGP SIGNATURE-----
Merge tag 'Smack-for-6.1' of https://github.com/cschaufler/smack-next
Pull smack updates from Casey Schaufler:
"Two minor code clean-ups: one removes constants left over from the old
mount API, while the other gets rid of an unneeded variable.
The other change fixes a flaw in handling IPv6 labeling"
* tag 'Smack-for-6.1' of https://github.com/cschaufler/smack-next:
smack: cleanup obsolete mount option flags
smack: lsm: remove the unneeded result variable
SMACK: Add sk_clone_security LSM hook