63d1cb53e2
25605 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Axel Rasmussen
|
f0fa943309 |
userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return faulting_process(0)` instead of `exit(faulting_process(0))`. This caused the forked process to keep running, trying to execute any further test cases after the events test in parallel with the "real" process. Add a simple test case which exercises minor faults. In short, it does the following: 1. "Sets up" an area (area_dst) and a second shared mapping to the same underlying pages (area_dst_alias). 2. Register one of these areas with userfaultfd, in minor fault mode. 3. Start a second thread to handle any minor faults. 4. Populate the underlying pages with the non-UFFD-registered side of the mapping. Basically, memset() each page with some arbitrary contents. 5. Then, using the UFFD-registered mapping, read all of the page contents, asserting that the contents match expectations (we expect the minor fault handling thread can modify the page contents before resolving the fault). The minor fault handling thread, upon receiving an event, flips all the bits (~) in that page, just to prove that it can modify it in some arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the mapping and resolve the fault. The reading thread should wake up and see this modification. Currently the minor fault test is only enabled in hugetlb_shared mode, as this is the only configuration the kernel feature supports. Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Adam Ruprecht <ruprecht@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Cannon Matthews <cannonmatthews@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chinwen Chang <chinwen.chang@mediatek.com> Cc: David Rientjes <rientjes@google.com> Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michal Koutn" <mkoutny@suse.com> Cc: Michel Lespinasse <walken@google.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oliver Upton <oupton@google.com> Cc: Shaohua Li <shli@fb.com> Cc: Shawn Anastasio <shawn@anastas.io> Cc: Steven Price <steven.price@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Zi Yan
|
fbe37501b2 |
mm: huge_memory: debugfs for file-backed THP split
Further extend <debugfs>/split_huge_pages to accept "<path>,<pgoff_start>,<pgoff_end>" for file-backed THP split tests since tmpfs may have file backed by THP that mapped nowhere. Update selftest program to test file-backed THP split too. Link: https://lkml.kernel.org/r/20210331235309.332292-2-zi.yan@sent.com Signed-off-by: Zi Yan <ziy@nvidia.com> Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Mika Penttila <mika.penttila@nextfour.com> Cc: David Rientjes <rientjes@google.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Zi Yan
|
fa6c02315f |
mm: huge_memory: a new debugfs interface for splitting THP tests
We did not have a direct user interface of splitting the compound page backing a THP and there is no need unless we want to expose the THP implementation details to users. Make <debugfs>/split_huge_pages accept a new command to do that. By writing "<pid>,<vaddr_start>,<vaddr_end>" to <debugfs>/split_huge_pages, THPs within the given virtual address range from the process with the given pid are split. It is used to test split_huge_page function. In addition, a selftest program is added to tools/testing/selftests/vm to utilize the interface by splitting PMD THPs and PTE-mapped THPs. This does not change the old behavior, i.e., writing 1 to the interface to split all THPs in the system. Link: https://lkml.kernel.org/r/20210331235309.332292-1-zi.yan@sent.com Signed-off-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mika Penttila <mika.penttila@nextfour.com> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Len Brown
|
3c070b2abf |
tools/power turbostat: version 2021.05.04
Signed-off-by: Len Brown <len.brown@intel.com> |
||
Len Brown
|
b60c573dc2 |
tools/power turbostat: Support "turbostat --hide idle"
As idle, in particular, can have many columns on some machines... Make it easy to ignore them all at once. Signed-off-by: Len Brown <len.brown@intel.com> |
||
Len Brown
|
38c6663a68 |
tools/power turbostat: elevate priority of interval mode
This makes interval mode less likely to see delayed results on a heavily loaded system. Signed-off-by: Len Brown <len.brown@intel.com> |
||
Len Brown
|
1b439f01b6 |
tools/power turbostat: formatting
Spring is here... run a long overdue Lendent on turbostat.c no functional change Signed-off-by: Len Brown <len.brown@intel.com> |
||
Zhang Rui
|
55279aef75 |
tools/power turbostat: rename tcc variables
There are two TCC activation temeprature. One is the default TCC activation temperature, also known as TJ_MAX. Another one is the effective TCC activation temperature, which is the subtraction of default TCC activation temperature and TCC offset. The name of variable tcc_activation_temp might be misleading here. Thus rename tcc_activation_temp to tj_max, and use tcc_default and tcc_offset to calculate the effective TCC activation temperature. No functional change in this patch. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> |
||
Zhang Rui
|
0b9a0b9be9 |
tools/power turbostat: add TCC Offset support
The length of TCC Offset bits varies on different platforms. Decode TCC Offset bits only for the platforms that we have verified. For the others, only show default TCC activation temperature. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> |
||
Zhang Rui
|
e9d3092f6d |
tools/power turbostat: save original CPU model
CPU model may get changed in intel_model_duplicates() for code reuse. But there are still some cases we need the original CPU model to handle minor differences between generations. Thus save the original CPU model. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> |
||
Zhang Rui
|
7ab5ff4937 |
tools/power turbostat: Fix Core C6 residency on Atom CPUs
For Atom CPUs that have core cstate deeper than C6, MSR_CORE_C6_RESIDENCY actually returns the residency for both CC6 and deeper Core cstates. Thus, the real Core C6 residency should be the subtraction of MSR_CORE_C6_RESIDENCY return value and MSR_CORE_C6_RESIDENCY return value. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> |
||
Chen Yu
|
aeb01e6d71 |
tools/power turbostat: Print the C-state Pre-wake settings
C-state pre-wake setting[1] is an optimization for some Intel CPUs to be woken up from deep C-states in order to reduce latency. According to the spec, the BIT30 is the C-state Pre-wake Disable. Expose this setting accordingly. Sample output from turbostat: ... cpu51: MSR_IA32_POWER_CTL: 0x1a00a40059 (C1E auto-promotion: DISabled) C-state Pre-wake: ENabled cpu51: MSR_TURBO_RATIO_LIMIT: 0x2021212121212224 ... [1] https://intel.github.io/wult/#c-state-pre-wake Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> |
||
Chen Yu
|
8c69da2930 |
tools/power turbostat: Enable tsc_tweak for Elkhart Lake and Jasper Lake
It was found that on Elkhart Lake the TSC frequency is driven by a separate crystal-clock domain, which is different from the BCLK domain which includes mperf. This has result in small different speed thus inconsistence between TSC and the mperf, which caused the Busy% to be higher than 100%. On this platform it seems that the mperf runs faster than tsc when the CPU is 100% utilized: delta tsc(18815473183) < delta mperf(18958403680) for 10 seconds. To align TSC with mperf, leverage the tsc_tweak mechanism introduced for cores newer than Skylake, so that TSC and mperf would be calculated in the same domain. Reported-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> |
||
Randy Dunlap
|
1e3ec5cdfb |
tools/power turbostat: unmark non-kernel-doc comment
Do not mark a comment as kernel-doc notation when it is not meant to be in kernel-doc notation. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Len Brown <len.brown@intel.com> |
||
Chen Yu
|
25368d7cef |
tools/power/turbostat: Remove Package C6 Retention on Ice Lake Server
Currently the turbostat treats ICX the same way as SKX and shares the code among them. But one difference is that ICX does not support Package C6 Retention, unlike SKX and CLX. So this patch: 1. Splitting SKX and ICX in turbostat. 2. Removing Package C6 Rentention for ICX. And after this split, it would be easier to cutomize Ice Lake Server in turbostat in the future. Suggested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com> |
||
Calvin Walton
|
13a779de41 |
tools/power turbostat: Fix offset overflow issue in index converting
The idx_to_offset() function returns type int (32-bit signed), but
MSR_PKG_ENERGY_STAT is u32 and would be interpreted as a negative number.
The end result is that it hits the if (offset < 0) check in update_msr_sum()
which prevents the timer callback from updating the stat in the background when
long durations are used. The similar issue exists in offset_to_idx() and
update_msr_sum(). Fix this issue by converting the 'int' to 'off_t' accordingly.
Fixes:
|
||
Bas Nieuwenhuizen
|
301b1d3a91 |
tools/power/turbostat: Fix turbostat for AMD Zen CPUs
It was reported that on Zen+ system turbostat started exiting,
which was tracked down to the MSR_PKG_ENERGY_STAT read failing because
offset_to_idx wasn't returning a non-negative index.
This patch combined the modification from Bingsong Si and
Bas Nieuwenhuizen and addd the MSR to the index system as alternative for
MSR_PKG_ENERGY_STATUS.
Fixes:
|
||
Len Brown
|
ba58ecde5e | tools/power turbostat: update version number | ||
Zhang Rui
|
abdc75ab53 |
tools/power turbostat: Fix DRAM Energy Unit on SKX
SKX uses fixed DRAM Energy Unit, just like HSX and BDX. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> |
||
Len Brown
|
b2b94be787 |
Revert "tools/power turbostat: adjust for temperature offset"
This reverts commit |
||
Chen Yu
|
6c5c656006 |
tools/power turbostat: Support Ice Lake D
Ice Lake D is low-end server version of Ice Lake X, reuse the code accordingly. Tested-by: Wendy Wang <wendy.wang@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> |
||
Chen Yu
|
5683460b85 |
tools/power turbostat: Support Alder Lake Mobile
Share the code between Alder Lake Mobile and Alder Lake Desktop. Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> |
||
Len Brown
|
ed0757b83a |
tools/power turbostat: print microcode patch level
(also available via "grep microcode /proc/cpuinfo") Signed-off-by: Len Brown <len.brown@intel.com> |
||
Len Brown
|
2af4f9b859 |
tools/power turbostat: add built-in-counter for IPC -- Instructions per Cycle
Use linux-perf to access the hardware instructions-retired counter. This is necessary because the counter is not enabled by default, and also the counter is prone to roll-over -- both of which perf manages. It is not necessary to use perf for the cycle counter, because turbostat already needs to collect delta-aperf to calcuate frequency. Signed-off-by: Len Brown <len.brown@intel.com> |
||
Linus Torvalds
|
954b720705 |
dma-mapping updates for Linux 5.13:
- add a new dma_alloc_noncontiguous API (me, Ricardo Ribalda) - fix a copyright noice (Hao Fang) - add an unlikely annotation to dma_mapping_error (Heiner Kallweit) - remove a pointless empty line (Wang Qing) - add support for multi-pages map/unmap bencharking (Xiang Chen) -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmCQ/foLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYMKLRAAyNxmtSSgU9FlzcCWIwLlOj5IMDR0bWkxMYr0oQOo e/e3ZLv5N6bF2tkLRE8YLy14LlmGYDdpV/VsuObsRMEJXUYs+wbWJOLuNFRZnkqH WjrXgCz0ijhynmxwkQCc7lsTZbMvwq0f2Jf+v0OWZ4D2v/raem3ESs2NW/Ld8lAe uPYK0mXywCPe0eqnRbJL53hah2y0PP+NKATPj18BtI32evPQcSL4g0r//VxPXb7G q4cT6D51xqo280h+fW2DqODJeQQsWb6tYrWDdg7y/JkGZU6aOKTCyCLrZ4Dz+MqA 7wb0PII3ldZxb0Z4FP9Ij/Tt8r+FuWS6uT8epwzFFuURrVj3M7Uuw2bKHJuFmHI4 R6wJGy6I5tIbgbg7EtqLr6LhwGtbdhhd5ehlXDA+x4VcXdQ3+9aE3IvGS0YGCon8 lDQH0EyC3FIAW6BJbBf1NySM3jtySSxrn2GMD3d70gOiS6O0Xiktj6jBY68cXYzS zCCSSY7KCZm65KbcDe71dv/RIgZePAIBGQJPPmSsRcKgnBXB+AVuUl0AoZu3yGwJ hY49eP55XX9Vm746GBwdwciuVDaTDScDzDtrw/GN3FQMBKrwtJs9MXTzbJHsdWVt G4p5qXa5VwH2xl1GsuRLAItKEglXzsX5GDZ0IvwjcbWlZokVhBlMOCv1ZfJRVkwc 0c0= =DJoy -----END PGP SIGNATURE----- Merge tag 'dma-mapping-5.13' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping updates from Christoph Hellwig: - add a new dma_alloc_noncontiguous API (me, Ricardo Ribalda) - fix a copyright notice (Hao Fang) - add an unlikely annotation to dma_mapping_error (Heiner Kallweit) - remove a pointless empty line (Wang Qing) - add support for multi-pages map/unmap bencharking (Xiang Chen) * tag 'dma-mapping-5.13' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: add unlikely hint to error path in dma_mapping_error dma-mapping: benchmark: Add support for multi-pages map/unmap dma-mapping: benchmark: use the correct HiSilicon copyright dma-mapping: remove a pointless empty line in dma_alloc_coherent media: uvcvideo: Use dma_alloc_noncontiguous API dma-iommu: implement ->alloc_noncontiguous dma-iommu: refactor iommu_dma_alloc_remap dma-mapping: add a dma_alloc_noncontiguous API dma-mapping: refactor dma_{alloc,free}_pages dma-mapping: add a dma_mmap_pages helper |
||
John 'Warthog9' Hawley (VMware)
|
6a0f365295 |
ktest: Re-arrange the code blocks for better discoverability
Perl, as with most scripting languages, is fairly flexible in how / where you can define things, and it will (for the most part) do what you would expect it to do. This however can lead to situations, like with ktest, where things get muddled over time. This pushes the variable definitions back up to the top, followed by functions, with the main script executables down at the bottom, INSTEAD of being somewhat mish-mashed together in certain places. This mostly has the advantage of making it more obvious where things are initially defined, what functions are there, and ACTUALLY where the main script starts executing, and should make this a little more approachable. Signed-off-by: John 'Warthog9' Hawley (VMware) <warthog9@eaglescrag.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
||
John 'Warthog9' Hawley (VMware)
|
c043ccbfc6 |
ktest: Further consistency cleanups
This cleans up some additional whitespace pieces that to be more consistent, as well as moving a curly brace around, and some 'or' statements to match the rest of the file (usually or goes at the end of the line vs. at the beginning) Signed-off-by: John 'Warthog9' Hawley (VMware) <warthog9@eaglescrag.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
||
John 'Warthog9' Hawley (VMware)
|
12d4cddda2 |
ktest: Fixing indentation to match expected pattern
This is a followup to "ktest: Adding editor hints to improve consistency" to actually adjust the existing indentation to match the, now, expected pattern (first column 4 spaces, 2nd tab, 3rd tab + 4 spaces, etc). This should, at least help, keep things consistent going forward now. Signed-off-by: John 'Warthog9' Hawley (VMware) <warthog9@eaglescrag.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
||
John 'Warthog9' Hawley (VMware)
|
becdd17b5a |
ktest: Adding editor hints to improve consistency
Emacs and Vi(m) have different styles of dealing with perl syntax which can lead to slightly inconsistent indentation, and makes the code slightly harder to read. Emacs assumes a more perl recommended standard of 4 spaces (1 column) or tab (two column) indentation. Vi(m) tends to favor just normal spaces or tabs depending on what was being used. This gives the basic hinting to Emacs and Vim to do what is expected to be basically consistent. Emacs: - Explicitly flip into perl mode, cperl would require more adjustments Vi(m): - Set softtabs=4 which will flip it over to doing indentation the way you would expect from Emacs Signed-off-by: John 'Warthog9' Hawley (VMware) <warthog9@eaglescrag.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
||
John 'Warthog9' Hawley (VMware)
|
2676eb4bfc |
ktest: Add example config for using VMware VMs
This duplicates the KVM/Qemu config with specific notes for how to use it with VMware VMs on Workstation, Player, or Fusion. The main thing to be aware of is how the serial port is exposed which is a unix pipe, and will need something like ncat to get into ktest's monitoring Signed-off-by: John 'Warthog9' Hawley (VMware) <warthog9@eaglescrag.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
||
John 'Warthog9' Hawley (VMware)
|
da2e56634b |
ktest: Minor cleanup with uninitialized variable $build_options
Signed-off-by: John 'Warthog9' Hawley (VMware) <warthog9@eaglescrag.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
||
Linus Torvalds
|
9b1f61d5d7 |
tracing updates for 5.13
New feature: The "func-no-repeats" option in tracefs/options directory. When set the function tracer will detect if the current function being traced is the same as the previous one, and instead of recording it, it will keep track of the number of times that the function is repeated in a row. And when another function is recorded, it will write a new event that shows the function that repeated, the number of times it repeated and the time stamp of when the last repeated function occurred. Enhancements: In order to implement the above "func-no-repeats" option, the ring buffer timestamp can now give the accurate timestamp of the event as it is being recorded, instead of having to record an absolute timestamp for all events. This helps the histogram code which no longer needs to waste ring buffer space. New validation logic to make sure all trace events that access dereferenced pointers do so in a safe way, and will warn otherwise. Fixes: No longer limit the PIDs of tasks that are recorded for "saved_cmdlines" to PID_MAX_DEFAULT (32768), as systemd now allows for a much larger range. This caused the mapping of PIDs to the task names to be dropped for all tasks with a PID greater than 32768. Change trace_clock_global() to never block. This caused a deadlock. Clean ups: Typos, prototype fixes, and removing of duplicate or unused code. Better management of ftrace_page allocations. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYI/1vBQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qiL0AP9EemIC5TDh2oihqLRNeUjdTu0ryEoM HRFqxozSF985twD/bfkt86KQC8rLHwxTbxQZ863bmdaC6cMGFhWiF+H/MAs= =psYt -----END PGP SIGNATURE----- Merge tag 'trace-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "New feature: - A new "func-no-repeats" option in tracefs/options directory. When set the function tracer will detect if the current function being traced is the same as the previous one, and instead of recording it, it will keep track of the number of times that the function is repeated in a row. And when another function is recorded, it will write a new event that shows the function that repeated, the number of times it repeated and the time stamp of when the last repeated function occurred. Enhancements: - In order to implement the above "func-no-repeats" option, the ring buffer timestamp can now give the accurate timestamp of the event as it is being recorded, instead of having to record an absolute timestamp for all events. This helps the histogram code which no longer needs to waste ring buffer space. - New validation logic to make sure all trace events that access dereferenced pointers do so in a safe way, and will warn otherwise. Fixes: - No longer limit the PIDs of tasks that are recorded for "saved_cmdlines" to PID_MAX_DEFAULT (32768), as systemd now allows for a much larger range. This caused the mapping of PIDs to the task names to be dropped for all tasks with a PID greater than 32768. - Change trace_clock_global() to never block. This caused a deadlock. Clean ups: - Typos, prototype fixes, and removing of duplicate or unused code. - Better management of ftrace_page allocations" * tag 'trace-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (32 commits) tracing: Restructure trace_clock_global() to never block tracing: Map all PIDs to command lines ftrace: Reuse the output of the function tracer for func_repeats tracing: Add "func_no_repeats" option for function tracing tracing: Unify the logic for function tracing options tracing: Add method for recording "func_repeats" events tracing: Add "last_func_repeats" to struct trace_array tracing: Define new ftrace event "func_repeats" tracing: Define static void trace_print_time() ftrace: Simplify the calculation of page number for ftrace_page->records some more ftrace: Store the order of pages allocated in ftrace_page tracing: Remove unused argument from "ring_buffer_time_stamp() tracing: Remove duplicate struct declaration in trace_events.h tracing: Update create_system_filter() kernel-doc comment tracing: A minor cleanup for create_system_filter() kernel: trace: Mundane typo fixes in the file trace_events_filter.c tracing: Fix various typos in comments scripts/recordmcount.pl: Make vim and emacs indent the same scripts/recordmcount.pl: Make indent spacing consistent tracing: Add a verifier to check string pointers for trace events ... |
||
Brendan Jackman
|
2a30f94406 |
libbpf: Fix signed overflow in ringbuf_process_ring
One of our benchmarks running in (Google-internal) CI pushes data
through the ringbuf faster htan than userspace is able to consume
it. In this case it seems we're actually able to get >INT_MAX entries
in a single ring_buffer__consume() call. ASAN detected that cnt
overflows in this case.
Fix by using 64-bit counter internally and then capping the result to
INT_MAX before converting to the int return type. Do the same for
the ring_buffer__poll().
Fixes:
|
||
Linus Torvalds
|
17ae69aba8 |
Add Landlock, a new LSM from Mickaël Salaün <mic@linux.microsoft.com>
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEgycj0O+d1G2aycA8rZhLv9lQBTwFAmCInP4ACgkQrZhLv9lQ BTza0g//dTeb9woC9H7qlEhK4l9yk62lTss60Q8X7m7ZSNfdL4tiEbi64SgK+iOW OOegbrOEb8Kzh4KJJYmVlVZ5YUWyH4szgmee1wnylBdsWiWaPLPF3Cflz77apy6T TiiBsJd7rRE29FKheaMt34B41BMh8QHESN+DzjzJWsFoi/uNxjgSs2W16XuSupKu bpRmB1pYNXMlrkzz7taL05jndZYE5arVriqlxgAsuLOFOp/ER7zecrjImdCM/4kL W6ej0R1fz2Geh6CsLBJVE+bKWSQ82q5a4xZEkSYuQHXgZV5eywE5UKu8ssQcRgQA VmGUY5k73rfY9Ofupf2gCaf/JSJNXKO/8Xjg0zAdklKtmgFjtna5Tyg9I90j7zn+ 5swSpKuRpilN8MQH+6GWAnfqQlNoviTOpFeq3LwBtNVVOh08cOg6lko/bmebBC+R TeQPACKS0Q0gCDPm9RYoU1pMUuYgfOwVfVRZK1prgi2Co7ZBUMOvYbNoKYoPIydr ENBYljlU1OYwbzgR2nE+24fvhU8xdNOVG1xXYPAEHShu+p7dLIWRLhl8UCtRQpSR 1ofeVaJjgjrp29O+1OIQjB2kwCaRdfv/Gq1mztE/VlMU/r++E62OEzcH0aS+mnrg yzfyUdI8IFv1q6FGT9yNSifWUWxQPmOKuC8kXsKYfqfJsFwKmHM= =uCN4 -----END PGP SIGNATURE----- Merge tag 'landlock_v34' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull Landlock LSM from James Morris: "Add Landlock, a new LSM from Mickaël Salaün. Briefly, Landlock provides for unprivileged application sandboxing. From Mickaël's cover letter: "The goal of Landlock is to enable to restrict ambient rights (e.g. global filesystem access) for a set of processes. Because Landlock is a stackable LSM [1], it makes possible to create safe security sandboxes as new security layers in addition to the existing system-wide access-controls. This kind of sandbox is expected to help mitigate the security impact of bugs or unexpected/malicious behaviors in user-space applications. Landlock empowers any process, including unprivileged ones, to securely restrict themselves. Landlock is inspired by seccomp-bpf but instead of filtering syscalls and their raw arguments, a Landlock rule can restrict the use of kernel objects like file hierarchies, according to the kernel semantic. Landlock also takes inspiration from other OS sandbox mechanisms: XNU Sandbox, FreeBSD Capsicum or OpenBSD Pledge/Unveil. In this current form, Landlock misses some access-control features. This enables to minimize this patch series and ease review. This series still addresses multiple use cases, especially with the combined use of seccomp-bpf: applications with built-in sandboxing, init systems, security sandbox tools and security-oriented APIs [2]" The cover letter and v34 posting is here: https://lore.kernel.org/linux-security-module/20210422154123.13086-1-mic@digikod.net/ See also: https://landlock.io/ This code has had extensive design discussion and review over several years" Link: https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b69f@schaufler-ca.com/ [1] Link: https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad0468cd@digikod.net/ [2] * tag 'landlock_v34' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: landlock: Enable user space to infer supported features landlock: Add user and kernel documentation samples/landlock: Add a sandbox manager example selftests/landlock: Add user space tests landlock: Add syscall implementations arch: Wire up Landlock syscalls fs,security: Add sb_delete hook landlock: Support filesystem access-control LSM: Infrastructure management of the superblock landlock: Add ptrace restrictions landlock: Set up the security framework and manage credentials landlock: Add ruleset and domain management landlock: Add object management |
||
Linus Torvalds
|
10a3efd0fe |
perf tools changes for v5.13: 1st batch
perf stat: - Add support for hybrid PMUs to support systems such as Intel Alderlake and its BIG/little core/atom cpus. - Introduce 'bperf' to share hardware PMCs with BPF. - New --iostat option to collect and present IO stats on Intel hardware. This functionality is based on recently introduced sysfs attributes for Intel® Xeon® Scalable processor family (code name Skylake-SP): commit |
||
Linus Torvalds
|
152d32aa84 |
ARM:
- Stage-2 isolation for the host kernel when running in protected mode - Guest SVE support when running in nVHE mode - Force W^X hypervisor mappings in nVHE mode - ITS save/restore for guests using direct injection with GICv4.1 - nVHE panics now produce readable backtraces - Guest support for PTP using the ptp_kvm driver - Performance improvements in the S2 fault handler x86: - Optimizations and cleanup of nested SVM code - AMD: Support for virtual SPEC_CTRL - Optimizations of the new MMU code: fast invalidation, zap under read lock, enable/disably dirty page logging under read lock - /dev/kvm API for AMD SEV live migration (guest API coming soon) - support SEV virtual machines sharing the same encryption context - support SGX in virtual machines - add a few more statistics - improved directed yield heuristics - Lots and lots of cleanups Generic: - Rework of MMU notifier interface, simplifying and optimizing the architecture-specific code - Some selftests improvements -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmCJ13kUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroM1HAgAqzPxEtiTPTFeFJV5cnPPJ3dFoFDK y/juZJUQ1AOtvuWzzwuf175ewkv9vfmtG6rVohpNSkUlJYeoc6tw7n8BTTzCVC1b c/4Dnrjeycr6cskYlzaPyV6MSgjSv5gfyj1LA5UEM16LDyekmaynosVWY5wJhju+ Bnyid8l8Utgz+TLLYogfQJQECCrsU0Wm//n+8TWQgLf1uuiwshU5JJe7b43diJrY +2DX+8p9yWXCTz62sCeDWNahUv8AbXpMeJ8uqZPYcN1P0gSEUGu8xKmLOFf9kR7b M4U1Gyz8QQbjd2lqnwiWIkvRLX6gyGVbq2zH0QbhUe5gg3qGUX7JjrhdDQ== =AXUi -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "This is a large update by KVM standards, including AMD PSP (Platform Security Processor, aka "AMD Secure Technology") and ARM CoreSight (debug and trace) changes. ARM: - CoreSight: Add support for ETE and TRBE - Stage-2 isolation for the host kernel when running in protected mode - Guest SVE support when running in nVHE mode - Force W^X hypervisor mappings in nVHE mode - ITS save/restore for guests using direct injection with GICv4.1 - nVHE panics now produce readable backtraces - Guest support for PTP using the ptp_kvm driver - Performance improvements in the S2 fault handler x86: - AMD PSP driver changes - Optimizations and cleanup of nested SVM code - AMD: Support for virtual SPEC_CTRL - Optimizations of the new MMU code: fast invalidation, zap under read lock, enable/disably dirty page logging under read lock - /dev/kvm API for AMD SEV live migration (guest API coming soon) - support SEV virtual machines sharing the same encryption context - support SGX in virtual machines - add a few more statistics - improved directed yield heuristics - Lots and lots of cleanups Generic: - Rework of MMU notifier interface, simplifying and optimizing the architecture-specific code - a handful of "Get rid of oprofile leftovers" patches - Some selftests improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (379 commits) KVM: selftests: Speed up set_memory_region_test selftests: kvm: Fix the check of return value KVM: x86: Take advantage of kvm_arch_dy_has_pending_interrupt() KVM: SVM: Skip SEV cache flush if no ASIDs have been used KVM: SVM: Remove an unnecessary prototype declaration of sev_flush_asids() KVM: SVM: Drop redundant svm_sev_enabled() helper KVM: SVM: Move SEV VMCB tracking allocation to sev.c KVM: SVM: Explicitly check max SEV ASID during sev_hardware_setup() KVM: SVM: Unconditionally invoke sev_hardware_teardown() KVM: SVM: Enable SEV/SEV-ES functionality by default (when supported) KVM: SVM: Condition sev_enabled and sev_es_enabled on CONFIG_KVM_AMD_SEV=y KVM: SVM: Append "_enabled" to module-scoped SEV/SEV-ES control variables KVM: SEV: Mask CPUID[0x8000001F].eax according to supported features KVM: SVM: Move SEV module params/variables to sev.c KVM: SVM: Disable SEV/SEV-ES if NPT is disabled KVM: SVM: Free sev_asid_bitmap during init if SEV setup fails KVM: SVM: Zero out the VMCB array used to track SEV ASID association x86/sev: Drop redundant and potentially misleading 'sev_enabled' KVM: x86: Move reverse CPUID helpers to separate header file KVM: x86: Rename GPR accessors to make mode-aware variants the defaults ... |
||
Masahiro Yamada
|
77a88274dc |
kbuild: replace LANG=C with LC_ALL=C
LANG gives a weak default to each LC_* in case it is not explicitly defined. LC_ALL, if set, overrides all other LC_* variables. LANG < LC_CTYPE, LC_COLLATE, LC_MONETARY, LC_NUMERIC, ... < LC_ALL This is why documentation such as [1] suggests to set LC_ALL in build scripts to get the deterministic result. LANG=C is not strong enough to override LC_* that may be set by end users. [1]: https://reproducible-builds.org/docs/locales/ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Reviewed-by: Matthias Maennich <maennich@google.com> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> (mptcp) Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Linus Torvalds
|
d42f323a7d |
Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton: "A few misc subsystems and some of MM. 175 patches. Subsystems affected by this patch series: ia64, kbuild, scripts, sh, ocfs2, kfifo, vfs, kernel/watchdog, and mm (slab-generic, slub, kmemleak, debug, pagecache, msync, gup, memremap, memcg, pagemap, mremap, dma, sparsemem, vmalloc, documentation, kasan, initialization, pagealloc, and memory-failure)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (175 commits) mm/memory-failure: unnecessary amount of unmapping mm/mmzone.h: fix existing kernel-doc comments and link them to core-api mm: page_alloc: ignore init_on_free=1 for debug_pagealloc=1 net: page_pool: use alloc_pages_bulk in refill code path net: page_pool: refactor dma_map into own function page_pool_dma_map SUNRPC: refresh rq_pages using a bulk page allocator SUNRPC: set rq_page_end differently mm/page_alloc: inline __rmqueue_pcplist mm/page_alloc: optimize code layout for __alloc_pages_bulk mm/page_alloc: add an array-based interface to the bulk page allocator mm/page_alloc: add a bulk page allocator mm/page_alloc: rename alloced to allocated mm/page_alloc: duplicate include linux/vmalloc.h mm, page_alloc: avoid page_to_pfn() in move_freepages() mm/Kconfig: remove default DISCONTIGMEM_MANUAL mm: page_alloc: dump migrate-failed pages mm/mempolicy: fix mpol_misplaced kernel-doc mm/mempolicy: rewrite alloc_pages_vma documentation mm/mempolicy: rewrite alloc_pages documentation mm/mempolicy: rename alloc_pages_current to alloc_pages ... |
||
Linus Torvalds
|
c70a4be130 |
powerpc updates for 5.13
- Enable KFENCE for 32-bit. - Implement EBPF for 32-bit. - Convert 32-bit to do interrupt entry/exit in C. - Convert 64-bit BookE to do interrupt entry/exit in C. - Changes to our signal handling code to use user_access_begin/end() more extensively. - Add support for time namespaces (CONFIG_TIME_NS) - A series of fixes that allow us to reenable STRICT_KERNEL_RWX. - Other smaller features, fixes & cleanups. Thanks to: Alexey Kardashevskiy, Andreas Schwab, Andrew Donnellan, Aneesh Kumar K.V, Athira Rajeev, Bhaskar Chowdhury, Bixuan Cui, Cédric Le Goater, Chen Huang, Chris Packham, Christophe Leroy, Christopher M. Riedl, Colin Ian King, Dan Carpenter, Daniel Axtens, Daniel Henrique Barboza, David Gibson, Davidlohr Bueso, Denis Efremov, dingsenjie, Dmitry Safonov, Dominic DeMarco, Fabiano Rosas, Ganesh Goudar, Geert Uytterhoeven, Geetika Moolchandani, Greg Kurz, Guenter Roeck, Haren Myneni, He Ying, Jiapeng Chong, Jordan Niethe, Laurent Dufour, Lee Jones, Leonardo Bras, Li Huafei, Madhavan Srinivasan, Mahesh Salgaonkar, Masahiro Yamada, Nathan Chancellor, Nathan Lynch, Nicholas Piggin, Oliver O'Halloran, Paul Menzel, Pu Lehui, Randy Dunlap, Ravi Bangoria, Rosen Penev, Russell Currey, Santosh Sivaraj, Sebastian Andrzej Siewior, Segher Boessenkool, Shivaprasad G Bhat, Srikar Dronamraju, Stephen Rothwell, Thadeu Lima de Souza Cascardo, Thomas Gleixner, Tony Ambardar, Tyrel Datwyler, Vaibhav Jain, Vincenzo Frascino, Xiongwei Song, Yang Li, Yu Kuai, Zhang Yunkai. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmCLV1kTHG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgLUyD/4jrTolG4sVec211hYO+0VuJzoqN4Cf j2CA2Ju39butnSMiq4LJUPRB7QRZY1OofkoNFpZeDQspjfZXPz2ulpYAz+SxHWE2 ReHPmWH1rOABlUPXFboePF4OLwmAs9eR5mN2z9HpKXbT3k78HaToLqiONyB4fVCr Q5TkJeRn/Y7ZJLdyPLTpczHHleQ8KoM6kT7ncXnTm6p97JOBJSrGaJ5N/8X5a4+e 6jtgB7Pvw8jNDShSr8BDLBgBZZcmoTiuG8KfgwRZ+m+mKB1yI2X8S/a54w/lDi9g UcSv3jQcFLJuW+T/pYe4R330uWDYa0cwjJOtMmsJ98S4EYOevoe9fZuL97qNshme xtBr4q1i03G1icYOJJ8dXtvabG2rUzj8t1SCDpwYfrynzTWVRikiQYTXUBhRSFoK nsoklvKd2IZa485XYJ2ljSyClMy8S4yJJ9RuzZ94DTXDSJUesKuyRWGnso4mhkcl wvl4wwMTJvnCMKVo6dsJyV24QWfd6dABxzm04uPA94CKhG33UwK8252jXVeaohSb WSO7qWBONgDXQLJ0mXRcEYa9NHvFS4Jnp6APbxnHr1gS+K+PNkD4gPBf34FoyN0E 9s27kvEYk5vr8APUclETF6+FkbGUD5bFbusjt3hYloFpAoHQ/k5pFVDsOZNPA8sW fDIRp05KunDojw== =dfKL -----END PGP SIGNATURE----- Merge tag 'powerpc-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Enable KFENCE for 32-bit. - Implement EBPF for 32-bit. - Convert 32-bit to do interrupt entry/exit in C. - Convert 64-bit BookE to do interrupt entry/exit in C. - Changes to our signal handling code to use user_access_begin/end() more extensively. - Add support for time namespaces (CONFIG_TIME_NS) - A series of fixes that allow us to reenable STRICT_KERNEL_RWX. - Other smaller features, fixes & cleanups. Thanks to Alexey Kardashevskiy, Andreas Schwab, Andrew Donnellan, Aneesh Kumar K.V, Athira Rajeev, Bhaskar Chowdhury, Bixuan Cui, Cédric Le Goater, Chen Huang, Chris Packham, Christophe Leroy, Christopher M. Riedl, Colin Ian King, Dan Carpenter, Daniel Axtens, Daniel Henrique Barboza, David Gibson, Davidlohr Bueso, Denis Efremov, dingsenjie, Dmitry Safonov, Dominic DeMarco, Fabiano Rosas, Ganesh Goudar, Geert Uytterhoeven, Geetika Moolchandani, Greg Kurz, Guenter Roeck, Haren Myneni, He Ying, Jiapeng Chong, Jordan Niethe, Laurent Dufour, Lee Jones, Leonardo Bras, Li Huafei, Madhavan Srinivasan, Mahesh Salgaonkar, Masahiro Yamada, Nathan Chancellor, Nathan Lynch, Nicholas Piggin, Oliver O'Halloran, Paul Menzel, Pu Lehui, Randy Dunlap, Ravi Bangoria, Rosen Penev, Russell Currey, Santosh Sivaraj, Sebastian Andrzej Siewior, Segher Boessenkool, Shivaprasad G Bhat, Srikar Dronamraju, Stephen Rothwell, Thadeu Lima de Souza Cascardo, Thomas Gleixner, Tony Ambardar, Tyrel Datwyler, Vaibhav Jain, Vincenzo Frascino, Xiongwei Song, Yang Li, Yu Kuai, and Zhang Yunkai. * tag 'powerpc-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (302 commits) powerpc/signal32: Fix erroneous SIGSEGV on RT signal return powerpc: Avoid clang uninitialized warning in __get_user_size_allowed powerpc/papr_scm: Mark nvdimm as unarmed if needed during probe powerpc/kvm: Fix build error when PPC_MEM_KEYS/PPC_PSERIES=n powerpc/kasan: Fix shadow start address with modules powerpc/kernel/iommu: Use largepool as a last resort when !largealloc powerpc/kernel/iommu: Align size for IOMMU_PAGE_SIZE() to save TCEs powerpc/44x: fix spelling mistake in Kconfig "varients" -> "variants" powerpc/iommu: Annotate nested lock for lockdep powerpc/iommu: Do not immediately panic when failed IOMMU table allocation powerpc/iommu: Allocate it_map by vmalloc selftests/powerpc: remove unneeded semicolon powerpc/64s: remove unneeded semicolon powerpc/eeh: remove unneeded semicolon powerpc/selftests: Add selftest to test concurrent perf/ptrace events powerpc/selftests/perf-hwbreak: Add testcases for 2nd DAWR powerpc/selftests/perf-hwbreak: Coalesce event creation code powerpc/selftests/ptrace-hwbreak: Add testcases for 2nd DAWR powerpc/configs: Add IBMVNIC to some 64-bit configs selftests/powerpc: Add uaccess flush test ... |
||
Uladzislau Rezki (Sony)
|
7bc4ca3ea9 |
vm/test_vmalloc.sh: adapt for updated driver interface
A 'single_cpu_test' parameter is odd and it does not exist anymore. Instead there was introduced a 'nr_threads' one. If it is not set it behaves as the former parameter. That is why update a "stress mode" according to this change specifying number of workers which are equal to number of CPUs. Also update an output of help message based on a new interface. Link: https://lkml.kernel.org/r/20210402202237.20334-3-urezki@gmail.com Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Brian Geffon
|
8593100444 |
selftests: add a MREMAP_DONTUNMAP selftest for shmem
This test extends the current mremap tests to validate that the MREMAP_DONTUNMAP operation can be performed on shmem mappings. Link: https://lkml.kernel.org/r/20210323182520.2712101-3-bgeffon@google.com Signed-off-by: Brian Geffon <bgeffon@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Peter Xu <peterx@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: "Michael S . Tsirkin" <mst@redhat.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Sonny Rao <sonnyrao@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Dmitry Safonov <dima@arista.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Alejandro Colomar <alx.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Johannes Weiner
|
4bbcc5a41c |
kselftests: cgroup: update kmem test for new vmstat implementation
With memcg having switched to rstat, memory.stat output is precise. Update the cgroup selftest to reflect the expectations and error tolerances of the new implementation. Also add newly tracked types of memory to the memory.stat side of the equation, since they're included in memory.current and could throw false positives. Link: https://lkml.kernel.org/r/20210209163304.77088-9-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Michal Koutný <mkoutny@suse.com> Acked-by: Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Florent Revest
|
f80f88f0e2 |
selftests/bpf: Fix the snprintf test
The BPF program for the snprintf selftest runs on all syscall entries.
On busy multicore systems this can cause concurrency issues.
For example it was observed that sometimes the userspace part of the
test reads " 4 0000" instead of " 4 000" (extra '0' at the end)
which seems to happen just before snprintf on another core sets
end[-1] = '\0'.
This patch adds a pid filter to the test to ensure that no
bpf_snprintf() will write over the test's output buffers while the
userspace reads the values.
Fixes:
|
||
Linus Torvalds
|
b0030af53a |
Kbuild updates for v5.13
- Evaluate $(call cc-option,...) etc. only for build targets - Add CONFIG_VMLINUX_MAP to generate .map file when linking vmlinux - Remove unnecessary --gcc-toolchains Clang flag because the --prefix flag finds the toolchains - Do not pass Clang's --prefix flag when using the integrated as - Check the assembler version in Kconfig time - Add new CONFIG options, AS_VERSION, AS_IS_GNU, AS_IS_LLVM to clean up some dependencies in Kconfig - Fix invalid Module.symvers creation when building only modules without vmlinux - Fix false-positive modpost warnings when CONFIG_TRIM_UNUSED_KSYMS is set, but there is no module to build - Refactor module installation Makefile - Support zstd for module compression - Convert alpha and ia64 to use generic shell scripts to generate the syscall headers - Add a new elfnote to indicate if the kernel was built with LTO, which will be used by pahole - Flatten the directory structure under include/config/ so CONFIG options and filenames match - Change the deb source package name from linux-$(KERNELRELEASE) to linux-upstream -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmCKOLUVHG1hc2FoaXJv eUBrZXJuZWwub3JnAAoJED2LAQed4NsGdq8P/2z+saxIWGXVWt0ggavR0vimcY4e NQIKGu9uZpo/lfoC78UG8HO+XvzvPUrcRuOX+WIVr2GfScgVnweDukexUAY0/2oi 4UvqhndJ0sjEwRj8mXXJ0O+PED+OtgrqrbhkLq9wHQd/jpSD4XEWXwn1g1XVrTZu WbwP6b1G/Rnjp2lz3HKC017rPkmfsCFQB7r+hbJGKhT0rCaceheUuBvGa/XqLknr IOyaUAY76u3Gtj6fVY1rk70kQgDMF8+LJPgdSSZ/XPCvbNJQAeop36EeRNfmxGIh vQhFJRJeqy+K5MhCpdGtTGYDawlmQVn/f/99SkDw9F04S4ZL2Xnaaqw4L1QDhjTh xBlckbPvmq36F4xSqWd5kYF3iwS+LsEJROwZKFLEVDb3zMsRQPEGQM/556QmrBi2 5KXzwOYEJKuobWr1hQ3PwLumJKTPGLvGEFB3Bq2eG8LrgpOAHPI4ejC2EBu0vCez QbskP2lPlMj3MbL5iZg+6ZRlOChZ7RUrSDj6+iTeOcinmXHqQONCL6qy+um4Rfcb zUkfwTlqM9d88u6AbO2VvQMOobMjvp4bvmqi/Xv8IiTukLHco4tc8zTuySmZwSyI rd3RKYn367qWztX5YyaoGRPVmlMG7ssbRc4fkXiV13vfeZebNfVwlX/CHv9+IWwN RVnMhYBhUZR68h6z =ti9L -----END PGP SIGNATURE----- Merge tag 'kbuild-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Evaluate $(call cc-option,...) etc. only for build targets - Add CONFIG_VMLINUX_MAP to generate .map file when linking vmlinux - Remove unnecessary --gcc-toolchains Clang flag because the --prefix flag finds the toolchains - Do not pass Clang's --prefix flag when using the integrated as - Check the assembler version in Kconfig time - Add new CONFIG options, AS_VERSION, AS_IS_GNU, AS_IS_LLVM to clean up some dependencies in Kconfig - Fix invalid Module.symvers creation when building only modules without vmlinux - Fix false-positive modpost warnings when CONFIG_TRIM_UNUSED_KSYMS is set, but there is no module to build - Refactor module installation Makefile - Support zstd for module compression - Convert alpha and ia64 to use generic shell scripts to generate the syscall headers - Add a new elfnote to indicate if the kernel was built with LTO, which will be used by pahole - Flatten the directory structure under include/config/ so CONFIG options and filenames match - Change the deb source package name from linux-$(KERNELRELEASE) to linux-upstream * tag 'kbuild-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (42 commits) kbuild: Add $(KBUILD_HOSTLDFLAGS) to 'has_libelf' test kbuild: deb-pkg: change the source package name to linux-upstream tools: do not include scripts/Kbuild.include kbuild: redo fake deps at include/config/*.h kbuild: remove TMPO from try-run MAINTAINERS: add pattern for dummy-tools kbuild: add an elfnote for whether vmlinux is built with lto ia64: syscalls: switch to generic syscallhdr.sh ia64: syscalls: switch to generic syscalltbl.sh alpha: syscalls: switch to generic syscallhdr.sh alpha: syscalls: switch to generic syscalltbl.sh sysctl: use min() helper for namecmp() kbuild: add support for zstd compressed modules kbuild: remove CONFIG_MODULE_COMPRESS kbuild: merge scripts/Makefile.modsign to scripts/Makefile.modinst kbuild: move module strip/compression code into scripts/Makefile.modinst kbuild: refactor scripts/Makefile.modinst kbuild: rename extmod-prefix to extmod_prefix kbuild: check module name conflict for external modules as well kbuild: show the target directory for depmod log ... |
||
Linus Torvalds
|
9d31d23389 |
Networking changes for 5.13.
Core: - bpf: - allow bpf programs calling kernel functions (initially to reuse TCP congestion control implementations) - enable task local storage for tracing programs - remove the need to store per-task state in hash maps, and allow tracing programs access to task local storage previously added for BPF_LSM - add bpf_for_each_map_elem() helper, allowing programs to walk all map elements in a more robust and easier to verify fashion - sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT redirection - lpm: add support for batched ops in LPM trie - add BTF_KIND_FLOAT support - mostly to allow use of BTF on s390 which has floats in its headers files - improve BPF syscall documentation and extend the use of kdoc parsing scripts we already employ for bpf-helpers - libbpf, bpftool: support static linking of BPF ELF files - improve support for encapsulation of L2 packets - xdp: restructure redirect actions to avoid a runtime lookup, improving performance by 4-8% in microbenchmarks - xsk: build skb by page (aka generic zerocopy xmit) - improve performance of software AF_XDP path by 33% for devices which don't need headers in the linear skb part (e.g. virtio) - nexthop: resilient next-hop groups - improve path stability on next-hops group changes (incl. offload for mlxsw) - ipv6: segment routing: add support for IPv4 decapsulation - icmp: add support for RFC 8335 extended PROBE messages - inet: use bigger hash table for IP ID generation - tcp: deal better with delayed TX completions - make sure we don't give up on fast TCP retransmissions only because driver is slow in reporting that it completed transmitting the original - tcp: reorder tcp_congestion_ops for better cache locality - mptcp: - add sockopt support for common TCP options - add support for common TCP msg flags - include multiple address ids in RM_ADDR - add reset option support for resetting one subflow - udp: GRO L4 improvements - improve 'forward' / 'frag_list' co-existence with UDP tunnel GRO, allowing the first to take place correctly even for encapsulated UDP traffic - micro-optimize dev_gro_receive() and flow dissection, avoid retpoline overhead on VLAN and TEB GRO - use less memory for sysctls, add a new sysctl type, to allow using u8 instead of "int" and "long" and shrink networking sysctls - veth: allow GRO without XDP - this allows aggregating UDP packets before handing them off to routing, bridge, OvS, etc. - allow specifing ifindex when device is moved to another namespace - netfilter: - nft_socket: add support for cgroupsv2 - nftables: add catch-all set element - special element used to define a default action in case normal lookup missed - use net_generic infra in many modules to avoid allocating per-ns memory unnecessarily - xps: improve the xps handling to avoid potential out-of-bound accesses and use-after-free when XPS change race with other re-configuration under traffic - add a config knob to turn off per-cpu netdev refcnt to catch underflows in testing Device APIs: - add WWAN subsystem to organize the WWAN interfaces better and hopefully start driving towards more unified and vendor- -independent APIs - ethtool: - add interface for reading IEEE MIB stats (incl. mlx5 and bnxt support) - allow network drivers to dump arbitrary SFP EEPROM data, current offset+length API was a poor fit for modern SFP which define EEPROM in terms of pages (incl. mlx5 support) - act_police, flow_offload: add support for packet-per-second policing (incl. offload for nfp) - psample: add additional metadata attributes like transit delay for packets sampled from switch HW (and corresponding egress and policy-based sampling in the mlxsw driver) - dsa: improve support for sandwiched LAGs with bridge and DSA - netfilter: - flowtable: use direct xmit in topologies with IP forwarding, bridging, vlans etc. - nftables: counter hardware offload support - Bluetooth: - improvements for firmware download w/ Intel devices - add support for reading AOSP vendor capabilities - add support for virtio transport driver - mac80211: - allow concurrent monitor iface and ethernet rx decap - set priority and queue mapping for injected frames - phy: add support for Clause-45 PHY Loopback - pci/iov: add sysfs MSI-X vector assignment interface to distribute MSI-X resources to VFs (incl. mlx5 support) New hardware/drivers: - dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit interfaces. - dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and BCM63xx switches - Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches - ath11k: support for QCN9074 a 802.11ax device - Bluetooth: Broadcom BCM4330 and BMC4334 - phy: Marvell 88X2222 transceiver support - mdio: add BCM6368 MDIO mux bus controller - r8152: support RTL8153 and RTL8156 (USB Ethernet) chips - mana: driver for Microsoft Azure Network Adapter (MANA) - Actions Semi Owl Ethernet MAC - can: driver for ETAS ES58X CAN/USB interfaces Pure driver changes: - add XDP support to: enetc, igc, stmmac - add AF_XDP support to: stmmac - virtio: - page_to_skb() use build_skb when there's sufficient tailroom (21% improvement for 1000B UDP frames) - support XDP even without dedicated Tx queues - share the Tx queues with the stack when necessary - mlx5: - flow rules: add support for mirroring with conntrack, matching on ICMP, GTP, flex filters and more - support packet sampling with flow offloads - persist uplink representor netdev across eswitch mode changes - allow coexistence of CQE compression and HW time-stamping - add ethtool extended link error state reporting - ice, iavf: support flow filters, UDP Segmentation Offload - dpaa2-switch: - move the driver out of staging - add spanning tree (STP) support - add rx copybreak support - add tc flower hardware offload on ingress traffic - ionic: - implement Rx page reuse - support HW PTP time-stamping - octeon: support TC hardware offloads - flower matching on ingress and egress ratelimitting. - stmmac: - add RX frame steering based on VLAN priority in tc flower - support frame preemption (FPE) - intel: add cross time-stamping freq difference adjustment - ocelot: - support forwarding of MRP frames in HW - support multiple bridges - support PTP Sync one-step timestamping - dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like learning, flooding etc. - ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350, SC7280 SoCs) - mt7601u: enable TDLS support - mt76: - add support for 802.3 rx frames (mt7915/mt7615) - mt7915 flash pre-calibration support - mt7921/mt7663 runtime power management fixes Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmCKFPIACgkQMUZtbf5S Irtw0g/+NA8bWdHNgG4H5rya0pv2z3IieLRmSdDfKRQQXcJpklawc5MKVVaTee/Q 5/QqgPdCsu1LAU6JXBKsKmyDDaMlQKdWuKbOqDSiAQKoMesZStTEHf9d851ZzgxA Cdb6O7BD3lBl/IN+oxNG+KcmD1LKquTPKGySq2mQtEdLO12ekAsranzmj4voKffd q9tBShpXQ7Dq77DLYfiQXVCvsizNcbbJFuxX0o9Lpb9+61ZyYAbogZSa9ypiZZwR I/9azRBtJg7UV1aD/cLuAfy66Qh7t63+rCxVazs5Os8jVO26P/jQdisnnOe/x+p9 wYEmKm3GSu0V4SAPxkWW+ooKusflCeqDoMIuooKt6kbP6BRj540veGw3Ww/m5YFr 7pLQkTSP/tSjuGQIdBE1LOP5LBO8DZeC8Kiop9V0fzAW9hFSZbEq25WW0bPj8QQO zA4Z7yWlslvxcfY2BdJX3wD8klaINkl/8fDWZFFsBdfFX2VeLtm7Xfduw34BJpvU rYT3oWr6PhtkPAKR32SUcemSfeWgIVU41eSshzRz3kez1NngBUuLlSGGSEaKbes5 pZVt6pYFFVByyf6MTHFEoQvafZfEw04JILZpo4R5V8iTHzom0kD3Py064sBiXEw2 B6t+OW4qgcxGblpFkK2lD4kR2s1TPUs0ckVO6sAy1x8q60KKKjY= =vcbA -----END PGP SIGNATURE----- Merge tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - bpf: - allow bpf programs calling kernel functions (initially to reuse TCP congestion control implementations) - enable task local storage for tracing programs - remove the need to store per-task state in hash maps, and allow tracing programs access to task local storage previously added for BPF_LSM - add bpf_for_each_map_elem() helper, allowing programs to walk all map elements in a more robust and easier to verify fashion - sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT redirection - lpm: add support for batched ops in LPM trie - add BTF_KIND_FLOAT support - mostly to allow use of BTF on s390 which has floats in its headers files - improve BPF syscall documentation and extend the use of kdoc parsing scripts we already employ for bpf-helpers - libbpf, bpftool: support static linking of BPF ELF files - improve support for encapsulation of L2 packets - xdp: restructure redirect actions to avoid a runtime lookup, improving performance by 4-8% in microbenchmarks - xsk: build skb by page (aka generic zerocopy xmit) - improve performance of software AF_XDP path by 33% for devices which don't need headers in the linear skb part (e.g. virtio) - nexthop: resilient next-hop groups - improve path stability on next-hops group changes (incl. offload for mlxsw) - ipv6: segment routing: add support for IPv4 decapsulation - icmp: add support for RFC 8335 extended PROBE messages - inet: use bigger hash table for IP ID generation - tcp: deal better with delayed TX completions - make sure we don't give up on fast TCP retransmissions only because driver is slow in reporting that it completed transmitting the original - tcp: reorder tcp_congestion_ops for better cache locality - mptcp: - add sockopt support for common TCP options - add support for common TCP msg flags - include multiple address ids in RM_ADDR - add reset option support for resetting one subflow - udp: GRO L4 improvements - improve 'forward' / 'frag_list' co-existence with UDP tunnel GRO, allowing the first to take place correctly even for encapsulated UDP traffic - micro-optimize dev_gro_receive() and flow dissection, avoid retpoline overhead on VLAN and TEB GRO - use less memory for sysctls, add a new sysctl type, to allow using u8 instead of "int" and "long" and shrink networking sysctls - veth: allow GRO without XDP - this allows aggregating UDP packets before handing them off to routing, bridge, OvS, etc. - allow specifing ifindex when device is moved to another namespace - netfilter: - nft_socket: add support for cgroupsv2 - nftables: add catch-all set element - special element used to define a default action in case normal lookup missed - use net_generic infra in many modules to avoid allocating per-ns memory unnecessarily - xps: improve the xps handling to avoid potential out-of-bound accesses and use-after-free when XPS change race with other re-configuration under traffic - add a config knob to turn off per-cpu netdev refcnt to catch underflows in testing Device APIs: - add WWAN subsystem to organize the WWAN interfaces better and hopefully start driving towards more unified and vendor- independent APIs - ethtool: - add interface for reading IEEE MIB stats (incl. mlx5 and bnxt support) - allow network drivers to dump arbitrary SFP EEPROM data, current offset+length API was a poor fit for modern SFP which define EEPROM in terms of pages (incl. mlx5 support) - act_police, flow_offload: add support for packet-per-second policing (incl. offload for nfp) - psample: add additional metadata attributes like transit delay for packets sampled from switch HW (and corresponding egress and policy-based sampling in the mlxsw driver) - dsa: improve support for sandwiched LAGs with bridge and DSA - netfilter: - flowtable: use direct xmit in topologies with IP forwarding, bridging, vlans etc. - nftables: counter hardware offload support - Bluetooth: - improvements for firmware download w/ Intel devices - add support for reading AOSP vendor capabilities - add support for virtio transport driver - mac80211: - allow concurrent monitor iface and ethernet rx decap - set priority and queue mapping for injected frames - phy: add support for Clause-45 PHY Loopback - pci/iov: add sysfs MSI-X vector assignment interface to distribute MSI-X resources to VFs (incl. mlx5 support) New hardware/drivers: - dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit interfaces. - dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and BCM63xx switches - Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches - ath11k: support for QCN9074 a 802.11ax device - Bluetooth: Broadcom BCM4330 and BMC4334 - phy: Marvell 88X2222 transceiver support - mdio: add BCM6368 MDIO mux bus controller - r8152: support RTL8153 and RTL8156 (USB Ethernet) chips - mana: driver for Microsoft Azure Network Adapter (MANA) - Actions Semi Owl Ethernet MAC - can: driver for ETAS ES58X CAN/USB interfaces Pure driver changes: - add XDP support to: enetc, igc, stmmac - add AF_XDP support to: stmmac - virtio: - page_to_skb() use build_skb when there's sufficient tailroom (21% improvement for 1000B UDP frames) - support XDP even without dedicated Tx queues - share the Tx queues with the stack when necessary - mlx5: - flow rules: add support for mirroring with conntrack, matching on ICMP, GTP, flex filters and more - support packet sampling with flow offloads - persist uplink representor netdev across eswitch mode changes - allow coexistence of CQE compression and HW time-stamping - add ethtool extended link error state reporting - ice, iavf: support flow filters, UDP Segmentation Offload - dpaa2-switch: - move the driver out of staging - add spanning tree (STP) support - add rx copybreak support - add tc flower hardware offload on ingress traffic - ionic: - implement Rx page reuse - support HW PTP time-stamping - octeon: support TC hardware offloads - flower matching on ingress and egress ratelimitting. - stmmac: - add RX frame steering based on VLAN priority in tc flower - support frame preemption (FPE) - intel: add cross time-stamping freq difference adjustment - ocelot: - support forwarding of MRP frames in HW - support multiple bridges - support PTP Sync one-step timestamping - dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like learning, flooding etc. - ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350, SC7280 SoCs) - mt7601u: enable TDLS support - mt76: - add support for 802.3 rx frames (mt7915/mt7615) - mt7915 flash pre-calibration support - mt7921/mt7663 runtime power management fixes" * tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2451 commits) net: selftest: fix build issue if INET is disabled net: netrom: nr_in: Remove redundant assignment to ns net: tun: Remove redundant assignment to ret net: phy: marvell: add downshift support for M88E1240 net: dsa: ksz: Make reg_mib_cnt a u8 as it never exceeds 255 net/sched: act_ct: Remove redundant ct get and check icmp: standardize naming of RFC 8335 PROBE constants bpf, selftests: Update array map tests for per-cpu batched ops bpf: Add batched ops support for percpu array bpf: Implement formatted output helpers with bstr_printf seq_file: Add a seq_bprintf function sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues net:nfc:digital: Fix a double free in digital_tg_recv_dep_req net: fix a concurrency bug in l2tp_tunnel_register() net/smc: Remove redundant assignment to rc mpls: Remove redundant assignment to err llc2: Remove redundant assignment to rc net/tls: Remove redundant initialization of record rds: Remove redundant assignment to nr_sig dt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0 ... |
||
Arnaldo Carvalho de Melo
|
c6e3bf4371 |
perf build: Defer printing detected features to the end of all feature checks
We were doing it in tools/build/Makefile.feature, after running the feature checks, but then in tools/perf/Makefile.config we can call more feature checks when we notice that some feature check failed, like when libbfd wasn't detected and we add libraries to the LDFLAGS of its feature check to try again, etc. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
19177bc3da |
tools build: Allow deferring printing the results of feature detection
By setting FEATURE_DISPLAY_DEFERRED=1 a tool may ask for the printout of the detected features in tools/build/Makefile.feature to be done later adter extra feature checks are done that are tool specific. The perf tool will do it via its tools/perf/Makefile.config, as it performs such extra feature checks. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jiri Olsa
|
fbed59f844 |
perf build: Regenerate the FEATURE_DUMP file after extra feature checks
Feature detection is done in tools/build/Makefile.feature, we may exit there with some features not detected and then, in tools/perf/Makefile.config try adding extra libraries to link and then do extra feature checks to see if we now find the feature. This is the case with the disassembler-four-args that checks if the diassembler() function in libopcodes (binutils) has a signature with one or with four arguments, as this is not ABI and they changed it at some point. This is not a problem when doing normal builds, for instance: $ make -C tools/perf O=/tmp/build/perf As we don't use what is in FEATURE-DUMP at that point, but is a problem if we pass FEATURE_DUMP=/previously-detected-features as we do in 'make -C tools/perf build-test' to reuse the feature detection in the many build combinations we test there. When that is done feature-disassembler-four-args will be set to 0, but opensuse 15.1 has the four arguments function signature in disassembler(). The build thus fails. Fix it by rewriting the FEATURE-DUMP file at the end of tools/perf/Makefile.config to register features we retested in that make file. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Leo Yan
|
81e70d7ee4 |
perf session: Dump PERF_RECORD_TIME_CONV event
Now perf tool uses the common stub function process_event_op2_stub() for dumping TIME_CONV event, thus it doesn't output the clock parameters contained in the event. This patch adds the callback function for dumping the hardware clock parameters in TIME_CONV event. Before: # perf report -D 0x978 [0x38]: event: 79 . . ... raw event: size 56 bytes . 0000: 4f 00 00 00 00 00 38 00 15 00 00 00 00 00 00 00 O.....8......... . 0010: 00 00 40 01 00 00 00 00 86 89 0b bf df ff ff ff ..@........<BF><DF><FF><FF><FF> . 0020: d1 c1 b2 39 03 00 00 00 ff ff ff ff ff ff ff 00 <D1><C1><B2>9....<FF><FF><FF><FF><FF><FF><FF>. . 0030: 01 01 00 00 00 00 00 00 ........ 0 0 0x978 [0x38]: PERF_RECORD_TIME_CONV : unhandled! [...] After: # perf report -D 0x978 [0x38]: event: 79 . . ... raw event: size 56 bytes . 0000: 4f 00 00 00 00 00 38 00 15 00 00 00 00 00 00 00 O.....8......... . 0010: 00 00 40 01 00 00 00 00 86 89 0b bf df ff ff ff ..@........<BF><DF><FF><FF><FF> . 0020: d1 c1 b2 39 03 00 00 00 ff ff ff ff ff ff ff 00 <D1><C1><B2>9....<FF><FF><FF><FF><FF><FF><FF>. . 0030: 01 01 00 00 00 00 00 00 ........ 0 0 0x978 [0x38]: PERF_RECORD_TIME_CONV ... Time Shift 21 ... Time Muliplier 20971520 ... Time Zero 18446743935180835206 ... Time Cycles 13852918225 ... Time Mask 0xffffffffffffff ... Cap Time Zero 1 ... Cap Time Short 1 : unhandled! [...] Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steve MacLean <Steve.MacLean@Microsoft.com> Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io> Link: https://lore.kernel.org/r/20210428120915.7123-5-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Leo Yan
|
050ffc4490 |
perf session: Add swap operation for event TIME_CONV
Since commit |
||
Leo Yan
|
aa616f5a8a |
perf jit: Let convert_timestamp() to be backwards-compatible
Commit |
||
Leo Yan
|
e1d380ea8b |
perf tools: Change fields type in perf_record_time_conv
C standard claims "An object declared as type _Bool is large enough to
store the values 0 and 1", bool type size can be 1 byte or larger than
1 byte. Thus it's uncertian for bool type size with different
compilers.
This patch changes the bool type in structure perf_record_time_conv to
__u8 type, and pads extra bytes for 8-byte alignment; this can give
reliable structure size.
Fixes:
|
||
Michael Petlan
|
56d32d4cac |
perf tools: Enable libtraceevent dynamic linking
Currently we support only static linking with kernel's libtraceevent (tools/lib/traceevent). This patch adds libtraceevent package detection and support to link perf with it dynamically. The libtraceevent package status is displayed with: $ make VF=1 LIBTRACEEVENT_DYNAMIC=1 ... ... libtraceevent: [ on ] Default behavior remains the same (static linking). Committer testing: $ make LIBTRACEEVENT_DYNAMIC=1 VF=1 O=/tmp/build/perf -C tools/perf install-bin |& grep traceevent Makefile.config:1090: *** Error: No libtraceevent devel library found, please install libtraceevent-devel. Stop. $ Signed-off-by: Michael Petlan <mpetlan@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> LPU-Reference: 20210428092023.4009-1-mpetlan@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
2750ce1d4d |
perf Documentation: Document intel-hybrid support
Add some words and examples to help understanding of Intel hybrid perf support. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-27-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
a37f3b8856 |
perf tests: Skip 'perf stat metrics (shadow stat) test' for hybrid
Currently we don't support shadow stat for hybrid. root@ssp-pwrt-002:~# ./perf stat -e cycles,instructions -a -- sleep 1 Performance counter stats for 'system wide': 12,883,109,591 cpu_core/cycles/ 6,405,163,221 cpu_atom/cycles/ 555,553,778 cpu_core/instructions/ 841,158,734 cpu_atom/instructions/ 1.002644773 seconds time elapsed Now there is no shadow stat 'insn per cycle' reported. We will support it later and now just skip the 'perf stat metrics (shadow stat) test'. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-26-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
d9da6f70eb |
perf tests: Support 'Convert perf time to TSC' test for hybrid
Since for "cycles:u' on hybrid platform, it creates two "cycles". So the second evsel in evlist also needs initialization. With this patch, # ./perf test 71 71: Convert perf time to TSC : Ok Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-25-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
c102038892 |
perf tests: Support 'Session topology' test for hybrid
Force to create one event "cpu_core/cycles/" by default, otherwise in evlist__valid_sample_type, the checking of 'if (evlist->core.nr_entries == 1)' would be failed. # ./perf test 41 41: Session topology : Ok Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-24-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
6081e876ed |
perf tests: Support 'Parse and process metrics' test for hybrid
Some events are not supported. Only pick up some cases for hybrid. # ./perf test 68 68: Parse and process metrics : Ok Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-23-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
43eb05d066 |
perf tests: Support 'Track with sched_switch' test for hybrid
Since for "cycles:u' on hybrid platform, it creates two "cycles". So the number of events in evlist is not expected in next test steps. Now we just use one event "cpu_core/cycles:u/" for hybrid. # ./perf test 35 35: Track with sched_switch : Ok Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-22-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
f15da0b1fb |
perf tests: Skip 'Setup struct perf_event_attr' test for hybrid
For hybrid, the attr.type consists of pmu type id + original type. There will be much changes for this test. Now we temporarily skip this test case and TODO in future. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-21-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
afff9f312e |
perf tests: Add hybrid cases for 'Roundtrip evsel->name' test
Since for one hw event, two hybrid events are created. For example, evsel->idx evsel__name(evsel) 0 cycles 1 cycles 2 instructions 3 instructions ... So for comparing the evsel name on hybrid, the evsel->idx needs to be divided by 2. # ./perf test 14 14: Roundtrip evsel->name : Ok Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-20-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
2541cb63ac |
perf tests: Add hybrid cases for 'Parse event definition strings' test
Add basic hybrid test cases for 'Parse event definition strings' test. # perf test 6 6: Parse event definition strings : Ok Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-19-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
91c0f5ec81 |
perf record: Uniquify hybrid event name
For perf-record, it would be useful to tell user the pmu which the event belongs to. For example, # perf record -a -- sleep 1 # perf report # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 106 of event 'cpu_core/cycles/' # Event count (approx.): 22043448 # # Overhead Command Shared Object Symbol # ........ ............ ....................... ............................ # ... Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-18-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
660e533e87 |
perf stat: Warn group events from different hybrid PMU
If a group has events which are from different hybrid PMUs, shows a warning: "WARNING: events in group from different hybrid PMUs!" This is to remind the user not to put the core event and atom event into one group. Next, just disable grouping. # perf stat -e "{cpu_core/cycles/,cpu_atom/cycles/}" -a -- sleep 1 WARNING: events in group from different hybrid PMUs! WARNING: grouped events cpus do not match, disabling group: anon group { cpu_core/cycles/, cpu_atom/cycles/ } Performance counter stats for 'system wide': 5,438,125 cpu_core/cycles/ 3,914,586 cpu_atom/cycles/ 1.004250966 seconds time elapsed Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-17-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
92637cc729 |
perf stat: Filter out unmatched aggregation for hybrid event
perf-stat has supported some aggregation modes, such as --per-core, --per-socket and etc. While for hybrid event, it may only available on part of cpus. So for --per-core, we need to filter out the unavailable cores, for --per-socket, filter out the unavailable sockets, and so on. Before: # perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1 Performance counter stats for 'system wide': S0-D0-C0 2 479,530 cpu_core/cycles/ S0-D0-C4 2 175,007 cpu_core/cycles/ S0-D0-C8 2 166,240 cpu_core/cycles/ S0-D0-C12 2 704,673 cpu_core/cycles/ S0-D0-C16 2 865,835 cpu_core/cycles/ S0-D0-C20 2 2,958,461 cpu_core/cycles/ S0-D0-C24 2 163,988 cpu_core/cycles/ S0-D0-C28 2 164,729 cpu_core/cycles/ S0-D0-C32 0 <not counted> cpu_core/cycles/ S0-D0-C33 0 <not counted> cpu_core/cycles/ S0-D0-C34 0 <not counted> cpu_core/cycles/ S0-D0-C35 0 <not counted> cpu_core/cycles/ S0-D0-C36 0 <not counted> cpu_core/cycles/ S0-D0-C37 0 <not counted> cpu_core/cycles/ S0-D0-C38 0 <not counted> cpu_core/cycles/ S0-D0-C39 0 <not counted> cpu_core/cycles/ 1.003597211 seconds time elapsed After: # perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1 Performance counter stats for 'system wide': S0-D0-C0 2 210,428 cpu_core/cycles/ S0-D0-C4 2 444,830 cpu_core/cycles/ S0-D0-C8 2 435,241 cpu_core/cycles/ S0-D0-C12 2 423,976 cpu_core/cycles/ S0-D0-C16 2 859,350 cpu_core/cycles/ S0-D0-C20 2 1,559,589 cpu_core/cycles/ S0-D0-C24 2 163,924 cpu_core/cycles/ S0-D0-C28 2 376,610 cpu_core/cycles/ 1.003621290 seconds time elapsed Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Co-developed-by: Jiri Olsa <jolsa@redhat.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-16-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
ac2dc29edd |
perf stat: Add default hybrid events
Previously if '-e' is not specified in perf stat, some software events and hardware events are added to evlist by default. Before: # perf stat -a -- sleep 1 Performance counter stats for 'system wide': 24,044.40 msec cpu-clock # 23.946 CPUs utilized 99 context-switches # 4.117 /sec 24 cpu-migrations # 0.998 /sec 3 page-faults # 0.125 /sec 7,000,244 cycles # 0.000 GHz 2,955,024 instructions # 0.42 insn per cycle 608,941 branches # 25.326 K/sec 31,991 branch-misses # 5.25% of all branches 1.004106859 seconds time elapsed Among the events, cycles, instructions, branches and branch-misses are hardware events. One hybrid platform, two hardware events are created for one hardware event. cpu_core/cycles/, cpu_atom/cycles/, cpu_core/instructions/, cpu_atom/instructions/, cpu_core/branches/, cpu_atom/branches/, cpu_core/branch-misses/, cpu_atom/branch-misses/ These events would be added to evlist on hybrid platform. Since parse_events() has been supported to create two hardware events for one event on hybrid platform, so we just use parse_events(evlist, "cycles,instructions,branches,branch-misses") to create the default events and add them to evlist. After: # perf stat -a -- sleep 1 Performance counter stats for 'system wide': 24,043.99 msec cpu-clock # 23.991 CPUs utilized 139 context-switches # 5.781 /sec 25 cpu-migrations # 1.040 /sec 6 page-faults # 0.250 /sec 10,381,751 cpu_core/cycles/ # 431.782 K/sec 1,264,216 cpu_atom/cycles/ # 52.579 K/sec 3,406,958 cpu_core/instructions/ # 141.697 K/sec 414,588 cpu_atom/instructions/ # 17.243 K/sec 705,149 cpu_core/branches/ # 29.327 K/sec 82,358 cpu_atom/branches/ # 3.425 K/sec 40,821 cpu_core/branch-misses/ # 1.698 K/sec 9,086 cpu_atom/branch-misses/ # 377.891 /sec 1.002228863 seconds time elapsed We can see two events are created for one hardware event. One TODO is, the shadow stats looks a bit different, now it's just 'M/sec'. The perf_stat__update_shadow_stats and perf_stat__print_shadow_stats need to be improved in future if we want to get the original shadow stats. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-15-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
b53a0755d5 |
perf record: Create two hybrid 'cycles' events by default
When evlist is empty, for example no '-e' specified in perf record, one default 'cycles' event is added to evlist. While on hybrid platform, it needs to create two default 'cycles' events. One is for cpu_core, the other is for cpu_atom. This patch actually calls evsel__new_cycles() two times to create two 'cycles' events. # ./perf record -vv -a -- sleep 1 ... ------------------------------------------------------------ perf_event_attr: size 120 config 0x400000000 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|ID|CPU|PERIOD read_format ID disabled 1 inherit 1 freq 1 precise_ip 3 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 5 sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 6 sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 7 sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 9 sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 10 sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 11 sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 12 sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 13 sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 14 sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 15 sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 16 sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 17 sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 18 sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 19 sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 20 sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 21 ------------------------------------------------------------ perf_event_attr: size 120 config 0x800000000 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|ID|CPU|PERIOD read_format ID disabled 1 inherit 1 freq 1 precise_ip 3 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 22 sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 23 sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 24 sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 25 sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 26 sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 27 sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 28 sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 29 ------------------------------------------------------------ We have to create evlist-hybrid.c otherwise due to the symbol dependency the perf test python would be failed. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-14-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
5e4edd1f73 |
perf parse-events: Support event inside hybrid pmu
On hybrid platform, user may want to enable events on one pmu. Following syntax are supported: cpu_core/<event>/ cpu_atom/<event>/ But the syntax doesn't work for cache event. Before: # perf stat -e cpu_core/LLC-loads/ -a -- sleep 1 event syntax error: 'cpu_core/LLC-loads/' \___ unknown term 'LLC-loads' for pmu 'cpu_core' Cache events are a bit complex. We can't create aliases for them. We use another solution. For example, if we use "cpu_core/LLC-loads/", in parse_events_add_pmu(), term->config is "LLC-loads". Then we create a new parser to scan "LLC-loads". The parse_events_add_cache() would be called during parsing. The parse_state->hybrid_pmu_name is used to identify the pmu where the event should be enabled on. After: # perf stat -e cpu_core/LLC-loads/ -a -- sleep 1 Performance counter stats for 'system wide': 24,593 cpu_core/LLC-loads/ 1.003911601 seconds time elapsed Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-13-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
c93afadc92 |
perf parse-events: Compare with hybrid pmu name
On hybrid platform, user may want to enable event only on one pmu. Following syntax will be supported: cpu_core/<event>/ cpu_atom/<event>/ For hardware event, hardware cache event and raw event, two events are created by default. We pass the specified pmu name in parse_state and it would be checked before event creation. So next only the event with the specified pmu would be created. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-12-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
94da591b1c |
perf parse-events: Create two hybrid raw events
On hybrid platform, same raw event is possible to be available on both cpu_core pmu and cpu_atom pmu. It's supported to create two raw events for one event encoding. For raw events, the attr.type is PMU type. # perf stat -e r3c -a -vv -- sleep 1 Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 4 size 120 config 0x3c sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: type 4 size 120 config 0x3c sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19 ------------------------------------------------------------ perf_event_attr: type 8 size 120 config 0x3c sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: type 8 size 120 config 0x3c sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27 r3c: 0: 434449 1001412521 1001412521 r3c: 1: 173162 1001482031 1001482031 r3c: 2: 231710 1001524974 1001524974 r3c: 3: 110012 1001563523 1001563523 r3c: 4: 191517 1001593221 1001593221 r3c: 5: 956458 1001628147 1001628147 r3c: 6: 416969 1001715626 1001715626 r3c: 7: 1047527 1001596650 1001596650 r3c: 8: 103877 1001633520 1001633520 r3c: 9: 70571 1001637898 1001637898 r3c: 10: 550284 1001714398 1001714398 r3c: 11: 1257274 1001738349 1001738349 r3c: 12: 107797 1001801432 1001801432 r3c: 13: 67471 1001836281 1001836281 r3c: 14: 286782 1001923161 1001923161 r3c: 15: 815509 1001952550 1001952550 r3c: 0: 95994 1002071117 1002071117 r3c: 1: 105570 1002142438 1002142438 r3c: 2: 115921 1002189147 1002189147 r3c: 3: 72747 1002238133 1002238133 r3c: 4: 103519 1002276753 1002276753 r3c: 5: 121382 1002315131 1002315131 r3c: 6: 80298 1002248050 1002248050 r3c: 7: 466790 1002278221 1002278221 r3c: 6821369 16026754282 16026754282 r3c: 1162221 8017758990 8017758990 Performance counter stats for 'system wide': 6,821,369 cpu_core/r3c/ 1,162,221 cpu_atom/r3c/ 1.002289965 seconds time elapsed Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-11-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
30def61f64 |
perf parse-events: Create two hybrid cache events
For cache events, they have pre-defined configs. The kernel needs to know where the cache event comes from (e.g. from cpu_core pmu or from cpu_atom pmu). But the perf type PERF_TYPE_HW_CACHE can't carry pmu information. Now the type PERF_TYPE_HW_CACHE is extended to be PMU aware type. The PMU type ID is stored at attr.config[63:32]. When enabling a hybrid cache event without specified pmu, such as, 'perf stat -e LLC-loads -a', two events are created automatically. One is for atom, the other is for core. # perf stat -e LLC-loads -a -vv -- sleep 1 Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 3 size 120 config 0x400000002 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: type 3 size 120 config 0x400000002 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19 ------------------------------------------------------------ perf_event_attr: type 3 size 120 config 0x800000002 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: type 3 size 120 config 0x800000002 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27 LLC-loads: 0: 1507 1001800280 1001800280 LLC-loads: 1: 666 1001812250 1001812250 LLC-loads: 2: 3353 1001813453 1001813453 LLC-loads: 3: 514 1001848795 1001848795 LLC-loads: 4: 627 1001952832 1001952832 LLC-loads: 5: 4399 1001451154 1001451154 LLC-loads: 6: 1240 1001481052 1001481052 LLC-loads: 7: 478 1001520348 1001520348 LLC-loads: 8: 691 1001551236 1001551236 LLC-loads: 9: 310 1001578945 1001578945 LLC-loads: 10: 1018 1001594354 1001594354 LLC-loads: 11: 3656 1001622355 1001622355 LLC-loads: 12: 882 1001661416 1001661416 LLC-loads: 13: 506 1001693963 1001693963 LLC-loads: 14: 3547 1001721013 1001721013 LLC-loads: 15: 1399 1001734818 1001734818 LLC-loads: 0: 1314 1001793826 1001793826 LLC-loads: 1: 2857 1001752764 1001752764 LLC-loads: 2: 646 1001830694 1001830694 LLC-loads: 3: 1612 1001864861 1001864861 LLC-loads: 4: 2244 1001912381 1001912381 LLC-loads: 5: 1255 1001943889 1001943889 LLC-loads: 6: 4624 1002021109 1002021109 LLC-loads: 7: 2703 1001959302 1001959302 LLC-loads: 24793 16026838264 16026838264 LLC-loads: 17255 8015078826 8015078826 Performance counter stats for 'system wide': 24,793 cpu_core/LLC-loads/ 17,255 cpu_atom/LLC-loads/ 1.001970988 seconds time elapsed 0x4 in 0x400000002 indicates the cpu_core pmu. 0x8 in 0x800000002 indicates the cpu_atom pmu. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-10-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
9cbfa2f64c |
perf parse-events: Create two hybrid hardware events
Current hardware events has special perf types PERF_TYPE_HARDWARE.
But it doesn't pass the PMU type in the user interface. For a hybrid
system, the perf kernel doesn't know which PMU the events belong to.
So now this type is extended to be PMU aware type. The PMU type ID
is stored at attr.config[63:32].
PMU type ID is retrieved from sysfs.
root@lkp-adl-d01:/sys/devices/cpu_atom# cat type
8
root@lkp-adl-d01:/sys/devices/cpu_core# cat type
4
When enabling a hybrid hardware event without specified pmu, such as,
'perf stat -e cycles -a', two events are created automatically. One
is for atom, the other is for core.
# perf stat -e cycles -a -vv -- sleep 1
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
size 120
config 0x400000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3
------------------------------------------------------------
...
------------------------------------------------------------
perf_event_attr:
size 120
config 0x400000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19
------------------------------------------------------------
perf_event_attr:
size 120
config 0x800000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20
------------------------------------------------------------
...
------------------------------------------------------------
perf_event_attr:
size 120
config 0x800000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27
cycles: 0: 836272 1001525722 1001525722
cycles: 1: 628564 1001580453 1001580453
cycles: 2: 872693 1001605997 1001605997
cycles: 3: 70417 1001641369 1001641369
cycles: 4: 88593 1001726722 1001726722
cycles: 5: 470495 1001752993 1001752993
cycles: 6: 484733 1001840440 1001840440
cycles: 7:
|
||
Jin Yao
|
12279429d8 |
perf stat: Uniquify hybrid event name
It would be useful to let user know the pmu which the event belongs to. perf-stat has supported '--no-merge' option and it can print the pmu name after the event name, such as: "cycles [cpu_core]" Now this option is enabled by default for hybrid platform but change the format to: "cpu_core/cycles/" If user configs the name, we still use the user specified name. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> ink: https://lore.kernel.org/r/20210427070139.25256-8-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
c5a26ea490 |
perf pmu: Add hybrid helper functions
The functions perf_pmu__is_hybrid and perf_pmu__find_hybrid_pmu can be used to identify the hybrid platform and return the found hybrid cpu pmu. All the detected hybrid pmus have been saved in 'perf_pmu__hybrid_pmus' list. So we just need to search this list. perf_pmu__hybrid_type_to_pmu converts the user specified string to hybrid pmu name. This is used to support the '--cputype' option in next patches. perf_pmu__has_hybrid checks the existing of hybrid pmu. Note that, we have to define it in pmu.c (make pmu-hybrid.c no more symbol dependency), otherwise perf test python would be failed. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-7-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
444624307c |
perf pmu: Save detected hybrid pmus to a global pmu list
We identify the cpu_core pmu and cpu_atom pmu by explicitly checking following files: For cpu_core, checks: "/sys/bus/event_source/devices/cpu_core/cpus" For cpu_atom, checks: "/sys/bus/event_source/devices/cpu_atom/cpus" If the 'cpus' file exists and it has data, the pmu exists. But in order not to hardcode the "cpu_core" and "cpu_atom", and make the code in a generic way. So if the path "/sys/bus/event_source/devices/cpu_xxx/cpus" exists, the hybrid pmu exists. All the detected hybrid pmus are linked to a global list 'perf_pmu__hybrid_pmus' and then next we just need to iterate the list to get all hybrid pmu by using perf_pmu__for_each_hybrid_pmu. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-6-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
32705de7d4 |
perf pmu: Save pmu name
On hybrid platform, one event is available on one pmu (such as, available on cpu_core or on cpu_atom). This patch saves the pmu name to the pmu field of struct perf_pmu_alias. Then next we can know the pmu which the event can be enabled on. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-5-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
eab35953e6 |
perf pmu: Simplify arguments of __perf_pmu__new_alias
Simplify the arguments of __perf_pmu__new_alias() by passing the whole 'struct pme_event' pointer. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-4-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
6b64833b9e |
perf jevents: Support unit value "cpu_core" and "cpu_atom"
For some Intel platforms, such as Alderlake, which is a hybrid platform and it consists of atom cpu and core cpu. Each cpu has dedicated event list. Part of events are available on core cpu, part of events are available on atom cpu. The kernel exports new cpu pmus: cpu_core and cpu_atom. The event in json is added with a new field "Unit" to indicate which pmu the event is available on. For example, one event in cache.json, { "BriefDescription": "Counts the number of load ops retired that", "CollectPEBSRecord": "2", "Counter": "0,1,2,3", "EventCode": "0xd2", "EventName": "MEM_LOAD_UOPS_RETIRED_MISC.MMIO", "PEBScounters": "0,1,2,3", "SampleAfterValue": "1000003", "UMask": "0x80", "Unit": "cpu_atom" }, The unit "cpu_atom" indicates this event is only available on "cpu_atom". In generated pmu-events.c, we can see: { .name = "mem_load_uops_retired_misc.mmio", .event = "period=1000003,umask=0x80,event=0xd2", .desc = "Counts the number of load ops retired that. Unit: cpu_atom ", .topic = "cache", .pmu = "cpu_atom", }, But if without this patch, the "uncore_" prefix is added before "cpu_atom", such as: .pmu = "uncore_cpu_atom" That would be a wrong pmu. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-3-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
4127361191 |
tools headers uapi: Update tools's copy of linux/perf_event.h
To get the changes in:
Liang Kan's patch
|
||
Namhyung Kim
|
462f57dbf9 |
perf report: Print percentage of each event statistics
It's sometimes useful to see how many samples vs other events in the data file with percent values. $ perf report --stat Aggregated stats: TOTAL events: 20064 MMAP events: 239 ( 1.2%) COMM events: 1518 ( 7.6%) EXIT events: 1 ( 0.0%) FORK events: 1517 ( 7.6%) SAMPLE events: 4015 (20.0%) MMAP2 events: 12769 (63.6%) FINISHED_ROUND events: 2 ( 0.0%) THREAD_MAP events: 1 ( 0.0%) CPU_MAP events: 1 ( 0.0%) TIME_CONV events: 1 ( 0.0%) cycles stats: SAMPLE events: 2475 instructions stats: SAMPLE events: 1540 Suggested-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427013717.1651674-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
8f08cf3330 |
perf report: Make --skip-empty as default
so that the compact output is shown by default. Also add 'report.skip-empty' config option to override the default. Users can also use --no-skip-empty command line option to change the behavior anytime. Committer testing: $ perf report --stat Aggregated stats: TOTAL events: 19 COMM events: 2 EXIT events: 1 SAMPLE events: 8 MMAP2 events: 4 FINISHED_ROUND events: 1 THREAD_MAP events: 1 CPU_MAP events: 1 TIME_CONV events: 1 cycles:u stats: SAMPLE events: 8 $ perf config report.skip-empty=false $ perf report --stat Aggregated stats: TOTAL events: 19 MMAP events: 0 LOST events: 0 COMM events: 2 EXIT events: 1 THROTTLE events: 0 UNTHROTTLE events: 0 FORK events: 0 READ events: 0 SAMPLE events: 8 MMAP2 events: 4 AUX events: 0 ITRACE_START events: 0 LOST_SAMPLES events: 0 SWITCH events: 0 SWITCH_CPU_WIDE events: 0 NAMESPACES events: 0 KSYMBOL events: 0 BPF_EVENT events: 0 CGROUP events: 0 TEXT_POKE events: 0 ATTR events: 0 EVENT_TYPE events: 0 TRACING_DATA events: 0 BUILD_ID events: 0 FINISHED_ROUND events: 1 ID_INDEX events: 0 AUXTRACE_INFO events: 0 AUXTRACE events: 0 AUXTRACE_ERROR events: 0 THREAD_MAP events: 1 CPU_MAP events: 1 STAT_CONFIG events: 0 STAT events: 0 STAT_ROUND events: 0 EVENT_UPDATE events: 0 TIME_CONV events: 1 FEATURE events: 0 COMPRESSED events: 0 cycles:u stats: SAMPLE events: 8 $ perf config report.skip-empty report.skip-empty=false $ Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427013717.1651674-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
2775de0b11 |
perf report: Add --skip-empty option to suppress 0 event stat
To make the output more readable, I think it's better to remove 0's in the output. Also the dummy event has no event stats so it just wasts the space. Let's use the --skip-empty option to suppress it. $ perf report --stat --skip-empty Aggregated stats: TOTAL events: 16530 MMAP events: 226 COMM events: 1596 EXIT events: 2 THROTTLE events: 121 UNTHROTTLE events: 117 FORK events: 1595 SAMPLE events: 719 MMAP2 events: 12147 CGROUP events: 2 FINISHED_ROUND events: 2 THREAD_MAP events: 1 CPU_MAP events: 1 TIME_CONV events: 1 cycles stats: SAMPLE events: 719 Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427013717.1651674-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
55f7544438 |
perf report: Show event sample counts in --stat output
To make the output identical with perf report -D, it needs to show per-event sample counts along with the aggregated stat at the end. Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427013717.1651674-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
0f0abbace3 |
perf hists: Split hists_stats from events_stats
Each struct hists have events_stats but most of the fields were not used. It's to count number of samples and periods whether filtered or not. And other fields are used only by evlist. So it'd be better to split hists_stats and events_stats to reduce wasted memory in the struct hists. This makes the output of event statistics in the perf report compact by skipping 0 events in each evsel/hists. Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427013717.1651674-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
bf8f8587bf |
perf top: Use evlist->events_stat to count events
It's mainly to count lost events for the warning so it should be ok to use the evlist->stats instead. This is needed for changes in the next commit. Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427013717.1651674-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Nicholas Fraser
|
d0713d4ca3 |
perf data: Add JSON export
This adds a feature to export perf data to JSON. The resolved symbols are exported into the JSON so that external tools don't need to load the dsos themselves (or even have access to them at all.) This makes it easy to load and analyze perf data with standalone tools where direct perf or libbabeltrace integration is impractical. The exporter uses a minimal inline JSON encoding without any external dependencies. Currently it only outputs some headers and sample metadata but it's easily extensible. Use it like this: $ perf data convert --to-json out.json Committer notes: Fixup a __printf() bug that broke the build: util/data-convert-json.c:103:11: error: expected ‘)’ before numeric constant 103 | __(printf, 5, 6) | ^~ | ) util/data-convert-json.c: In function ‘output_sample_callchain_entry’: util/data-convert-json.c:124:2: error: implicit declaration of function ‘output_json_key_format’; did you mean ‘output_json_format’? [-Werror=implicit-function-declaration] 124 | output_json_key_format(out, false, 5, "ip", "\"0x%" PRIx64 "\"", ip); | ^~~~~~~~~~~~~~~~~~~~~~ | output_json_format Also had to add this patch to fix errors reported by various versions of clang: - if (al && al->sym && al->sym->name && strlen(al->sym->name) > 0) { + if (al && al->sym && al->sym->namelen) { al->sym->name is a zero sized array, to avoid one extra alloc in the symbol__new() constructor, sym->namelen carries its strlen. Committer testing: $ ls -la out.json ls: cannot access 'out.json': No such file or directory $ perf record sleep 0.1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ] $ perf report --stats | grep -w SAMPLE SAMPLE events: 8 $ perf data convert --to-json out.json [ perf data convert: Converted 'perf.data' into JSON data 'out.json' ] [ perf data convert: Converted and wrote 0.002 MB (8 samples) ] $ ls -la out.json -rw-rw-r--. 1 acme acme 2017 Apr 26 17:29 out.json $ cat out.json { "linux-perf-json-version": 1, "headers": { "header-version": 1, "captured-on": "2021-04-26T20:28:57Z", "data-offset": 432, "data-size": 1016, "feat-offset": 1448, "hostname": "five", "os-release": "5.11.14-200.fc33.x86_64", "arch": "x86_64", "cpu-desc": "AMD Ryzen 9 3900X 12-Core Processor", "cpuid": "AuthenticAMD,23,113,0", "nrcpus-online": 24, "nrcpus-avail": 24, "perf-version": "5.12.gee134f3189bd", "cmdline": [ "/home/acme/bin/perf", "record", "sleep", "0.1" ] }, "samples": [ { "timestamp": 170517539043684, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0xffffffffa6268827" } ] }, { "timestamp": 170517539048443, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0xffffffffa661359d" } ] }, { "timestamp": 170517539051018, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0xffffffffa6311e18" } ] }, { "timestamp": 170517539053652, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0x7fdb77b4812b", "symbol": "_dl_start", "dso": "ld-2.32.so" } ] }, { "timestamp": 170517539055306, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0xffffffffa6269286" } ] }, { "timestamp": 170517539057590, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0xffffffffa62abd8b" } ] }, { "timestamp": 170517539067559, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0x7fdb77b5e9e9", "symbol": "__GI___tunables_init", "dso": "ld-2.32.so" } ] }, { "timestamp": 170517539282452, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0x7fdb779978d2", "symbol": "getenv", "dso": "libc-2.32.so" } ] } ] } $ Signed-off-by: Nicholas Fraser <nfraser@codeweavers.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Changbin Du <changbin.du@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tan Xiaojun <tanxiaojun@huawei.com> Cc: Ulrich Czekalla <uczekalla@codeweavers.com> Link: http://lore.kernel.org/lkml/3884969f-804d-2f53-c648-e2b0bd85edff@codeweavers.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Song Liu
|
5508c9dae2 |
perf stat: Introduce bpf_counter_ops->disable()
Introduce bpf_counter_ops->disable(), which is used stop counting the event. Committer notes: Added a dummy bpf_counter__disable() to the python binding to avoid having 'perf test python' failing. bpf_counter isn't supported in the python binding. Signed-off-by: Song Liu <song@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: kernel-team@fb.com Link: https://lore.kernel.org/r/20210425214333.1090950-6-song@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Song Liu
|
01bd8efcec |
perf stat: Introduce ':b' modifier
Introduce 'b' modifier to event parser, which means use BPF program to manage this event. This is the same as --bpf-counters option, but only applies to this event. For example, perf stat -e cycles:b,cs # use bpf for cycles, but not cs perf stat -e cycles,cs --bpf-counters # use bpf for both cycles and cs Suggested-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/r/20210425214333.1090950-5-song@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Song Liu
|
112cb56164 |
perf stat: Introduce config stat.bpf-counter-events
Currently, to use BPF to aggregate perf event counters, the user uses --bpf-counters option. Enable "use bpf by default" events with a config option, stat.bpf-counter-events. Events with name in the option will use BPF. This also enables mixed BPF event and regular event in the same sesssion. For example: perf config stat.bpf-counter-events=instructions perf stat -e instructions,cs The second command will use BPF for "instructions" but not "cs". Signed-off-by: Song Liu <song@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/r/20210425214333.1090950-4-song@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Song Liu
|
fe3dd8263b |
perf bpf: check perf_attr_map is compatible with the perf binary
perf_attr_map could be shared among different version of perf binary. Add bperf_attr_map_compatible() to check whether the existing attr_map is compatible with current perf binary. Signed-off-by: Song Liu <song@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: kernel-team@fb.com Link: https://lore.kernel.org/r/20210425214333.1090950-3-song@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Song Liu
|
ec8149fba6 |
perf util: Move bpf_perf definitions to a libperf header
By following the same protocol, other tools can share hardware PMCs with perf. Move perf_event_attr_map_entry and BPF_PERF_DEFAULT_ATTR_MAP_PATH to bpf_perf.h for other tools to use. Signed-off-by: Song Liu <song@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: kernel-team@fb.com Link: https://lore.kernel.org/r/20210425214333.1090950-2-song@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Linus Torvalds
|
42dec9a936 |
Perf events changes in this cycle were:
- Improve Intel uncore PMU support: - Parse uncore 'discovery tables' - a new hardware capability enumeration method introduced on the latest Intel platforms. This table is in a well-defined PCI namespace location and is read via MMIO. It is organized in an rbtree. These uncore tables will allow the discovery of standard counter blocks, but fancier counters still need to be enumerated explicitly. - Add Alder Lake support - Improve IIO stacks to PMON mapping support on Skylake servers - Add Intel Alder Lake PMU support - which requires the introduction of 'hybrid' CPUs and PMUs. Alder Lake is a mix of Golden Cove ('big') and Gracemont ('small' - Atom derived) cores. The CPU-side feature set is entirely symmetrical - but on the PMU side there's core type dependent PMU functionality. - Reduce data loss with CPU level hardware tracing on Intel PT / AUX profiling, by fixing the AUX allocation watermark logic. - Improve ring buffer allocation on NUMA systems - Put 'struct perf_event' into their separate kmem_cache pool - Add support for synchronous signals for select perf events. The immediate motivation is to support low-overhead sampling-based race detection for user-space code. The feature consists of the following main changes: - Add thread-only event inheritance via perf_event_attr::inherit_thread, which limits inheritance of events to CLONE_THREAD. - Add the ability for events to not leak through exec(), via perf_event_attr::remove_on_exec. - Allow the generation of SIGTRAP via perf_event_attr::sigtrap, extend siginfo with an u64 ::si_perf, and add the breakpoint information to ::si_addr and ::si_perf if the event is PERF_TYPE_BREAKPOINT. The siginfo support is adequate for breakpoints right now - but the new field can be used to introduce support for other types of metadata passed over siginfo as well. - Misc fixes, cleanups and smaller updates. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCJGpERHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1j9zBAAuVbG2snV6SBSdXLhQcM66N3NckOXvSY5 QjjhQcuwJQEK/NJB3266K5d8qSmdyRBsWf3GCsrmyBT67P1V28K44Pu7oCV0UDtf mpVRjEP0oR7hNsANSSgo8Fa4ZD7H5waX7dK7925Tvw8By3mMoZoddiD/84WJHhxO NDF+GRFaRj+/dpbhV8cdCoXTjYdkC36vYuZs3b9lu0tS9D/AJgsNy7TinLvO02Cs 5peP+2y29dgvCXiGBiuJtEA6JyGnX3nUJCvfOZZ/DWDc3fdduARlRrc5Aiq4n/wY UdSkw1VTZBlZ1wMSdmHQVeC5RIH3uWUtRoNqy0Yc90lBm55AQ0EENwIfWDUDC5zy USdBqWTNWKMBxlEilUIyqKPQK8LW/31TRzqy8BWKPNcZt5yP5YS1SjAJRDDjSwL/ I+OBw1vjLJamYh8oNiD5b+VLqNQba81jFASfv+HVWcULumnY6ImECCpkg289Fkpi BVR065boifJDlyENXFbvTxyMBXQsZfA+EhtxG7ju2Ni+TokBbogyCb3L2injPt9g 7jjtTOqmfad4gX1WSc+215iYZMkgECcUd9E+BfOseEjBohqlo7yNKIfYnT8mE/Xq nb7eHjyvLiE8tRtZ+7SjsujOMHv9LhWFAbSaxU/kEVzpkp0zyd6mnnslDKaaHLhz goUMOL/D0lg= =NhQ7 -----END PGP SIGNATURE----- Merge tag 'perf-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event updates from Ingo Molnar: - Improve Intel uncore PMU support: - Parse uncore 'discovery tables' - a new hardware capability enumeration method introduced on the latest Intel platforms. This table is in a well-defined PCI namespace location and is read via MMIO. It is organized in an rbtree. These uncore tables will allow the discovery of standard counter blocks, but fancier counters still need to be enumerated explicitly. - Add Alder Lake support - Improve IIO stacks to PMON mapping support on Skylake servers - Add Intel Alder Lake PMU support - which requires the introduction of 'hybrid' CPUs and PMUs. Alder Lake is a mix of Golden Cove ('big') and Gracemont ('small' - Atom derived) cores. The CPU-side feature set is entirely symmetrical - but on the PMU side there's core type dependent PMU functionality. - Reduce data loss with CPU level hardware tracing on Intel PT / AUX profiling, by fixing the AUX allocation watermark logic. - Improve ring buffer allocation on NUMA systems - Put 'struct perf_event' into their separate kmem_cache pool - Add support for synchronous signals for select perf events. The immediate motivation is to support low-overhead sampling-based race detection for user-space code. The feature consists of the following main changes: - Add thread-only event inheritance via perf_event_attr::inherit_thread, which limits inheritance of events to CLONE_THREAD. - Add the ability for events to not leak through exec(), via perf_event_attr::remove_on_exec. - Allow the generation of SIGTRAP via perf_event_attr::sigtrap, extend siginfo with an u64 ::si_perf, and add the breakpoint information to ::si_addr and ::si_perf if the event is PERF_TYPE_BREAKPOINT. The siginfo support is adequate for breakpoints right now - but the new field can be used to introduce support for other types of metadata passed over siginfo as well. - Misc fixes, cleanups and smaller updates. * tag 'perf-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits) signal, perf: Add missing TRAP_PERF case in siginfo_layout() signal, perf: Fix siginfo_t by avoiding u64 on 32-bit architectures perf/x86: Allow for 8<num_fixed_counters<16 perf/x86/rapl: Add support for Intel Alder Lake perf/x86/cstate: Add Alder Lake CPU support perf/x86/msr: Add Alder Lake CPU support perf/x86/intel/uncore: Add Alder Lake support perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE perf/x86/intel: Add Alder Lake Hybrid support perf/x86: Support filter_match callback perf/x86/intel: Add attr_update for Hybrid PMUs perf/x86: Add structures for the attributes of Hybrid PMUs perf/x86: Register hybrid PMUs perf/x86: Factor out x86_pmu_show_pmu_cap perf/x86: Remove temporary pmu assignment in event_init perf/x86/intel: Factor out intel_pmu_check_extra_regs perf/x86/intel: Factor out intel_pmu_check_event_constraints perf/x86/intel: Factor out intel_pmu_check_num_counters perf/x86: Hybrid PMU support for extra_regs perf/x86: Hybrid PMU support for event constraints ... |
||
Linus Torvalds
|
03b2cd72aa |
Objtool updates in this cycle were:
- Standardize the crypto asm code so that it looks like compiler-generated code to objtool - so that it can understand it. This enables unwinding from crypto asm code - and also fixes the last known remaining objtool warnings for LTO and more. - x86 decoder fixes: clean up and fix the decoder, and also extend it a bit - Misc fixes and cleanups Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCJEOQRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1jN0A//dZIR9GPW1cHkPD3Na+lxb+RWgCVnFUgw meLEOum389zWr7S8YmcFpKWLy94f3l24i/e7ufKn6/RMdaQuT6pZUa6teiNPqDKN Qq1v6EX/LT49Q1zh/zCCnKdlmF1t7wDyA1/+HBLnB4YfYEtteEt+p2Apyv4xIHOl xWqaTMFcVR/El9FXSyqRWRR4zuqY0Uatz0fmfo5jmi2xq460k53fQlTLA/0w5Jw0 V3omyA3AYMUW6YlW5TGUINOhyDeAJm4PWl3siSUnSd6t8A/TVs5zpZX15BtseCle 0FRp2SbxOoVkiyo3N3XmkfYYns9+4wK7cr9qja9U9MsSBZJZwaBm2LO/t2WFrAhq 5dkOsoPmpIsjutsQnIhQgtVT9I/A4/u5m5Zi3trlXsBS0XAt/q+2GPfEngFmgb3q nae4rhGUsQ3NTGBiqNuMHQF4yeEvQZ8DCf3ytTz7DjBeiQ9nAtwzbUUGQjYl2mj1 ZPOnl7Xmq/Nyw+AmdpffFPiEUJxqEg9HWjDo7DQATXb3Hw2VJ3cU8jwPRqDDlO10 OB81vysXNGTmhOngHXexxncpmU9gDOIC1imZZpw5lNx4W9Qn20AlGaGAIbqzlfx0 p5VuhkIWCySe1bOZx03xuk7Gq7GBIPPy/a2m204Ftipetlo1HBYwT3KB/wVpHmh7 CSjWgdiW3+k= =poAZ -----END PGP SIGNATURE----- Merge tag 'objtool-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: - Standardize the crypto asm code so that it looks like compiler- generated code to objtool - so that it can understand it. This enables unwinding from crypto asm code - and also fixes the last known remaining objtool warnings for LTO and more. - x86 decoder fixes: clean up and fix the decoder, and also extend it a bit - Misc fixes and cleanups * tag 'objtool-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/crypto: Enable objtool in crypto code x86/crypto/sha512-ssse3: Standardize stack alignment prologue x86/crypto/sha512-avx2: Standardize stack alignment prologue x86/crypto/sha512-avx: Standardize stack alignment prologue x86/crypto/sha256-avx2: Standardize stack alignment prologue x86/crypto/sha1_avx2: Standardize stack alignment prologue x86/crypto/sha_ni: Standardize stack alignment prologue x86/crypto/crc32c-pcl-intel: Standardize jump table x86/crypto/camellia-aesni-avx2: Unconditionally allocate stack buffer x86/crypto/aesni-intel_avx: Standardize stack alignment prologue x86/crypto/aesni-intel_avx: Fix register usage comments x86/crypto/aesni-intel_avx: Remove unused macros objtool: Support asm jump tables objtool: Parse options from OBJTOOL_ARGS objtool: Collate parse_options() users objtool: Add --backup objtool,x86: More ModRM sugar objtool,x86: Rewrite ADD/SUB/AND objtool,x86: Support %riz encodings objtool,x86: Simplify register decode ... |
||
Linus Torvalds
|
0ff0edb550 |
Locking changes for this cycle were:
- rtmutex cleanup & spring cleaning pass that removes ~400 lines of code - Futex simplifications & cleanups - Add debugging to the CSD code, to help track down a tenacious race (or hw problem) - Add lockdep_assert_not_held(), to allow code to require a lock to not be held, and propagate this into the ath10k driver - Misc LKMM documentation updates - Misc KCSAN updates: cleanups & documentation updates - Misc fixes and cleanups - Fix locktorture bugs with ww_mutexes Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCJDn0RHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1hPrRAAryS4zPnuDsfkVk0smxo7a0lK5ljbH2Xo 28QUZXOl6upnEV8dzbjwG7eAjt5ZJVI5tKIeG0PV0NUJH2nsyHwESdtULGGYuPf/ 4YUzNwZJa+nI/jeBnVsXCimLVxxnNCRdR7yOVOHm4ukEwa+YTNt1pvlYRmUd4YyH Q5cCrpb3THvLka3AAamEbqnHnAdGxHKuuHYVRkODpMQ+zrQvtN8antYsuk8kJsqM m+GZg/dVCuLEPah5k+lOACtcq/w7HCmTlxS8t4XLvD52jywFZLcCPvi1rk0+JR+k Vd9TngC09GJ4jXuDpr42YKkU9/X6qy2Es39iA/ozCvc1Alrhspx/59XmaVSuWQGo XYuEPx38Yuo/6w16haSgp0k4WSay15A4uhCTQ75VF4vli8Bqgg9PaxLyQH1uG8e2 xk8U90R7bDzLlhKYIx1Vu5Z0t7A1JtB5CJtgpcfg/zQLlzygo75fHzdAiU5fDBDm 3QQXSU2Oqzt7c5ZypioHWazARk7tL6th38KGN1gZDTm5zwifpaCtHi7sml6hhZ/4 ATH6zEPzIbXJL2UqumSli6H4ye5ORNjOu32r7YPqLI4IDbzpssfoSwfKYlQG4Tvn 4H1Ukirzni0gz5+wbleItzf2aeo1rocs4YQTnaT02j8NmUHUz4AzOHGOQFr5Tvh0 wk/P4MIoSb0= =cOOk -----END PGP SIGNATURE----- Merge tag 'locking-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - rtmutex cleanup & spring cleaning pass that removes ~400 lines of code - Futex simplifications & cleanups - Add debugging to the CSD code, to help track down a tenacious race (or hw problem) - Add lockdep_assert_not_held(), to allow code to require a lock to not be held, and propagate this into the ath10k driver - Misc LKMM documentation updates - Misc KCSAN updates: cleanups & documentation updates - Misc fixes and cleanups - Fix locktorture bugs with ww_mutexes * tag 'locking-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits) kcsan: Fix printk format string static_call: Relax static_call_update() function argument type static_call: Fix unused variable warn w/o MODULE locking/rtmutex: Clean up signal handling in __rt_mutex_slowlock() locking/rtmutex: Restrict the trylock WARN_ON() to debug locking/rtmutex: Fix misleading comment in rt_mutex_postunlock() locking/rtmutex: Consolidate the fast/slowpath invocation locking/rtmutex: Make text section and inlining consistent locking/rtmutex: Move debug functions as inlines into common header locking/rtmutex: Decrapify __rt_mutex_init() locking/rtmutex: Remove pointless CONFIG_RT_MUTEXES=n stubs locking/rtmutex: Inline chainwalk depth check locking/rtmutex: Move rt_mutex_debug_task_free() to rtmutex.c locking/rtmutex: Remove empty and unused debug stubs locking/rtmutex: Consolidate rt_mutex_init() locking/rtmutex: Remove output from deadlock detector locking/rtmutex: Remove rtmutex deadlock tester leftovers locking/rtmutex: Remove rt_mutex_timed_lock() MAINTAINERS: Add myself as futex reviewer locking/mutex: Remove repeated declaration ... |
||
Linus Torvalds
|
9a45da9270 |
RCU changes for this cycle were:
- Bitmap support for "N" as alias for last bit - kvfree_rcu updates - mm_dump_obj() updates. (One of these is to mm, but was suggested by Andrew Morton.) - RCU callback offloading update - Polling RCU grace-period interfaces - Realtime-related RCU updates - Tasks-RCU updates - Torture-test updates - Torture-test scripting updates - Miscellaneous fixes Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCJCZERHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1hRjw/+Jkb9KvR9odPt/zqN/KPtIlburCUWgsFb 2zAlWN4uMocPAiXT2Xq58/8gqMkpyn7ZVZtL1tD8fZSvlwEr0U8Z74+/NdoQvYE+ kMXIYIuhIAGRyAupmzkriqN33iY+BSZPacX3u6ziPj57/0OZzbWVN/DAhbuvyLqG J/oL4PHCa7XAqXbf95rd5Zjs680QJ3CbTRh4nA8uHArzJmKZOaaHJ05Pxd1LpULe SJ+5p1GQnnwxd1HqmlHMDu/dW+2hE35BGykF8zi78je9OJXualDoM/6JpIYGhMNY 5qlhU55QYP1jzjuNGVZZUS4L77eS2/W7SpPAaTmMEy/SsVB59G8Kf22oNDpVaEqQ m+2ErqwaHvlkMjqnsx+JQbsOP0yCi2NZBoEPFdfk1H23E2deVlSDbxPso4Zb1oUD E12769kN+SWDytuLSOAe1PY/KXqmNUKjPZl1GDCGXL7HlCnWyggUDschTsKJa19O XXl+yCTGMUH4XAPSqavAKQbBjurqpT6i4zfooSH4TBtOHm1ExgZOUS8gglZ1JuJd q+uJdZIgS8BcGkGw/k1bYDWY5TA4Rjv3sAOKQL1PgYBl1t/yLK441mE7LI9gWOwz Crz7vlSxD6Jc2cYQeUVW0KPGt5aVd63Gd9HjpXxGkqYQSDRqYMCebHEAGagz+jj7 Nv/nOnf34Uc= =mpNt -----END PGP SIGNATURE----- Merge tag 'core-rcu-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: - Support for "N" as alias for last bit in bitmap parsing library (eg using syntax like "nohz_full=2-N") - kvfree_rcu updates - mm_dump_obj() updates. (One of these is to mm, but was suggested by Andrew Morton.) - RCU callback offloading update - Polling RCU grace-period interfaces - Realtime-related RCU updates - Tasks-RCU updates - Torture-test updates - Torture-test scripting updates - Miscellaneous fixes * tag 'core-rcu-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits) rcutorture: Test start_poll_synchronize_rcu() and poll_state_synchronize_rcu() rcu: Provide polling interfaces for Tiny RCU grace periods torture: Fix kvm.sh --datestamp regex check torture: Consolidate qemu-cmd duration editing into kvm-transform.sh torture: Print proper vmlinux path for kvm-again.sh runs torture: Make TORTURE_TRUST_MAKE available in kvm-again.sh environment torture: Make kvm-transform.sh update jitter commands torture: Add --duration argument to kvm-again.sh torture: Add kvm-again.sh to rerun a previous torture-test torture: Create a "batches" file for build reuse torture: De-capitalize TORTURE_SUITE torture: Make upper-case-only no-dot no-slash scenario names official torture: Rename SRCU-t and SRCU-u to avoid lowercase characters torture: Remove no-mpstat error message torture: Record kvm-test-1-run.sh and kvm-test-1-run-qemu.sh PIDs torture: Record jitter start/stop commands torture: Extract kvm-test-1-run-qemu.sh from kvm-test-1-run.sh torture: Record TORTURE_KCONFIG_GDB_ARG in qemu-cmd torture: Abstract jitter.sh start/stop into scripts rcu: Provide polling interfaces for Tree RCU grace periods ... |
||
Linus Torvalds
|
3aa139aa9f |
media updates for v5.13-rc1
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmCH7/AACgkQCF8+vY7k 4RVWVg//bcGubc1Sq68r0C+RMI1ETWMthtlkz9DTigDnht8uTmO+DohsE9R1mq4I szwwcwfMnMdbLB+zP7v1JNicT8GNMeFu7STurwj1w6YeQ5/IHcQc4P39+/8EAwqB hrUSIsY+RuYuv7dROwi45Yn8/mmWH4vdP3zrm3/k/EdlEBrf6C0e0KbGLqCr9Zrx pWVMB22RGfzh+1qQJhA43rDYceXs5/b5/Y1Dc/W97lJXyv87Hy0g333R+G1KzKLf 3fhluSaLHC1j6Mm7Vneowy3mDjeyPZBiRajvNbbApQMzKIa78rJGOjMEBynHW1Rl Np7N0cGIq2a/yai7fCQk1SO9MdsLzE8rG+X/KxS3LSgb8c1RT31VaZhRfD9Pn4Gr asJ/JO7SX3YHXDO8F9WtsD/Fi9AREPz2o3I1760zk5KNQf9uvxeWkQPHQkKLlO/7 6JWtmu5KbzUMqXZyekhcTDfqJE3xgUg36xLa7wDy8FlgDfE9MZUBaRwktHyfNgi0 6tDdv1/w0iEvglUGVLMlQRh2ysx3lgBuAmI8Jg1vSt9DhZLsj8DY89/JE3B+CCBX uT9Zwc/Ug/HEpYMEhptP26z/TgVIkxs5iCvKripJ5GKl/tEIqsHSjqW5lzgI/6O+ hKjxgZdeQ8+F9TJ5AMIBDU82trwDj77lIcHsRFRI6JLs/9L2PRo= =fvc5 -----END PGP SIGNATURE----- Merge tag 'media/v5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: - addition of a maintainer's profile for the media subsystem - addition of i.MX8 IP support - qcom/camss gained support for hardware version Titan 170 - new RC keymaps - Lots of other improvements, cleanups and bug fixes * tag 'media/v5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (488 commits) media: coda: fix macroblocks count control usage media: rkisp1: params: fix wrong bits settings media: cedrus: Fix H265 status definitions media: meson-ge2d: fix rotation parameters media: v4l2-ctrls: fix reference to freed memory media: venus : hfi: add venus image info into smem media: venus: Fix internal buffer size calculations for v6. media: venus: helpers: keep max bandwidth when mbps exceeds the supported range media: venus: fix hw overload error log condition media: venus: core: correct firmware name for sm8250 media: venus: core,pm: fix potential infinite loop media: venus: core: Fix kerneldoc warnings media: gscpa/stv06xx: fix memory leak media: cx25821: remove unused including <linux/version.h> media: staging: media/meson: remove redundant dev_err call media: adv7842: support 1 block EDIDs, fix clearing EDID media: adv7842: configure all pads media: allegro: change kernel-doc comment blocks to normal comments media: camss: ispif: Remove redundant dev_err call in msm_ispif_subdev_init() media: i2c: rdamc21: Fix warning on u8 cast ... |
||
Linus Torvalds
|
1e9599dfc4 |
linux-kselftest-kunit-5.13-rc1
This KUnit update for Linux 5.13-rc1 consists of several fixes and new feature to support failure from dynamic analysis tools such as UBSAN and fake ops for testing. - a fake ops struct for testing a "free" function to complain if it was called with an invalid argument, or caught a double-free. Most return void and have no normal means of signalling failure (e.g. super_operations, iommu_ops, etc.). -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmCIpAEACgkQCwJExA0N QxyPtRAAgA9/X6zSvT1LftnTkgl0Zn+4Krbr7eNyvxcVdJkYrmW31Pk8PaTK02Z/ OVDzw0ESHaAN2SwDOmKFUyiob9qBZjf4lEHDkGWLnSkV3GHd9JrM+kceyX/0yr8E bpYvGZHBKe+SGtAr1H9DcUcfzCZlDUhUI2HOMwm/h29mFmmJOlrjvNUAGY4pnmti 6Mqh4GAoFxqQDDUANUIWinWU56rWrcIPQgxPD6/r2VvrC8gHtQ+2dBR8+3i2qSQO Ydzgno/8OmXfl606NMKh5DPS6E8uRrLw67fBB+YoEGK6E9w5yyXKKAbV3M3ekB/Y 3ze/SUi6IKrvdF6OEH4mX3rMNEKgRQoJLpQ8KyMvDMuIAdmq3xQQFDlSc2+gEnTj VnEREBhrIOh24GxZFTM6VvNmjNqdAq4/BMTt/LoSHEKwOASZi9udAnKjo67P/LB1 1+rcoKdn/OA7p9/Zo/ETpTkFvEDyF2TscaEgYz2aRiV0YgcncPc1RSDAzbghdlFP y10Dk3uARXdzqrd0Hb1B3HL+cAPZEINerqqAUl0ggWejcJjfUDMi7sQuKEK8I5yU 6sSXyVhCHDNmOUjuJjt5JdqeadLUDCkZqEnMMvwqKz00WKioQ2pqYy/UJ2rpWy+G +pymI17nRcOVfIIqnXitSCgch7cS0FZCjiOqZBBCCs9Axuv7mJU= =aP5e -----END PGP SIGNATURE----- Merge tag 'linux-kselftest-kunit-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull KUnit updates from Shuah Khan: "Several fixes and a new feature to support failure from dynamic analysis tools such as UBSAN and fake ops for testing. - a fake ops struct for testing a "free" function to complain if it was called with an invalid argument, or caught a double-free. Most return void and have no normal means of signalling failure (e.g. super_operations, iommu_ops, etc.)" * tag 'linux-kselftest-kunit-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: Documentation: kunit: add tips for using current->kunit_test kunit: fix -Wunused-function warning for __kunit_fail_current_test kunit: support failure from dynamic analysis tools kunit: tool: make --kunitconfig accept dirs, add lib/kunit fragment kunit: make KUNIT_EXPECT_STREQ() quote values, don't print literals kunit: Match parenthesis alignment to improve code readability |
||
Linus Torvalds
|
2a68c268a1 |
linux-kselftest-next-5.13-rc1
This Kselftest update for Linux 5.13-rc1 consists of: - fixes and updates to resctrl test from Fenghua Yu and Reinette Chatre - fixes to Kselftest documentation, framework - minor spelling correction in timers test -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmCIls8ACgkQCwJExA0N QxzrDg/9E2+KrNsqT/bVZIDZgPsLOPtkIaNd+94wsGKgHUSekoHRYKcmeRZLLA3x m6s4Jc8o84rztRgjjttWCOQD3ICsFC5Dp4eu2f8YowlPqPRn0MMJEUwQAPhxnFq0 44KQ2v7bJhXYZRwhZXcv1Gu1o3o6cx59X9pLFo/Yf/OeTHj7ulegWtjCvBcS2uuT bhI0YbiCKDE4gIXYLPWKD96JjLRVo5zYnMIRqDJrgf7xSr+xoKmsZKSgkt6ca+My KSYtkaXDEB1DFNoovDQyhmAwImeqWgEKPMZIblLyfoUJNRyBQg9flRvguBzgR3TM J1lvavNZSC7qgx9xQI4DjsHtpn9y9C5/k9vXauhVtdMpMGY6zrz2zN5/xOohXjzN vlonhp6G/wkfxuo0Dcr++Oqlw5wWt55hxFJm84rIQ/2IYUfRBKWV5c2mUKRUJzrr pT3fcIpN1WTEBaxvC4/aL5oLvF/RSArSKs7StX1uzkedy7IwPsiCJa5OgT2iNSbH tpDS9KNOiLIkwpr8dBF9O9WRBo8ZtUoB1OPqQWuc0PMa0RDT4i/oTwe0ulu5rWma 5G3yQIilTsRAxpFWihklBiV0pB9bT8O/d8kMlagj/znl8GmKyiXEMDwrCPfVmp16 zNMskcarXWgL+BdhnSX5j53tLL6MEAWS1RzweR62U20eyPJF34Y= =0VC+ -----END PGP SIGNATURE----- Merge tag 'linux-kselftest-next-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest updates from Shuah Khan: - fixes and updates to resctrl test from Fenghua Yu and Reinette Chatre - fixes to Kselftest documentation, framework - minor spelling correction in timers test * tag 'linux-kselftest-next-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (25 commits) selftests/resctrl: Change a few printed messages Documentation: kselftest: fix path to test module files selftests/resctrl: Create .gitignore to include resctrl_tests selftests/resctrl: Fix checking for < 0 for unsigned values selftests/resctrl: Fix incorrect parsing of iMC counters selftests/resctrl: Fix unmount resctrl FS selftests/resctrl: Skip the test if requested resctrl feature is not supported selftests/resctrl: Modularize resctrl test suite main() function selftests/resctrl: Don't hard code value of "no_of_bits" variable selftests/resctrl: Fix MBA/MBM results reporting format selftests/resctrl: Use resctrl/info for feature detection selftests/resctrl: Check for resctrl mount point only if resctrl FS is supported selftests/resctrl: Add config dependencies selftests/resctrl: Fix a printed message selftests/resctrl: Share show_cache_info() by CAT and CMT tests selftests/resctrl: Call kselftest APIs to log test results selftests/resctrl: Rename CQM test as CMT test selftests/resctrl: Fix missing options "-n" and "-p" selftests/resctrl: Ensure sibling CPU is not same as original CPU selftests/resctrl: Clean up resctrl features check ... |
||
Linus Torvalds
|
c6536676c7 |
- turn the stack canary into a normal __percpu variable on 32-bit which
gets rid of the LAZY_GS stuff and a lot of code. - Add an insn_decode() API which all users of the instruction decoder should preferrably use. Its goal is to keep the details of the instruction decoder away from its users and simplify and streamline how one decodes insns in the kernel. Convert its users to it. - kprobes improvements and fixes - Set the maximum DIE per package variable on Hygon - Rip out the dynamic NOP selection and simplify all the machinery around selecting NOPs. Use the simplified NOPs in objtool now too. - Add Xeon Sapphire Rapids to list of CPUs that support PPIN - Simplify the retpolines by folding the entire thing into an alternative now that objtool can handle alternatives with stack ops. Then, have objtool rewrite the call to the retpoline with the alternative which then will get patched at boot time. - Document Intel uarch per models in intel-family.h - Make Sub-NUMA Clustering topology the default and Cluster-on-Die the exception on Intel. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmCHyJQACgkQEsHwGGHe VUpjiRAAwPZdwwp08ypZuMHR4EhLNru6gYhbAoALGgtYnQjLtn5onQhIeieK+R4L cmZpxHT9OFp5dXHk4kwygaQBsD4pPOiIpm60kye1dN3cSbOORRdkwEoQMpKMZ+5Y kvVsmn7lrwRbp600KdE4G6L5+N6gEgr0r6fMFWWGK3mgVAyCzPexVHgydcp131ch iYMo6/pPDcNkcV/hboVKgx7GISdQ7L356L1MAIW/Sxtw6uD/X4qGYW+kV2OQg9+t nQDaAo7a8Jqlop5W5TQUdMLKQZ1xK8SFOSX/nTS15DZIOBQOGgXR7Xjywn1chBH/ PHLwM5s4XF6NT5VlIA8tXNZjWIZTiBdldr1kJAmdDYacrtZVs2LWSOC0ilXsd08Z EWtvcpHfHEqcuYJlcdALuXY8xDWqf6Q2F7BeadEBAxwnnBg+pAEoLXI/1UwWcmsj wpaZTCorhJpYo2pxXckVdHz2z0LldDCNOXOjjaWU8tyaOBKEK6MgAaYU7e0yyENv mVc9n5+WuvXuivC6EdZ94Pcr/KQsd09ezpJYcVfMDGv58YZrb6XIEELAJIBTu2/B Ua8QApgRgetx+1FKb8X6eGjPl0p40qjD381TADb4rgETPb1AgKaQflmrSTIik+7p O+Eo/4x/GdIi9jFk3K+j4mIznRbUX0cheTJgXoiI4zXML9Jv94w= =bm4S -----END PGP SIGNATURE----- Merge tag 'x86_core_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 updates from Borislav Petkov: - Turn the stack canary into a normal __percpu variable on 32-bit which gets rid of the LAZY_GS stuff and a lot of code. - Add an insn_decode() API which all users of the instruction decoder should preferrably use. Its goal is to keep the details of the instruction decoder away from its users and simplify and streamline how one decodes insns in the kernel. Convert its users to it. - kprobes improvements and fixes - Set the maximum DIE per package variable on Hygon - Rip out the dynamic NOP selection and simplify all the machinery around selecting NOPs. Use the simplified NOPs in objtool now too. - Add Xeon Sapphire Rapids to list of CPUs that support PPIN - Simplify the retpolines by folding the entire thing into an alternative now that objtool can handle alternatives with stack ops. Then, have objtool rewrite the call to the retpoline with the alternative which then will get patched at boot time. - Document Intel uarch per models in intel-family.h - Make Sub-NUMA Clustering topology the default and Cluster-on-Die the exception on Intel. * tag 'x86_core_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits) x86, sched: Treat Intel SNC topology as default, COD as exception x86/cpu: Comment Skylake server stepping too x86/cpu: Resort and comment Intel models objtool/x86: Rewrite retpoline thunk calls objtool: Skip magical retpoline .altinstr_replacement objtool: Cache instruction relocs objtool: Keep track of retpoline call sites objtool: Add elf_create_undef_symbol() objtool: Extract elf_symbol_add() objtool: Extract elf_strtab_concat() objtool: Create reloc sections implicitly objtool: Add elf_create_reloc() helper objtool: Rework the elf_rebuild_reloc_section() logic objtool: Fix static_call list generation objtool: Handle per arch retpoline naming objtool: Correctly handle retpoline thunk calls x86/retpoline: Simplify retpolines x86/alternatives: Optimize optimize_nops() x86: Add insn_decode_kernel() x86/kprobes: Move 'inline' to the beginning of the kprobe_is_ss() declaration ... |
||
Pedro Tammela
|
3733bfbbdd |
bpf, selftests: Update array map tests for per-cpu batched ops
Follows the same logic as the hashtable tests. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210424214510.806627-3-pctammela@mojatatu.com |