Commit Graph

25605 Commits

Author SHA1 Message Date
Axel Rasmussen
f0fa943309 userfaultfd/selftests: add test exercising minor fault handling
Fix a dormant bug in userfaultfd_events_test(), where we did `return
faulting_process(0)` instead of `exit(faulting_process(0))`.  This
caused the forked process to keep running, trying to execute any further
test cases after the events test in parallel with the "real" process.

Add a simple test case which exercises minor faults.  In short, it does
the following:

1. "Sets up" an area (area_dst) and a second shared mapping to the same
   underlying pages (area_dst_alias).

2. Register one of these areas with userfaultfd, in minor fault mode.

3. Start a second thread to handle any minor faults.

4. Populate the underlying pages with the non-UFFD-registered side of
   the mapping. Basically, memset() each page with some arbitrary
   contents.

5. Then, using the UFFD-registered mapping, read all of the page
   contents, asserting that the contents match expectations (we expect
   the minor fault handling thread can modify the page contents before
   resolving the fault).

The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way.  Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault.  The reading thread should wake up and
see this modification.

Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.

Link: https://lkml.kernel.org/r/20210301222728.176417-7-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-05 11:27:23 -07:00
Zi Yan
fbe37501b2 mm: huge_memory: debugfs for file-backed THP split
Further extend <debugfs>/split_huge_pages to accept
"<path>,<pgoff_start>,<pgoff_end>" for file-backed THP split tests since
tmpfs may have file backed by THP that mapped nowhere.

Update selftest program to test file-backed THP split too.

Link: https://lkml.kernel.org/r/20210331235309.332292-2-zi.yan@sent.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Mika Penttila <mika.penttila@nextfour.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-05 11:27:21 -07:00
Zi Yan
fa6c02315f mm: huge_memory: a new debugfs interface for splitting THP tests
We did not have a direct user interface of splitting the compound page
backing a THP and there is no need unless we want to expose the THP
implementation details to users.  Make <debugfs>/split_huge_pages accept a
new command to do that.

By writing "<pid>,<vaddr_start>,<vaddr_end>" to
<debugfs>/split_huge_pages, THPs within the given virtual address range
from the process with the given pid are split. It is used to test
split_huge_page function. In addition, a selftest program is added to
tools/testing/selftests/vm to utilize the interface by splitting
PMD THPs and PTE-mapped THPs.

This does not change the old behavior, i.e., writing 1 to the interface
to split all THPs in the system.

Link: https://lkml.kernel.org/r/20210331235309.332292-1-zi.yan@sent.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mika Penttila <mika.penttila@nextfour.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-05 11:27:21 -07:00
Len Brown
3c070b2abf tools/power turbostat: version 2021.05.04
Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 20:01:09 -04:00
Len Brown
b60c573dc2 tools/power turbostat: Support "turbostat --hide idle"
As idle, in particular, can have many columns on some machines...
Make it easy to ignore them all at once.

Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 20:01:09 -04:00
Len Brown
38c6663a68 tools/power turbostat: elevate priority of interval mode
This makes interval mode less likely to see delayed
results on a heavily loaded system.

Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 20:01:09 -04:00
Len Brown
1b439f01b6 tools/power turbostat: formatting
Spring is here...
run a long overdue Lendent on turbostat.c

no functional change

Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 20:01:09 -04:00
Zhang Rui
55279aef75 tools/power turbostat: rename tcc variables
There are two TCC activation temeprature.
One is the default TCC activation temperature, also known as TJ_MAX.
Another one is the effective TCC activation temperature, which is the
subtraction of default TCC activation temperature and TCC offset.

The name of variable tcc_activation_temp might be misleading here.
Thus rename tcc_activation_temp to tj_max, and use tcc_default and
tcc_offset to calculate the effective TCC activation temperature.

No functional change in this patch.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 20:01:09 -04:00
Zhang Rui
0b9a0b9be9 tools/power turbostat: add TCC Offset support
The length of TCC Offset bits varies on different platforms.
Decode TCC Offset bits only for the platforms that we have verified.
For the others, only show default TCC activation temperature.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 20:01:08 -04:00
Zhang Rui
e9d3092f6d tools/power turbostat: save original CPU model
CPU model may get changed in intel_model_duplicates() for code reuse.
But there are still some cases we need the original CPU model to handle
minor differences between generations.

Thus save the original CPU model.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 20:01:08 -04:00
Zhang Rui
7ab5ff4937 tools/power turbostat: Fix Core C6 residency on Atom CPUs
For Atom CPUs that have core cstate deeper than C6,
MSR_CORE_C6_RESIDENCY actually returns the residency for both CC6 and
deeper Core cstates.
Thus, the real Core C6 residency should be the subtraction of
MSR_CORE_C6_RESIDENCY return value and MSR_CORE_C6_RESIDENCY return value.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 20:01:00 -04:00
Chen Yu
aeb01e6d71 tools/power turbostat: Print the C-state Pre-wake settings
C-state pre-wake setting[1] is an optimization for some Intel CPUs to
be woken up from deep C-states in order to reduce latency. According to
the spec, the BIT30 is the C-state Pre-wake Disable. Expose this setting
accordingly.
Sample output from turbostat:
...
cpu51: MSR_IA32_POWER_CTL: 0x1a00a40059 (C1E auto-promotion: DISabled)
C-state Pre-wake: ENabled
cpu51: MSR_TURBO_RATIO_LIMIT: 0x2021212121212224
...

[1] https://intel.github.io/wult/#c-state-pre-wake

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 19:10:32 -04:00
Chen Yu
8c69da2930 tools/power turbostat: Enable tsc_tweak for Elkhart Lake and Jasper Lake
It was found that on Elkhart Lake the TSC frequency is driven by
a separate crystal-clock domain, which is different from the
BCLK domain which includes mperf. This has result in small different
speed thus inconsistence between TSC and the mperf, which caused the
Busy% to be higher than 100%. On this platform it seems that the mperf
runs faster than tsc when the CPU is 100% utilized:
delta tsc(18815473183) < delta mperf(18958403680) for 10 seconds.

To align TSC with mperf, leverage the tsc_tweak mechanism introduced for
cores newer than Skylake, so that TSC and mperf would be calculated in
the same domain.

Reported-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 19:09:00 -04:00
Randy Dunlap
1e3ec5cdfb tools/power turbostat: unmark non-kernel-doc comment
Do not mark a comment as kernel-doc notation when it is not
meant to be in kernel-doc notation.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 18:44:32 -04:00
Chen Yu
25368d7cef tools/power/turbostat: Remove Package C6 Retention on Ice Lake Server
Currently the turbostat treats ICX the same way as SKX and shares the
code among them. But one difference is that ICX does not support Package
C6 Retention, unlike SKX and CLX.

So this patch:

1. Splitting SKX and ICX in turbostat.
2. Removing Package C6 Rentention for ICX.

And after this split, it would be easier to cutomize Ice Lake Server
in turbostat in the future.

Suggested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 18:35:43 -04:00
Calvin Walton
13a779de41 tools/power turbostat: Fix offset overflow issue in index converting
The idx_to_offset() function returns type int (32-bit signed), but
MSR_PKG_ENERGY_STAT is u32 and would be interpreted as a negative number.
The end result is that it hits the if (offset < 0) check in update_msr_sum()
which prevents the timer callback from updating the stat in the background when
long durations are used. The similar issue exists in offset_to_idx() and
update_msr_sum(). Fix this issue by converting the 'int' to 'off_t' accordingly.

Fixes: 9972d5d84d ("tools/power turbostat: Enable accumulate RAPL display")
Signed-off-by: Calvin Walton <calvin.walton@kepstin.ca>
Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 18:33:41 -04:00
Bas Nieuwenhuizen
301b1d3a91 tools/power/turbostat: Fix turbostat for AMD Zen CPUs
It was reported that on Zen+ system turbostat started exiting,
which was tracked down to the MSR_PKG_ENERGY_STAT read failing because
offset_to_idx wasn't returning a non-negative index.

This patch combined the modification from Bingsong Si and
Bas Nieuwenhuizen and addd the MSR to the index system as alternative for
MSR_PKG_ENERGY_STATUS.

Fixes: 9972d5d84d ("tools/power turbostat: Enable accumulate RAPL display")
Reported-by: youling257 <youling257@gmail.com>
Tested-by: youling257 <youling257@gmail.com>
Tested-by: Kurt Garloff <kurt@garloff.de>
Tested-by: Bingsong Si <owen.si@ucloud.cn>
Tested-by: Artem S. Tashkinov <aros@gmx.com>
Co-developed-by: Bingsong Si <owen.si@ucloud.cn>
Co-developed-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 18:31:35 -04:00
Len Brown
ba58ecde5e tools/power turbostat: update version number 2021-05-04 18:23:15 -04:00
Zhang Rui
abdc75ab53 tools/power turbostat: Fix DRAM Energy Unit on SKX
SKX uses fixed DRAM Energy Unit, just like HSX and BDX.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 18:23:14 -04:00
Len Brown
b2b94be787 Revert "tools/power turbostat: adjust for temperature offset"
This reverts commit 6ff7cb371c.

Apparently the TCC offset should not be used to adjust what temperature
we show the user after all.

(on most systems, TCC offset is 0, FWIW)

Fixes: 6ff7cb371c

Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 18:23:14 -04:00
Chen Yu
6c5c656006 tools/power turbostat: Support Ice Lake D
Ice Lake D is low-end server version of Ice Lake X, reuse
the code accordingly.

Tested-by: Wendy Wang <wendy.wang@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 18:23:14 -04:00
Chen Yu
5683460b85 tools/power turbostat: Support Alder Lake Mobile
Share the code between Alder Lake Mobile and Alder Lake Desktop.

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 18:23:14 -04:00
Len Brown
ed0757b83a tools/power turbostat: print microcode patch level
(also available via "grep microcode /proc/cpuinfo")

Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 18:23:14 -04:00
Len Brown
2af4f9b859 tools/power turbostat: add built-in-counter for IPC -- Instructions per Cycle
Use linux-perf to access the hardware instructions-retired counter.
This is necessary because the counter is not enabled by default,
and also the counter is prone to roll-over -- both of which
perf manages.

It is not necessary to use perf for the cycle counter,
because turbostat already needs to collect delta-aperf
to calcuate frequency.

Signed-off-by: Len Brown <len.brown@intel.com>
2021-05-04 18:23:14 -04:00
Linus Torvalds
954b720705 dma-mapping updates for Linux 5.13:
- add a new dma_alloc_noncontiguous API (me, Ricardo Ribalda)
  - fix a copyright noice (Hao Fang)
  - add an unlikely annotation to dma_mapping_error (Heiner Kallweit)
  - remove a pointless empty line (Wang Qing)
  - add support for multi-pages map/unmap bencharking (Xiang Chen)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmCQ/foLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYMKLRAAyNxmtSSgU9FlzcCWIwLlOj5IMDR0bWkxMYr0oQOo
 e/e3ZLv5N6bF2tkLRE8YLy14LlmGYDdpV/VsuObsRMEJXUYs+wbWJOLuNFRZnkqH
 WjrXgCz0ijhynmxwkQCc7lsTZbMvwq0f2Jf+v0OWZ4D2v/raem3ESs2NW/Ld8lAe
 uPYK0mXywCPe0eqnRbJL53hah2y0PP+NKATPj18BtI32evPQcSL4g0r//VxPXb7G
 q4cT6D51xqo280h+fW2DqODJeQQsWb6tYrWDdg7y/JkGZU6aOKTCyCLrZ4Dz+MqA
 7wb0PII3ldZxb0Z4FP9Ij/Tt8r+FuWS6uT8epwzFFuURrVj3M7Uuw2bKHJuFmHI4
 R6wJGy6I5tIbgbg7EtqLr6LhwGtbdhhd5ehlXDA+x4VcXdQ3+9aE3IvGS0YGCon8
 lDQH0EyC3FIAW6BJbBf1NySM3jtySSxrn2GMD3d70gOiS6O0Xiktj6jBY68cXYzS
 zCCSSY7KCZm65KbcDe71dv/RIgZePAIBGQJPPmSsRcKgnBXB+AVuUl0AoZu3yGwJ
 hY49eP55XX9Vm746GBwdwciuVDaTDScDzDtrw/GN3FQMBKrwtJs9MXTzbJHsdWVt
 G4p5qXa5VwH2xl1GsuRLAItKEglXzsX5GDZ0IvwjcbWlZokVhBlMOCv1ZfJRVkwc
 0c0=
 =DJoy
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.13' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - add a new dma_alloc_noncontiguous API (me, Ricardo Ribalda)

 - fix a copyright notice (Hao Fang)

 - add an unlikely annotation to dma_mapping_error (Heiner Kallweit)

 - remove a pointless empty line (Wang Qing)

 - add support for multi-pages map/unmap bencharking (Xiang Chen)

* tag 'dma-mapping-5.13' of git://git.infradead.org/users/hch/dma-mapping:
  dma-mapping: add unlikely hint to error path in dma_mapping_error
  dma-mapping: benchmark: Add support for multi-pages map/unmap
  dma-mapping: benchmark: use the correct HiSilicon copyright
  dma-mapping: remove a pointless empty line in dma_alloc_coherent
  media: uvcvideo: Use dma_alloc_noncontiguous API
  dma-iommu: implement ->alloc_noncontiguous
  dma-iommu: refactor iommu_dma_alloc_remap
  dma-mapping: add a dma_alloc_noncontiguous API
  dma-mapping: refactor dma_{alloc,free}_pages
  dma-mapping: add a dma_mmap_pages helper
2021-05-04 10:52:09 -07:00
John 'Warthog9' Hawley (VMware)
6a0f365295 ktest: Re-arrange the code blocks for better discoverability
Perl, as with most scripting languages, is fairly flexible in how /
where you can define things, and it will (for the most part) do what you
would expect it to do.  This however can lead to situations, like with
ktest, where things get muddled over time.

This pushes the variable definitions back up to the top, followed by
functions, with the main script executables down at the bottom, INSTEAD
of being somewhat mish-mashed together in certain places.  This mostly
has the advantage of making it more obvious where things are initially
defined, what functions are there, and ACTUALLY where the main script
starts executing, and should make this a little more approachable.

Signed-off-by: John 'Warthog9' Hawley (VMware) <warthog9@eaglescrag.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-05-03 18:57:03 -04:00
John 'Warthog9' Hawley (VMware)
c043ccbfc6 ktest: Further consistency cleanups
This cleans up some additional whitespace pieces that to be more
consistent, as well as moving a curly brace around, and some 'or'
statements to match the rest of the file (usually or goes at the
end of the line vs. at the beginning)

Signed-off-by: John 'Warthog9' Hawley (VMware) <warthog9@eaglescrag.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-05-03 18:57:03 -04:00
John 'Warthog9' Hawley (VMware)
12d4cddda2 ktest: Fixing indentation to match expected pattern
This is a followup to "ktest: Adding editor hints to improve
consistency" to actually adjust the existing indentation to match
the, now, expected pattern (first column 4 spaces, 2nd tab, 3rd
tab + 4 spaces, etc).  This should, at least help, keep things
consistent going forward now.

Signed-off-by: John 'Warthog9' Hawley (VMware) <warthog9@eaglescrag.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-05-03 18:57:03 -04:00
John 'Warthog9' Hawley (VMware)
becdd17b5a ktest: Adding editor hints to improve consistency
Emacs and Vi(m) have different styles of dealing with perl syntax
which can lead to slightly inconsistent indentation, and makes the
code slightly harder to read.  Emacs assumes a more perl recommended
standard of 4 spaces (1 column) or tab (two column) indentation.

Vi(m) tends to favor just normal spaces or tabs depending on what
was being used.

This gives the basic hinting to Emacs and Vim to do what is
expected to be basically consistent.

Emacs:
	- Explicitly flip into perl mode, cperl would require
	  more adjustments

Vi(m):
	- Set softtabs=4 which will flip it over to doing
	  indentation the way you would expect from Emacs

Signed-off-by: John 'Warthog9' Hawley (VMware) <warthog9@eaglescrag.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-05-03 18:57:03 -04:00
John 'Warthog9' Hawley (VMware)
2676eb4bfc ktest: Add example config for using VMware VMs
This duplicates the KVM/Qemu config with specific notes for how
to use it with VMware VMs on Workstation, Player, or Fusion.
The main thing to be aware of is how the serial port is exposed
which is a unix pipe, and will need something like ncat to get
into ktest's monitoring

Signed-off-by: John 'Warthog9' Hawley (VMware) <warthog9@eaglescrag.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-05-03 18:57:02 -04:00
John 'Warthog9' Hawley (VMware)
da2e56634b ktest: Minor cleanup with uninitialized variable $build_options
Signed-off-by: John 'Warthog9' Hawley (VMware) <warthog9@eaglescrag.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-05-03 18:57:02 -04:00
Linus Torvalds
9b1f61d5d7 tracing updates for 5.13
New feature:
 
  The "func-no-repeats" option in tracefs/options directory. When set
  the function tracer will detect if the current function being traced
  is the same as the previous one, and instead of recording it, it will
  keep track of the number of times that the function is repeated in a row.
  And when another function is recorded, it will write a new event that
  shows the function that repeated, the number of times it repeated and
  the time stamp of when the last repeated function occurred.
 
 Enhancements:
 
  In order to implement the above "func-no-repeats" option, the ring
  buffer timestamp can now give the accurate timestamp of the event
  as it is being recorded, instead of having to record an absolute
  timestamp for all events. This helps the histogram code which no longer
  needs to waste ring buffer space.
 
  New validation logic to make sure all trace events that access
  dereferenced pointers do so in a safe way, and will warn otherwise.
 
 Fixes:
 
  No longer limit the PIDs of tasks that are recorded for "saved_cmdlines"
  to PID_MAX_DEFAULT (32768), as systemd now allows for a much larger
  range. This caused the mapping of PIDs to the task names to be dropped
  for all tasks with a PID greater than 32768.
 
  Change trace_clock_global() to never block. This caused a deadlock.
 
 Clean ups:
 
  Typos, prototype fixes, and removing of duplicate or unused code.
 
  Better management of ftrace_page allocations.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYI/1vBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qiL0AP9EemIC5TDh2oihqLRNeUjdTu0ryEoM
 HRFqxozSF985twD/bfkt86KQC8rLHwxTbxQZ863bmdaC6cMGFhWiF+H/MAs=
 =psYt
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "New feature:

   - A new "func-no-repeats" option in tracefs/options directory.

     When set the function tracer will detect if the current function
     being traced is the same as the previous one, and instead of
     recording it, it will keep track of the number of times that the
     function is repeated in a row. And when another function is
     recorded, it will write a new event that shows the function that
     repeated, the number of times it repeated and the time stamp of
     when the last repeated function occurred.

  Enhancements:

   - In order to implement the above "func-no-repeats" option, the ring
     buffer timestamp can now give the accurate timestamp of the event
     as it is being recorded, instead of having to record an absolute
     timestamp for all events. This helps the histogram code which no
     longer needs to waste ring buffer space.

   - New validation logic to make sure all trace events that access
     dereferenced pointers do so in a safe way, and will warn otherwise.

  Fixes:

   - No longer limit the PIDs of tasks that are recorded for
     "saved_cmdlines" to PID_MAX_DEFAULT (32768), as systemd now allows
     for a much larger range. This caused the mapping of PIDs to the
     task names to be dropped for all tasks with a PID greater than
     32768.

   - Change trace_clock_global() to never block. This caused a deadlock.

  Clean ups:

   - Typos, prototype fixes, and removing of duplicate or unused code.

   - Better management of ftrace_page allocations"

* tag 'trace-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (32 commits)
  tracing: Restructure trace_clock_global() to never block
  tracing: Map all PIDs to command lines
  ftrace: Reuse the output of the function tracer for func_repeats
  tracing: Add "func_no_repeats" option for function tracing
  tracing: Unify the logic for function tracing options
  tracing: Add method for recording "func_repeats" events
  tracing: Add "last_func_repeats" to struct trace_array
  tracing: Define new ftrace event "func_repeats"
  tracing: Define static void trace_print_time()
  ftrace: Simplify the calculation of page number for ftrace_page->records some more
  ftrace: Store the order of pages allocated in ftrace_page
  tracing: Remove unused argument from "ring_buffer_time_stamp()
  tracing: Remove duplicate struct declaration in trace_events.h
  tracing: Update create_system_filter() kernel-doc comment
  tracing: A minor cleanup for create_system_filter()
  kernel: trace: Mundane typo fixes in the file trace_events_filter.c
  tracing: Fix various typos in comments
  scripts/recordmcount.pl: Make vim and emacs indent the same
  scripts/recordmcount.pl: Make indent spacing consistent
  tracing: Add a verifier to check string pointers for trace events
  ...
2021-05-03 11:19:54 -07:00
Brendan Jackman
2a30f94406 libbpf: Fix signed overflow in ringbuf_process_ring
One of our benchmarks running in (Google-internal) CI pushes data
through the ringbuf faster htan than userspace is able to consume
it. In this case it seems we're actually able to get >INT_MAX entries
in a single ring_buffer__consume() call. ASAN detected that cnt
overflows in this case.

Fix by using 64-bit counter internally and then capping the result to
INT_MAX before converting to the int return type. Do the same for
the ring_buffer__poll().

Fixes: bf99c936f9 (libbpf: Add BPF ring buffer support)
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210429130510.1621665-1-jackmanb@google.com
2021-05-03 09:54:12 -07:00
Linus Torvalds
17ae69aba8 Add Landlock, a new LSM from Mickaël Salaün <mic@linux.microsoft.com>
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEgycj0O+d1G2aycA8rZhLv9lQBTwFAmCInP4ACgkQrZhLv9lQ
 BTza0g//dTeb9woC9H7qlEhK4l9yk62lTss60Q8X7m7ZSNfdL4tiEbi64SgK+iOW
 OOegbrOEb8Kzh4KJJYmVlVZ5YUWyH4szgmee1wnylBdsWiWaPLPF3Cflz77apy6T
 TiiBsJd7rRE29FKheaMt34B41BMh8QHESN+DzjzJWsFoi/uNxjgSs2W16XuSupKu
 bpRmB1pYNXMlrkzz7taL05jndZYE5arVriqlxgAsuLOFOp/ER7zecrjImdCM/4kL
 W6ej0R1fz2Geh6CsLBJVE+bKWSQ82q5a4xZEkSYuQHXgZV5eywE5UKu8ssQcRgQA
 VmGUY5k73rfY9Ofupf2gCaf/JSJNXKO/8Xjg0zAdklKtmgFjtna5Tyg9I90j7zn+
 5swSpKuRpilN8MQH+6GWAnfqQlNoviTOpFeq3LwBtNVVOh08cOg6lko/bmebBC+R
 TeQPACKS0Q0gCDPm9RYoU1pMUuYgfOwVfVRZK1prgi2Co7ZBUMOvYbNoKYoPIydr
 ENBYljlU1OYwbzgR2nE+24fvhU8xdNOVG1xXYPAEHShu+p7dLIWRLhl8UCtRQpSR
 1ofeVaJjgjrp29O+1OIQjB2kwCaRdfv/Gq1mztE/VlMU/r++E62OEzcH0aS+mnrg
 yzfyUdI8IFv1q6FGT9yNSifWUWxQPmOKuC8kXsKYfqfJsFwKmHM=
 =uCN4
 -----END PGP SIGNATURE-----

Merge tag 'landlock_v34' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security

Pull Landlock LSM from James Morris:
 "Add Landlock, a new LSM from Mickaël Salaün.

  Briefly, Landlock provides for unprivileged application sandboxing.

  From Mickaël's cover letter:
    "The goal of Landlock is to enable to restrict ambient rights (e.g.
     global filesystem access) for a set of processes. Because Landlock
     is a stackable LSM [1], it makes possible to create safe security
     sandboxes as new security layers in addition to the existing
     system-wide access-controls. This kind of sandbox is expected to
     help mitigate the security impact of bugs or unexpected/malicious
     behaviors in user-space applications. Landlock empowers any
     process, including unprivileged ones, to securely restrict
     themselves.

     Landlock is inspired by seccomp-bpf but instead of filtering
     syscalls and their raw arguments, a Landlock rule can restrict the
     use of kernel objects like file hierarchies, according to the
     kernel semantic. Landlock also takes inspiration from other OS
     sandbox mechanisms: XNU Sandbox, FreeBSD Capsicum or OpenBSD
     Pledge/Unveil.

     In this current form, Landlock misses some access-control features.
     This enables to minimize this patch series and ease review. This
     series still addresses multiple use cases, especially with the
     combined use of seccomp-bpf: applications with built-in sandboxing,
     init systems, security sandbox tools and security-oriented APIs [2]"

  The cover letter and v34 posting is here:

      https://lore.kernel.org/linux-security-module/20210422154123.13086-1-mic@digikod.net/

  See also:

      https://landlock.io/

  This code has had extensive design discussion and review over several
  years"

Link: https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b69f@schaufler-ca.com/ [1]
Link: https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad0468cd@digikod.net/ [2]

* tag 'landlock_v34' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  landlock: Enable user space to infer supported features
  landlock: Add user and kernel documentation
  samples/landlock: Add a sandbox manager example
  selftests/landlock: Add user space tests
  landlock: Add syscall implementations
  arch: Wire up Landlock syscalls
  fs,security: Add sb_delete hook
  landlock: Support filesystem access-control
  LSM: Infrastructure management of the superblock
  landlock: Add ptrace restrictions
  landlock: Set up the security framework and manage credentials
  landlock: Add ruleset and domain management
  landlock: Add object management
2021-05-01 18:50:44 -07:00
Linus Torvalds
10a3efd0fe perf tools changes for v5.13: 1st batch
perf stat:
 
 - Add support for hybrid PMUs to support systems such as Intel Alderlake
   and its BIG/little core/atom cpus.
 
 - Introduce 'bperf' to share hardware PMCs with BPF.
 
 - New --iostat option to collect and present IO stats on Intel hardware.
 
   This functionality is based on recently introduced sysfs attributes
   for Intel® Xeon® Scalable processor family (code name Skylake-SP):
 
     commit bb42b3d397 ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping")
 
   It is intended to provide four I/O performance metrics in MB per each
   PCIe root port:
 
    - Inbound Read: I/O devices below root port read from the host memory
    - Inbound Write: I/O devices below root port write to the host memory
    - Outbound Read: CPU reads from I/O devices below root port
    - Outbound Write: CPU writes to I/O devices below root port
 
 - Align CSV output for summary.
 
 - Clarify --null use cases: Assess raw overhead of 'perf stat' or
   measure just wall clock time.
 
 - Improve readability of shadow stats.
 
 perf record:
 
 - Change the COMM when starting tha workload so that --exclude-perf
   doesn't seem to be not honoured.
 
 - Improve 'Workload failed' message printing events + what was exec'ed.
 
 - Fix cross-arch support for TIME_CONV.
 
 perf report:
 
 - Add option to disable raw event ordering.
 
 - Dump the contents of PERF_RECORD_TIME_CONV in 'perf report -D'.
 
 - Improvements to --stat output, that shows information about PERF_RECORD_ events.
 
 - Preserve identifier id in OCaml demangler.
 
 perf annotate:
 
 - Show full source location with 'l' hotkey in the 'perf annotate' TUI.
 
 - Add line number like in TUI and source location at EOL to the 'perf annotate' --stdio mode.
 
 - Add --demangle and --demangle-kernel to 'perf annotate'.
 
 - Allow configuring annotate.demangle{,_kernel} in 'perf config'.
 
 - Fix sample events lost in stdio mode.
 
 perf data:
 
 - Allow converting a perf.data file to JSON.
 
 libperf:
 
 - Add support for user space counter access.
 
 - Update topdown documentation to permit rdpmc calls.
 
 perf test:
 
 - Add 'perf test' for 'perf stat' CSV output.
 
 - Add 'perf test' entries to test the hybrid PMU support.
 
 - Cleanup 'perf test daemon' if its 'perf test' is interrupted.
 
 - Handle metric reuse in pmu-events parsing 'perf test' entry.
 
 - Add test for PE executable support.
 
 - Add timeout for wait for daemon start in its 'perf test' entries.
 
 Build:
 
 - Enable libtraceevent dynamic linking.
 
 - Improve feature detection output.
 
 - Fix caching of feature checks caching.
 
 - First round of updates for tools copies of kernel headers.
 
 - Enable warnings when compiling BPF programs.
 
 Vendor specific events:
 
 Intel:
 
 - Add missing skylake & icelake model numbers.
 
 arm64:
 
 - Add Hisi hip08 L1, L2 and L3 metrics.
 
 - Add Fujitsu A64FX PMU events.
 
 PowerPC:
 
 - Initial JSON/events list for power10 platform.
 
 - Remove unsupported power9 metrics.
 
 AMD:
 
 - Add Zen3 events.
 
 - Fix broken L2 Cache Hits from L2 HWPF metric.
 
 - Use lowercases for all the eventcodes and umasks.
 
 Hardware tracing:
 
 arm64:
 
 - Update CoreSight ETM metadata format.
 
 - Fix bitmap for CS-ETM option.
 
 - Support PID tracing in config.
 
 - Detect pid in VMID for kernel running at EL2.
 
 Arch specific:
 
 MIPS:
 
 - Support MIPS unwinding and dwarf-regs.
 
 - Generate mips syscalls_n64.c syscall table.
 
 PowerPC:
 
 - Add support for PERF_SAMPLE_WEIGH_STRUCT on PowerPC.
 
 - Support pipeline stage cycles for powerpc.
 
 libbeauty:
 
 - Fix fsconfig generator.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYIshAwAKCRCyPKLppCJ+
 J8oWAP9c1POclDQ7AZDe5/t/InZYSQKJFIku1sE1SNCSOupy7wEAuPBtaN7wDaRj
 BFBibfUGd4MNzLPvMMHneIhSY3DgJwg=
 =FLLr
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v5.13-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tool updates from Arnaldo Carvalho de Melo:
 "perf stat:

   - Add support for hybrid PMUs to support systems such as Intel
     Alderlake and its BIG/little core/atom cpus.

   - Introduce 'bperf' to share hardware PMCs with BPF.

   - New --iostat option to collect and present IO stats on Intel
     hardware.

     This functionality is based on recently introduced sysfs attributes
     for Intel® Xeon® Scalable processor family (code name Skylake-SP)
     in commit bb42b3d397 ("perf/x86/intel/uncore: Expose an Uncore
     unit to IIO PMON mapping")

     It is intended to provide four I/O performance metrics in MB per
     each PCIe root port:

       - Inbound Read: I/O devices below root port read from the host memory
       - Inbound Write: I/O devices below root port write to the host memory
       - Outbound Read: CPU reads from I/O devices below root port
       - Outbound Write: CPU writes to I/O devices below root port

   - Align CSV output for summary.

   - Clarify --null use cases: Assess raw overhead of 'perf stat' or
     measure just wall clock time.

   - Improve readability of shadow stats.

  perf record:

   - Change the COMM when starting tha workload so that --exclude-perf
     doesn't seem to be not honoured.

   - Improve 'Workload failed' message printing events + what was
     exec'ed.

   - Fix cross-arch support for TIME_CONV.

  perf report:

   - Add option to disable raw event ordering.

   - Dump the contents of PERF_RECORD_TIME_CONV in 'perf report -D'.

   - Improvements to --stat output, that shows information about
     PERF_RECORD_ events.

   - Preserve identifier id in OCaml demangler.

  perf annotate:

   - Show full source location with 'l' hotkey in the 'perf annotate'
     TUI.

   - Add line number like in TUI and source location at EOL to the 'perf
     annotate' --stdio mode.

   - Add --demangle and --demangle-kernel to 'perf annotate'.

   - Allow configuring annotate.demangle{,_kernel} in 'perf config'.

   - Fix sample events lost in stdio mode.

  perf data:

   - Allow converting a perf.data file to JSON.

  libperf:

   - Add support for user space counter access.

   - Update topdown documentation to permit rdpmc calls.

  perf test:

   - Add 'perf test' for 'perf stat' CSV output.

   - Add 'perf test' entries to test the hybrid PMU support.

   - Cleanup 'perf test daemon' if its 'perf test' is interrupted.

   - Handle metric reuse in pmu-events parsing 'perf test' entry.

   - Add test for PE executable support.

   - Add timeout for wait for daemon start in its 'perf test' entries.

  Build:

   - Enable libtraceevent dynamic linking.

   - Improve feature detection output.

   - Fix caching of feature checks caching.

   - First round of updates for tools copies of kernel headers.

   - Enable warnings when compiling BPF programs.

  Vendor specific events:

   - Intel:
      - Add missing skylake & icelake model numbers.

   - arm64:
      - Add Hisi hip08 L1, L2 and L3 metrics.
      - Add Fujitsu A64FX PMU events.

   - PowerPC:
      - Initial JSON/events list for power10 platform.
      - Remove unsupported power9 metrics.

   - AMD:
      - Add Zen3 events.
      - Fix broken L2 Cache Hits from L2 HWPF metric.
      - Use lowercases for all the eventcodes and umasks.

  Hardware tracing:

   - arm64:
      - Update CoreSight ETM metadata format.
      - Fix bitmap for CS-ETM option.
      - Support PID tracing in config.
      - Detect pid in VMID for kernel running at EL2.

  Arch specific updates:

   - MIPS:
      - Support MIPS unwinding and dwarf-regs.
      - Generate mips syscalls_n64.c syscall table.

   - PowerPC:
      - Add support for PERF_SAMPLE_WEIGH_STRUCT on PowerPC.
      - Support pipeline stage cycles for powerpc.

  libbeauty:

   - Fix fsconfig generator"

* tag 'perf-tools-for-v5.13-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (132 commits)
  perf build: Defer printing detected features to the end of all feature checks
  tools build: Allow deferring printing the results of feature detection
  perf build: Regenerate the FEATURE_DUMP file after extra feature checks
  perf session: Dump PERF_RECORD_TIME_CONV event
  perf session: Add swap operation for event TIME_CONV
  perf jit: Let convert_timestamp() to be backwards-compatible
  perf tools: Change fields type in perf_record_time_conv
  perf tools: Enable libtraceevent dynamic linking
  perf Documentation: Document intel-hybrid support
  perf tests: Skip 'perf stat metrics (shadow stat) test' for hybrid
  perf tests: Support 'Convert perf time to TSC' test for hybrid
  perf tests: Support 'Session topology' test for hybrid
  perf tests: Support 'Parse and process metrics' test for hybrid
  perf tests: Support 'Track with sched_switch' test for hybrid
  perf tests: Skip 'Setup struct perf_event_attr' test for hybrid
  perf tests: Add hybrid cases for 'Roundtrip evsel->name' test
  perf tests: Add hybrid cases for 'Parse event definition strings' test
  perf record: Uniquify hybrid event name
  perf stat: Warn group events from different hybrid PMU
  perf stat: Filter out unmatched aggregation for hybrid event
  ...
2021-05-01 12:22:38 -07:00
Linus Torvalds
152d32aa84 ARM:
- Stage-2 isolation for the host kernel when running in protected mode
 
 - Guest SVE support when running in nVHE mode
 
 - Force W^X hypervisor mappings in nVHE mode
 
 - ITS save/restore for guests using direct injection with GICv4.1
 
 - nVHE panics now produce readable backtraces
 
 - Guest support for PTP using the ptp_kvm driver
 
 - Performance improvements in the S2 fault handler
 
 x86:
 
 - Optimizations and cleanup of nested SVM code
 
 - AMD: Support for virtual SPEC_CTRL
 
 - Optimizations of the new MMU code: fast invalidation,
   zap under read lock, enable/disably dirty page logging under
   read lock
 
 - /dev/kvm API for AMD SEV live migration (guest API coming soon)
 
 - support SEV virtual machines sharing the same encryption context
 
 - support SGX in virtual machines
 
 - add a few more statistics
 
 - improved directed yield heuristics
 
 - Lots and lots of cleanups
 
 Generic:
 
 - Rework of MMU notifier interface, simplifying and optimizing
 the architecture-specific code
 
 - Some selftests improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmCJ13kUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroM1HAgAqzPxEtiTPTFeFJV5cnPPJ3dFoFDK
 y/juZJUQ1AOtvuWzzwuf175ewkv9vfmtG6rVohpNSkUlJYeoc6tw7n8BTTzCVC1b
 c/4Dnrjeycr6cskYlzaPyV6MSgjSv5gfyj1LA5UEM16LDyekmaynosVWY5wJhju+
 Bnyid8l8Utgz+TLLYogfQJQECCrsU0Wm//n+8TWQgLf1uuiwshU5JJe7b43diJrY
 +2DX+8p9yWXCTz62sCeDWNahUv8AbXpMeJ8uqZPYcN1P0gSEUGu8xKmLOFf9kR7b
 M4U1Gyz8QQbjd2lqnwiWIkvRLX6gyGVbq2zH0QbhUe5gg3qGUX7JjrhdDQ==
 =AXUi
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "This is a large update by KVM standards, including AMD PSP (Platform
  Security Processor, aka "AMD Secure Technology") and ARM CoreSight
  (debug and trace) changes.

  ARM:

   - CoreSight: Add support for ETE and TRBE

   - Stage-2 isolation for the host kernel when running in protected
     mode

   - Guest SVE support when running in nVHE mode

   - Force W^X hypervisor mappings in nVHE mode

   - ITS save/restore for guests using direct injection with GICv4.1

   - nVHE panics now produce readable backtraces

   - Guest support for PTP using the ptp_kvm driver

   - Performance improvements in the S2 fault handler

  x86:

   - AMD PSP driver changes

   - Optimizations and cleanup of nested SVM code

   - AMD: Support for virtual SPEC_CTRL

   - Optimizations of the new MMU code: fast invalidation, zap under
     read lock, enable/disably dirty page logging under read lock

   - /dev/kvm API for AMD SEV live migration (guest API coming soon)

   - support SEV virtual machines sharing the same encryption context

   - support SGX in virtual machines

   - add a few more statistics

   - improved directed yield heuristics

   - Lots and lots of cleanups

  Generic:

   - Rework of MMU notifier interface, simplifying and optimizing the
     architecture-specific code

   - a handful of "Get rid of oprofile leftovers" patches

   - Some selftests improvements"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (379 commits)
  KVM: selftests: Speed up set_memory_region_test
  selftests: kvm: Fix the check of return value
  KVM: x86: Take advantage of kvm_arch_dy_has_pending_interrupt()
  KVM: SVM: Skip SEV cache flush if no ASIDs have been used
  KVM: SVM: Remove an unnecessary prototype declaration of sev_flush_asids()
  KVM: SVM: Drop redundant svm_sev_enabled() helper
  KVM: SVM: Move SEV VMCB tracking allocation to sev.c
  KVM: SVM: Explicitly check max SEV ASID during sev_hardware_setup()
  KVM: SVM: Unconditionally invoke sev_hardware_teardown()
  KVM: SVM: Enable SEV/SEV-ES functionality by default (when supported)
  KVM: SVM: Condition sev_enabled and sev_es_enabled on CONFIG_KVM_AMD_SEV=y
  KVM: SVM: Append "_enabled" to module-scoped SEV/SEV-ES control variables
  KVM: SEV: Mask CPUID[0x8000001F].eax according to supported features
  KVM: SVM: Move SEV module params/variables to sev.c
  KVM: SVM: Disable SEV/SEV-ES if NPT is disabled
  KVM: SVM: Free sev_asid_bitmap during init if SEV setup fails
  KVM: SVM: Zero out the VMCB array used to track SEV ASID association
  x86/sev: Drop redundant and potentially misleading 'sev_enabled'
  KVM: x86: Move reverse CPUID helpers to separate header file
  KVM: x86: Rename GPR accessors to make mode-aware variants the defaults
  ...
2021-05-01 10:14:08 -07:00
Masahiro Yamada
77a88274dc kbuild: replace LANG=C with LC_ALL=C
LANG gives a weak default to each LC_* in case it is not explicitly
defined. LC_ALL, if set, overrides all other LC_* variables.

  LANG  <  LC_CTYPE, LC_COLLATE, LC_MONETARY, LC_NUMERIC, ...  <  LC_ALL

This is why documentation such as [1] suggests to set LC_ALL in build
scripts to get the deterministic result.

LANG=C is not strong enough to override LC_* that may be set by end
users.

[1]: https://reproducible-builds.org/docs/locales/

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Reviewed-by: Matthias Maennich <maennich@google.com>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> (mptcp)
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-02 00:43:35 +09:00
Linus Torvalds
d42f323a7d Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "A few misc subsystems and some of MM.

  175 patches.

  Subsystems affected by this patch series: ia64, kbuild, scripts, sh,
  ocfs2, kfifo, vfs, kernel/watchdog, and mm (slab-generic, slub,
  kmemleak, debug, pagecache, msync, gup, memremap, memcg, pagemap,
  mremap, dma, sparsemem, vmalloc, documentation, kasan, initialization,
  pagealloc, and memory-failure)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (175 commits)
  mm/memory-failure: unnecessary amount of unmapping
  mm/mmzone.h: fix existing kernel-doc comments and link them to core-api
  mm: page_alloc: ignore init_on_free=1 for debug_pagealloc=1
  net: page_pool: use alloc_pages_bulk in refill code path
  net: page_pool: refactor dma_map into own function page_pool_dma_map
  SUNRPC: refresh rq_pages using a bulk page allocator
  SUNRPC: set rq_page_end differently
  mm/page_alloc: inline __rmqueue_pcplist
  mm/page_alloc: optimize code layout for __alloc_pages_bulk
  mm/page_alloc: add an array-based interface to the bulk page allocator
  mm/page_alloc: add a bulk page allocator
  mm/page_alloc: rename alloced to allocated
  mm/page_alloc: duplicate include linux/vmalloc.h
  mm, page_alloc: avoid page_to_pfn() in move_freepages()
  mm/Kconfig: remove default DISCONTIGMEM_MANUAL
  mm: page_alloc: dump migrate-failed pages
  mm/mempolicy: fix mpol_misplaced kernel-doc
  mm/mempolicy: rewrite alloc_pages_vma documentation
  mm/mempolicy: rewrite alloc_pages documentation
  mm/mempolicy: rename alloc_pages_current to alloc_pages
  ...
2021-04-30 14:38:01 -07:00
Linus Torvalds
c70a4be130 powerpc updates for 5.13
- Enable KFENCE for 32-bit.
 
  - Implement EBPF for 32-bit.
 
  - Convert 32-bit to do interrupt entry/exit in C.
 
  - Convert 64-bit BookE to do interrupt entry/exit in C.
 
  - Changes to our signal handling code to use user_access_begin/end() more extensively.
 
  - Add support for time namespaces (CONFIG_TIME_NS)
 
  - A series of fixes that allow us to reenable STRICT_KERNEL_RWX.
 
  - Other smaller features, fixes & cleanups.
 
 Thanks to: Alexey Kardashevskiy, Andreas Schwab, Andrew Donnellan, Aneesh Kumar K.V,
   Athira Rajeev, Bhaskar Chowdhury, Bixuan Cui, Cédric Le Goater, Chen Huang, Chris
   Packham, Christophe Leroy, Christopher M. Riedl, Colin Ian King, Dan Carpenter, Daniel
   Axtens, Daniel Henrique Barboza, David Gibson, Davidlohr Bueso, Denis Efremov,
   dingsenjie, Dmitry Safonov, Dominic DeMarco, Fabiano Rosas, Ganesh Goudar, Geert
   Uytterhoeven, Geetika Moolchandani, Greg Kurz, Guenter Roeck, Haren Myneni, He Ying,
   Jiapeng Chong, Jordan Niethe, Laurent Dufour, Lee Jones, Leonardo Bras, Li Huafei,
   Madhavan Srinivasan, Mahesh Salgaonkar, Masahiro Yamada, Nathan Chancellor, Nathan
   Lynch, Nicholas Piggin, Oliver O'Halloran, Paul Menzel, Pu Lehui, Randy Dunlap, Ravi
   Bangoria, Rosen Penev, Russell Currey, Santosh Sivaraj, Sebastian Andrzej Siewior,
   Segher Boessenkool, Shivaprasad G Bhat, Srikar Dronamraju, Stephen Rothwell, Thadeu Lima
   de Souza Cascardo, Thomas Gleixner, Tony Ambardar, Tyrel Datwyler, Vaibhav Jain,
   Vincenzo Frascino, Xiongwei Song, Yang Li, Yu Kuai, Zhang Yunkai.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmCLV1kTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgLUyD/4jrTolG4sVec211hYO+0VuJzoqN4Cf
 j2CA2Ju39butnSMiq4LJUPRB7QRZY1OofkoNFpZeDQspjfZXPz2ulpYAz+SxHWE2
 ReHPmWH1rOABlUPXFboePF4OLwmAs9eR5mN2z9HpKXbT3k78HaToLqiONyB4fVCr
 Q5TkJeRn/Y7ZJLdyPLTpczHHleQ8KoM6kT7ncXnTm6p97JOBJSrGaJ5N/8X5a4+e
 6jtgB7Pvw8jNDShSr8BDLBgBZZcmoTiuG8KfgwRZ+m+mKB1yI2X8S/a54w/lDi9g
 UcSv3jQcFLJuW+T/pYe4R330uWDYa0cwjJOtMmsJ98S4EYOevoe9fZuL97qNshme
 xtBr4q1i03G1icYOJJ8dXtvabG2rUzj8t1SCDpwYfrynzTWVRikiQYTXUBhRSFoK
 nsoklvKd2IZa485XYJ2ljSyClMy8S4yJJ9RuzZ94DTXDSJUesKuyRWGnso4mhkcl
 wvl4wwMTJvnCMKVo6dsJyV24QWfd6dABxzm04uPA94CKhG33UwK8252jXVeaohSb
 WSO7qWBONgDXQLJ0mXRcEYa9NHvFS4Jnp6APbxnHr1gS+K+PNkD4gPBf34FoyN0E
 9s27kvEYk5vr8APUclETF6+FkbGUD5bFbusjt3hYloFpAoHQ/k5pFVDsOZNPA8sW
 fDIRp05KunDojw==
 =dfKL
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Enable KFENCE for 32-bit.

 - Implement EBPF for 32-bit.

 - Convert 32-bit to do interrupt entry/exit in C.

 - Convert 64-bit BookE to do interrupt entry/exit in C.

 - Changes to our signal handling code to use user_access_begin/end()
   more extensively.

 - Add support for time namespaces (CONFIG_TIME_NS)

 - A series of fixes that allow us to reenable STRICT_KERNEL_RWX.

 - Other smaller features, fixes & cleanups.

Thanks to Alexey Kardashevskiy, Andreas Schwab, Andrew Donnellan, Aneesh
Kumar K.V, Athira Rajeev, Bhaskar Chowdhury, Bixuan Cui, Cédric Le
Goater, Chen Huang, Chris Packham, Christophe Leroy, Christopher M.
Riedl, Colin Ian King, Dan Carpenter, Daniel Axtens, Daniel Henrique
Barboza, David Gibson, Davidlohr Bueso, Denis Efremov, dingsenjie,
Dmitry Safonov, Dominic DeMarco, Fabiano Rosas, Ganesh Goudar, Geert
Uytterhoeven, Geetika Moolchandani, Greg Kurz, Guenter Roeck, Haren
Myneni, He Ying, Jiapeng Chong, Jordan Niethe, Laurent Dufour, Lee
Jones, Leonardo Bras, Li Huafei, Madhavan Srinivasan, Mahesh Salgaonkar,
Masahiro Yamada, Nathan Chancellor, Nathan Lynch, Nicholas Piggin,
Oliver O'Halloran, Paul Menzel, Pu Lehui, Randy Dunlap, Ravi Bangoria,
Rosen Penev, Russell Currey, Santosh Sivaraj, Sebastian Andrzej Siewior,
Segher Boessenkool, Shivaprasad G Bhat, Srikar Dronamraju, Stephen
Rothwell, Thadeu Lima de Souza Cascardo, Thomas Gleixner, Tony Ambardar,
Tyrel Datwyler, Vaibhav Jain, Vincenzo Frascino, Xiongwei Song, Yang Li,
Yu Kuai, and Zhang Yunkai.

* tag 'powerpc-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (302 commits)
  powerpc/signal32: Fix erroneous SIGSEGV on RT signal return
  powerpc: Avoid clang uninitialized warning in __get_user_size_allowed
  powerpc/papr_scm: Mark nvdimm as unarmed if needed during probe
  powerpc/kvm: Fix build error when PPC_MEM_KEYS/PPC_PSERIES=n
  powerpc/kasan: Fix shadow start address with modules
  powerpc/kernel/iommu: Use largepool as a last resort when !largealloc
  powerpc/kernel/iommu: Align size for IOMMU_PAGE_SIZE() to save TCEs
  powerpc/44x: fix spelling mistake in Kconfig "varients" -> "variants"
  powerpc/iommu: Annotate nested lock for lockdep
  powerpc/iommu: Do not immediately panic when failed IOMMU table allocation
  powerpc/iommu: Allocate it_map by vmalloc
  selftests/powerpc: remove unneeded semicolon
  powerpc/64s: remove unneeded semicolon
  powerpc/eeh: remove unneeded semicolon
  powerpc/selftests: Add selftest to test concurrent perf/ptrace events
  powerpc/selftests/perf-hwbreak: Add testcases for 2nd DAWR
  powerpc/selftests/perf-hwbreak: Coalesce event creation code
  powerpc/selftests/ptrace-hwbreak: Add testcases for 2nd DAWR
  powerpc/configs: Add IBMVNIC to some 64-bit configs
  selftests/powerpc: Add uaccess flush test
  ...
2021-04-30 12:22:28 -07:00
Uladzislau Rezki (Sony)
7bc4ca3ea9 vm/test_vmalloc.sh: adapt for updated driver interface
A 'single_cpu_test' parameter is odd and it does not exist anymore.
Instead there was introduced a 'nr_threads' one.  If it is not set it
behaves as the former parameter.

That is why update a "stress mode" according to this change specifying
number of workers which are equal to number of CPUs.  Also update an
output of help message based on a new interface.

Link: https://lkml.kernel.org/r/20210402202237.20334-3-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-30 11:20:40 -07:00
Brian Geffon
8593100444 selftests: add a MREMAP_DONTUNMAP selftest for shmem
This test extends the current mremap tests to validate that the
MREMAP_DONTUNMAP operation can be performed on shmem mappings.

Link: https://lkml.kernel.org/r/20210323182520.2712101-3-bgeffon@google.com
Signed-off-by: Brian Geffon <bgeffon@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: "Michael S . Tsirkin" <mst@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Sonny Rao <sonnyrao@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Dmitry Safonov <dima@arista.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Alejandro Colomar <alx.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-30 11:20:39 -07:00
Johannes Weiner
4bbcc5a41c kselftests: cgroup: update kmem test for new vmstat implementation
With memcg having switched to rstat, memory.stat output is precise.
Update the cgroup selftest to reflect the expectations and error
tolerances of the new implementation.

Also add newly tracked types of memory to the memory.stat side of the
equation, since they're included in memory.current and could throw false
positives.

Link: https://lkml.kernel.org/r/20210209163304.77088-9-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-30 11:20:38 -07:00
Florent Revest
f80f88f0e2 selftests/bpf: Fix the snprintf test
The BPF program for the snprintf selftest runs on all syscall entries.
On busy multicore systems this can cause concurrency issues.

For example it was observed that sometimes the userspace part of the
test reads "    4 0000" instead of "    4 000" (extra '0' at the end)
which seems to happen just before snprintf on another core sets
end[-1] = '\0'.

This patch adds a pid filter to the test to ensure that no
bpf_snprintf() will write over the test's output buffers while the
userspace reads the values.

Fixes: c2e39c6bdc ("selftests/bpf: Add a series of tests for bpf_snprintf")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210428152501.1024509-1-revest@chromium.org
2021-04-30 10:36:59 -07:00
Linus Torvalds
b0030af53a Kbuild updates for v5.13
- Evaluate $(call cc-option,...) etc. only for build targets
 
  - Add CONFIG_VMLINUX_MAP to generate .map file when linking vmlinux
 
  - Remove unnecessary --gcc-toolchains Clang flag because the --prefix
    flag finds the toolchains
 
  - Do not pass Clang's --prefix flag when using the integrated as
 
  - Check the assembler version in Kconfig time
 
  - Add new CONFIG options, AS_VERSION, AS_IS_GNU, AS_IS_LLVM to clean up
    some dependencies in Kconfig
 
  - Fix invalid Module.symvers creation when building only modules without
    vmlinux
 
  - Fix false-positive modpost warnings when CONFIG_TRIM_UNUSED_KSYMS is
    set, but there is no module to build
 
  - Refactor module installation Makefile
 
  - Support zstd for module compression
 
  - Convert alpha and ia64 to use generic shell scripts to generate the
    syscall headers
 
  - Add a new elfnote to indicate if the kernel was built with LTO, which
    will be used by pahole
 
  - Flatten the directory structure under include/config/ so CONFIG options
    and filenames match
 
  - Change the deb source package name from linux-$(KERNELRELEASE) to
    linux-upstream
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmCKOLUVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGdq8P/2z+saxIWGXVWt0ggavR0vimcY4e
 NQIKGu9uZpo/lfoC78UG8HO+XvzvPUrcRuOX+WIVr2GfScgVnweDukexUAY0/2oi
 4UvqhndJ0sjEwRj8mXXJ0O+PED+OtgrqrbhkLq9wHQd/jpSD4XEWXwn1g1XVrTZu
 WbwP6b1G/Rnjp2lz3HKC017rPkmfsCFQB7r+hbJGKhT0rCaceheUuBvGa/XqLknr
 IOyaUAY76u3Gtj6fVY1rk70kQgDMF8+LJPgdSSZ/XPCvbNJQAeop36EeRNfmxGIh
 vQhFJRJeqy+K5MhCpdGtTGYDawlmQVn/f/99SkDw9F04S4ZL2Xnaaqw4L1QDhjTh
 xBlckbPvmq36F4xSqWd5kYF3iwS+LsEJROwZKFLEVDb3zMsRQPEGQM/556QmrBi2
 5KXzwOYEJKuobWr1hQ3PwLumJKTPGLvGEFB3Bq2eG8LrgpOAHPI4ejC2EBu0vCez
 QbskP2lPlMj3MbL5iZg+6ZRlOChZ7RUrSDj6+iTeOcinmXHqQONCL6qy+um4Rfcb
 zUkfwTlqM9d88u6AbO2VvQMOobMjvp4bvmqi/Xv8IiTukLHco4tc8zTuySmZwSyI
 rd3RKYn367qWztX5YyaoGRPVmlMG7ssbRc4fkXiV13vfeZebNfVwlX/CHv9+IWwN
 RVnMhYBhUZR68h6z
 =ti9L
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Evaluate $(call cc-option,...) etc. only for build targets

 - Add CONFIG_VMLINUX_MAP to generate .map file when linking vmlinux

 - Remove unnecessary --gcc-toolchains Clang flag because the --prefix
   flag finds the toolchains

 - Do not pass Clang's --prefix flag when using the integrated as

 - Check the assembler version in Kconfig time

 - Add new CONFIG options, AS_VERSION, AS_IS_GNU, AS_IS_LLVM to clean up
   some dependencies in Kconfig

 - Fix invalid Module.symvers creation when building only modules
   without vmlinux

 - Fix false-positive modpost warnings when CONFIG_TRIM_UNUSED_KSYMS is
   set, but there is no module to build

 - Refactor module installation Makefile

 - Support zstd for module compression

 - Convert alpha and ia64 to use generic shell scripts to generate the
   syscall headers

 - Add a new elfnote to indicate if the kernel was built with LTO, which
   will be used by pahole

 - Flatten the directory structure under include/config/ so CONFIG
   options and filenames match

 - Change the deb source package name from linux-$(KERNELRELEASE) to
   linux-upstream

* tag 'kbuild-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (42 commits)
  kbuild: Add $(KBUILD_HOSTLDFLAGS) to 'has_libelf' test
  kbuild: deb-pkg: change the source package name to linux-upstream
  tools: do not include scripts/Kbuild.include
  kbuild: redo fake deps at include/config/*.h
  kbuild: remove TMPO from try-run
  MAINTAINERS: add pattern for dummy-tools
  kbuild: add an elfnote for whether vmlinux is built with lto
  ia64: syscalls: switch to generic syscallhdr.sh
  ia64: syscalls: switch to generic syscalltbl.sh
  alpha: syscalls: switch to generic syscallhdr.sh
  alpha: syscalls: switch to generic syscalltbl.sh
  sysctl: use min() helper for namecmp()
  kbuild: add support for zstd compressed modules
  kbuild: remove CONFIG_MODULE_COMPRESS
  kbuild: merge scripts/Makefile.modsign to scripts/Makefile.modinst
  kbuild: move module strip/compression code into scripts/Makefile.modinst
  kbuild: refactor scripts/Makefile.modinst
  kbuild: rename extmod-prefix to extmod_prefix
  kbuild: check module name conflict for external modules as well
  kbuild: show the target directory for depmod log
  ...
2021-04-29 14:24:39 -07:00
Linus Torvalds
9d31d23389 Networking changes for 5.13.
Core:
 
  - bpf:
 	- allow bpf programs calling kernel functions (initially to
 	  reuse TCP congestion control implementations)
 	- enable task local storage for tracing programs - remove the
 	  need to store per-task state in hash maps, and allow tracing
 	  programs access to task local storage previously added for
 	  BPF_LSM
 	- add bpf_for_each_map_elem() helper, allowing programs to
 	  walk all map elements in a more robust and easier to verify
 	  fashion
 	- sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT
 	  redirection
 	- lpm: add support for batched ops in LPM trie
 	- add BTF_KIND_FLOAT support - mostly to allow use of BTF
 	  on s390 which has floats in its headers files
 	- improve BPF syscall documentation and extend the use of kdoc
 	  parsing scripts we already employ for bpf-helpers
 	- libbpf, bpftool: support static linking of BPF ELF files
 	- improve support for encapsulation of L2 packets
 
  - xdp: restructure redirect actions to avoid a runtime lookup,
 	improving performance by 4-8% in microbenchmarks
 
  - xsk: build skb by page (aka generic zerocopy xmit) - improve
 	performance of software AF_XDP path by 33% for devices
 	which don't need headers in the linear skb part (e.g. virtio)
 
  - nexthop: resilient next-hop groups - improve path stability
 	on next-hops group changes (incl. offload for mlxsw)
 
  - ipv6: segment routing: add support for IPv4 decapsulation
 
  - icmp: add support for RFC 8335 extended PROBE messages
 
  - inet: use bigger hash table for IP ID generation
 
  - tcp: deal better with delayed TX completions - make sure we don't
 	give up on fast TCP retransmissions only because driver is
 	slow in reporting that it completed transmitting the original
 
  - tcp: reorder tcp_congestion_ops for better cache locality
 
  - mptcp:
 	- add sockopt support for common TCP options
 	- add support for common TCP msg flags
 	- include multiple address ids in RM_ADDR
 	- add reset option support for resetting one subflow
 
  - udp: GRO L4 improvements - improve 'forward' / 'frag_list'
 	co-existence with UDP tunnel GRO, allowing the first to take
 	place correctly	even for encapsulated UDP traffic
 
  - micro-optimize dev_gro_receive() and flow dissection, avoid
 	retpoline overhead on VLAN and TEB GRO
 
  - use less memory for sysctls, add a new sysctl type, to allow using
 	u8 instead of "int" and "long" and shrink networking sysctls
 
  - veth: allow GRO without XDP - this allows aggregating UDP
 	packets before handing them off to routing, bridge, OvS, etc.
 
  - allow specifing ifindex when device is moved to another namespace
 
  - netfilter:
 	- nft_socket: add support for cgroupsv2
 	- nftables: add catch-all set element - special element used
 	  to define a default action in case normal lookup missed
 	- use net_generic infra in many modules to avoid allocating
 	  per-ns memory unnecessarily
 
  - xps: improve the xps handling to avoid potential out-of-bound
 	accesses and use-after-free when XPS change race with other
 	re-configuration under traffic
 
  - add a config knob to turn off per-cpu netdev refcnt to catch
 	underflows in testing
 
 Device APIs:
 
  - add WWAN subsystem to organize the WWAN interfaces better and
    hopefully start driving towards more unified and vendor-
    -independent APIs
 
  - ethtool:
 	- add interface for reading IEEE MIB stats (incl. mlx5 and
 	  bnxt support)
 	- allow network drivers to dump arbitrary SFP EEPROM data,
 	  current offset+length API was a poor fit for modern SFP
 	  which define EEPROM in terms of pages (incl. mlx5 support)
 
  - act_police, flow_offload: add support for packet-per-second
 	policing (incl. offload for nfp)
 
  - psample: add additional metadata attributes like transit delay
 	for packets sampled from switch HW (and corresponding egress
 	and policy-based sampling in the mlxsw driver)
 
  - dsa: improve support for sandwiched LAGs with bridge and DSA
 
  - netfilter:
 	- flowtable: use direct xmit in topologies with IP
 	  forwarding, bridging, vlans etc.
 	- nftables: counter hardware offload support
 
  - Bluetooth:
 	- improvements for firmware download w/ Intel devices
 	- add support for reading AOSP vendor capabilities
 	- add support for virtio transport driver
 
  - mac80211:
 	- allow concurrent monitor iface and ethernet rx decap
 	- set priority and queue mapping for injected frames
 
  - phy: add support for Clause-45 PHY Loopback
 
  - pci/iov: add sysfs MSI-X vector assignment interface
 	to distribute MSI-X resources to VFs (incl. mlx5 support)
 
 New hardware/drivers:
 
  - dsa: mv88e6xxx: add support for Marvell mv88e6393x -
 	11-port Ethernet switch with 8x 1-Gigabit Ethernet
 	and 3x 10-Gigabit interfaces.
 
  - dsa: support for legacy Broadcom tags used on BCM5325, BCM5365
 	and BCM63xx switches
 
  - Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches
 
  - ath11k: support for QCN9074 a 802.11ax device
 
  - Bluetooth: Broadcom BCM4330 and BMC4334
 
  - phy: Marvell 88X2222 transceiver support
 
  - mdio: add BCM6368 MDIO mux bus controller
 
  - r8152: support RTL8153 and RTL8156 (USB Ethernet) chips
 
  - mana: driver for Microsoft Azure Network Adapter (MANA)
 
  - Actions Semi Owl Ethernet MAC
 
  - can: driver for ETAS ES58X CAN/USB interfaces
 
 Pure driver changes:
 
  - add XDP support to: enetc, igc, stmmac
  - add AF_XDP support to: stmmac
 
  - virtio:
 	- page_to_skb() use build_skb when there's sufficient tailroom
 	  (21% improvement for 1000B UDP frames)
 	- support XDP even without dedicated Tx queues - share the Tx
 	  queues with the stack when necessary
 
  - mlx5:
 	- flow rules: add support for mirroring with conntrack,
 	  matching on ICMP, GTP, flex filters and more
 	- support packet sampling with flow offloads
 	- persist uplink representor netdev across eswitch mode
 	  changes
 	- allow coexistence of CQE compression and HW time-stamping
 	- add ethtool extended link error state reporting
 
  - ice, iavf: support flow filters, UDP Segmentation Offload
 
  - dpaa2-switch:
 	- move the driver out of staging
 	- add spanning tree (STP) support
 	- add rx copybreak support
 	- add tc flower hardware offload on ingress traffic
 
  - ionic:
 	- implement Rx page reuse
 	- support HW PTP time-stamping
 
  - octeon: support TC hardware offloads - flower matching on ingress
 	and egress ratelimitting.
 
  - stmmac:
 	- add RX frame steering based on VLAN priority in tc flower
 	- support frame preemption (FPE)
 	- intel: add cross time-stamping freq difference adjustment
 
  - ocelot:
 	- support forwarding of MRP frames in HW
 	- support multiple bridges
 	- support PTP Sync one-step timestamping
 
  - dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like
 	learning, flooding etc.
 
  - ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350,
 	SC7280 SoCs)
 
  - mt7601u: enable TDLS support
 
  - mt76:
 	- add support for 802.3 rx frames (mt7915/mt7615)
 	- mt7915 flash pre-calibration support
 	- mt7921/mt7663 runtime power management fixes
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmCKFPIACgkQMUZtbf5S
 Irtw0g/+NA8bWdHNgG4H5rya0pv2z3IieLRmSdDfKRQQXcJpklawc5MKVVaTee/Q
 5/QqgPdCsu1LAU6JXBKsKmyDDaMlQKdWuKbOqDSiAQKoMesZStTEHf9d851ZzgxA
 Cdb6O7BD3lBl/IN+oxNG+KcmD1LKquTPKGySq2mQtEdLO12ekAsranzmj4voKffd
 q9tBShpXQ7Dq77DLYfiQXVCvsizNcbbJFuxX0o9Lpb9+61ZyYAbogZSa9ypiZZwR
 I/9azRBtJg7UV1aD/cLuAfy66Qh7t63+rCxVazs5Os8jVO26P/jQdisnnOe/x+p9
 wYEmKm3GSu0V4SAPxkWW+ooKusflCeqDoMIuooKt6kbP6BRj540veGw3Ww/m5YFr
 7pLQkTSP/tSjuGQIdBE1LOP5LBO8DZeC8Kiop9V0fzAW9hFSZbEq25WW0bPj8QQO
 zA4Z7yWlslvxcfY2BdJX3wD8klaINkl/8fDWZFFsBdfFX2VeLtm7Xfduw34BJpvU
 rYT3oWr6PhtkPAKR32SUcemSfeWgIVU41eSshzRz3kez1NngBUuLlSGGSEaKbes5
 pZVt6pYFFVByyf6MTHFEoQvafZfEw04JILZpo4R5V8iTHzom0kD3Py064sBiXEw2
 B6t+OW4qgcxGblpFkK2lD4kR2s1TPUs0ckVO6sAy1x8q60KKKjY=
 =vcbA
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - bpf:
        - allow bpf programs calling kernel functions (initially to
          reuse TCP congestion control implementations)
        - enable task local storage for tracing programs - remove the
          need to store per-task state in hash maps, and allow tracing
          programs access to task local storage previously added for
          BPF_LSM
        - add bpf_for_each_map_elem() helper, allowing programs to walk
          all map elements in a more robust and easier to verify fashion
        - sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT
          redirection
        - lpm: add support for batched ops in LPM trie
        - add BTF_KIND_FLOAT support - mostly to allow use of BTF on
          s390 which has floats in its headers files
        - improve BPF syscall documentation and extend the use of kdoc
          parsing scripts we already employ for bpf-helpers
        - libbpf, bpftool: support static linking of BPF ELF files
        - improve support for encapsulation of L2 packets

   - xdp: restructure redirect actions to avoid a runtime lookup,
     improving performance by 4-8% in microbenchmarks

   - xsk: build skb by page (aka generic zerocopy xmit) - improve
     performance of software AF_XDP path by 33% for devices which don't
     need headers in the linear skb part (e.g. virtio)

   - nexthop: resilient next-hop groups - improve path stability on
     next-hops group changes (incl. offload for mlxsw)

   - ipv6: segment routing: add support for IPv4 decapsulation

   - icmp: add support for RFC 8335 extended PROBE messages

   - inet: use bigger hash table for IP ID generation

   - tcp: deal better with delayed TX completions - make sure we don't
     give up on fast TCP retransmissions only because driver is slow in
     reporting that it completed transmitting the original

   - tcp: reorder tcp_congestion_ops for better cache locality

   - mptcp:
        - add sockopt support for common TCP options
        - add support for common TCP msg flags
        - include multiple address ids in RM_ADDR
        - add reset option support for resetting one subflow

   - udp: GRO L4 improvements - improve 'forward' / 'frag_list'
     co-existence with UDP tunnel GRO, allowing the first to take place
     correctly even for encapsulated UDP traffic

   - micro-optimize dev_gro_receive() and flow dissection, avoid
     retpoline overhead on VLAN and TEB GRO

   - use less memory for sysctls, add a new sysctl type, to allow using
     u8 instead of "int" and "long" and shrink networking sysctls

   - veth: allow GRO without XDP - this allows aggregating UDP packets
     before handing them off to routing, bridge, OvS, etc.

   - allow specifing ifindex when device is moved to another namespace

   - netfilter:
        - nft_socket: add support for cgroupsv2
        - nftables: add catch-all set element - special element used to
          define a default action in case normal lookup missed
        - use net_generic infra in many modules to avoid allocating
          per-ns memory unnecessarily

   - xps: improve the xps handling to avoid potential out-of-bound
     accesses and use-after-free when XPS change race with other
     re-configuration under traffic

   - add a config knob to turn off per-cpu netdev refcnt to catch
     underflows in testing

  Device APIs:

   - add WWAN subsystem to organize the WWAN interfaces better and
     hopefully start driving towards more unified and vendor-
     independent APIs

   - ethtool:
        - add interface for reading IEEE MIB stats (incl. mlx5 and bnxt
          support)
        - allow network drivers to dump arbitrary SFP EEPROM data,
          current offset+length API was a poor fit for modern SFP which
          define EEPROM in terms of pages (incl. mlx5 support)

   - act_police, flow_offload: add support for packet-per-second
     policing (incl. offload for nfp)

   - psample: add additional metadata attributes like transit delay for
     packets sampled from switch HW (and corresponding egress and
     policy-based sampling in the mlxsw driver)

   - dsa: improve support for sandwiched LAGs with bridge and DSA

   - netfilter:
        - flowtable: use direct xmit in topologies with IP forwarding,
          bridging, vlans etc.
        - nftables: counter hardware offload support

   - Bluetooth:
        - improvements for firmware download w/ Intel devices
        - add support for reading AOSP vendor capabilities
        - add support for virtio transport driver

   - mac80211:
        - allow concurrent monitor iface and ethernet rx decap
        - set priority and queue mapping for injected frames

   - phy: add support for Clause-45 PHY Loopback

   - pci/iov: add sysfs MSI-X vector assignment interface to distribute
     MSI-X resources to VFs (incl. mlx5 support)

  New hardware/drivers:

   - dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port
     Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit
     interfaces.

   - dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and
     BCM63xx switches

   - Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches

   - ath11k: support for QCN9074 a 802.11ax device

   - Bluetooth: Broadcom BCM4330 and BMC4334

   - phy: Marvell 88X2222 transceiver support

   - mdio: add BCM6368 MDIO mux bus controller

   - r8152: support RTL8153 and RTL8156 (USB Ethernet) chips

   - mana: driver for Microsoft Azure Network Adapter (MANA)

   - Actions Semi Owl Ethernet MAC

   - can: driver for ETAS ES58X CAN/USB interfaces

  Pure driver changes:

   - add XDP support to: enetc, igc, stmmac

   - add AF_XDP support to: stmmac

   - virtio:
        - page_to_skb() use build_skb when there's sufficient tailroom
          (21% improvement for 1000B UDP frames)
        - support XDP even without dedicated Tx queues - share the Tx
          queues with the stack when necessary

   - mlx5:
        - flow rules: add support for mirroring with conntrack, matching
          on ICMP, GTP, flex filters and more
        - support packet sampling with flow offloads
        - persist uplink representor netdev across eswitch mode changes
        - allow coexistence of CQE compression and HW time-stamping
        - add ethtool extended link error state reporting

   - ice, iavf: support flow filters, UDP Segmentation Offload

   - dpaa2-switch:
        - move the driver out of staging
        - add spanning tree (STP) support
        - add rx copybreak support
        - add tc flower hardware offload on ingress traffic

   - ionic:
        - implement Rx page reuse
        - support HW PTP time-stamping

   - octeon: support TC hardware offloads - flower matching on ingress
     and egress ratelimitting.

   - stmmac:
        - add RX frame steering based on VLAN priority in tc flower
        - support frame preemption (FPE)
        - intel: add cross time-stamping freq difference adjustment

   - ocelot:
        - support forwarding of MRP frames in HW
        - support multiple bridges
        - support PTP Sync one-step timestamping

   - dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like
     learning, flooding etc.

   - ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350,
     SC7280 SoCs)

   - mt7601u: enable TDLS support

   - mt76:
        - add support for 802.3 rx frames (mt7915/mt7615)
        - mt7915 flash pre-calibration support
        - mt7921/mt7663 runtime power management fixes"

* tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2451 commits)
  net: selftest: fix build issue if INET is disabled
  net: netrom: nr_in: Remove redundant assignment to ns
  net: tun: Remove redundant assignment to ret
  net: phy: marvell: add downshift support for M88E1240
  net: dsa: ksz: Make reg_mib_cnt a u8 as it never exceeds 255
  net/sched: act_ct: Remove redundant ct get and check
  icmp: standardize naming of RFC 8335 PROBE constants
  bpf, selftests: Update array map tests for per-cpu batched ops
  bpf: Add batched ops support for percpu array
  bpf: Implement formatted output helpers with bstr_printf
  seq_file: Add a seq_bprintf function
  sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues
  net:nfc:digital: Fix a double free in digital_tg_recv_dep_req
  net: fix a concurrency bug in l2tp_tunnel_register()
  net/smc: Remove redundant assignment to rc
  mpls: Remove redundant assignment to err
  llc2: Remove redundant assignment to rc
  net/tls: Remove redundant initialization of record
  rds: Remove redundant assignment to nr_sig
  dt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0
  ...
2021-04-29 11:57:23 -07:00
Arnaldo Carvalho de Melo
c6e3bf4371 perf build: Defer printing detected features to the end of all feature checks
We were doing it in tools/build/Makefile.feature, after running the
feature checks, but then in tools/perf/Makefile.config we can call more
feature checks when we notice that some feature check failed, like when
libbfd wasn't detected and we add libraries to the LDFLAGS of its
feature check to try again, etc.

Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 11:22:33 -03:00
Arnaldo Carvalho de Melo
19177bc3da tools build: Allow deferring printing the results of feature detection
By setting FEATURE_DISPLAY_DEFERRED=1 a tool may ask for the printout
of the detected features in tools/build/Makefile.feature to be done
later adter extra feature checks are done that are tool specific.

The perf tool will do it via its tools/perf/Makefile.config, as it
performs such extra feature checks.

Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:31:00 -03:00
Jiri Olsa
fbed59f844 perf build: Regenerate the FEATURE_DUMP file after extra feature checks
Feature detection is done in tools/build/Makefile.feature, we may exit
there with some features not detected and then, in
tools/perf/Makefile.config try adding extra libraries to link and then
do extra feature checks to see if we now find the feature.

This is the case with the disassembler-four-args that checks if the
diassembler() function in libopcodes (binutils) has a signature with
one or with four arguments, as this is not ABI and they changed it at
some point.

This is not a problem when doing normal builds, for instance:

  $ make -C tools/perf O=/tmp/build/perf

As we don't use what is in FEATURE-DUMP at that point, but is a problem
if we pass FEATURE_DUMP=/previously-detected-features as we do in
'make -C tools/perf build-test' to reuse the feature detection in the
many build combinations we test there.

When that is done feature-disassembler-four-args will be set to 0, but
opensuse 15.1 has the four arguments function signature in
disassembler(). The build thus fails.

Fix it by rewriting the FEATURE-DUMP file at the end of
tools/perf/Makefile.config to register features we retested in that make
file.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:31:00 -03:00
Leo Yan
81e70d7ee4 perf session: Dump PERF_RECORD_TIME_CONV event
Now perf tool uses the common stub function process_event_op2_stub() for
dumping TIME_CONV event, thus it doesn't output the clock parameters
contained in the event.

This patch adds the callback function for dumping the hardware clock
parameters in TIME_CONV event.

Before:

  # perf report -D

  0x978 [0x38]: event: 79
  .
  . ... raw event: size 56 bytes
  .  0000:  4f 00 00 00 00 00 38 00 15 00 00 00 00 00 00 00  O.....8.........
  .  0010:  00 00 40 01 00 00 00 00 86 89 0b bf df ff ff ff  ..@........<BF><DF><FF><FF><FF>
  .  0020:  d1 c1 b2 39 03 00 00 00 ff ff ff ff ff ff ff 00  <D1><C1><B2>9....<FF><FF><FF><FF><FF><FF><FF>.
  .  0030:  01 01 00 00 00 00 00 00                          ........

  0 0 0x978 [0x38]: PERF_RECORD_TIME_CONV
  : unhandled!

  [...]

After:

  # perf report -D

  0x978 [0x38]: event: 79
  .
  . ... raw event: size 56 bytes
  .  0000:  4f 00 00 00 00 00 38 00 15 00 00 00 00 00 00 00  O.....8.........
  .  0010:  00 00 40 01 00 00 00 00 86 89 0b bf df ff ff ff  ..@........<BF><DF><FF><FF><FF>
  .  0020:  d1 c1 b2 39 03 00 00 00 ff ff ff ff ff ff ff 00  <D1><C1><B2>9....<FF><FF><FF><FF><FF><FF><FF>.
  .  0030:  01 01 00 00 00 00 00 00                          ........

  0 0 0x978 [0x38]: PERF_RECORD_TIME_CONV
  ... Time Shift      21
  ... Time Muliplier  20971520
  ... Time Zero       18446743935180835206
  ... Time Cycles     13852918225
  ... Time Mask       0xffffffffffffff
  ... Cap Time Zero   1
  ... Cap Time Short  1
  : unhandled!

  [...]

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steve MacLean <Steve.MacLean@Microsoft.com>
Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Link: https://lore.kernel.org/r/20210428120915.7123-5-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:31:00 -03:00
Leo Yan
050ffc4490 perf session: Add swap operation for event TIME_CONV
Since commit d110162caf ("perf tsc: Support cap_user_time_short for
event TIME_CONV"), the event PERF_RECORD_TIME_CONV has extended the data
structure for clock parameters.

To be backwards-compatible, this patch adds a dedicated swap operation
for the event PERF_RECORD_TIME_CONV, based on checking if the event
contains field "time_cycles", it can support both for the old and new
event formats.

Fixes: d110162caf ("perf tsc: Support cap_user_time_short for event TIME_CONV")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steve MacLean <Steve.MacLean@Microsoft.com>
Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Link: https://lore.kernel.org/r/20210428120915.7123-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:31:00 -03:00
Leo Yan
aa616f5a8a perf jit: Let convert_timestamp() to be backwards-compatible
Commit d110162caf ("perf tsc: Support cap_user_time_short for
event TIME_CONV") supports the extended parameters for event TIME_CONV,
but it broke the backwards compatibility, so any perf data file with old
event format fails to convert timestamp.

This patch introduces a helper event_contains() to check if an event
contains a specific member or not.  For the backwards-compatibility, if
the event size confirms the extended parameters are supported in the
event TIME_CONV, then copies these parameters.

Committer notes:

To make this compiler backwards compatible add this patch:

  -       struct perf_tsc_conversion tc = { 0 };
  +       struct perf_tsc_conversion tc = { .time_shift = 0, };

Fixes: d110162caf ("perf tsc: Support cap_user_time_short for event TIME_CONV")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steve MacLean <Steve.MacLean@Microsoft.com>
Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Link: https://lore.kernel.org/r/20210428120915.7123-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:31:00 -03:00
Leo Yan
e1d380ea8b perf tools: Change fields type in perf_record_time_conv
C standard claims "An object declared as type _Bool is large enough to
store the values 0 and 1", bool type size can be 1 byte or larger than
1 byte.  Thus it's uncertian for bool type size with different
compilers.

This patch changes the bool type in structure perf_record_time_conv to
__u8 type, and pads extra bytes for 8-byte alignment; this can give
reliable structure size.

Fixes: d110162caf ("perf tsc: Support cap_user_time_short for event TIME_CONV")
Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steve MacLean <Steve.MacLean@Microsoft.com>
Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Link: https://lore.kernel.org/r/20210428120915.7123-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:31:00 -03:00
Michael Petlan
56d32d4cac perf tools: Enable libtraceevent dynamic linking
Currently we support only static linking with kernel's libtraceevent
(tools/lib/traceevent). This patch adds libtraceevent package detection
and support to link perf with it dynamically.

  The libtraceevent package status is displayed with:
  $ make VF=1 LIBTRACEEVENT_DYNAMIC=1
  ...
  ...                 libtraceevent: [ on  ]

Default behavior remains the same (static linking).

Committer testing:

  $ make LIBTRACEEVENT_DYNAMIC=1 VF=1 O=/tmp/build/perf -C tools/perf install-bin |& grep traceevent
  Makefile.config:1090: *** Error: No libtraceevent devel library found, please install libtraceevent-devel.  Stop.
  $

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
LPU-Reference: 20210428092023.4009-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:31:00 -03:00
Jin Yao
2750ce1d4d perf Documentation: Document intel-hybrid support
Add some words and examples to help understanding of
Intel hybrid perf support.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-27-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:31:00 -03:00
Jin Yao
a37f3b8856 perf tests: Skip 'perf stat metrics (shadow stat) test' for hybrid
Currently we don't support shadow stat for hybrid.

  root@ssp-pwrt-002:~# ./perf stat -e cycles,instructions -a -- sleep 1

   Performance counter stats for 'system wide':

      12,883,109,591      cpu_core/cycles/
       6,405,163,221      cpu_atom/cycles/
         555,553,778      cpu_core/instructions/
         841,158,734      cpu_atom/instructions/

         1.002644773 seconds time elapsed

Now there is no shadow stat 'insn per cycle' reported. We will support
it later and now just skip the 'perf stat metrics (shadow stat) test'.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-26-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:31:00 -03:00
Jin Yao
d9da6f70eb perf tests: Support 'Convert perf time to TSC' test for hybrid
Since for "cycles:u' on hybrid platform, it creates two "cycles".  So
the second evsel in evlist also needs initialization.

With this patch,

  # ./perf test 71
  71: Convert perf time to TSC                                        : Ok

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-25-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:31:00 -03:00
Jin Yao
c102038892 perf tests: Support 'Session topology' test for hybrid
Force to create one event "cpu_core/cycles/" by default, otherwise in
evlist__valid_sample_type, the checking of 'if (evlist->core.nr_entries
== 1)' would be failed.

  # ./perf test 41
  41: Session topology                                                : Ok

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-24-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:31:00 -03:00
Jin Yao
6081e876ed perf tests: Support 'Parse and process metrics' test for hybrid
Some events are not supported. Only pick up some cases for hybrid.

  # ./perf test 68
  68: Parse and process metrics                                       : Ok

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-23-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:31:00 -03:00
Jin Yao
43eb05d066 perf tests: Support 'Track with sched_switch' test for hybrid
Since for "cycles:u' on hybrid platform, it creates two "cycles".
So the number of events in evlist is not expected in next test
steps. Now we just use one event "cpu_core/cycles:u/" for hybrid.

  # ./perf test 35
  35: Track with sched_switch                                         : Ok

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-22-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:31:00 -03:00
Jin Yao
f15da0b1fb perf tests: Skip 'Setup struct perf_event_attr' test for hybrid
For hybrid, the attr.type consists of pmu type id + original type.
There will be much changes for this test. Now we temporarily
skip this test case and TODO in future.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-21-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:31:00 -03:00
Jin Yao
afff9f312e perf tests: Add hybrid cases for 'Roundtrip evsel->name' test
Since for one hw event, two hybrid events are created.

For example,

evsel->idx      evsel__name(evsel)
0               cycles
1               cycles
2               instructions
3               instructions
...

So for comparing the evsel name on hybrid, the evsel->idx
needs to be divided by 2.

  # ./perf test 14
  14: Roundtrip evsel->name                                           : Ok

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-20-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:31:00 -03:00
Jin Yao
2541cb63ac perf tests: Add hybrid cases for 'Parse event definition strings' test
Add basic hybrid test cases for 'Parse event definition strings' test.

  # perf test 6
   6: Parse event definition strings                                  : Ok

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-19-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:31:00 -03:00
Jin Yao
91c0f5ec81 perf record: Uniquify hybrid event name
For perf-record, it would be useful to tell user the pmu which the
event belongs to.

For example,

  # perf record -a -- sleep 1
  # perf report

  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 106  of event 'cpu_core/cycles/'
  # Event count (approx.): 22043448
  #
  # Overhead  Command       Shared Object            Symbol
  # ........  ............  .......................  ............................
  #
  ...

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-18-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao
660e533e87 perf stat: Warn group events from different hybrid PMU
If a group has events which are from different hybrid PMUs,
shows a warning:

"WARNING: events in group from different hybrid PMUs!"

This is to remind the user not to put the core event and atom
event into one group.

Next, just disable grouping.

  # perf stat -e "{cpu_core/cycles/,cpu_atom/cycles/}" -a -- sleep 1
  WARNING: events in group from different hybrid PMUs!
  WARNING: grouped events cpus do not match, disabling group:
    anon group { cpu_core/cycles/, cpu_atom/cycles/ }

   Performance counter stats for 'system wide':

           5,438,125      cpu_core/cycles/
           3,914,586      cpu_atom/cycles/

         1.004250966 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-17-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao
92637cc729 perf stat: Filter out unmatched aggregation for hybrid event
perf-stat has supported some aggregation modes, such as --per-core,
--per-socket and etc. While for hybrid event, it may only available
on part of cpus. So for --per-core, we need to filter out the
unavailable cores, for --per-socket, filter out the unavailable
sockets, and so on.

Before:

  # perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1

   Performance counter stats for 'system wide':

  S0-D0-C0           2            479,530      cpu_core/cycles/
  S0-D0-C4           2            175,007      cpu_core/cycles/
  S0-D0-C8           2            166,240      cpu_core/cycles/
  S0-D0-C12          2            704,673      cpu_core/cycles/
  S0-D0-C16          2            865,835      cpu_core/cycles/
  S0-D0-C20          2          2,958,461      cpu_core/cycles/
  S0-D0-C24          2            163,988      cpu_core/cycles/
  S0-D0-C28          2            164,729      cpu_core/cycles/
  S0-D0-C32          0      <not counted>      cpu_core/cycles/
  S0-D0-C33          0      <not counted>      cpu_core/cycles/
  S0-D0-C34          0      <not counted>      cpu_core/cycles/
  S0-D0-C35          0      <not counted>      cpu_core/cycles/
  S0-D0-C36          0      <not counted>      cpu_core/cycles/
  S0-D0-C37          0      <not counted>      cpu_core/cycles/
  S0-D0-C38          0      <not counted>      cpu_core/cycles/
  S0-D0-C39          0      <not counted>      cpu_core/cycles/

         1.003597211 seconds time elapsed

After:

  # perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1

   Performance counter stats for 'system wide':

  S0-D0-C0           2            210,428      cpu_core/cycles/
  S0-D0-C4           2            444,830      cpu_core/cycles/
  S0-D0-C8           2            435,241      cpu_core/cycles/
  S0-D0-C12          2            423,976      cpu_core/cycles/
  S0-D0-C16          2            859,350      cpu_core/cycles/
  S0-D0-C20          2          1,559,589      cpu_core/cycles/
  S0-D0-C24          2            163,924      cpu_core/cycles/
  S0-D0-C28          2            376,610      cpu_core/cycles/

         1.003621290 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Co-developed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-16-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao
ac2dc29edd perf stat: Add default hybrid events
Previously if '-e' is not specified in perf stat, some software events
and hardware events are added to evlist by default.

Before:

  # perf stat -a -- sleep 1

   Performance counter stats for 'system wide':

           24,044.40 msec cpu-clock                 #   23.946 CPUs utilized
                  99      context-switches          #    4.117 /sec
                  24      cpu-migrations            #    0.998 /sec
                   3      page-faults               #    0.125 /sec
           7,000,244      cycles                    #    0.000 GHz
           2,955,024      instructions              #    0.42  insn per cycle
             608,941      branches                  #   25.326 K/sec
              31,991      branch-misses             #    5.25% of all branches

         1.004106859 seconds time elapsed

Among the events, cycles, instructions, branches and branch-misses
are hardware events.

One hybrid platform, two hardware events are created for one
hardware event.

cpu_core/cycles/,
cpu_atom/cycles/,
cpu_core/instructions/,
cpu_atom/instructions/,
cpu_core/branches/,
cpu_atom/branches/,
cpu_core/branch-misses/,
cpu_atom/branch-misses/

These events would be added to evlist on hybrid platform.

Since parse_events() has been supported to create two hardware events
for one event on hybrid platform, so we just use parse_events(evlist,
"cycles,instructions,branches,branch-misses") to create the default
events and add them to evlist.

After:

  # perf stat -a -- sleep 1

   Performance counter stats for 'system wide':

           24,043.99 msec cpu-clock                 #   23.991 CPUs utilized
                 139      context-switches          #    5.781 /sec
                  25      cpu-migrations            #    1.040 /sec
                   6      page-faults               #    0.250 /sec
          10,381,751      cpu_core/cycles/          #  431.782 K/sec
           1,264,216      cpu_atom/cycles/          #   52.579 K/sec
           3,406,958      cpu_core/instructions/    #  141.697 K/sec
             414,588      cpu_atom/instructions/    #   17.243 K/sec
             705,149      cpu_core/branches/        #   29.327 K/sec
              82,358      cpu_atom/branches/        #    3.425 K/sec
              40,821      cpu_core/branch-misses/   #    1.698 K/sec
               9,086      cpu_atom/branch-misses/   #  377.891 /sec

         1.002228863 seconds time elapsed

We can see two events are created for one hardware event.

One TODO is, the shadow stats looks a bit different, now it's just
'M/sec'.

The perf_stat__update_shadow_stats and perf_stat__print_shadow_stats
need to be improved in future if we want to get the original shadow
stats.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-15-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao
b53a0755d5 perf record: Create two hybrid 'cycles' events by default
When evlist is empty, for example no '-e' specified in perf record,
one default 'cycles' event is added to evlist.

While on hybrid platform, it needs to create two default 'cycles'
events. One is for cpu_core, the other is for cpu_atom.

This patch actually calls evsel__new_cycles() two times to create
two 'cycles' events.

  # ./perf record -vv -a -- sleep 1
  ...
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x400000000
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|ID|CPU|PERIOD
    read_format                      ID
    disabled                         1
    inherit                          1
    freq                             1
    precise_ip                       3
    sample_id_all                    1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
  sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 6
  sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8 = 7
  sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8 = 9
  sys_perf_event_open: pid -1  cpu 4  group_fd -1  flags 0x8 = 10
  sys_perf_event_open: pid -1  cpu 5  group_fd -1  flags 0x8 = 11
  sys_perf_event_open: pid -1  cpu 6  group_fd -1  flags 0x8 = 12
  sys_perf_event_open: pid -1  cpu 7  group_fd -1  flags 0x8 = 13
  sys_perf_event_open: pid -1  cpu 8  group_fd -1  flags 0x8 = 14
  sys_perf_event_open: pid -1  cpu 9  group_fd -1  flags 0x8 = 15
  sys_perf_event_open: pid -1  cpu 10  group_fd -1  flags 0x8 = 16
  sys_perf_event_open: pid -1  cpu 11  group_fd -1  flags 0x8 = 17
  sys_perf_event_open: pid -1  cpu 12  group_fd -1  flags 0x8 = 18
  sys_perf_event_open: pid -1  cpu 13  group_fd -1  flags 0x8 = 19
  sys_perf_event_open: pid -1  cpu 14  group_fd -1  flags 0x8 = 20
  sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 21
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x800000000
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|ID|CPU|PERIOD
    read_format                      ID
    disabled                         1
    inherit                          1
    freq                             1
    precise_ip                       3
    sample_id_all                    1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 22
  sys_perf_event_open: pid -1  cpu 17  group_fd -1  flags 0x8 = 23
  sys_perf_event_open: pid -1  cpu 18  group_fd -1  flags 0x8 = 24
  sys_perf_event_open: pid -1  cpu 19  group_fd -1  flags 0x8 = 25
  sys_perf_event_open: pid -1  cpu 20  group_fd -1  flags 0x8 = 26
  sys_perf_event_open: pid -1  cpu 21  group_fd -1  flags 0x8 = 27
  sys_perf_event_open: pid -1  cpu 22  group_fd -1  flags 0x8 = 28
  sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 29
  ------------------------------------------------------------

We have to create evlist-hybrid.c otherwise due to the symbol
dependency the perf test python would be failed.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-14-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao
5e4edd1f73 perf parse-events: Support event inside hybrid pmu
On hybrid platform, user may want to enable events on one pmu.

Following syntax are supported:

cpu_core/<event>/
cpu_atom/<event>/

But the syntax doesn't work for cache event.

Before:

  # perf stat -e cpu_core/LLC-loads/ -a -- sleep 1
  event syntax error: 'cpu_core/LLC-loads/'
                                \___ unknown term 'LLC-loads' for pmu 'cpu_core'

Cache events are a bit complex. We can't create aliases for them.
We use another solution. For example, if we use "cpu_core/LLC-loads/",
in parse_events_add_pmu(), term->config is "LLC-loads".

Then we create a new parser to scan "LLC-loads". The
parse_events_add_cache() would be called during parsing.
The parse_state->hybrid_pmu_name is used to identify the pmu
where the event should be enabled on.

After:

  # perf stat -e cpu_core/LLC-loads/ -a -- sleep 1

   Performance counter stats for 'system wide':

              24,593      cpu_core/LLC-loads/

         1.003911601 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-13-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao
c93afadc92 perf parse-events: Compare with hybrid pmu name
On hybrid platform, user may want to enable event only on one pmu.
Following syntax will be supported:

cpu_core/<event>/
cpu_atom/<event>/

For hardware event, hardware cache event and raw event, two events
are created by default. We pass the specified pmu name in parse_state
and it would be checked before event creation. So next only the
event with the specified pmu would be created.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-12-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao
94da591b1c perf parse-events: Create two hybrid raw events
On hybrid platform, same raw event is possible to be available
on both cpu_core pmu and cpu_atom pmu. It's supported to create
two raw events for one event encoding. For raw events, the
attr.type is PMU type.

  # perf stat -e r3c -a -vv -- sleep 1
  Control descriptor is not initialized
  ------------------------------------------------------------
  perf_event_attr:
    type                             4
    size                             120
    config                           0x3c
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
  ------------------------------------------------------------
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             4
    size                             120
    config                           0x3c
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
  ------------------------------------------------------------
  perf_event_attr:
    type                             8
    size                             120
    config                           0x3c
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
  ------------------------------------------------------------
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             8
    size                             120
    config                           0x3c
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
  r3c: 0: 434449 1001412521 1001412521
  r3c: 1: 173162 1001482031 1001482031
  r3c: 2: 231710 1001524974 1001524974
  r3c: 3: 110012 1001563523 1001563523
  r3c: 4: 191517 1001593221 1001593221
  r3c: 5: 956458 1001628147 1001628147
  r3c: 6: 416969 1001715626 1001715626
  r3c: 7: 1047527 1001596650 1001596650
  r3c: 8: 103877 1001633520 1001633520
  r3c: 9: 70571 1001637898 1001637898
  r3c: 10: 550284 1001714398 1001714398
  r3c: 11: 1257274 1001738349 1001738349
  r3c: 12: 107797 1001801432 1001801432
  r3c: 13: 67471 1001836281 1001836281
  r3c: 14: 286782 1001923161 1001923161
  r3c: 15: 815509 1001952550 1001952550
  r3c: 0: 95994 1002071117 1002071117
  r3c: 1: 105570 1002142438 1002142438
  r3c: 2: 115921 1002189147 1002189147
  r3c: 3: 72747 1002238133 1002238133
  r3c: 4: 103519 1002276753 1002276753
  r3c: 5: 121382 1002315131 1002315131
  r3c: 6: 80298 1002248050 1002248050
  r3c: 7: 466790 1002278221 1002278221
  r3c: 6821369 16026754282 16026754282
  r3c: 1162221 8017758990 8017758990

   Performance counter stats for 'system wide':

           6,821,369      cpu_core/r3c/
           1,162,221      cpu_atom/r3c/

         1.002289965 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-11-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao
30def61f64 perf parse-events: Create two hybrid cache events
For cache events, they have pre-defined configs. The kernel needs
to know where the cache event comes from (e.g. from cpu_core pmu
or from cpu_atom pmu). But the perf type PERF_TYPE_HW_CACHE
can't carry pmu information.

Now the type PERF_TYPE_HW_CACHE is extended to be PMU aware type.
The PMU type ID is stored at attr.config[63:32].

When enabling a hybrid cache event without specified pmu, such as,
'perf stat -e LLC-loads -a', two events are created
automatically. One is for atom, the other is for core.

  # perf stat -e LLC-loads -a -vv -- sleep 1
  Control descriptor is not initialized
  ------------------------------------------------------------
  perf_event_attr:
    type                             3
    size                             120
    config                           0x400000002
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
  ------------------------------------------------------------
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             3
    size                             120
    config                           0x400000002
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
  ------------------------------------------------------------
  perf_event_attr:
    type                             3
    size                             120
    config                           0x800000002
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
  ------------------------------------------------------------
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             3
    size                             120
    config                           0x800000002
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
  LLC-loads: 0: 1507 1001800280 1001800280
  LLC-loads: 1: 666 1001812250 1001812250
  LLC-loads: 2: 3353 1001813453 1001813453
  LLC-loads: 3: 514 1001848795 1001848795
  LLC-loads: 4: 627 1001952832 1001952832
  LLC-loads: 5: 4399 1001451154 1001451154
  LLC-loads: 6: 1240 1001481052 1001481052
  LLC-loads: 7: 478 1001520348 1001520348
  LLC-loads: 8: 691 1001551236 1001551236
  LLC-loads: 9: 310 1001578945 1001578945
  LLC-loads: 10: 1018 1001594354 1001594354
  LLC-loads: 11: 3656 1001622355 1001622355
  LLC-loads: 12: 882 1001661416 1001661416
  LLC-loads: 13: 506 1001693963 1001693963
  LLC-loads: 14: 3547 1001721013 1001721013
  LLC-loads: 15: 1399 1001734818 1001734818
  LLC-loads: 0: 1314 1001793826 1001793826
  LLC-loads: 1: 2857 1001752764 1001752764
  LLC-loads: 2: 646 1001830694 1001830694
  LLC-loads: 3: 1612 1001864861 1001864861
  LLC-loads: 4: 2244 1001912381 1001912381
  LLC-loads: 5: 1255 1001943889 1001943889
  LLC-loads: 6: 4624 1002021109 1002021109
  LLC-loads: 7: 2703 1001959302 1001959302
  LLC-loads: 24793 16026838264 16026838264
  LLC-loads: 17255 8015078826 8015078826

   Performance counter stats for 'system wide':

              24,793      cpu_core/LLC-loads/
              17,255      cpu_atom/LLC-loads/

         1.001970988 seconds time elapsed

0x4 in 0x400000002 indicates the cpu_core pmu.
0x8 in 0x800000002 indicates the cpu_atom pmu.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-10-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao
9cbfa2f64c perf parse-events: Create two hybrid hardware events
Current hardware events has special perf types PERF_TYPE_HARDWARE.
But it doesn't pass the PMU type in the user interface. For a hybrid
system, the perf kernel doesn't know which PMU the events belong to.

So now this type is extended to be PMU aware type. The PMU type ID
is stored at attr.config[63:32].

PMU type ID is retrieved from sysfs.

  root@lkp-adl-d01:/sys/devices/cpu_atom# cat type
  8

  root@lkp-adl-d01:/sys/devices/cpu_core# cat type
  4

When enabling a hybrid hardware event without specified pmu, such as,
'perf stat -e cycles -a', two events are created automatically. One
is for atom, the other is for core.

  # perf stat -e cycles -a -vv -- sleep 1
  Control descriptor is not initialized
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x400000000
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
  ------------------------------------------------------------
  ...
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x400000000
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x800000000
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
  ------------------------------------------------------------
  ...
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x800000000
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
  cycles: 0: 836272 1001525722 1001525722
  cycles: 1: 628564 1001580453 1001580453
  cycles: 2: 872693 1001605997 1001605997
  cycles: 3: 70417 1001641369 1001641369
  cycles: 4: 88593 1001726722 1001726722
  cycles: 5: 470495 1001752993 1001752993
  cycles: 6: 484733 1001840440 1001840440
  cycles: 7: 1272477 1001593105 1001593105
  cycles: 8: 209185 1001608616 1001608616
  cycles: 9: 204391 1001633962 1001633962
  cycles: 10: 264121 1001661745 1001661745
  cycles: 11: 826104 1001689904 1001689904
  cycles: 12: 89935 1001728861 1001728861
  cycles: 13: 70639 1001756757 1001756757
  cycles: 14: 185266 1001784810 1001784810
  cycles: 15: 171094 1001825466 1001825466
  cycles: 0: 129624 1001854843 1001854843
  cycles: 1: 122533 1001840421 1001840421
  cycles: 2: 90055 1001882506 1001882506
  cycles: 3: 139607 1001896463 1001896463
  cycles: 4: 141791 1001907838 1001907838
  cycles: 5: 530927 1001883880 1001883880
  cycles: 6: 143246 1001852529 1001852529
  cycles: 7: 667769 1001872626 1001872626
  cycles: 6744979 16026956922 16026956922
  cycles: 1965552 8014991106 8014991106

   Performance counter stats for 'system wide':

           6,744,979      cpu_core/cycles/
           1,965,552      cpu_atom/cycles/

         1.001882711 seconds time elapsed

0x4 in 0x400000000 indicates the cpu_core pmu.
0x8 in 0x800000000 indicates the cpu_atom pmu.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-9-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao
12279429d8 perf stat: Uniquify hybrid event name
It would be useful to let user know the pmu which the event belongs to.
perf-stat has supported '--no-merge' option and it can print the pmu
name after the event name, such as:

"cycles [cpu_core]"

Now this option is enabled by default for hybrid platform but change
the format to:

"cpu_core/cycles/"

If user configs the name, we still use the user specified name.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
ink: https://lore.kernel.org/r/20210427070139.25256-8-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao
c5a26ea490 perf pmu: Add hybrid helper functions
The functions perf_pmu__is_hybrid and perf_pmu__find_hybrid_pmu
can be used to identify the hybrid platform and return the found
hybrid cpu pmu. All the detected hybrid pmus have been saved in
'perf_pmu__hybrid_pmus' list. So we just need to search this list.

perf_pmu__hybrid_type_to_pmu converts the user specified string
to hybrid pmu name. This is used to support the '--cputype' option
in next patches.

perf_pmu__has_hybrid checks the existing of hybrid pmu. Note that,
we have to define it in pmu.c (make pmu-hybrid.c no more symbol
dependency), otherwise perf test python would be failed.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-7-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao
444624307c perf pmu: Save detected hybrid pmus to a global pmu list
We identify the cpu_core pmu and cpu_atom pmu by explicitly
checking following files:

For cpu_core, checks:
"/sys/bus/event_source/devices/cpu_core/cpus"

For cpu_atom, checks:
"/sys/bus/event_source/devices/cpu_atom/cpus"

If the 'cpus' file exists and it has data, the pmu exists.

But in order not to hardcode the "cpu_core" and "cpu_atom",
and make the code in a generic way.

So if the path "/sys/bus/event_source/devices/cpu_xxx/cpus" exists, the
hybrid pmu exists. All the detected hybrid pmus are linked to a global
list 'perf_pmu__hybrid_pmus' and then next we just need to iterate the
list to get all hybrid pmu by using perf_pmu__for_each_hybrid_pmu.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-6-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao
32705de7d4 perf pmu: Save pmu name
On hybrid platform, one event is available on one pmu
(such as, available on cpu_core or on cpu_atom).

This patch saves the pmu name to the pmu field of struct perf_pmu_alias.
Then next we can know the pmu which the event can be enabled on.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-5-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao
eab35953e6 perf pmu: Simplify arguments of __perf_pmu__new_alias
Simplify the arguments of __perf_pmu__new_alias() by passing the whole
'struct pme_event' pointer.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao
6b64833b9e perf jevents: Support unit value "cpu_core" and "cpu_atom"
For some Intel platforms, such as Alderlake, which is a hybrid platform
and it consists of atom cpu and core cpu. Each cpu has dedicated event
list. Part of events are available on core cpu, part of events are
available on atom cpu.

The kernel exports new cpu pmus: cpu_core and cpu_atom. The event in
json is added with a new field "Unit" to indicate which pmu the event
is available on.

For example, one event in cache.json,

    {
        "BriefDescription": "Counts the number of load ops retired that",
        "CollectPEBSRecord": "2",
        "Counter": "0,1,2,3",
        "EventCode": "0xd2",
        "EventName": "MEM_LOAD_UOPS_RETIRED_MISC.MMIO",
        "PEBScounters": "0,1,2,3",
        "SampleAfterValue": "1000003",
        "UMask": "0x80",
        "Unit": "cpu_atom"
    },

The unit "cpu_atom" indicates this event is only available on "cpu_atom".

In generated pmu-events.c, we can see:

{
        .name = "mem_load_uops_retired_misc.mmio",
        .event = "period=1000003,umask=0x80,event=0xd2",
        .desc = "Counts the number of load ops retired that. Unit: cpu_atom ",
        .topic = "cache",
        .pmu = "cpu_atom",
},

But if without this patch, the "uncore_" prefix is added before "cpu_atom",
such as:
        .pmu = "uncore_cpu_atom"

That would be a wrong pmu.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao
4127361191 tools headers uapi: Update tools's copy of linux/perf_event.h
To get the changes in:

Liang Kan's patch

  55bcf6ef31 ("perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE")

Kan's patch is in the tip/perf/core branch.

So the next perf tool patches need this interface for hybrid support.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Namhyung Kim
462f57dbf9 perf report: Print percentage of each event statistics
It's sometimes useful to see how many samples vs other events in the
data file with percent values.

  $ perf report --stat

  Aggregated stats:
             TOTAL events:      20064
              MMAP events:        239  ( 1.2%)
              COMM events:       1518  ( 7.6%)
              EXIT events:          1  ( 0.0%)
              FORK events:       1517  ( 7.6%)
            SAMPLE events:       4015  (20.0%)
             MMAP2 events:      12769  (63.6%)
    FINISHED_ROUND events:          2  ( 0.0%)
        THREAD_MAP events:          1  ( 0.0%)
           CPU_MAP events:          1  ( 0.0%)
         TIME_CONV events:          1  ( 0.0%)
  cycles stats:
            SAMPLE events:       2475
  instructions stats:
            SAMPLE events:       1540

Suggested-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Namhyung Kim
8f08cf3330 perf report: Make --skip-empty as default
so that the compact output is shown by default.  Also add 'report.skip-empty'
config option to override the default.  Users can also use --no-skip-empty
command line option to change the behavior anytime.

Committer testing:

  $ perf report --stat

  Aggregated stats:
             TOTAL events:         19
              COMM events:          2
              EXIT events:          1
            SAMPLE events:          8
             MMAP2 events:          4
    FINISHED_ROUND events:          1
        THREAD_MAP events:          1
           CPU_MAP events:          1
         TIME_CONV events:          1
  cycles:u stats:
            SAMPLE events:          8
  $ perf config report.skip-empty=false
  $ perf report --stat

  Aggregated stats:
             TOTAL events:         19
              MMAP events:          0
              LOST events:          0
              COMM events:          2
              EXIT events:          1
          THROTTLE events:          0
        UNTHROTTLE events:          0
              FORK events:          0
              READ events:          0
            SAMPLE events:          8
             MMAP2 events:          4
               AUX events:          0
      ITRACE_START events:          0
      LOST_SAMPLES events:          0
            SWITCH events:          0
   SWITCH_CPU_WIDE events:          0
        NAMESPACES events:          0
           KSYMBOL events:          0
         BPF_EVENT events:          0
            CGROUP events:          0
         TEXT_POKE events:          0
              ATTR events:          0
        EVENT_TYPE events:          0
      TRACING_DATA events:          0
          BUILD_ID events:          0
    FINISHED_ROUND events:          1
          ID_INDEX events:          0
     AUXTRACE_INFO events:          0
          AUXTRACE events:          0
    AUXTRACE_ERROR events:          0
        THREAD_MAP events:          1
           CPU_MAP events:          1
       STAT_CONFIG events:          0
              STAT events:          0
        STAT_ROUND events:          0
      EVENT_UPDATE events:          0
         TIME_CONV events:          1
           FEATURE events:          0
        COMPRESSED events:          0
  cycles:u stats:
            SAMPLE events:          8
  $ perf config report.skip-empty
  report.skip-empty=false
  $

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Namhyung Kim
2775de0b11 perf report: Add --skip-empty option to suppress 0 event stat
To make the output more readable, I think it's better to remove 0's in
the output.  Also the dummy event has no event stats so it just wasts
the space.  Let's use the --skip-empty option to suppress it.

  $ perf report --stat --skip-empty

  Aggregated stats:
             TOTAL events:      16530
              MMAP events:        226
              COMM events:       1596
              EXIT events:          2
          THROTTLE events:        121
        UNTHROTTLE events:        117
              FORK events:       1595
            SAMPLE events:        719
             MMAP2 events:      12147
            CGROUP events:          2
    FINISHED_ROUND events:          2
        THREAD_MAP events:          1
           CPU_MAP events:          1
         TIME_CONV events:          1
  cycles stats:
            SAMPLE events:        719

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Namhyung Kim
55f7544438 perf report: Show event sample counts in --stat output
To make the output identical with perf report -D, it needs to show
per-event sample counts along with the aggregated stat  at the end.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:58 -03:00
Namhyung Kim
0f0abbace3 perf hists: Split hists_stats from events_stats
Each struct hists have events_stats but most of the fields were not
used.  It's to count number of samples and periods whether filtered or
not.  And other fields are used only by evlist.

So it'd be better to split hists_stats and events_stats to reduce
wasted memory in the struct hists.  This makes the output of event
statistics in the perf report compact by skipping 0 events in each
evsel/hists.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:58 -03:00
Namhyung Kim
bf8f8587bf perf top: Use evlist->events_stat to count events
It's mainly to count lost events for the warning so it should be ok
to use the evlist->stats instead.  This is needed for changes in the
next commit.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:58 -03:00
Nicholas Fraser
d0713d4ca3 perf data: Add JSON export
This adds a feature to export perf data to JSON.

The resolved symbols are exported into the JSON so that external tools
don't need to load the dsos themselves (or even have access to them at
all.) This makes it easy to load and analyze perf data with standalone
tools where direct perf or libbabeltrace integration is impractical.

The exporter uses a minimal inline JSON encoding without any external
dependencies. Currently it only outputs some headers and sample metadata
but it's easily extensible.

Use it like this:

  $ perf data convert --to-json out.json

Committer notes:

Fixup a __printf() bug that broke the build:

  util/data-convert-json.c:103:11: error: expected ‘)’ before numeric constant
    103 | __(printf, 5, 6)
        |           ^~
        |           )
  util/data-convert-json.c: In function ‘output_sample_callchain_entry’:
  util/data-convert-json.c:124:2: error: implicit declaration of function ‘output_json_key_format’; did you mean ‘output_json_format’? [-Werror=implicit-function-declaration]
    124 |  output_json_key_format(out, false, 5, "ip", "\"0x%" PRIx64 "\"", ip);
        |  ^~~~~~~~~~~~~~~~~~~~~~
        |  output_json_format

Also had to add this patch to fix errors reported by various versions of
clang:

  -       if (al && al->sym && al->sym->name && strlen(al->sym->name) > 0) {
  +       if (al && al->sym && al->sym->namelen) {

al->sym->name is a zero sized array, to avoid one extra alloc in the
symbol__new() constructor, sym->namelen carries its strlen.

Committer testing:

  $ ls -la out.json
  ls: cannot access 'out.json': No such file or directory
  $ perf record sleep 0.1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ]
  $ perf report --stats | grep -w SAMPLE
            SAMPLE events:          8
  $ perf data convert --to-json out.json
  [ perf data convert: Converted 'perf.data' into JSON data 'out.json' ]
  [ perf data convert: Converted and wrote 0.002 MB (8 samples) ]
  $ ls -la out.json
  -rw-rw-r--. 1 acme acme 2017 Apr 26 17:29 out.json
  $ cat out.json
  {
  	"linux-perf-json-version": 1,
  	"headers": {
  		"header-version": 1,
  		"captured-on": "2021-04-26T20:28:57Z",
  		"data-offset": 432,
  		"data-size": 1016,
  		"feat-offset": 1448,
  		"hostname": "five",
  		"os-release": "5.11.14-200.fc33.x86_64",
  		"arch": "x86_64",
  		"cpu-desc": "AMD Ryzen 9 3900X 12-Core Processor",
  		"cpuid": "AuthenticAMD,23,113,0",
  		"nrcpus-online": 24,
  		"nrcpus-avail": 24,
  		"perf-version": "5.12.gee134f3189bd",
  		"cmdline": [
  			"/home/acme/bin/perf",
  			"record",
  			"sleep",
  			"0.1"
  		]
  	},
  	"samples": [
  		{
  			"timestamp": 170517539043684,
  			"pid": 375844,
  			"tid": 375844,
  			"comm": "sleep",
  			"callchain": [
  				{
  					"ip": "0xffffffffa6268827"
  				}
  			]
  		},
  		{
  			"timestamp": 170517539048443,
  			"pid": 375844,
  			"tid": 375844,
  			"comm": "sleep",
  			"callchain": [
  				{
  					"ip": "0xffffffffa661359d"
  				}
  			]
  		},
  		{
  			"timestamp": 170517539051018,
  			"pid": 375844,
  			"tid": 375844,
  			"comm": "sleep",
  			"callchain": [
  				{
  					"ip": "0xffffffffa6311e18"
  				}
  			]
  		},
  		{
  			"timestamp": 170517539053652,
  			"pid": 375844,
  			"tid": 375844,
  			"comm": "sleep",
  			"callchain": [
  				{
  					"ip": "0x7fdb77b4812b",
  					"symbol": "_dl_start",
  					"dso": "ld-2.32.so"
  				}
  			]
  		},
  		{
  			"timestamp": 170517539055306,
  			"pid": 375844,
  			"tid": 375844,
  			"comm": "sleep",
  			"callchain": [
  				{
  					"ip": "0xffffffffa6269286"
  				}
  			]
  		},
  		{
  			"timestamp": 170517539057590,
  			"pid": 375844,
  			"tid": 375844,
  			"comm": "sleep",
  			"callchain": [
  				{
  					"ip": "0xffffffffa62abd8b"
  				}
  			]
  		},
  		{
  			"timestamp": 170517539067559,
  			"pid": 375844,
  			"tid": 375844,
  			"comm": "sleep",
  			"callchain": [
  				{
  					"ip": "0x7fdb77b5e9e9",
  					"symbol": "__GI___tunables_init",
  					"dso": "ld-2.32.so"
  				}
  			]
  		},
  		{
  			"timestamp": 170517539282452,
  			"pid": 375844,
  			"tid": 375844,
  			"comm": "sleep",
  			"callchain": [
  				{
  					"ip": "0x7fdb779978d2",
  					"symbol": "getenv",
  					"dso": "libc-2.32.so"
  				}
  			]
  		}
  	]
  }
  $

Signed-off-by: Nicholas Fraser <nfraser@codeweavers.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: Ulrich Czekalla <uczekalla@codeweavers.com>
Link: http://lore.kernel.org/lkml/3884969f-804d-2f53-c648-e2b0bd85edff@codeweavers.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:58 -03:00
Song Liu
5508c9dae2 perf stat: Introduce bpf_counter_ops->disable()
Introduce bpf_counter_ops->disable(), which is used stop counting the
event.

Committer notes:

Added a dummy bpf_counter__disable() to the python binding to avoid
having 'perf test python' failing.

bpf_counter isn't supported in the python binding.

Signed-off-by: Song Liu <song@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: kernel-team@fb.com
Link: https://lore.kernel.org/r/20210425214333.1090950-6-song@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:58 -03:00
Song Liu
01bd8efcec perf stat: Introduce ':b' modifier
Introduce 'b' modifier to event parser, which means use BPF program to
manage this event. This is the same as --bpf-counters option, but only
applies to this event. For example,

  perf stat -e cycles:b,cs               # use bpf for cycles, but not cs
  perf stat -e cycles,cs --bpf-counters  # use bpf for both cycles and cs

Suggested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Song Liu <song@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/20210425214333.1090950-5-song@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:58 -03:00
Song Liu
112cb56164 perf stat: Introduce config stat.bpf-counter-events
Currently, to use BPF to aggregate perf event counters, the user uses
--bpf-counters option. Enable "use bpf by default" events with a config
option, stat.bpf-counter-events. Events with name in the option will use
BPF.

This also enables mixed BPF event and regular event in the same sesssion.
For example:

   perf config stat.bpf-counter-events=instructions
   perf stat -e instructions,cs

The second command will use BPF for "instructions" but not "cs".

Signed-off-by: Song Liu <song@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/20210425214333.1090950-4-song@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:58 -03:00
Song Liu
fe3dd8263b perf bpf: check perf_attr_map is compatible with the perf binary
perf_attr_map could be shared among different version of perf binary. Add
bperf_attr_map_compatible() to check whether the existing attr_map is
compatible with current perf binary.

Signed-off-by: Song Liu <song@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: kernel-team@fb.com
Link: https://lore.kernel.org/r/20210425214333.1090950-3-song@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:58 -03:00
Song Liu
ec8149fba6 perf util: Move bpf_perf definitions to a libperf header
By following the same protocol, other tools can share hardware PMCs with
perf. Move perf_event_attr_map_entry and BPF_PERF_DEFAULT_ATTR_MAP_PATH to
bpf_perf.h for other tools to use.

Signed-off-by: Song Liu <song@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: kernel-team@fb.com
Link: https://lore.kernel.org/r/20210425214333.1090950-2-song@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:58 -03:00
Linus Torvalds
42dec9a936 Perf events changes in this cycle were:
- Improve Intel uncore PMU support:
 
      - Parse uncore 'discovery tables' - a new hardware capability enumeration method
        introduced on the latest Intel platforms. This table is in a well-defined PCI
        namespace location and is read via MMIO. It is organized in an rbtree.
 
        These uncore tables will allow the discovery of standard counter blocks, but
        fancier counters still need to be enumerated explicitly.
 
      - Add Alder Lake support
 
      - Improve IIO stacks to PMON mapping support on Skylake servers
 
  - Add Intel Alder Lake PMU support - which requires the introduction of 'hybrid' CPUs
    and PMUs. Alder Lake is a mix of Golden Cove ('big') and Gracemont ('small' - Atom derived)
    cores.
 
    The CPU-side feature set is entirely symmetrical - but on the PMU side there's
    core type dependent PMU functionality.
 
  - Reduce data loss with CPU level hardware tracing on Intel PT / AUX profiling, by
    fixing the AUX allocation watermark logic.
 
  - Improve ring buffer allocation on NUMA systems
 
  - Put 'struct perf_event' into their separate kmem_cache pool
 
  - Add support for synchronous signals for select perf events. The immediate motivation
    is to support low-overhead sampling-based race detection for user-space code. The
    feature consists of the following main changes:
 
     - Add thread-only event inheritance via perf_event_attr::inherit_thread, which limits
       inheritance of events to CLONE_THREAD.
 
     - Add the ability for events to not leak through exec(), via perf_event_attr::remove_on_exec.
 
     - Allow the generation of SIGTRAP via perf_event_attr::sigtrap, extend siginfo with an u64
       ::si_perf, and add the breakpoint information to ::si_addr and ::si_perf if the event is
       PERF_TYPE_BREAKPOINT.
 
    The siginfo support is adequate for breakpoints right now - but the new field can be used
    to introduce support for other types of metadata passed over siginfo as well.
 
  - Misc fixes, cleanups and smaller updates.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCJGpERHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1j9zBAAuVbG2snV6SBSdXLhQcM66N3NckOXvSY5
 QjjhQcuwJQEK/NJB3266K5d8qSmdyRBsWf3GCsrmyBT67P1V28K44Pu7oCV0UDtf
 mpVRjEP0oR7hNsANSSgo8Fa4ZD7H5waX7dK7925Tvw8By3mMoZoddiD/84WJHhxO
 NDF+GRFaRj+/dpbhV8cdCoXTjYdkC36vYuZs3b9lu0tS9D/AJgsNy7TinLvO02Cs
 5peP+2y29dgvCXiGBiuJtEA6JyGnX3nUJCvfOZZ/DWDc3fdduARlRrc5Aiq4n/wY
 UdSkw1VTZBlZ1wMSdmHQVeC5RIH3uWUtRoNqy0Yc90lBm55AQ0EENwIfWDUDC5zy
 USdBqWTNWKMBxlEilUIyqKPQK8LW/31TRzqy8BWKPNcZt5yP5YS1SjAJRDDjSwL/
 I+OBw1vjLJamYh8oNiD5b+VLqNQba81jFASfv+HVWcULumnY6ImECCpkg289Fkpi
 BVR065boifJDlyENXFbvTxyMBXQsZfA+EhtxG7ju2Ni+TokBbogyCb3L2injPt9g
 7jjtTOqmfad4gX1WSc+215iYZMkgECcUd9E+BfOseEjBohqlo7yNKIfYnT8mE/Xq
 nb7eHjyvLiE8tRtZ+7SjsujOMHv9LhWFAbSaxU/kEVzpkp0zyd6mnnslDKaaHLhz
 goUMOL/D0lg=
 =NhQ7
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf event updates from Ingo Molnar:

 - Improve Intel uncore PMU support:

     - Parse uncore 'discovery tables' - a new hardware capability
       enumeration method introduced on the latest Intel platforms. This
       table is in a well-defined PCI namespace location and is read via
       MMIO. It is organized in an rbtree.

       These uncore tables will allow the discovery of standard counter
       blocks, but fancier counters still need to be enumerated
       explicitly.

     - Add Alder Lake support

     - Improve IIO stacks to PMON mapping support on Skylake servers

 - Add Intel Alder Lake PMU support - which requires the introduction of
   'hybrid' CPUs and PMUs. Alder Lake is a mix of Golden Cove ('big')
   and Gracemont ('small' - Atom derived) cores.

   The CPU-side feature set is entirely symmetrical - but on the PMU
   side there's core type dependent PMU functionality.

 - Reduce data loss with CPU level hardware tracing on Intel PT / AUX
   profiling, by fixing the AUX allocation watermark logic.

 - Improve ring buffer allocation on NUMA systems

 - Put 'struct perf_event' into their separate kmem_cache pool

 - Add support for synchronous signals for select perf events. The
   immediate motivation is to support low-overhead sampling-based race
   detection for user-space code. The feature consists of the following
   main changes:

     - Add thread-only event inheritance via
       perf_event_attr::inherit_thread, which limits inheritance of
       events to CLONE_THREAD.

     - Add the ability for events to not leak through exec(), via
       perf_event_attr::remove_on_exec.

     - Allow the generation of SIGTRAP via perf_event_attr::sigtrap,
       extend siginfo with an u64 ::si_perf, and add the breakpoint
       information to ::si_addr and ::si_perf if the event is
       PERF_TYPE_BREAKPOINT.

   The siginfo support is adequate for breakpoints right now - but the
   new field can be used to introduce support for other types of
   metadata passed over siginfo as well.

 - Misc fixes, cleanups and smaller updates.

* tag 'perf-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
  signal, perf: Add missing TRAP_PERF case in siginfo_layout()
  signal, perf: Fix siginfo_t by avoiding u64 on 32-bit architectures
  perf/x86: Allow for 8<num_fixed_counters<16
  perf/x86/rapl: Add support for Intel Alder Lake
  perf/x86/cstate: Add Alder Lake CPU support
  perf/x86/msr: Add Alder Lake CPU support
  perf/x86/intel/uncore: Add Alder Lake support
  perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE
  perf/x86/intel: Add Alder Lake Hybrid support
  perf/x86: Support filter_match callback
  perf/x86/intel: Add attr_update for Hybrid PMUs
  perf/x86: Add structures for the attributes of Hybrid PMUs
  perf/x86: Register hybrid PMUs
  perf/x86: Factor out x86_pmu_show_pmu_cap
  perf/x86: Remove temporary pmu assignment in event_init
  perf/x86/intel: Factor out intel_pmu_check_extra_regs
  perf/x86/intel: Factor out intel_pmu_check_event_constraints
  perf/x86/intel: Factor out intel_pmu_check_num_counters
  perf/x86: Hybrid PMU support for extra_regs
  perf/x86: Hybrid PMU support for event constraints
  ...
2021-04-28 13:03:44 -07:00
Linus Torvalds
03b2cd72aa Objtool updates in this cycle were:
- Standardize the crypto asm code so that it looks like compiler-generated
    code to objtool - so that it can understand it. This enables unwinding
    from crypto asm code - and also fixes the last known remaining objtool
    warnings for LTO and more.
 
  - x86 decoder fixes: clean up and fix the decoder, and also extend it a bit
 
  - Misc fixes and cleanups
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCJEOQRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1jN0A//dZIR9GPW1cHkPD3Na+lxb+RWgCVnFUgw
 meLEOum389zWr7S8YmcFpKWLy94f3l24i/e7ufKn6/RMdaQuT6pZUa6teiNPqDKN
 Qq1v6EX/LT49Q1zh/zCCnKdlmF1t7wDyA1/+HBLnB4YfYEtteEt+p2Apyv4xIHOl
 xWqaTMFcVR/El9FXSyqRWRR4zuqY0Uatz0fmfo5jmi2xq460k53fQlTLA/0w5Jw0
 V3omyA3AYMUW6YlW5TGUINOhyDeAJm4PWl3siSUnSd6t8A/TVs5zpZX15BtseCle
 0FRp2SbxOoVkiyo3N3XmkfYYns9+4wK7cr9qja9U9MsSBZJZwaBm2LO/t2WFrAhq
 5dkOsoPmpIsjutsQnIhQgtVT9I/A4/u5m5Zi3trlXsBS0XAt/q+2GPfEngFmgb3q
 nae4rhGUsQ3NTGBiqNuMHQF4yeEvQZ8DCf3ytTz7DjBeiQ9nAtwzbUUGQjYl2mj1
 ZPOnl7Xmq/Nyw+AmdpffFPiEUJxqEg9HWjDo7DQATXb3Hw2VJ3cU8jwPRqDDlO10
 OB81vysXNGTmhOngHXexxncpmU9gDOIC1imZZpw5lNx4W9Qn20AlGaGAIbqzlfx0
 p5VuhkIWCySe1bOZx03xuk7Gq7GBIPPy/a2m204Ftipetlo1HBYwT3KB/wVpHmh7
 CSjWgdiW3+k=
 =poAZ
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molnar:

 - Standardize the crypto asm code so that it looks like compiler-
   generated code to objtool - so that it can understand it. This
   enables unwinding from crypto asm code - and also fixes the last
   known remaining objtool warnings for LTO and more.

 - x86 decoder fixes: clean up and fix the decoder, and also extend it a
   bit

 - Misc fixes and cleanups

* tag 'objtool-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  x86/crypto: Enable objtool in crypto code
  x86/crypto/sha512-ssse3: Standardize stack alignment prologue
  x86/crypto/sha512-avx2: Standardize stack alignment prologue
  x86/crypto/sha512-avx: Standardize stack alignment prologue
  x86/crypto/sha256-avx2: Standardize stack alignment prologue
  x86/crypto/sha1_avx2: Standardize stack alignment prologue
  x86/crypto/sha_ni: Standardize stack alignment prologue
  x86/crypto/crc32c-pcl-intel: Standardize jump table
  x86/crypto/camellia-aesni-avx2: Unconditionally allocate stack buffer
  x86/crypto/aesni-intel_avx: Standardize stack alignment prologue
  x86/crypto/aesni-intel_avx: Fix register usage comments
  x86/crypto/aesni-intel_avx: Remove unused macros
  objtool: Support asm jump tables
  objtool: Parse options from OBJTOOL_ARGS
  objtool: Collate parse_options() users
  objtool: Add --backup
  objtool,x86: More ModRM sugar
  objtool,x86: Rewrite ADD/SUB/AND
  objtool,x86: Support %riz encodings
  objtool,x86: Simplify register decode
  ...
2021-04-28 12:53:24 -07:00
Linus Torvalds
0ff0edb550 Locking changes for this cycle were:
- rtmutex cleanup & spring cleaning pass that removes ~400 lines of code
  - Futex simplifications & cleanups
  - Add debugging to the CSD code, to help track down a tenacious race (or hw problem)
  - Add lockdep_assert_not_held(), to allow code to require a lock to not be held,
    and propagate this into the ath10k driver
  - Misc LKMM documentation updates
  - Misc KCSAN updates: cleanups & documentation updates
  - Misc fixes and cleanups
  - Fix locktorture bugs with ww_mutexes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCJDn0RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hPrRAAryS4zPnuDsfkVk0smxo7a0lK5ljbH2Xo
 28QUZXOl6upnEV8dzbjwG7eAjt5ZJVI5tKIeG0PV0NUJH2nsyHwESdtULGGYuPf/
 4YUzNwZJa+nI/jeBnVsXCimLVxxnNCRdR7yOVOHm4ukEwa+YTNt1pvlYRmUd4YyH
 Q5cCrpb3THvLka3AAamEbqnHnAdGxHKuuHYVRkODpMQ+zrQvtN8antYsuk8kJsqM
 m+GZg/dVCuLEPah5k+lOACtcq/w7HCmTlxS8t4XLvD52jywFZLcCPvi1rk0+JR+k
 Vd9TngC09GJ4jXuDpr42YKkU9/X6qy2Es39iA/ozCvc1Alrhspx/59XmaVSuWQGo
 XYuEPx38Yuo/6w16haSgp0k4WSay15A4uhCTQ75VF4vli8Bqgg9PaxLyQH1uG8e2
 xk8U90R7bDzLlhKYIx1Vu5Z0t7A1JtB5CJtgpcfg/zQLlzygo75fHzdAiU5fDBDm
 3QQXSU2Oqzt7c5ZypioHWazARk7tL6th38KGN1gZDTm5zwifpaCtHi7sml6hhZ/4
 ATH6zEPzIbXJL2UqumSli6H4ye5ORNjOu32r7YPqLI4IDbzpssfoSwfKYlQG4Tvn
 4H1Ukirzni0gz5+wbleItzf2aeo1rocs4YQTnaT02j8NmUHUz4AzOHGOQFr5Tvh0
 wk/P4MIoSb0=
 =cOOk
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:

 - rtmutex cleanup & spring cleaning pass that removes ~400 lines of
   code

 - Futex simplifications & cleanups

 - Add debugging to the CSD code, to help track down a tenacious race
   (or hw problem)

 - Add lockdep_assert_not_held(), to allow code to require a lock to not
   be held, and propagate this into the ath10k driver

 - Misc LKMM documentation updates

 - Misc KCSAN updates: cleanups & documentation updates

 - Misc fixes and cleanups

 - Fix locktorture bugs with ww_mutexes

* tag 'locking-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  kcsan: Fix printk format string
  static_call: Relax static_call_update() function argument type
  static_call: Fix unused variable warn w/o MODULE
  locking/rtmutex: Clean up signal handling in __rt_mutex_slowlock()
  locking/rtmutex: Restrict the trylock WARN_ON() to debug
  locking/rtmutex: Fix misleading comment in rt_mutex_postunlock()
  locking/rtmutex: Consolidate the fast/slowpath invocation
  locking/rtmutex: Make text section and inlining consistent
  locking/rtmutex: Move debug functions as inlines into common header
  locking/rtmutex: Decrapify __rt_mutex_init()
  locking/rtmutex: Remove pointless CONFIG_RT_MUTEXES=n stubs
  locking/rtmutex: Inline chainwalk depth check
  locking/rtmutex: Move rt_mutex_debug_task_free() to rtmutex.c
  locking/rtmutex: Remove empty and unused debug stubs
  locking/rtmutex: Consolidate rt_mutex_init()
  locking/rtmutex: Remove output from deadlock detector
  locking/rtmutex: Remove rtmutex deadlock tester leftovers
  locking/rtmutex: Remove rt_mutex_timed_lock()
  MAINTAINERS: Add myself as futex reviewer
  locking/mutex: Remove repeated declaration
  ...
2021-04-28 12:37:53 -07:00
Linus Torvalds
9a45da9270 RCU changes for this cycle were:
- Bitmap support for "N" as alias for last bit
  - kvfree_rcu updates
  - mm_dump_obj() updates.  (One of these is to mm, but was suggested by Andrew Morton.)
  - RCU callback offloading update
  - Polling RCU grace-period interfaces
  - Realtime-related RCU updates
  - Tasks-RCU updates
  - Torture-test updates
  - Torture-test scripting updates
  - Miscellaneous fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCJCZERHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hRjw/+Jkb9KvR9odPt/zqN/KPtIlburCUWgsFb
 2zAlWN4uMocPAiXT2Xq58/8gqMkpyn7ZVZtL1tD8fZSvlwEr0U8Z74+/NdoQvYE+
 kMXIYIuhIAGRyAupmzkriqN33iY+BSZPacX3u6ziPj57/0OZzbWVN/DAhbuvyLqG
 J/oL4PHCa7XAqXbf95rd5Zjs680QJ3CbTRh4nA8uHArzJmKZOaaHJ05Pxd1LpULe
 SJ+5p1GQnnwxd1HqmlHMDu/dW+2hE35BGykF8zi78je9OJXualDoM/6JpIYGhMNY
 5qlhU55QYP1jzjuNGVZZUS4L77eS2/W7SpPAaTmMEy/SsVB59G8Kf22oNDpVaEqQ
 m+2ErqwaHvlkMjqnsx+JQbsOP0yCi2NZBoEPFdfk1H23E2deVlSDbxPso4Zb1oUD
 E12769kN+SWDytuLSOAe1PY/KXqmNUKjPZl1GDCGXL7HlCnWyggUDschTsKJa19O
 XXl+yCTGMUH4XAPSqavAKQbBjurqpT6i4zfooSH4TBtOHm1ExgZOUS8gglZ1JuJd
 q+uJdZIgS8BcGkGw/k1bYDWY5TA4Rjv3sAOKQL1PgYBl1t/yLK441mE7LI9gWOwz
 Crz7vlSxD6Jc2cYQeUVW0KPGt5aVd63Gd9HjpXxGkqYQSDRqYMCebHEAGagz+jj7
 Nv/nOnf34Uc=
 =mpNt
 -----END PGP SIGNATURE-----

Merge tag 'core-rcu-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RCU updates from Ingo Molnar:

 - Support for "N" as alias for last bit in bitmap parsing library (eg
   using syntax like "nohz_full=2-N")

 - kvfree_rcu updates

 - mm_dump_obj() updates. (One of these is to mm, but was suggested by
   Andrew Morton.)

 - RCU callback offloading update

 - Polling RCU grace-period interfaces

 - Realtime-related RCU updates

 - Tasks-RCU updates

 - Torture-test updates

 - Torture-test scripting updates

 - Miscellaneous fixes

* tag 'core-rcu-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
  rcutorture: Test start_poll_synchronize_rcu() and poll_state_synchronize_rcu()
  rcu: Provide polling interfaces for Tiny RCU grace periods
  torture: Fix kvm.sh --datestamp regex check
  torture: Consolidate qemu-cmd duration editing into kvm-transform.sh
  torture: Print proper vmlinux path for kvm-again.sh runs
  torture: Make TORTURE_TRUST_MAKE available in kvm-again.sh environment
  torture: Make kvm-transform.sh update jitter commands
  torture: Add --duration argument to kvm-again.sh
  torture: Add kvm-again.sh to rerun a previous torture-test
  torture: Create a "batches" file for build reuse
  torture: De-capitalize TORTURE_SUITE
  torture: Make upper-case-only no-dot no-slash scenario names official
  torture: Rename SRCU-t and SRCU-u to avoid lowercase characters
  torture: Remove no-mpstat error message
  torture: Record kvm-test-1-run.sh and kvm-test-1-run-qemu.sh PIDs
  torture: Record jitter start/stop commands
  torture: Extract kvm-test-1-run-qemu.sh from kvm-test-1-run.sh
  torture: Record TORTURE_KCONFIG_GDB_ARG in qemu-cmd
  torture: Abstract jitter.sh start/stop into scripts
  rcu: Provide polling interfaces for Tree RCU grace periods
  ...
2021-04-28 12:00:13 -07:00
Linus Torvalds
3aa139aa9f media updates for v5.13-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmCH7/AACgkQCF8+vY7k
 4RVWVg//bcGubc1Sq68r0C+RMI1ETWMthtlkz9DTigDnht8uTmO+DohsE9R1mq4I
 szwwcwfMnMdbLB+zP7v1JNicT8GNMeFu7STurwj1w6YeQ5/IHcQc4P39+/8EAwqB
 hrUSIsY+RuYuv7dROwi45Yn8/mmWH4vdP3zrm3/k/EdlEBrf6C0e0KbGLqCr9Zrx
 pWVMB22RGfzh+1qQJhA43rDYceXs5/b5/Y1Dc/W97lJXyv87Hy0g333R+G1KzKLf
 3fhluSaLHC1j6Mm7Vneowy3mDjeyPZBiRajvNbbApQMzKIa78rJGOjMEBynHW1Rl
 Np7N0cGIq2a/yai7fCQk1SO9MdsLzE8rG+X/KxS3LSgb8c1RT31VaZhRfD9Pn4Gr
 asJ/JO7SX3YHXDO8F9WtsD/Fi9AREPz2o3I1760zk5KNQf9uvxeWkQPHQkKLlO/7
 6JWtmu5KbzUMqXZyekhcTDfqJE3xgUg36xLa7wDy8FlgDfE9MZUBaRwktHyfNgi0
 6tDdv1/w0iEvglUGVLMlQRh2ysx3lgBuAmI8Jg1vSt9DhZLsj8DY89/JE3B+CCBX
 uT9Zwc/Ug/HEpYMEhptP26z/TgVIkxs5iCvKripJ5GKl/tEIqsHSjqW5lzgI/6O+
 hKjxgZdeQ8+F9TJ5AMIBDU82trwDj77lIcHsRFRI6JLs/9L2PRo=
 =fvc5
 -----END PGP SIGNATURE-----

Merge tag 'media/v5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:

 - addition of a maintainer's profile for the media subsystem

 - addition of i.MX8 IP support

 - qcom/camss gained support for hardware version Titan 170

 - new RC keymaps

 - Lots of other improvements, cleanups and bug fixes

* tag 'media/v5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (488 commits)
  media: coda: fix macroblocks count control usage
  media: rkisp1: params: fix wrong bits settings
  media: cedrus: Fix H265 status definitions
  media: meson-ge2d: fix rotation parameters
  media: v4l2-ctrls: fix reference to freed memory
  media: venus : hfi: add venus image info into smem
  media: venus: Fix internal buffer size calculations for v6.
  media: venus: helpers: keep max bandwidth when mbps exceeds the supported range
  media: venus: fix hw overload error log condition
  media: venus: core: correct firmware name for sm8250
  media: venus: core,pm: fix potential infinite loop
  media: venus: core: Fix kerneldoc warnings
  media: gscpa/stv06xx: fix memory leak
  media: cx25821: remove unused including <linux/version.h>
  media: staging: media/meson: remove redundant dev_err call
  media: adv7842: support 1 block EDIDs, fix clearing EDID
  media: adv7842: configure all pads
  media: allegro: change kernel-doc comment blocks to normal comments
  media: camss: ispif: Remove redundant dev_err call in msm_ispif_subdev_init()
  media: i2c: rdamc21: Fix warning on u8 cast
  ...
2021-04-28 09:24:36 -07:00
Linus Torvalds
1e9599dfc4 linux-kselftest-kunit-5.13-rc1
This KUnit update for Linux 5.13-rc1 consists of several fixes and
 new feature to support failure from dynamic analysis tools such as
 UBSAN and fake ops for testing.
 
 - a fake ops struct for testing a "free" function to complain if it
   was called with an invalid argument, or caught a double-free. Most
   return void and have no normal means of signalling failure
   (e.g. super_operations, iommu_ops, etc.).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmCIpAEACgkQCwJExA0N
 QxyPtRAAgA9/X6zSvT1LftnTkgl0Zn+4Krbr7eNyvxcVdJkYrmW31Pk8PaTK02Z/
 OVDzw0ESHaAN2SwDOmKFUyiob9qBZjf4lEHDkGWLnSkV3GHd9JrM+kceyX/0yr8E
 bpYvGZHBKe+SGtAr1H9DcUcfzCZlDUhUI2HOMwm/h29mFmmJOlrjvNUAGY4pnmti
 6Mqh4GAoFxqQDDUANUIWinWU56rWrcIPQgxPD6/r2VvrC8gHtQ+2dBR8+3i2qSQO
 Ydzgno/8OmXfl606NMKh5DPS6E8uRrLw67fBB+YoEGK6E9w5yyXKKAbV3M3ekB/Y
 3ze/SUi6IKrvdF6OEH4mX3rMNEKgRQoJLpQ8KyMvDMuIAdmq3xQQFDlSc2+gEnTj
 VnEREBhrIOh24GxZFTM6VvNmjNqdAq4/BMTt/LoSHEKwOASZi9udAnKjo67P/LB1
 1+rcoKdn/OA7p9/Zo/ETpTkFvEDyF2TscaEgYz2aRiV0YgcncPc1RSDAzbghdlFP
 y10Dk3uARXdzqrd0Hb1B3HL+cAPZEINerqqAUl0ggWejcJjfUDMi7sQuKEK8I5yU
 6sSXyVhCHDNmOUjuJjt5JdqeadLUDCkZqEnMMvwqKz00WKioQ2pqYy/UJ2rpWy+G
 +pymI17nRcOVfIIqnXitSCgch7cS0FZCjiOqZBBCCs9Axuv7mJU=
 =aP5e
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:
 "Several fixes and a new feature to support failure from dynamic
  analysis tools such as UBSAN and fake ops for testing.

   - a fake ops struct for testing a "free" function to complain if it
     was called with an invalid argument, or caught a double-free. Most
     return void and have no normal means of signalling failure (e.g.
     super_operations, iommu_ops, etc.)"

* tag 'linux-kselftest-kunit-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  Documentation: kunit: add tips for using current->kunit_test
  kunit: fix -Wunused-function warning for __kunit_fail_current_test
  kunit: support failure from dynamic analysis tools
  kunit: tool: make --kunitconfig accept dirs, add lib/kunit fragment
  kunit: make KUNIT_EXPECT_STREQ() quote values, don't print literals
  kunit: Match parenthesis alignment to improve code readability
2021-04-27 18:56:29 -07:00
Linus Torvalds
2a68c268a1 linux-kselftest-next-5.13-rc1
This Kselftest update for Linux 5.13-rc1 consists of:
 
 - fixes and updates to resctrl test from Fenghua Yu and Reinette Chatre
 - fixes to Kselftest documentation, framework
 - minor spelling correction in timers test
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmCIls8ACgkQCwJExA0N
 QxzrDg/9E2+KrNsqT/bVZIDZgPsLOPtkIaNd+94wsGKgHUSekoHRYKcmeRZLLA3x
 m6s4Jc8o84rztRgjjttWCOQD3ICsFC5Dp4eu2f8YowlPqPRn0MMJEUwQAPhxnFq0
 44KQ2v7bJhXYZRwhZXcv1Gu1o3o6cx59X9pLFo/Yf/OeTHj7ulegWtjCvBcS2uuT
 bhI0YbiCKDE4gIXYLPWKD96JjLRVo5zYnMIRqDJrgf7xSr+xoKmsZKSgkt6ca+My
 KSYtkaXDEB1DFNoovDQyhmAwImeqWgEKPMZIblLyfoUJNRyBQg9flRvguBzgR3TM
 J1lvavNZSC7qgx9xQI4DjsHtpn9y9C5/k9vXauhVtdMpMGY6zrz2zN5/xOohXjzN
 vlonhp6G/wkfxuo0Dcr++Oqlw5wWt55hxFJm84rIQ/2IYUfRBKWV5c2mUKRUJzrr
 pT3fcIpN1WTEBaxvC4/aL5oLvF/RSArSKs7StX1uzkedy7IwPsiCJa5OgT2iNSbH
 tpDS9KNOiLIkwpr8dBF9O9WRBo8ZtUoB1OPqQWuc0PMa0RDT4i/oTwe0ulu5rWma
 5G3yQIilTsRAxpFWihklBiV0pB9bT8O/d8kMlagj/znl8GmKyiXEMDwrCPfVmp16
 zNMskcarXWgL+BdhnSX5j53tLL6MEAWS1RzweR62U20eyPJF34Y=
 =0VC+
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-next-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest updates from Shuah Khan:

 - fixes and updates to resctrl test from Fenghua Yu and Reinette Chatre

 - fixes to Kselftest documentation, framework

 - minor spelling correction in timers test

* tag 'linux-kselftest-next-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (25 commits)
  selftests/resctrl: Change a few printed messages
  Documentation: kselftest: fix path to test module files
  selftests/resctrl: Create .gitignore to include resctrl_tests
  selftests/resctrl: Fix checking for < 0 for unsigned values
  selftests/resctrl: Fix incorrect parsing of iMC counters
  selftests/resctrl: Fix unmount resctrl FS
  selftests/resctrl: Skip the test if requested resctrl feature is not supported
  selftests/resctrl: Modularize resctrl test suite main() function
  selftests/resctrl: Don't hard code value of "no_of_bits" variable
  selftests/resctrl: Fix MBA/MBM results reporting format
  selftests/resctrl: Use resctrl/info for feature detection
  selftests/resctrl: Check for resctrl mount point only if resctrl FS is supported
  selftests/resctrl: Add config dependencies
  selftests/resctrl: Fix a printed message
  selftests/resctrl: Share show_cache_info() by CAT and CMT tests
  selftests/resctrl: Call kselftest APIs to log test results
  selftests/resctrl: Rename CQM test as CMT test
  selftests/resctrl: Fix missing options "-n" and "-p"
  selftests/resctrl: Ensure sibling CPU is not same as original CPU
  selftests/resctrl: Clean up resctrl features check
  ...
2021-04-27 18:54:01 -07:00
Linus Torvalds
c6536676c7 - turn the stack canary into a normal __percpu variable on 32-bit which
gets rid of the LAZY_GS stuff and a lot of code.
 
 - Add an insn_decode() API which all users of the instruction decoder
 should preferrably use. Its goal is to keep the details of the
 instruction decoder away from its users and simplify and streamline how
 one decodes insns in the kernel. Convert its users to it.
 
 - kprobes improvements and fixes
 
 - Set the maximum DIE per package variable on Hygon
 
 - Rip out the dynamic NOP selection and simplify all the machinery around
 selecting NOPs. Use the simplified NOPs in objtool now too.
 
 - Add Xeon Sapphire Rapids to list of CPUs that support PPIN
 
 - Simplify the retpolines by folding the entire thing into an
 alternative now that objtool can handle alternatives with stack
 ops. Then, have objtool rewrite the call to the retpoline with the
 alternative which then will get patched at boot time.
 
 - Document Intel uarch per models in intel-family.h
 
 - Make Sub-NUMA Clustering topology the default and Cluster-on-Die the
 exception on Intel.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmCHyJQACgkQEsHwGGHe
 VUpjiRAAwPZdwwp08ypZuMHR4EhLNru6gYhbAoALGgtYnQjLtn5onQhIeieK+R4L
 cmZpxHT9OFp5dXHk4kwygaQBsD4pPOiIpm60kye1dN3cSbOORRdkwEoQMpKMZ+5Y
 kvVsmn7lrwRbp600KdE4G6L5+N6gEgr0r6fMFWWGK3mgVAyCzPexVHgydcp131ch
 iYMo6/pPDcNkcV/hboVKgx7GISdQ7L356L1MAIW/Sxtw6uD/X4qGYW+kV2OQg9+t
 nQDaAo7a8Jqlop5W5TQUdMLKQZ1xK8SFOSX/nTS15DZIOBQOGgXR7Xjywn1chBH/
 PHLwM5s4XF6NT5VlIA8tXNZjWIZTiBdldr1kJAmdDYacrtZVs2LWSOC0ilXsd08Z
 EWtvcpHfHEqcuYJlcdALuXY8xDWqf6Q2F7BeadEBAxwnnBg+pAEoLXI/1UwWcmsj
 wpaZTCorhJpYo2pxXckVdHz2z0LldDCNOXOjjaWU8tyaOBKEK6MgAaYU7e0yyENv
 mVc9n5+WuvXuivC6EdZ94Pcr/KQsd09ezpJYcVfMDGv58YZrb6XIEELAJIBTu2/B
 Ua8QApgRgetx+1FKb8X6eGjPl0p40qjD381TADb4rgETPb1AgKaQflmrSTIik+7p
 O+Eo/4x/GdIi9jFk3K+j4mIznRbUX0cheTJgXoiI4zXML9Jv94w=
 =bm4S
 -----END PGP SIGNATURE-----

Merge tag 'x86_core_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 updates from Borislav Petkov:

 - Turn the stack canary into a normal __percpu variable on 32-bit which
   gets rid of the LAZY_GS stuff and a lot of code.

 - Add an insn_decode() API which all users of the instruction decoder
   should preferrably use. Its goal is to keep the details of the
   instruction decoder away from its users and simplify and streamline
   how one decodes insns in the kernel. Convert its users to it.

 - kprobes improvements and fixes

 - Set the maximum DIE per package variable on Hygon

 - Rip out the dynamic NOP selection and simplify all the machinery
   around selecting NOPs. Use the simplified NOPs in objtool now too.

 - Add Xeon Sapphire Rapids to list of CPUs that support PPIN

 - Simplify the retpolines by folding the entire thing into an
   alternative now that objtool can handle alternatives with stack ops.
   Then, have objtool rewrite the call to the retpoline with the
   alternative which then will get patched at boot time.

 - Document Intel uarch per models in intel-family.h

 - Make Sub-NUMA Clustering topology the default and Cluster-on-Die the
   exception on Intel.

* tag 'x86_core_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
  x86, sched: Treat Intel SNC topology as default, COD as exception
  x86/cpu: Comment Skylake server stepping too
  x86/cpu: Resort and comment Intel models
  objtool/x86: Rewrite retpoline thunk calls
  objtool: Skip magical retpoline .altinstr_replacement
  objtool: Cache instruction relocs
  objtool: Keep track of retpoline call sites
  objtool: Add elf_create_undef_symbol()
  objtool: Extract elf_symbol_add()
  objtool: Extract elf_strtab_concat()
  objtool: Create reloc sections implicitly
  objtool: Add elf_create_reloc() helper
  objtool: Rework the elf_rebuild_reloc_section() logic
  objtool: Fix static_call list generation
  objtool: Handle per arch retpoline naming
  objtool: Correctly handle retpoline thunk calls
  x86/retpoline: Simplify retpolines
  x86/alternatives: Optimize optimize_nops()
  x86: Add insn_decode_kernel()
  x86/kprobes: Move 'inline' to the beginning of the kprobe_is_ss() declaration
  ...
2021-04-27 17:45:09 -07:00
Pedro Tammela
3733bfbbdd bpf, selftests: Update array map tests for per-cpu batched ops
Follows the same logic as the hashtable tests.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210424214510.806627-3-pctammela@mojatatu.com
2021-04-28 01:18:12 +02:00