Commit Graph

28372 Commits

Author SHA1 Message Date
036a05f50b bpf/selftests: Add check for updating XDP bpf_link with wrong program type
Add a check to the xdp_link selftest that the kernel rejects replacing an
XDP program with a different program type on link update.

v2:
- Split this out into its own patch.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20220107221115.326171-3-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-11 09:44:06 -08:00
4b27480dca bpf/selftests: convert xdp_link test to ASSERT_* macros
Convert the selftest to use the preferred ASSERT_* macros instead of the
deprecated CHECK().

v2:
- Don't add if statements around checks if they weren't there before.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20220107221115.326171-2-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-11 09:44:06 -08:00
1c824bf768 Merge tag 'lkmm.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull memory model documentation updates from Paul McKenney:
 "This series contains documentation and litmus tests for locking,
  courtesy of Boqun Feng"

* tag 'lkmm.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  tools/memory-model: litmus: Add two tests for unlock(A)+lock(B) ordering
  tools/memory-model: doc: Describe the requirement of the litmus-tests directory
  tools/memory-model: Provide extra ordering for unlock+lock pair on the same CPU
2022-01-11 09:38:03 -08:00
e7d38f16c2 Merge tag 'rcu.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU updates from Paul McKenney:

 - Documentation updates, perhaps most notably Neil Brown's writeup of
   the reference-counting analogy to RCU.

 - Expedited grace-period cleanups.

 - Remove CONFIG_RCU_FAST_NO_HZ due to lack of valid users. I have asked
   around, posted a blog entry, and sent this series to LKML without
   result.

 - Miscellaneous fixes.

 - RCU callback offloading updates, perhaps most notably Frederic
   Weisbecker's updates allowing CPUs booted in the de-offloaded state
   to be offloaded at runtime.

 - nolibc fixes from Willy Tarreau and Anmar Faizi, but also including
   Mark Brown's addition of gettid().

 - RCU Tasks Trace fixes, including changes that increase the
   scalability of call_rcu_tasks_trace() for the BPF folks (Martin Lau
   and KP Singh).

 - Various fixes including those from Wander Lairson Costa and Li
   Zhijian.

 - Fixes plus addition of tests for the increased call_rcu_tasks_trace()
   scalability.

* tag 'rcu.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (87 commits)
  rcu/nocb: Merge rcu_spawn_cpu_nocb_kthread() and rcu_spawn_one_nocb_kthread()
  rcu/nocb: Allow empty "rcu_nocbs" kernel parameter
  rcu/nocb: Create kthreads on all CPUs if "rcu_nocbs=" or "nohz_full=" are passed
  rcu/nocb: Optimize kthreads and rdp initialization
  rcu/nocb: Prepare nocb_cb_wait() to start with a non-offloaded rdp
  rcu/nocb: Remove rcu_node structure from nocb list when de-offloaded
  rcu-tasks: Use fewer callbacks queues if callback flood ends
  rcu-tasks: Use separate ->percpu_dequeue_lim for callback dequeueing
  rcu-tasks: Use more callback queues if contention encountered
  rcu-tasks: Avoid raw-spinlocked wakeups from call_rcu_tasks_generic()
  rcu-tasks: Count trylocks to estimate call_rcu_tasks() contention
  rcu-tasks: Add rcupdate.rcu_task_enqueue_lim to set initial queueing
  rcu-tasks: Make rcu_barrier_tasks*() handle multiple callback queues
  rcu-tasks: Use workqueues for multiple rcu_tasks_invoke_cbs() invocations
  rcu-tasks: Abstract invocations of callbacks
  rcu-tasks: Abstract checking of callback lists
  rcu-tasks: Add a ->percpu_enqueue_lim to the rcu_tasks structure
  rcu-tasks: Inspect stalled task's trc state in locked state
  rcu-tasks: Use spin_lock_rcu_node() and friends
  rcutorture: Combine n_max_cbs from all kthreads in a callback flood
  ...
2022-01-11 09:29:44 -08:00
fe2437ccbd Merge tag 'thermal-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
 "These add a new driver for Renesas RZ/G2L TSU, update a few existing
  thermal control drivers and clean up the tmon utility.

  Specifics:

   - Add new TSU driver and DT bindings for the Renesas RZ/G2L platform
     (Biju Das).

   - Fix missing check when calling reset_control_deassert() in the
     rz2gl thermal driver (Biju Das).

   - In preparation for FORTIFY_SOURCE performing compile-time and
     run-time field bounds checking for memcpy(), avoid intentionally
     writing across neighboring fields in the int340x thermal control
     driver (Kees Cook).

   - Fix RFIM mailbox write commands handling in the int340x thermal
     control driver (Sumeet Pawnikar).

   - Fix PM issue occurring in the iMX thermal control driver during
     suspend/resume by implementing PM runtime support in it (Oleksij
     Rempel).

   - Add 'const' annotation to thermal_cooling_ops in the Intel
     powerclamp driver (Rikard Falkeborn).

   - Fix missing ADC bit set in the iMX8MP thermal driver to enable the
     sensor (Paul Gerber).

   - Drop unused local variable definition from tmon (ran jianping)"

* tag 'thermal-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal/drivers/int340x: Fix RFIM mailbox write commands
  thermal/drivers/rz2gl: Add error check for reset_control_deassert()
  thermal/drivers/imx8mm: Enable ADC when enabling monitor
  thermal/drivers: Add TSU driver for RZ/G2L
  dt-bindings: thermal: Document Renesas RZ/G2L TSU
  thermal/drivers/intel_powerclamp: Constify static thermal_cooling_device_ops
  thermal/drivers/imx: Implement runtime PM support
  thermal: tools: tmon: remove unneeded local variable
  thermal: int340x: Use struct_group() for memcpy() region
2022-01-10 20:43:54 -08:00
8efd0d9c31 Merge tag '5.17-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
 "Core
  ----

   - Defer freeing TCP skbs to the BH handler, whenever possible, or at
     least perform the freeing outside of the socket lock section to
     decrease cross-CPU allocator work and improve latency.

   - Add netdevice refcount tracking to locate sources of netdevice and
     net namespace refcount leaks.

   - Make Tx watchdog less intrusive - avoid pausing Tx and restarting
     all queues from a single CPU removing latency spikes.

   - Various small optimizations throughout the stack from Eric Dumazet.

   - Make netdev->dev_addr[] constant, force modifications to go via
     appropriate helpers to allow us to keep addresses in ordered data
     structures.

   - Replace unix_table_lock with per-hash locks, improving performance
     of bind() calls.

   - Extend skb drop tracepoint with a drop reason.

   - Allow SO_MARK and SO_PRIORITY setsockopt under CAP_NET_RAW.

  BPF
  ---

   - New helpers:
      - bpf_find_vma(), find and inspect VMAs for profiling use cases
      - bpf_loop(), runtime-bounded loop helper trading some execution
        time for much faster (if at all converging) verification
      - bpf_strncmp(), improve performance, avoid compiler flakiness
      - bpf_get_func_arg(), bpf_get_func_ret(), bpf_get_func_arg_cnt()
        for tracing programs, all inlined by the verifier

   - Support BPF relocations (CO-RE) in the kernel loader.

   - Further the support for BTF_TYPE_TAG annotations.

   - Allow access to local storage in sleepable helpers.

   - Convert verifier argument types to a composable form with different
     attributes which can be shared across types (ro, maybe-null).

   - Prepare libbpf for upcoming v1.0 release by cleaning up APIs,
     creating new, extensible ones where missing and deprecating those
     to be removed.

  Protocols
  ---------

   - WiFi (mac80211/cfg80211):
      - notify user space about long "come back in N" AP responses,
        allow it to react to such temporary rejections
      - allow non-standard VHT MCS 10/11 rates
      - use coarse time in airtime fairness code to save CPU cycles

   - Bluetooth:
      - rework of HCI command execution serialization to use a common
        queue and work struct, and improve handling errors reported in
        the middle of a batch of commands
      - rework HCI event handling to use skb_pull_data, avoiding packet
        parsing pitfalls
      - support AOSP Bluetooth Quality Report

   - SMC:
      - support net namespaces, following the RDMA model
      - improve connection establishment latency by pre-clearing buffers
      - introduce TCP ULP for automatic redirection to SMC

   - Multi-Path TCP:
      - support ioctls: SIOCINQ, OUTQ, and OUTQNSD
      - support socket options: IP_TOS, IP_FREEBIND, IP_TRANSPARENT,
        IPV6_FREEBIND, and IPV6_TRANSPARENT, TCP_CORK and TCP_NODELAY
      - support cmsgs: TCP_INQ
      - improvements in the data scheduler (assigning data to subflows)
      - support fastclose option (quick shutdown of the full MPTCP
        connection, similar to TCP RST in regular TCP)

   - MCTP (Management Component Transport) over serial, as defined by
     DMTF spec DSP0253 - "MCTP Serial Transport Binding".

  Driver API
  ----------

   - Support timestamping on bond interfaces in active/passive mode.

   - Introduce generic phylink link mode validation for drivers which
     don't have any quirks and where MAC capability bits fully express
     what's supported. Allow PCS layer to participate in the validation.
     Convert a number of drivers.

   - Add support to set/get size of buffers on the Rx rings and size of
     the tx copybreak buffer via ethtool.

   - Support offloading TC actions as first-class citizens rather than
     only as attributes of filters, improve sharing and device resource
     utilization.

   - WiFi (mac80211/cfg80211):
      - support forwarding offload (ndo_fill_forward_path)
      - support for background radar detection hardware
      - SA Query Procedures offload on the AP side

  New hardware / drivers
  ----------------------

   - tsnep - FPGA based TSN endpoint Ethernet MAC used in PLCs with
     real-time requirements for isochronous communication with protocols
     like OPC UA Pub/Sub.

   - Qualcomm BAM-DMUX WWAN - driver for data channels of modems
     integrated into many older Qualcomm SoCs, e.g. MSM8916 or MSM8974
     (qcom_bam_dmux).

   - Microchip LAN966x multi-port Gigabit AVB/TSN Ethernet Switch driver
     with support for bridging, VLANs and multicast forwarding
     (lan966x).

   - iwlmei driver for co-operating between Intel's WiFi driver and
     Intel's Active Management Technology (AMT) devices.

   - mse102x - Vertexcom MSE102x Homeplug GreenPHY chips

   - Bluetooth:
      - MediaTek MT7921 SDIO devices
      - Foxconn MT7922A
      - Realtek RTL8852AE

  Drivers
  -------

   - Significantly improve performance in the datapaths of: lan78xx,
     ax88179_178a, lantiq_xrx200, bnxt.

   - Intel Ethernet NICs:
      - igb: support PTP/time PEROUT and EXTTS SDP functions on
        82580/i354/i350 adapters
      - ixgbevf: new PF -> VF mailbox API which avoids the risk of
        mailbox corruption with ESXi
      - iavf: support configuration of VLAN features of finer
        granularity, stacked tags and filtering
      - ice: PTP support for new E822 devices with sub-ns precision
      - ice: support firmware activation without reboot

   - Mellanox Ethernet NICs (mlx5):
      - expose control over IRQ coalescing mode (CQE vs EQE) via ethtool
      - support TC forwarding when tunnel encap and decap happen between
        two ports of the same NIC
      - dynamically size and allow disabling various features to save
        resources for running in embedded / SmartNIC scenarios

   - Broadcom Ethernet NICs (bnxt):
      - use page frag allocator to improve Rx performance
      - expose control over IRQ coalescing mode (CQE vs EQE) via ethtool

   - Other Ethernet NICs:
      - amd-xgbe: add Ryzen 6000 (Yellow Carp) Ethernet support

   - Microsoft cloud/virtual NIC (mana):
      - add XDP support (PASS, DROP, TX)

   - Mellanox Ethernet switches (mlxsw):
      - initial support for Spectrum-4 ASICs
      - VxLAN with IPv6 underlay

   - Marvell Ethernet switches (prestera):
      - support flower flow templates
      - add basic IP forwarding support

   - NXP embedded Ethernet switches (ocelot & felix):
      - support Per-Stream Filtering and Policing (PSFP)
      - enable cut-through forwarding between ports by default
      - support FDMA to improve packet Rx/Tx to CPU

   - Other embedded switches:
      - hellcreek: improve trapping management (STP and PTP) packets
      - qca8k: support link aggregation and port mirroring

   - Qualcomm 802.11ax WiFi (ath11k):
      - qca6390, wcn6855: enable 802.11 power save mode in station mode
      - BSS color change support
      - WCN6855 hw2.1 support
      - 11d scan offload support
      - scan MAC address randomization support
      - full monitor mode, only supported on QCN9074
      - qca6390/wcn6855: report signal and tx bitrate
      - qca6390: rfkill support
      - qca6390/wcn6855: regdb.bin support

   - Intel WiFi (iwlwifi):
      - support SAR GEO Offset Mapping (SGOM) and Time-Aware-SAR (TAS)
        in cooperation with the BIOS
      - support for Optimized Connectivity Experience (OCE) scan
      - support firmware API version 68
      - lots of preparatory work for the upcoming Bz device family

   - MediaTek WiFi (mt76):
      - Specific Absorption Rate (SAR) support
      - mt7921: 160 MHz channel support

   - RealTek WiFi (rtw88):
      - Specific Absorption Rate (SAR) support
      - scan offload

   - Other WiFi NICs
      - ath10k: support fetching (pre-)calibration data from nvmem
      - brcmfmac: configure keep-alive packet on suspend
      - wcn36xx: beacon filter support"

* tag '5.17-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2048 commits)
  tcp: tcp_send_challenge_ack delete useless param `skb`
  net/qla3xxx: Remove useless DMA-32 fallback configuration
  rocker: Remove useless DMA-32 fallback configuration
  hinic: Remove useless DMA-32 fallback configuration
  lan743x: Remove useless DMA-32 fallback configuration
  net: enetc: Remove useless DMA-32 fallback configuration
  cxgb4vf: Remove useless DMA-32 fallback configuration
  cxgb4: Remove useless DMA-32 fallback configuration
  cxgb3: Remove useless DMA-32 fallback configuration
  bnx2x: Remove useless DMA-32 fallback configuration
  et131x: Remove useless DMA-32 fallback configuration
  be2net: Remove useless DMA-32 fallback configuration
  vmxnet3: Remove useless DMA-32 fallback configuration
  bna: Simplify DMA setting
  net: alteon: Simplify DMA setting
  myri10ge: Simplify DMA setting
  qlcnic: Simplify DMA setting
  net: allwinner: Fix print format
  page_pool: remove spinlock in page_pool_refill_alloc_cache()
  amt: fix wrong return type of amt_send_membership_update()
  ...
2022-01-10 19:06:09 -08:00
bf4eebf8cf Merge tag 'linux-kselftest-kunit-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull KUnit updates from Shuah Khan:
 "This consists of several fixes and enhancements. A few highlights:

   - Option --kconfig_add option allows easily tweaking kunitconfigs

   - make build subcommand can reconfigure if needed

   - doesn't error on tests without test plans

   - doesn't crash if no parameters are generated

   - defaults --jobs to # of cups

   - reports test parameter results as (K)TAP subtests"

* tag 'linux-kselftest-kunit-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: tool: Default --jobs to number of CPUs
  kunit: tool: fix newly introduced typechecker errors
  kunit: tool: make `build` subcommand also reconfigure if needed
  kunit: tool: delete kunit_parser.TestResult type
  kunit: tool: use dataclass instead of collections.namedtuple
  kunit: tool: suggest using decode_stacktrace.sh on kernel crash
  kunit: tool: reconfigure when the used kunitconfig changes
  kunit: tool: revamp message for invalid kunitconfig
  kunit: tool: add --kconfig_add to allow easily tweaking kunitconfigs
  kunit: tool: move Kconfig read_from_file/parse_from_string to package-level
  kunit: tool: print parsed test results fully incrementally
  kunit: Report test parameter results as (K)TAP subtests
  kunit: Don't crash if no parameters are generated
  kunit: tool: Report an error if any test has no subtests
  kunit: tool: Do not error on tests without test plans
  kunit: add run_checks.py script to validate kunit changes
  Documentation: kunit: remove claims that kunit is a mocking framework
  kunit: tool: fix --json output for skipped tests
2022-01-10 12:16:48 -08:00
4369b3cec2 Merge tag 'linux-kselftest-next-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest update from Shuah Khan:
 "Fixes to build errors, false negatives, and several code cleanups,
  including the ARRAY_SIZE cleanup that removes 25+ duplicates
  ARRAY_SIZE defines from individual tests"

* tag 'linux-kselftest-next-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/vm: remove ARRAY_SIZE define from individual tests
  selftests/timens: remove ARRAY_SIZE define from individual tests
  selftests/sparc64: remove ARRAY_SIZE define from adi-test
  selftests/seccomp: remove ARRAY_SIZE define from seccomp_benchmark
  selftests/rseq: remove ARRAY_SIZE define from individual tests
  selftests/net: remove ARRAY_SIZE define from individual tests
  selftests/landlock: remove ARRAY_SIZE define from common.h
  selftests/ir: remove ARRAY_SIZE define from ir_loopback.c
  selftests/core: remove ARRAY_SIZE define from close_range_test.c
  selftests/cgroup: remove ARRAY_SIZE define from cgroup_util.h
  selftests/arm64: remove ARRAY_SIZE define from vec-syscfg.c
  tools: fix ARRAY_SIZE defines in tools and selftests hdrs
  selftests: cgroup: build error multiple outpt files
  selftests/move_mount_set_group remove unneeded conversion to bool
  selftests/mount: remove unneeded conversion to bool
  selftests: harness: avoid false negatives if test has no ASSERTs
  selftests/ftrace: make kprobe profile testcase description unique
  selftests: clone3: clone3: add case CLONE3_ARGS_NO_TEST
  selftests: timers: Remove unneeded semicolon
  kselftests: timers:Remove unneeded semicolon
2022-01-10 12:08:12 -08:00
9d3a1e0a88 Merge tag 'seccomp-v5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp updates from Kees Cook:
 "The core seccomp code hasn't changed for this cycle, but the selftests
  were improved while helping to debug the recent signal handling
  refactoring work Eric did.

  Summary:

   - Improve seccomp selftests in support of signal handler refactoring
     (Kees Cook)"

* tag 'seccomp-v5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  selftests/seccomp: Report event mismatches more clearly
  selftests/seccomp: Stop USER_NOTIF test if kcmp() fails
2022-01-10 11:50:57 -08:00
d5962fb7d6 perf annotate: Avoid TUI crash when navigating in the annotation of recursive functions
In 'perf report', entering a recursive function from inside of itself
(either directly of indirectly through some other function) results in
calling symbol__annotate2 multiple() times, and freeing the whole
disassembly when exiting from the innermost instance.

The first issue causes the function's disassembly to be duplicated, and
the latter a heap use-after-free (and crash) when trying to access the
disassembly again.

I reproduced the bug on perf 5.11.22 (Ubuntu 20.04.3 LTS) and 5.16.rc8
with the following testcase (compile with gcc recursive.c -o recursive).
To reproduce:

- perf record ./recursive
- perf report
- enter fibonacci and annotate it
- move the cursor on one of the "callq fibonacci" instructions and press enter
  - at this point there will be two copies of the function in the disassembly
- go back by pressing q, and perf will crash

  #include <stdio.h>

  int fibonacci(int n)
  {
      if(n <= 2) return 1;
      return fibonacci(n-1) + fibonacci(n-2);
  }

  int main()
  {
      printf("%d\n", fibonacci(40));
  }

This patch addresses the issue by annotating a function and freeing the
associated memory on exit only if no annotation is already present, so
that a recursive function is only annotated on entry.

Signed-off-by: Dario Petrillo <dario.pk1@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@kernel.org
Link: http://lore.kernel.org/lkml/20220109234441.325106-1-dario.pk1@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-10 15:47:30 -03:00
befee3775b perf powerpc: Update global/local variants for p_stage_cyc
Update the arch_support_sort_key() function in powerpc to enable
presenting local and global variants of sort key 'p_stage_cyc'.

Update the "se_header" strings for these in arch_perf_header_entry()
along with instruction latency.

Reported-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20211203022038.48240-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-10 15:39:00 -03:00
e3304c2135 perf sort: Include global and local variants for p_stage_cyc sort key
Sort key 'p_stage_cyc' is used to present the latency cycles spent in
pipeline stages.

perf has local 'p_stage_cyc' sort key to display this info. There is no
global variant available for this sort key. The local variant shows
latency in a single sample, whereas the global value will be useful to
present the total latency (sum of latencies) in the hist entry. It
represents the latency number multiplied by the number of samples.

Add global ('p_stage_cyc') and local variant ('local_p_stage_cyc') for
this sort key. Use 'local_p_stage_cyc' as default option for "mem" sort
mode.

Also add this to the list of dynamic sort keys and made the
"dynamic_headers" and "arch_specific_sort_keys" as static.

Reported-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20211203022038.48240-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-10 15:39:00 -03:00
debe70e488 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-10 15:35:41 -03:00
bfed6efb8e Merge tag 'x86_sgx_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SGX updates from Borislav Petkov:

 - Add support for handling hw errors in SGX pages: poisoning,
   recovering from poison memory and error injection into SGX pages

 - A bunch of changes to the SGX selftests to simplify and allow of SGX
   features testing without the need of a whole SGX software stack

 - Add a sysfs attribute which is supposed to show the amount of SGX
   memory in a NUMA node, similar to what /proc/meminfo is to normal
   memory

 - The usual bunch of fixes and cleanups too

* tag 'x86_sgx_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  x86/sgx: Fix NULL pointer dereference on non-SGX systems
  selftests/sgx: Fix corrupted cpuid macro invocation
  x86/sgx: Add an attribute for the amount of SGX memory in a NUMA node
  x86/sgx: Fix minor documentation issues
  selftests/sgx: Add test for multiple TCS entry
  selftests/sgx: Enable multiple thread support
  selftests/sgx: Add page permission and exception test
  selftests/sgx: Rename test properties in preparation for more enclave tests
  selftests/sgx: Provide per-op parameter structs for the test enclave
  selftests/sgx: Add a new kselftest: Unclobbered_vdso_oversubscribed
  selftests/sgx: Move setup_test_encl() to each TEST_F()
  selftests/sgx: Encpsulate the test enclave creation
  selftests/sgx: Dump segments and /proc/self/maps only on failure
  selftests/sgx: Create a heap for the test enclave
  selftests/sgx: Make data measurement for an enclave segment optional
  selftests/sgx: Assign source for each segment
  selftests/sgx: Fix a benign linker warning
  x86/sgx: Add check for SGX pages to ghes_do_memory_failure()
  x86/sgx: Add hook to error injection address validation
  x86/sgx: Hook arch_memory_failure() into mainline code
  ...
2022-01-10 09:44:09 -08:00
9b9e211360 Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:

 - KCSAN enabled for arm64.

 - Additional kselftests to exercise the syscall ABI w.r.t. SVE/FPSIMD.

 - Some more SVE clean-ups and refactoring in preparation for SME
   support (scalable matrix extensions).

 - BTI clean-ups (SYM_FUNC macros etc.)

 - arm64 atomics clean-up and codegen improvements.

 - HWCAPs for FEAT_AFP (alternate floating point behaviour) and
   FEAT_RPRESS (increased precision of reciprocal estimate and
   reciprocal square root estimate).

 - Use SHA3 instructions to speed-up XOR.

 - arm64 unwind code refactoring/unification.

 - Avoid DC (data cache maintenance) instructions when DCZID_EL0.DZP ==
   1 (potentially set by a hypervisor; user-space already does this).

 - Perf updates for arm64: support for CI-700, HiSilicon PCIe PMU,
   Marvell CN10K LLC-TAD PMU, miscellaneous clean-ups.

 - Other fixes and clean-ups; highlights: fix the handling of erratum
   1418040, correct the calculation of the nomap region boundaries,
   introduce io_stop_wc() mapped to the new DGH instruction (data
   gathering hint).

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (81 commits)
  arm64: Use correct method to calculate nomap region boundaries
  arm64: Drop outdated links in comments
  arm64: perf: Don't register user access sysctl handler multiple times
  drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check
  perf/smmuv3: Fix unused variable warning when CONFIG_OF=n
  arm64: errata: Fix exec handling in erratum 1418040 workaround
  arm64: Unhash early pointer print plus improve comment
  asm-generic: introduce io_stop_wc() and add implementation for ARM64
  arm64: Ensure that the 'bti' macro is defined where linkage.h is included
  arm64: remove __dma_*_area() aliases
  docs/arm64: delete a space from tagged-address-abi
  arm64: Enable KCSAN
  kselftest/arm64: Add pidbench for floating point syscall cases
  arm64/fp: Add comments documenting the usage of state restore functions
  kselftest/arm64: Add a test program to exercise the syscall ABI
  kselftest/arm64: Allow signal tests to trigger from a function
  kselftest/arm64: Parameterise ptrace vector length information
  arm64/sve: Minor clarification of ABI documentation
  arm64/sve: Generalise vector length configuration prctl() for SME
  arm64/sve: Make sysctl interface for SVE reusable by SME
  ...
2022-01-10 08:49:37 -08:00
8aaaf2f3af Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in fixes directly in prep for the 5.17 merge window.
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-09 17:00:17 -08:00
77bbcb60f7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next. This
includes one patch to update ovs and act_ct to use nf_ct_put() instead
of nf_conntrack_put().

1) Add netns_tracker to nfnetlink_log and masquerade, from Eric Dumazet.

2) Remove redundant rcu read-size lock in nf_tables packet path.

3) Replace BUG() by WARN_ON_ONCE() in nft_payload.

4) Consolidate rule verdict tracing.

5) Replace WARN_ON() by WARN_ON_ONCE() in nf_tables core.

6) Make counter support built-in in nf_tables.

7) Add new field to conntrack object to identify locally generated
   traffic, from Florian Westphal.

8) Prevent NAT from shadowing well-known ports, from Florian Westphal.

9) Merge nf_flow_table_{ipv4,ipv6} into nf_flow_table_inet, also from
   Florian.

10) Remove redundant pointer in nft_pipapo AVX2 support, from Colin Ian King.

11) Replace opencoded max() in conntrack, from Jiapeng Chong.

12) Update conntrack to use refcount_t API, from Florian Westphal.

13) Move ip_ct_attach indirection into the nf_ct_hook structure.

14) Constify several pointer object in the netfilter codebase,
    from Florian Westphal.

15) Tree-wide replacement of nf_conntrack_put() by nf_ct_put(), also
    from Florian.

16) Fix egress splat due to incorrect rcu notation, from Florian.

17) Move stateful fields of connlimit, last, quota, numgen and limit
    out of the expression data area.

18) Build a blob to represent the ruleset in nf_tables, this is a
    requirement of the new register tracking infrastructure.

19) Add NFT_REG32_NUM to define the maximum number of 32-bit registers.

20) Add register tracking infrastructure to skip redundant
    store-to-register operations, this includes support for payload,
    meta and bitwise expresssions.

* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next: (32 commits)
  netfilter: nft_meta: cancel register tracking after meta update
  netfilter: nft_payload: cancel register tracking after payload update
  netfilter: nft_bitwise: track register operations
  netfilter: nft_meta: track register operations
  netfilter: nft_payload: track register operations
  netfilter: nf_tables: add register tracking infrastructure
  netfilter: nf_tables: add NFT_REG32_NUM
  netfilter: nf_tables: add rule blob layout
  netfilter: nft_limit: move stateful fields out of expression data
  netfilter: nft_limit: rename stateful structure
  netfilter: nft_numgen: move stateful fields out of expression data
  netfilter: nft_quota: move stateful fields out of expression data
  netfilter: nft_last: move stateful fields out of expression data
  netfilter: nft_connlimit: move stateful fields out of expression data
  netfilter: egress: avoid a lockdep splat
  net: prefer nf_ct_put instead of nf_conntrack_put
  netfilter: conntrack: avoid useless indirection during conntrack destruction
  netfilter: make function op structures const
  netfilter: core: move ip_ct_attach indirection to struct nf_ct_hook
  netfilter: conntrack: convert to refcount_t api
  ...
====================

Link: https://lore.kernel.org/r/20220109231640.104123-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-09 15:59:23 -08:00
9a12a5aa17 Merge tag 'perf-tools-fixes-for-v5.16-2022-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Revert "libtraceevent: Increase libtraceevent logging when verbose",
   breaks the build with libtraceevent-1.3.0, i.e. when building with
   'LIBTRACEEVENT_DYNAMIC=1'.

 - Avoid early exit in 'perf trace' due to running SIGCHLD handler
   before it makes sense to. It can happen when using a BPF source code
   event that have to be first built into an object file.

* tag 'perf-tools-fixes-for-v5.16-2022-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  Revert "libtraceevent: Increase libtraceevent logging when verbose"
  perf trace: Avoid early exit due to running SIGCHLD handler before it makes sense to
2022-01-09 10:37:07 -08:00
327b9a94e2 selftests: mptcp: more stable join tests-cases
MPTCP join self-tests are a bit fragile as they reply on
delays instead of events to catch-up with the expected
sockets states.

Replace the delay with state checking where possible and
reduce the number of sleeps in the most complex scenarios.

This will both reduce the tests run-time and will improve
stability.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07 19:00:43 -08:00
dc9f2dd5de Revert "libtraceevent: Increase libtraceevent logging when verbose"
This reverts commit 08efcb4a63.

This breaks the build as it will prefer using libbpf-devel header files,
even when not using LIBBPF_DYNAMIC=1, breaking the build.

This was detected on OpenSuSE Tumbleweed with libtraceevent-devel 1.3.0,
as described by Jiri Slaby:

=======================================================================
It breaks build with LIBTRACEEVENT_DYNAMIC and version 1.3.0:
> util/debug.c: In function ‘perf_debug_option’:
> util/debug.c:243:17: error: implicit declaration of function
‘tep_set_loglevel’ [-Werror=implicit-function-declaration]
>   243 |                 tep_set_loglevel(TEP_LOG_INFO);
>       |                 ^~~~~~~~~~~~~~~~
> util/debug.c:243:34: error: ‘TEP_LOG_INFO’ undeclared (first use in this
function); did you mean ‘TEP_PRINT_INFO’?
>   243 |                 tep_set_loglevel(TEP_LOG_INFO);
>       |                                  ^~~~~~~~~~~~
>       |                                  TEP_PRINT_INFO
> util/debug.c:243:34: note: each undeclared identifier is reported only once
for each function it appears in
> util/debug.c:245:34: error: ‘TEP_LOG_DEBUG’ undeclared (first use in this
function)
>   245 |                 tep_set_loglevel(TEP_LOG_DEBUG);
>       |                                  ^~~~~~~~~~~~~
> util/debug.c:247:34: error: ‘TEP_LOG_ALL’ undeclared (first use in this
function)
>   247 |                 tep_set_loglevel(TEP_LOG_ALL);
>       |                                  ^~~~~~~~~~~

It is because the gcc's command line looks like:
gcc
...
-I/home/abuild/rpmbuild/BUILD/tools/lib/
...
-DLIBTRACEEVENT_VERSION=65790
...
=======================================================================

The proper way to fix this is more involved and so not suitable for this
late in the 5.16-rc stage.

Reported-by: Jiri Slaby <jirislaby@kernel.org>
Link: https://lore.kernel.org/lkml/bc2b0786-8965-1bcd-2316-9d9bb37b9c31@kernel.org
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: https://lore.kernel.org/lkml/YddGjjmlMZzxUZbN@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-07 16:08:50 -03:00
f06a82f9d3 perf trace: Avoid early exit due to running SIGCHLD handler before it makes sense to
When running 'perf trace' with an BPF object like:

  # perf trace -e openat,tools/perf/examples/bpf/hello.c

the event parsing eventually calls llvm__get_kbuild_opts() that runs a
script and that ends up with SIGCHLD delivered to the 'perf trace'
handler, which assumes the workload process is done and quits 'perf
trace'.

Move the SIGCHLD handler setup directly to trace__run(), where the event
is parsed and the object is already compiled.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Christy Lee <christyc.y.lee@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220106222030.227499-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-07 15:45:19 -03:00
96c1a62855 kvm: selftests: move ucall declarations into ucall_common.h
Now that core kvm_util declarations have special home in
kvm_util_base.h, move ucall-related declarations out into a separate
header.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-Id: <20211210164620.11636-3-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 13:25:33 -05:00
7d9a662ed9 kvm: selftests: move base kvm_util.h declarations to kvm_util_base.h
Between helper macros and interfaces that will be introduced in
subsequent patches, much of kvm_util.h would end up being declarations
specific to ucall. Ideally these could be separated out into a separate
header since they are not strictly required for writing guest tests and
are mostly self-contained interfaces other than a reliance on a few
core declarations like struct kvm_vm. This doesn't make a big
difference as far as how tests will be compiled/written since all these
interfaces will still be packaged up into a single/common libkvm.a used
by all tests, but it is still nice to be able to compartmentalize to
improve readabilty and reduce merge conflicts in the future for common
tasks like adding new interfaces to kvm_util.h.

Furthermore, some of the ucall declarations will be arch-specific,
requiring various #ifdef'ery in kvm_util.h. Ideally these declarations
could live in separate arch-specific headers, e.g.
include/<arch>/ucall.h, which would handle arch-specific declarations
as well as pulling in common ucall-related declarations shared by all
archs.

One simple way to do this would be to #include ucall.h at the bottom of
kvm_util.h, after declarations it relies upon like struct kvm_vm.
This is brittle however, and doesn't scale easily to other sets of
interfaces that may be added in the future.

Instead, move all declarations currently in kvm_util.h into
kvm_util_base.h, then have kvm_util.h #include it. With this change,
non-base declarations can be selectively moved/introduced into separate
headers, which can then be included in kvm_util.h so that individual
tests don't need to be touched. Subsequent patches will then move
ucall-related declarations into a separate header to meet the above
goals.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-Id: <20211210164620.11636-2-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 13:25:33 -05:00
14243b3871 KVM: x86/xen: Add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery
This adds basic support for delivering 2 level event channels to a guest.

Initially, it only supports delivery via the IRQ routing table, triggered
by an eventfd. In order to do so, it has a kvm_xen_set_evtchn_fast()
function which will use the pre-mapped shared_info page if it already
exists and is still valid, while the slow path through the irqfd_inject
workqueue will remap the shared_info page if necessary.

It sets the bits in the shared_info page but not the vcpu_info; that is
deferred to __kvm_xen_has_interrupt() which raises the vector to the
appropriate vCPU.

Add a 'verbose' mode to xen_shinfo_test while adding test cases for this.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20211210163625.2886-5-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:45 -05:00
a6fec53947 selftests: KVM: sev_migrate_tests: Add mirror command tests
Add tests to confirm mirror vms can only run correct subset of commands.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Marc Orr <marcorr@google.com>
Signed-off-by: Peter Gonda <pgonda@google.com>
Message-Id: <20211208191642.3792819-4-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:38 -05:00
427d046a41 selftests: KVM: sev_migrate_tests: Fix sev_ioctl()
TEST_ASSERT in SEV ioctl was allowing errors because it checked return
value was good OR the FW error code was OK. This TEST_ASSERT should
require both (aka. AND) values are OK. Removes the LAUNCH_START from the
mirror VM because this call correctly fails because mirror VMs cannot
call this command. Currently issues with the PSP driver functions mean
the firmware error is not always reset to SEV_RET_SUCCESS when a call is
successful. Mainly sev_platform_init() doesn't correctly set the fw
error if the platform has already been initialized.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Marc Orr <marcorr@google.com>
Signed-off-by: Peter Gonda <pgonda@google.com>
Message-Id: <20211208191642.3792819-3-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:38 -05:00
4c66b56781 selftests: KVM: sev_migrate_tests: Fix test_sev_mirror()
Mirrors should not be able to call LAUNCH_START. Remove the call on the
mirror to correct the test before fixing sev_ioctl() to correctly assert
on this failed ioctl.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Marc Orr <marcorr@google.com>
Signed-off-by: Peter Gonda <pgonda@google.com>
Message-Id: <20211208191642.3792819-2-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:38 -05:00
1b0c9d00aa Merge tag 'kvm-riscv-5.17-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv changes for 5.17, take #1

- Use common KVM implementation of MMU memory caches
- SBI v0.2 support for Guest
- Initial KVM selftests support
- Fix to avoid spurious virtual interrupts after clearing hideleg CSR
- Update email address for Anup and Atish
2022-01-07 10:43:02 -05:00
7fd55a02a4 Merge tag 'kvmarm-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for Linux 5.16

- Simplification of the 'vcpu first run' by integrating it into
  KVM's 'pid change' flow

- Refactoring of the FP and SVE state tracking, also leading to
  a simpler state and less shared data between EL1 and EL2 in
  the nVHE case

- Tidy up the header file usage for the nvhe hyp object

- New HYP unsharing mechanism, finally allowing pages to be
  unmapped from the Stage-1 EL2 page-tables

- Various pKVM cleanups around refcounting and sharing

- A couple of vgic fixes for bugs that would trigger once
  the vcpu xarray rework is merged, but not sooner

- Add minimal support for ARMv8.7's PMU extension

- Rework kvm_pgtable initialisation ahead of the NV work

- New selftest for IRQ injection

- Teach selftests about the lack of default IPA space and
  page sizes

- Expand sysreg selftest to deal with Pointer Authentication

- The usual bunch of cleanups and doc update
2022-01-07 10:42:19 -05:00
46e967d187 selftests: mptcp: add tests for subflow creation failure
Verify that, when multiple endpoints are available, subflows
creation proceed even when the first additional subflow creation
fails - due to packet drop on the relevant link

Co-developed-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-07 11:27:07 +00:00
86e39e0448 mptcp: keep track of local endpoint still available for each msk
Include into the path manager status a bitmap tracking the list
of local endpoints still available - not yet used - for the
relevant mptcp socket.

Keep such map updated at endpoint creation/deletion time, so
that we can easily skip already used endpoint at local address
selection time.

The endpoint used by the initial subflow is lazyly accounted at
subflow creation time: the usage bitmap is be up2date before
endpoint selection and we avoid such unneeded task in some relevant
scenarios - e.g. busy servers accepting incoming subflows but
not creating any additional ones nor annuncing additional addresses.

Overall this allows for fair local endpoints usage in case of
subflow failure.

As a side effect, this patch also enforces that each endpoint
is used at most once for each mptcp connection.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-07 11:27:07 +00:00
05be5e273c selftests: mptcp: add disconnect tests
Performs several disconnect/reconnect on the same socket,
ensuring the overall transfer is succesful.

The new test leverages ioctl(SIOCOUTQ) to ensure all the
pending data is acked before disconnecting.

Additionally order alphabetically the test program arguments list
for better maintainability.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-07 11:27:07 +00:00
29507144c9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Refcount leak in ipt_CLUSTERIP rule loading path, from Xin Xiong.

2) Use socat in netfilter selftests, from Hangbin Liu.

3) Skip layer checksum 4 update for IP fragments.

4) Missing allocation of pcpu scratch maps on clone in
   nft_set_pipapo, from Florian Westphal.

* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
  netfilter: nft_set_pipapo: allocate pcpu scratch maps on clone
  netfilter: nft_payload: do not update layer 4 checksum when mangling fragments
  selftests: netfilter: switch to socat for tests using -q option
  netfilter: ipt_CLUSTERIP: fix refcount leak in clusterip_tg_check()
====================

Link: https://lore.kernel.org/r/20220106215139.170824-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-06 18:37:45 -08:00
257367c0c9 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2022-01-06

We've added 41 non-merge commits during the last 2 day(s) which contain
a total of 36 files changed, 1214 insertions(+), 368 deletions(-).

The main changes are:

1) Various fixes in the verifier, from Kris and Daniel.

2) Fixes in sockmap, from John.

3) bpf_getsockopt fix, from Kuniyuki.

4) INET_POST_BIND fix, from Menglong.

5) arm64 JIT fix for bpf pseudo funcs, from Hou.

6) BPF ISA doc improvements, from Christoph.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (41 commits)
  bpf: selftests: Add bind retry for post_bind{4, 6}
  bpf: selftests: Use C99 initializers in test_sock.c
  net: bpf: Handle return value of BPF_CGROUP_RUN_PROG_INET{4,6}_POST_BIND()
  bpf/selftests: Test bpf_d_path on rdonly_mem.
  libbpf: Add documentation for bpf_map batch operations
  selftests/bpf: Don't rely on preserving volatile in PT_REGS macros in loop3
  xdp: Add xdp_do_redirect_frame() for pre-computed xdp_frames
  xdp: Move conversion to xdp_frame out of map functions
  page_pool: Store the XDP mem id
  page_pool: Add callback to init pages when they are allocated
  xdp: Allow registering memory model without rxq reference
  samples/bpf: xdpsock: Add timestamp for Tx-only operation
  samples/bpf: xdpsock: Add time-out for cleaning Tx
  samples/bpf: xdpsock: Add sched policy and priority support
  samples/bpf: xdpsock: Add cyclic TX operation capability
  samples/bpf: xdpsock: Add clockid selection support
  samples/bpf: xdpsock: Add Dest and Src MAC setting for Tx-only operation
  samples/bpf: xdpsock: Add VLAN support for Tx-only operation
  libbpf 1.0: Deprecate bpf_object__find_map_by_offset() API
  libbpf 1.0: Deprecate bpf_map__is_offload_neutral()
  ...
====================

Link: https://lore.kernel.org/r/20220107013626.53943-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-06 18:07:26 -08:00
f734248174 bpf: selftests: Add bind retry for post_bind{4, 6}
With previous patch, kernel is able to 'put_port' after sys_bind()
fails. Add the test for that case: rebind another port after
sys_bind() fails. If the bind success, it means previous bind
operation is already undoed.

Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220106132022.3470772-4-imagedong@tencent.com
2022-01-06 17:08:35 -08:00
6fd92c7f0c bpf: selftests: Use C99 initializers in test_sock.c
Use C99 initializers for the initialization of 'tests' in test_sock.c.

Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220106132022.3470772-3-imagedong@tencent.com
2022-01-06 17:08:35 -08:00
44bab87d8c bpf/selftests: Test bpf_d_path on rdonly_mem.
The second parameter of bpf_d_path() can only accept writable
memories. Rdonly_mem obtained from bpf_per_cpu_ptr() can not
be passed into bpf_d_path for modification. This patch adds
a selftest to verify this behavior.

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220106205525.2116218-1-haoluo@google.com
2022-01-06 15:20:49 -08:00
e59618f0f4 libbpf: Add documentation for bpf_map batch operations
This adds documention for:

- bpf_map_delete_batch()
- bpf_map_lookup_batch()
- bpf_map_lookup_and_delete_batch()
- bpf_map_update_batch()

This also updates the public API for the `keys` parameter
of `bpf_map_delete_batch()`, and both the
`keys` and `values` parameters of `bpf_map_update_batch()`
to be constants.

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220106201304.112675-1-grantseltzer@gmail.com
2022-01-06 15:12:42 -08:00
70bc793382 selftests/bpf: Don't rely on preserving volatile in PT_REGS macros in loop3
PT_REGS*() macro on some architectures force-cast struct pt_regs to
other types (user_pt_regs, etc) and might drop volatile modifiers, if any.
Volatile isn't really required as pt_regs value isn't supposed to change
during the BPF program run, so this is correct behavior.

But progs/loop3.c relies on that volatile modifier to ensure that loop
is preserved. Fix loop3.c by declaring i and sum variables as volatile
instead. It preserves the loop and makes the test pass on all
architectures (including s390x which is currently broken).

Fixes: 3cc31d7940 ("libbpf: Normalize PT_REGS_xxx() macro definitions")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220106205156.955373-1-andrii@kernel.org
2022-01-06 22:25:53 +01:00
bf35a7879f selftests: cgroup: Test open-time cgroup namespace usage for migration checks
When a task is writing to an fd opened by a different task, the perm check
should use the cgroup namespace of the latter task. Add a test for it.

Tested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06 11:02:29 -10:00
613e040e4d selftests: cgroup: Test open-time credential usage for migration checks
When a task is writing to an fd opened by a different task, the perm check
should use the credentials of the latter task. Add a test for it.

Tested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06 11:02:29 -10:00
b09c2baa56 selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644
0644 is an odd perm to create a cgroup which is a directory. Use the regular
0755 instead. This is necessary for euid switching test case.

Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06 11:02:29 -10:00
eac1b93c14 gro: add ability to control gro max packet size
Eric Dumazet suggested to allow users to modify max GRO packet size.

We have seen GRO being disabled by users of appliances (such as
wifi access points) because of claimed bufferbloat issues,
or some work arounds in sch_cake, to split GRO/GSO packets.

Instead of disabling GRO completely, one can chose to limit
the maximum packet size of GRO packets, depending on their
latency constraints.

This patch adds a per device gro_max_size attribute
that can be changed with ip link command.

ip link set dev eth0 gro_max_size 16000

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Coco Li <lixiaoyan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-06 12:27:05 +00:00
3e06cdf105 KVM: selftests: Add initial support for RISC-V 64-bit
We add initial support for RISC-V 64-bit in KVM selftests using
which we can cross-compile and run arch independent tests such as:
demand_paging_test
dirty_log_test
kvm_create_max_vcpus,
kvm_page_table_test
set_memory_region_test
kvm_binary_stats_test

All VM guest modes defined in kvm_util.h require at least 48-bit
guest virtual address so to use KVM RISC-V selftests hardware
need to support at least Sv48 MMU for guest (i.e. VS-mode).

Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-and-tested-by: Atish Patra <atishp@rivosinc.com>
2022-01-06 15:17:50 +05:30
788490e798 KVM: selftests: Add EXTRA_CFLAGS in top-level Makefile
We add EXTRA_CFLAGS to the common CFLAGS of top-level Makefile which will
allow users to pass additional compile-time flags such as "-static".

Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-and-tested-by: Atish Patra <atishp@rivosinc.com>
Reviewed-and-tested-by: Sean Christopherson <seanjc@google.com>
2022-01-06 15:17:46 +05:30
1585f590a2 selftests: netfilter: switch to socat for tests using -q option
The nc cmd(nmap-ncat) that distributed with Fedora/Red Hat does not have
option -q. This make some tests failed with:

	nc: invalid option -- 'q'

Let's switch to socat which is far more dependable.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-06 10:43:18 +01:00
87eee9c558 testptp: set pin function before other requests
When the -L option of the testptp utility is specified with other
options (e.g. -p to enable PPS output), the user probably wants to
apply it to the pin configured by the -L option.

Reorder the code to set the pin function before other function requests
to avoid confusing users.

Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://lore.kernel.org/r/20220105152506.3256026-1-mlichvar@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-05 17:08:58 -08:00
5f60826428 libbpf 1.0: Deprecate bpf_object__find_map_by_offset() API
API created with simplistic assumptions about BPF map definitions.
It hasn’t worked for a while, deprecate it in preparation for
libbpf 1.0.

  [0] Closes: https://github.com/libbpf/libbpf/issues/302

Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220105003120.2222673-1-christylee@fb.com
2022-01-05 16:11:32 -08:00
9855c131b9 libbpf 1.0: Deprecate bpf_map__is_offload_neutral()
Deprecate bpf_map__is_offload_neutral(). It’s most probably broken
already. PERF_EVENT_ARRAY isn’t the only map that’s not suitable
for hardware offloading. Applications can directly check map type
instead.

  [0] Closes: https://github.com/libbpf/libbpf/issues/306

Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220105000601.2090044-1-christylee@fb.com
2022-01-05 16:09:06 -08:00
51a33c60f1 libbpf: Support repeated legacy kprobes on same function
If repeated legacy kprobes on same function in one process,
libbpf will register using the same probe name and got -EBUSY
error. So append index to the probe name format to fix this
problem.

Co-developed-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Qiang Wang <wangqiang.wq.frank@bytedance.com>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211227130713.66933-2-wangqiang.wq.frank@bytedance.com
2022-01-05 15:38:21 -08:00