8503 Commits

Author SHA1 Message Date
Linus Torvalds
7e062cda7d Networking changes for 5.19.
Core
 ----
 
  - Support TCPv6 segmentation offload with super-segments larger than
    64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP).
 
  - Generalize skb freeing deferral to per-cpu lists, instead of
    per-socket lists.
 
  - Add a netdev statistic for packets dropped due to L2 address
    mismatch (rx_otherhost_dropped).
 
  - Continue work annotating skb drop reasons.
 
  - Accept alternative netdev names (ALT_IFNAME) in more netlink
    requests.
 
  - Add VLAN support for AF_PACKET SOCK_RAW GSO.
 
  - Allow receiving skb mark from the socket as a cmsg.
 
  - Enable memcg accounting for veth queues, sysctl tables and IPv6.
 
 BPF
 ---
 
  - Add libbpf support for User Statically-Defined Tracing (USDTs).
 
  - Speed up symbol resolution for kprobes multi-link attachments.
 
  - Support storing typed pointers to referenced and unreferenced
    objects in BPF maps.
 
  - Add support for BPF link iterator.
 
  - Introduce access to remote CPU map elements in BPF per-cpu map.
 
  - Allow middle-of-the-road settings for the
    kernel.unprivileged_bpf_disabled sysctl.
 
  - Implement basic types of dynamic pointers e.g. to allow for
    dynamically sized ringbuf reservations without extra memory copies.
 
 Protocols
 ---------
 
  - Retire port only listening_hash table, add a second bind table
    hashed by port and address. Avoid linear list walk when binding
    to very popular ports (e.g. 443).
 
  - Add bridge FDB bulk flush filtering support allowing user space
    to remove all FDB entries matching a condition.
 
  - Introduce accept_unsolicited_na sysctl for IPv6 to implement
    router-side changes for RFC9131.
 
  - Support for MPTCP path manager in user space.
 
  - Add MPTCP support for fallback to regular TCP for connections
    that have never connected additional subflows or transmitted
    out-of-sequence data (partial support for RFC8684 fallback).
 
  - Avoid races in MPTCP-level window tracking, stabilize and improve
    throughput.
 
  - Support lockless operation of GRE tunnels with seq numbers enabled.
 
  - WiFi support for host based BSS color collision detection.
 
  - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets.
 
  - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2).
 
  - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile).
 
  - Allow matching on the number of VLAN tags via tc-flower.
 
  - Add tracepoint for tcp_set_ca_state().
 
 Driver API
 ----------
 
  - Improve error reporting from classifier and action offload.
 
  - Add support for listing line cards in switches (devlink).
 
  - Add helpers for reporting page pool statistics with ethtool -S.
 
  - Add support for reading clock cycles when using PTP virtual clocks,
    instead of having the driver convert to time before reporting.
    This makes it possible to report time from different vclocks.
 
  - Support configuring low-latency Tx descriptor push via ethtool.
 
  - Separate Clause 22 and Clause 45 MDIO accesses more explicitly.
 
 New hardware / drivers
 ----------------------
 
  - Ethernet:
    - Marvell's Octeon NIC PCI Endpoint support (octeon_ep)
    - Sunplus SP7021 SoC (sp7021_emac)
    - Add support for Renesas RZ/V2M (in ravb)
    - Add support for MediaTek mt7986 switches (in mtk_eth_soc)
 
  - Ethernet PHYs:
    - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting)
    - TI DP83TD510 PHY
    - Microchip LAN8742/LAN88xx PHYs
 
  - WiFi:
    - Driver for pureLiFi X, XL, XC devices (plfxlc)
    - Driver for Silicon Labs devices (wfx)
    - Support for WCN6750 (in ath11k)
    - Support Realtek 8852ce devices (in rtw89)
 
  - Mobile:
    - MediaTek T700 modems (Intel 5G 5000 M.2 cards)
 
  - CAN:
   - ctucanfd: add support for CTU CAN FD open-source IP core
     from Czech Technical University in Prague
 
 Drivers
 -------
 
  - Delete a number of old drivers still using virt_to_bus().
 
  - Ethernet NICs:
    - intel: support TSO on tunnels MPLS
    - broadcom: support multi-buffer XDP
    - nfp: support VF rate limiting
    - sfc: use hardware tx timestamps for more than PTP
    - mlx5: multi-port eswitch support
    - hyper-v: add support for XDP_REDIRECT
    - atlantic: XDP support (including multi-buffer)
    - macb: improve real-time perf by deferring Tx processing to NAPI
 
  - High-speed Ethernet switches:
    - mlxsw: implement basic line card information querying
    - prestera: add support for traffic policing on ingress and egress
 
  - Embedded Ethernet switches:
    - lan966x: add support for packet DMA (FDMA)
    - lan966x: add support for PTP programmable pins
    - ti: cpsw_new: enable bc/mc storm prevention
 
  - Qualcomm 802.11ax WiFi (ath11k):
    - Wake-on-WLAN support for QCA6390 and WCN6855
    - device recovery (firmware restart) support
    - support setting Specific Absorption Rate (SAR) for WCN6855
    - read country code from SMBIOS for WCN6855/QCA6390
    - enable keep-alive during WoWLAN suspend
    - implement remain-on-channel support
 
  - MediaTek WiFi (mt76):
    - support Wireless Ethernet Dispatch offloading packet movement
      between the Ethernet switch and WiFi interfaces
    - non-standard VHT MCS10-11 support
    - mt7921 AP mode support
    - mt7921 IPv6 NS offload support
 
  - Ethernet PHYs:
    - micrel: ksz9031/ksz9131: cabletest support
    - lan87xx: SQI support for T1 PHYs
    - lan937x: add interrupt support for link detection
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmKNMPQACgkQMUZtbf5S
 IrsRARAAuDyYs6jFYB3p+xazZdOnbF4iAgVv71+DQGvmsCl6CB9OrsNZMlvE85OL
 Q3gjcRbgjrkN4lhgI8DmiGYbsUJnAvVjFdNjccz1Z/vTLYvuIM0ol54MUp5S+9WY
 StncOJkOGJxxR/Gi5gzVmejPDsysU3Jik+hm/fpIcz8pybXxAsFKU5waY5qfl+/T
 TZepfV0VCfqRDjqcF1qA5+jJZNU8pdodQlZ1+mh8bwu6Jk1ZkWkj6Ov8MWdwQldr
 LnPeK/9hIGzkdJYHZfajxA3t8D0K5CHzSuih2bJ9ry8ZXgVBkXEThew778/R5izW
 uB0YZs9COFlrIP7XHjtRTy/2xHOdYIPlj2nWhVdfuQDX8Crvt4VRN6EZ1rjko1ZJ
 WanfG6WHF8NH5pXBRQbh3kIMKBnYn6OIzuCfCQSqd+niHcxFIM4vRiggeXI5C5TW
 vJgEWfK6X+NfDiFVa3xyCrEmp5ieA/pNecpwd8rVkql+MtFAAw4vfsotLKOJEAru
 J/XL6UE+YuLqIJV9ACZ9x1AFXXAo661jOxBunOo4VXhXVzWS9lYYz5r5ryIkgT/8
 /Fr0zjANJWgfIuNdIBtYfQ4qG+LozGq038VA06RhFUAZ5tF9DzhqJs2Q2AFuWWBC
 ewCePJVqo1j2Ceq2mGonXRt47OEnlePoOxTk9W+cKZb7ZWE+zEo=
 =Wjii
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core
  ----

   - Support TCPv6 segmentation offload with super-segments larger than
     64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP).

   - Generalize skb freeing deferral to per-cpu lists, instead of
     per-socket lists.

   - Add a netdev statistic for packets dropped due to L2 address
     mismatch (rx_otherhost_dropped).

   - Continue work annotating skb drop reasons.

   - Accept alternative netdev names (ALT_IFNAME) in more netlink
     requests.

   - Add VLAN support for AF_PACKET SOCK_RAW GSO.

   - Allow receiving skb mark from the socket as a cmsg.

   - Enable memcg accounting for veth queues, sysctl tables and IPv6.

  BPF
  ---

   - Add libbpf support for User Statically-Defined Tracing (USDTs).

   - Speed up symbol resolution for kprobes multi-link attachments.

   - Support storing typed pointers to referenced and unreferenced
     objects in BPF maps.

   - Add support for BPF link iterator.

   - Introduce access to remote CPU map elements in BPF per-cpu map.

   - Allow middle-of-the-road settings for the
     kernel.unprivileged_bpf_disabled sysctl.

   - Implement basic types of dynamic pointers e.g. to allow for
     dynamically sized ringbuf reservations without extra memory copies.

  Protocols
  ---------

   - Retire port only listening_hash table, add a second bind table
     hashed by port and address. Avoid linear list walk when binding to
     very popular ports (e.g. 443).

   - Add bridge FDB bulk flush filtering support allowing user space to
     remove all FDB entries matching a condition.

   - Introduce accept_unsolicited_na sysctl for IPv6 to implement
     router-side changes for RFC9131.

   - Support for MPTCP path manager in user space.

   - Add MPTCP support for fallback to regular TCP for connections that
     have never connected additional subflows or transmitted
     out-of-sequence data (partial support for RFC8684 fallback).

   - Avoid races in MPTCP-level window tracking, stabilize and improve
     throughput.

   - Support lockless operation of GRE tunnels with seq numbers enabled.

   - WiFi support for host based BSS color collision detection.

   - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets.

   - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2).

   - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile).

   - Allow matching on the number of VLAN tags via tc-flower.

   - Add tracepoint for tcp_set_ca_state().

  Driver API
  ----------

   - Improve error reporting from classifier and action offload.

   - Add support for listing line cards in switches (devlink).

   - Add helpers for reporting page pool statistics with ethtool -S.

   - Add support for reading clock cycles when using PTP virtual clocks,
     instead of having the driver convert to time before reporting. This
     makes it possible to report time from different vclocks.

   - Support configuring low-latency Tx descriptor push via ethtool.

   - Separate Clause 22 and Clause 45 MDIO accesses more explicitly.

  New hardware / drivers
  ----------------------

   - Ethernet:
      - Marvell's Octeon NIC PCI Endpoint support (octeon_ep)
      - Sunplus SP7021 SoC (sp7021_emac)
      - Add support for Renesas RZ/V2M (in ravb)
      - Add support for MediaTek mt7986 switches (in mtk_eth_soc)

   - Ethernet PHYs:
      - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting)
      - TI DP83TD510 PHY
      - Microchip LAN8742/LAN88xx PHYs

   - WiFi:
      - Driver for pureLiFi X, XL, XC devices (plfxlc)
      - Driver for Silicon Labs devices (wfx)
      - Support for WCN6750 (in ath11k)
      - Support Realtek 8852ce devices (in rtw89)

   - Mobile:
      - MediaTek T700 modems (Intel 5G 5000 M.2 cards)

   - CAN:
      - ctucanfd: add support for CTU CAN FD open-source IP core from
        Czech Technical University in Prague

  Drivers
  -------

   - Delete a number of old drivers still using virt_to_bus().

   - Ethernet NICs:
      - intel: support TSO on tunnels MPLS
      - broadcom: support multi-buffer XDP
      - nfp: support VF rate limiting
      - sfc: use hardware tx timestamps for more than PTP
      - mlx5: multi-port eswitch support
      - hyper-v: add support for XDP_REDIRECT
      - atlantic: XDP support (including multi-buffer)
      - macb: improve real-time perf by deferring Tx processing to NAPI

   - High-speed Ethernet switches:
      - mlxsw: implement basic line card information querying
      - prestera: add support for traffic policing on ingress and egress

   - Embedded Ethernet switches:
      - lan966x: add support for packet DMA (FDMA)
      - lan966x: add support for PTP programmable pins
      - ti: cpsw_new: enable bc/mc storm prevention

   - Qualcomm 802.11ax WiFi (ath11k):
      - Wake-on-WLAN support for QCA6390 and WCN6855
      - device recovery (firmware restart) support
      - support setting Specific Absorption Rate (SAR) for WCN6855
      - read country code from SMBIOS for WCN6855/QCA6390
      - enable keep-alive during WoWLAN suspend
      - implement remain-on-channel support

   - MediaTek WiFi (mt76):
      - support Wireless Ethernet Dispatch offloading packet movement
        between the Ethernet switch and WiFi interfaces
      - non-standard VHT MCS10-11 support
      - mt7921 AP mode support
      - mt7921 IPv6 NS offload support

   - Ethernet PHYs:
      - micrel: ksz9031/ksz9131: cabletest support
      - lan87xx: SQI support for T1 PHYs
      - lan937x: add interrupt support for link detection"

* tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1809 commits)
  ptp: ocp: Add firmware header checks
  ptp: ocp: fix PPS source selector debugfs reporting
  ptp: ocp: add .init function for sma_op vector
  ptp: ocp: vectorize the sma accessor functions
  ptp: ocp: constify selectors
  ptp: ocp: parameterize input/output sma selectors
  ptp: ocp: revise firmware display
  ptp: ocp: add Celestica timecard PCI ids
  ptp: ocp: Remove #ifdefs around PCI IDs
  ptp: ocp: 32-bit fixups for pci start address
  Revert "net/smc: fix listen processing for SMC-Rv2"
  ath6kl: Use cc-disable-warning to disable -Wdangling-pointer
  selftests/bpf: Dynptr tests
  bpf: Add dynptr data slices
  bpf: Add bpf_dynptr_read and bpf_dynptr_write
  bpf: Dynptr support for ring buffers
  bpf: Add bpf_dynptr_from_mem for local dynptrs
  bpf: Add verifier support for dynptrs
  bpf: Suppress 'passing zero to PTR_ERR' warning
  bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack
  ...
2022-05-25 12:22:58 -07:00
Linus Torvalds
ac2ab99072 Random number generator updates for Linux 5.19-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmKKpM8ACgkQSfxwEqXe
 A6726w/+OJimGd4arvpSmdn+vxepSyDLgKfwM0x5zprRVd16xg8CjJr4eMonTesq
 YvtJRqpetb53MB+sMhutlvQqQzrjtf2MBkgPwF4I2gUrk7vLD45Q+AGdGhi/rUwz
 wHGA7xg1FHLHia2M/9idSqi8QlZmUP4u4l5ZnMyTUHiwvRD6XOrWKfqvUSawNzyh
 hCWlTUxDrjizsW5YpsJX/MkRadSC8loJEk5ByZebow6nRPfurJvqfrcOMgHyNrbY
 pOZ/CGPxcetMqotL2TuuJt5wKmenqYhIWGAp3YM2SWWgU2ueBZekW8AYeMfgUcvh
 LWV93RpSuAnE5wsdjIULvjFnEDJBf8ihfMnMrd9G5QjQu44tuKWfY2MghLSpYzaR
 V6UFbRmhrqhqiStHQXOvk1oqxtpbHlc9zzJLmvPmDJcbvzXQ9Opk5GVXAmdtnHnj
 M/ty3wGWxucY6mHqT8MkCShSSslbgEtc1pEIWHdrUgnaiSVoCVBEO+9LqLbjvOTm
 XA/6YtoiCE5FasK51pir1zVb2GORQn0v8HnuAOsusD/iPAlRQ/G5jZkaXbwRQI6j
 atYL1svqvSKn5POnzqAlMUXfMUr19K5xqJdp7i6qmlO1Vq6Z+tWbCQgD1JV+Wjkb
 CMyvXomFCFu4aYKGRE2SBRnWLRghG3kYHqEQ15yTPMQerxbUDNg=
 =SUr3
 -----END PGP SIGNATURE-----

Merge tag 'random-5.19-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:
 "These updates continue to refine the work began in 5.17 and 5.18 of
  modernizing the RNG's crypto and streamlining and documenting its
  code.

  New for 5.19, the updates aim to improve entropy collection methods
  and make some initial decisions regarding the "premature next" problem
  and our threat model. The cloc utility now reports that random.c is
  931 lines of code and 466 lines of comments, not that basic metrics
  like that mean all that much, but at the very least it tells you that
  this is very much a manageable driver now.

  Here's a summary of the various updates:

   - The random_get_entropy() function now always returns something at
     least minimally useful. This is the primary entropy source in most
     collectors, which in the best case expands to something like RDTSC,
     but prior to this change, in the worst case it would just return 0,
     contributing nothing. For 5.19, additional architectures are wired
     up, and architectures that are entirely missing a cycle counter now
     have a generic fallback path, which uses the highest resolution
     clock available from the timekeeping subsystem.

     Some of those clocks can actually be quite good, despite the CPU
     not having a cycle counter of its own, and going off-core for a
     stamp is generally thought to increase jitter, something positive
     from the perspective of entropy gathering. Done very early on in
     the development cycle, this has been sitting in next getting some
     testing for a while now and has relevant acks from the archs, so it
     should be pretty well tested and fine, but is nonetheless the thing
     I'll be keeping my eye on most closely.

   - Of particular note with the random_get_entropy() improvements is
     MIPS, which, on CPUs that lack the c0 count register, will now
     combine the high-speed but short-cycle c0 random register with the
     lower-speed but long-cycle generic fallback path.

   - With random_get_entropy() now always returning something useful,
     the interrupt handler now collects entropy in a consistent
     construction.

   - Rather than comparing two samples of random_get_entropy() for the
     jitter dance, the algorithm now tests many samples, and uses the
     amount of differing ones to determine whether or not jitter entropy
     is usable and how laborious it must be. The problem with comparing
     only two samples was that if the cycle counter was extremely slow,
     but just so happened to be on the cusp of a change, the slowness
     wouldn't be detected. Taking many samples fixes that to some
     degree.

     This, combined with the other improvements to random_get_entropy(),
     should make future unification of /dev/random and /dev/urandom
     maybe more possible. At the very least, were we to attempt it again
     today (we're not), it wouldn't break any of Guenter's test rigs
     that broke when we tried it with 5.18. So, not today, but perhaps
     down the road, that's something we can revisit.

   - We attempt to reseed the RNG immediately upon waking up from system
     suspend or hibernation, making use of the various timestamps about
     suspend time and such available, as well as the usual inputs such
     as RDRAND when available.

   - Batched randomness now falls back to ordinary randomness before the
     RNG is initialized. This provides more consistent guarantees to the
     types of random numbers being returned by the various accessors.

   - The "pre-init injection" code is now gone for good. I suspect you
     in particular will be happy to read that, as I recall you
     expressing your distaste for it a few months ago. Instead, to avoid
     a "premature first" issue, while still allowing for maximal amount
     of entropy availability during system boot, the first 128 bits of
     estimated entropy are used immediately as it arrives, with the next
     128 bits being buffered. And, as before, after the RNG has been
     fully initialized, it winds up reseeding anyway a few seconds later
     in most cases. This resulted in a pretty big simplification of the
     initialization code and let us remove various ad-hoc mechanisms
     like the ugly crng_pre_init_inject().

   - The RNG no longer pretends to handle the "premature next" security
     model, something that various academics and other RNG designs have
     tried to care about in the past. After an interesting mailing list
     thread, these issues are thought to be a) mainly academic and not
     practical at all, and b) actively harming the real security of the
     RNG by delaying new entropy additions after a potential compromise,
     making a potentially bad situation even worse. As well, in the
     first place, our RNG never even properly handled the premature next
     issue, so removing an incomplete solution to a fake problem was
     particularly nice.

     This allowed for numerous other simplifications in the code, which
     is a lot cleaner as a consequence. If you didn't see it before,
     https://lore.kernel.org/lkml/YmlMGx6+uigkGiZ0@zx2c4.com/ may be a
     thread worth skimming through.

   - While the interrupt handler received a separate code path years ago
     that avoids locks by using per-cpu data structures and a faster
     mixing algorithm, in order to reduce interrupt latency, input and
     disk events that are triggered in hardirq handlers were still
     hitting locks and more expensive algorithms. Those are now
     redirected to use the faster per-cpu data structures.

   - Rather than having the fake-crypto almost-siphash-based random32
     implementation be used right and left, and in many places where
     cryptographically secure randomness is desirable, the batched
     entropy code is now fast enough to replace that.

   - As usual, numerous code quality and documentation cleanups. For
     example, the initialization state machine now uses enum symbolic
     constants instead of just hard coding numbers everywhere.

   - Since the RNG initializes once, and then is always initialized
     thereafter, a pretty heavy amount of code used during that
     initialization is never used again. It is now completely cordoned
     off using static branches and it winds up in the .text.unlikely
     section so that it doesn't reduce cache compactness after the RNG
     is ready.

   - A variety of functions meant for waiting on the RNG to be
     initialized were only used by vsprintf, and in not a particularly
     optimal way. Replacing that usage with a more ordinary setup made
     it possible to remove those functions.

   - A cleanup of how we warn userspace about the use of uninitialized
     /dev/urandom and uninitialized get_random_bytes() usage.
     Interestingly, with the change you merged for 5.18 that attempts to
     use jitter (but does not block if it can't), the majority of users
     should never see those warnings for /dev/urandom at all now, and
     the one for in-kernel usage is mainly a debug thing.

   - The file_operations struct for /dev/[u]random now implements
     .read_iter and .write_iter instead of .read and .write, allowing it
     to also implement .splice_read and .splice_write, which makes
     splice(2) work again after it was broken here (and in many other
     places in the tree) during the set_fs() removal. This was a bit of
     a last minute arrival from Jens that hasn't had as much time to
     bake, so I'll be keeping my eye on this as well, but it seems
     fairly ordinary. Unfortunately, read_iter() is around 3% slower
     than read() in my tests, which I'm not thrilled about. But Jens and
     Al, spurred by this observation, seem to be making progress in
     removing the bottlenecks on the iter paths in the VFS layer in
     general, which should remove the performance gap for all drivers.

   - Assorted other bug fixes, cleanups, and optimizations.

   - A small SipHash cleanup"

* tag 'random-5.19-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (49 commits)
  random: check for signals after page of pool writes
  random: wire up fops->splice_{read,write}_iter()
  random: convert to using fops->write_iter()
  random: convert to using fops->read_iter()
  random: unify batched entropy implementations
  random: move randomize_page() into mm where it belongs
  random: remove mostly unused async readiness notifier
  random: remove get_random_bytes_arch() and add rng_has_arch_random()
  random: move initialization functions out of hot pages
  random: make consistent use of buf and len
  random: use proper return types on get_random_{int,long}_wait()
  random: remove extern from functions in header
  random: use static branch for crng_ready()
  random: credit architectural init the exact amount
  random: handle latent entropy and command line from random_init()
  random: use proper jiffies comparison macro
  random: remove ratelimiting for in-kernel unseeded randomness
  random: move initialization out of reseeding hot path
  random: avoid initializing twice in credit race
  random: use symbolic constants for crng_init states
  ...
2022-05-24 11:58:10 -07:00
Linus Torvalds
95fbef17e8 s390 updates for 5.19 merge window
- Make use of the IBM z16 processor activity instrumentation facility
   to count cryptography operations: add a new PMU device driver so
   that perf can make use of this.
 
 - Add new IBM z16 extended counter set to cpumf support.
 
 - Add vdso randomization support.
 
 - Add missing KCSAN instrumentation to barriers and spinlocks, which
   should make s390's KCSAN support complete.
 
 - Add support for IPL-complete-control facility: notify the hypervisor
   that kexec finished work and the kernel starts.
 
 - Improve error logging for PCI.
 
 - Various small changes to workaround llvm's integrated assembler
   limitations, and one bug, to make it finally possible to compile the
   kernel with llvm's integrated assembler. This also requires to raise
   the minimum clang version to 14.0.0.
 
 - Various other small enhancements, bug fixes, and cleanups all over
   the place.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmKLedYACgkQIg7DeRsp
 bsKDfA//TR/8jyyrNs75VDUPiS0UgMgHfjinQqLa8qwaQxCxA0J31I9nYiDxSfp/
 E8hTCLyARnPX0YpcLCEI0ChC6Ad+LElGr6kctdV0FTQopRVreVRKYe2bmrsvXNqs
 4OzFNGZ8mnvMMSi1IQ/A7Yq/DZjbEON5VfY3iJv8djyC7qVNDgngdiQxtIJ+3eq/
 77pw3VEgtuI2lVC3O9fEsdqRUyB5UHS3GSknmc8+KuRmOorir0JwMvxQ9xARZJYE
 6FbTnSDW1YGI6TBoa/zFberqsldU/qJzo40JmPr27a2qbEmysc8kw60r+cIFsxgC
 H432/aS9102CnsocaY7CtOvs+TLAK8dYeU31enxUGXnICMJ0MuuqnNnAfHrJziVs
 ZnK3iUfPmMMewYfSefn8Sk87kJR5ggGePF++44GEqd87lRwZUnC+hd19dNtzzgSx
 Br4dRYrdQl+w2nqBHGCGW2288svtiPHslnhaQqy343fS9q0o3Mebqx1e9be7t9/K
 IDFQ00Cd3FS2jhphCbCrq2vJTmByhTQqCiNoEJ6vZK2B3ksrJUotfdwI+5etE2Kj
 8sOPwOPyIAI9HnXFVknGIl/u5kaPuHazkZu6u3Or0miVZYw01pov1am0ArcFjeMX
 /4Js/lI4O/wXvRzVk0rILrAZFDirAHvqqx+aI20cegTQU2C8mHY=
 =W+1k
 -----END PGP SIGNATURE-----

Merge tag 's390-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Heiko Carstens:

 - Make use of the IBM z16 processor activity instrumentation facility
   to count cryptography operations: add a new PMU device driver so that
   perf can make use of this.

 - Add new IBM z16 extended counter set to cpumf support.

 - Add vdso randomization support.

 - Add missing KCSAN instrumentation to barriers and spinlocks, which
   should make s390's KCSAN support complete.

 - Add support for IPL-complete-control facility: notify the hypervisor
   that kexec finished work and the kernel starts.

 - Improve error logging for PCI.

 - Various small changes to workaround llvm's integrated assembler
   limitations, and one bug, to make it finally possible to compile the
   kernel with llvm's integrated assembler. This also requires to raise
   the minimum clang version to 14.0.0.

 - Various other small enhancements, bug fixes, and cleanups all over
   the place.

* tag 's390-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (48 commits)
  s390/head: get rid of 31 bit leftovers
  scripts/min-tool-version.sh: raise minimum clang version to 14.0.0 for s390
  s390/boot: do not emit debug info for assembly with llvm's IAS
  s390/boot: workaround llvm IAS bug
  s390/purgatory: workaround llvm's IAS limitations
  s390/entry: workaround llvm's IAS limitations
  s390/alternatives: remove padding generation code
  s390/alternatives: provide identical sized orginal/alternative sequences
  s390/cpumf: add new extended counter set for IBM z16
  s390/preempt: disable __preempt_count_add() optimization for PROFILE_ALL_BRANCHES
  s390/stp: clock_delta should be signed
  s390/stp: fix todoff size
  s390/pai: add support for cryptography counters
  entry: Rename arch_check_user_regs() to arch_enter_from_user_mode()
  s390/compat: cleanup compat_linux.h header file
  s390/entry: remove broken and not needed code
  s390/boot: convert parmarea to C
  s390/boot: convert initial lowcore to C
  s390/ptrace: move short psw definitions to ptrace header file
  s390/head: initialize all new psws
  ...
2022-05-23 21:01:30 -07:00
Linus Torvalds
de8ac81747 - Remove all the code around GS switching on 32-bit now that it is not
needed anymore
 
 - Other misc improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmKLp74ACgkQEsHwGGHe
 VUpqrhAAgNdNw/vNTTzeOH5ZSNxyIoTQapmrSNev0cXRW4tV2hxuYSa2wPZPJZXx
 aYhnFxwL7rVy0er7jG/5KaOyzHmrh6PcmqgFdPVo8+yVrfcsPIUqg/4L5peFZh7T
 ETV2pvFIiB4njkL/pR3mU5uAtTjyO89tD/LclKmc4ndv19vI8maj+k/dCDOnNnEz
 m4wJMXYWh4bG47/izU5TcTYU7ttTLEiVQ/mC5kEuj7PQeUR0kXKvvLo4rX+lOI2v
 dQRHgHg/qoNM7uVLd7vV/YdMWwcHchmKG5Y7+a/ogdlwR7a/X9e+lklFSeuxNvyH
 8dOHIyzcb6lKTijpqhisZ3o9150ax3Q5FlSWuE3F/9Rcuc1T5eY82kTW2RTOTdV9
 xsjob4y+hlpsUfuImupxJLHn685xsYAdqyiG/SPkcnJL++tNBlWiGHX9NqXF5cgw
 bq4/94Aouxevl0OBxnFBeoQOJvOnf60OY3LHcYR78yEEJyi4iWsC0/TEmD+9IE+r
 EpC1wz9bHCYbSwZ+yv8u2tNPd/rKxdspPL/6SxT9a+WAVrOZbQAN3VmlOIon6W9O
 bW5ye6suqBbl/Q1FACVU1xxSNjLTJUTFsB1X3QKGm8E+Kr7/zD1ZtT0WQNvyLMfT
 p/I4VRcdIxV3eDiYqeTfJ3sTS7IjKHSaZVBnpkZvRh869mMdqCg=
 =CfX1
 -----END PGP SIGNATURE-----

Merge tag 'x86_core_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core x86 updates from Borislav Petkov:

 - Remove all the code around GS switching on 32-bit now that it is not
   needed anymore

 - Other misc improvements

* tag 'x86_core_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  bug: Use normal relative pointers in 'struct bug_entry'
  x86/nmi: Make register_nmi_handler() more robust
  x86/asm: Merge load_gs_index()
  x86/32: Remove lazy GS macros
  ELF: Remove elf_core_copy_kernel_regs()
  x86/32: Simplify ELF_CORE_COPY_REGS
2022-05-23 18:42:07 -07:00
Jakub Kicinski
1ef0736c07 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2022-05-23

We've added 113 non-merge commits during the last 26 day(s) which contain
a total of 121 files changed, 7425 insertions(+), 1586 deletions(-).

The main changes are:

1) Speed up symbol resolution for kprobes multi-link attachments, from Jiri Olsa.

2) Add BPF dynamic pointer infrastructure e.g. to allow for dynamically sized ringbuf
   reservations without extra memory copies, from Joanne Koong.

3) Big batch of libbpf improvements towards libbpf 1.0 release, from Andrii Nakryiko.

4) Add BPF link iterator to traverse links via seq_file ops, from Dmitrii Dolgov.

5) Add source IP address to BPF tunnel key infrastructure, from Kaixi Fan.

6) Refine unprivileged BPF to disable only object-creating commands, from Alan Maguire.

7) Fix JIT blinding of ld_imm64 when they point to subprogs, from Alexei Starovoitov.

8) Add BPF access to mptcp_sock structures and their meta data, from Geliang Tang.

9) Add new BPF helper for access to remote CPU's BPF map elements, from Feng Zhou.

10) Allow attaching 64-bit cookie to BPF link of fentry/fexit/fmod_ret, from Kui-Feng Lee.

11) Follow-ups to typed pointer support in BPF maps, from Kumar Kartikeya Dwivedi.

12) Add busy-poll test cases to the XSK selftest suite, from Magnus Karlsson.

13) Improvements in BPF selftest test_progs subtest output, from Mykola Lysenko.

14) Fill bpf_prog_pack allocator areas with illegal instructions, from Song Liu.

15) Add generic batch operations for BPF map-in-map cases, from Takshak Chahande.

16) Make bpf_jit_enable more user friendly when permanently on 1, from Tiezhu Yang.

17) Fix an array overflow in bpf_trampoline_get_progs(), from Yuntao Wang.

====================

Link: https://lore.kernel.org/r/20220523223805.27931-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-23 16:07:14 -07:00
Julia Lawall
ff2095976c s390/bpf: Fix typo in comment
Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/bpf/20220521111145.81697-84-Julia.Lawall@inria.fr
2022-05-23 11:25:53 -07:00
Josh Poimboeuf
69505e3d9a bug: Use normal relative pointers in 'struct bug_entry'
With CONFIG_GENERIC_BUG_RELATIVE_POINTERS, the addr/file relative
pointers are calculated weirdly: based on the beginning of the bug_entry
struct address, rather than their respective pointer addresses.

Make the relative pointers less surprising to both humans and tools by
calculating them the normal way.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Sven Schnelle <svens@linux.ibm.com> # s390
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Link: https://lkml.kernel.org/r/f0e05be797a16f4fc2401eeb88c8450dcbe61df6.1652362951.git.jpoimboe@kernel.org
2022-05-19 23:46:10 +02:00
Heiko Carstens
94d3477897 s390/head: get rid of 31 bit leftovers
Get rid of old 31 bit leftovers within ipl code:

- convert everything to pc relative code
- use 64 bit addressing mode as early as possible
- use 64 bit arithmetics wherever possible

This way the code doesn't look as odd as before anymore.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-18 13:31:07 +02:00
Heiko Carstens
bb31074db9 s390/boot: do not emit debug info for assembly with llvm's IAS
Commit ee6d777d3e93 ("s390/decompressor: support extra debug flags")
added extra debug flags, in particular debug info is created,
depending on config options.

With llvm's IAS this causes this compile warning:

arch/s390/boot/head.S:38:1: warning: DWARF2 only supports one section per compilation unit
.section ".head.text","ax"
^

This is a known problem and was addressed with commit b8a9092330da
("Kbuild: do not emit debug info for assembly with LLVM_IAS=1").
Just do the same for s390 to get rid of this warning.

Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20220511120532.2228616-8-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-17 15:16:29 +02:00
Heiko Carstens
e9953b729b s390/boot: workaround llvm IAS bug
For at least the mvc and clc instructions llvm's integrated assembler can
generate incorrect code. In particular this happens with decompressor boot
code. The reason seems to be that relocations for the second displacement
of each instruction are at incorrect locations (-/+: gas vs llvm IAS):

mvc     __LC_IO_NEW_PSW(16),.Lnewpsw

results in

        4:      d2 0f 01 f0 00 00       mvc     496(16,%r0),0
-                       8: R_390_12     .head.text+0x10
+		       6: R_390_12     .head.text+0x10

and
clc     0(3,%r4),.L_hdr
results in

      258:      d5 02 40 00 00 00       clc     0(3,%r4),0
-                       25c: R_390_12   .head.text+0x324
+		       25a: R_390_12   .head.text+0x324

Workaround this by writing the code in a different way.

Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://github.com/llvm/llvm-project/issues/55411
Link: https://lore.kernel.org/r/20220511120532.2228616-7-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-17 15:16:29 +02:00
Heiko Carstens
adda746629 s390/purgatory: workaround llvm's IAS limitations
llvm's integrated assembler cannot handle immediate values which are
calculated with two local labels:

arch/s390/purgatory/head.S:139:11: error: invalid operand for instruction
 aghi %r8,-(.base_crash-purgatory_start)

Workaround this by partially rewriting the code.

Link: https://lore.kernel.org/r/20220511120532.2228616-6-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-17 15:16:28 +02:00
Heiko Carstens
4c25f0ff63 s390/entry: workaround llvm's IAS limitations
llvm's integrated assembler cannot handle immediate values which are
calculated with two local labels:

<instantiation>:3:13: error: invalid operand for instruction
 clgfi %r14,.Lsie_done - .Lsie_gmap

Workaround this by adding clang specific code which reads the specific
value from memory. Since this code is within the hot paths of the kernel
and adds an additional memory reference, keep the original code, and add
ifdef'ed code.

Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Link: https://lore.kernel.org/r/20220511120532.2228616-5-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-17 15:16:28 +02:00
Heiko Carstens
e6ed91fd07 s390/alternatives: remove padding generation code
clang fails to handle ".if" statements in inline assembly which are heavily
used in the alternatives code.

To work around this remove this code, and enforce that users of
alternatives must specify original and alternative instruction sequences
which have identical sizes. Add a compile time check with two ".org"
statements similar to arm64.

In result not only clang can handle this, but also quite a lot of code can
be removed.

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1356
Link: https://lore.kernel.org/r/20220511120532.2228616-3-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-17 15:16:28 +02:00
Heiko Carstens
fad442d3ab s390/alternatives: provide identical sized orginal/alternative sequences
Explicitly provide identical sized original/alternative instruction
sequences. This way there is no need for the s390 specific alternatives
infrastructure to generate padding sequences.
The code which generates such sequences will be removed with a follow on
patch.

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20220511120532.2228616-2-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-17 15:16:28 +02:00
Thomas Richter
c9311de716 s390/cpumf: add new extended counter set for IBM z16
Export the extended counter set counters of the IBM z16 via sysfs.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-16 10:58:33 +02:00
Jason A. Donenfeld
2e3df52325 s390: define get_cycles macro for arch-override
S390x defines a get_cycles() function, but it does not do the usual
`#define get_cycles get_cycles` dance, making it impossible for generic
code to see if an arch-specific function was defined. While the
get_cycles() ifdef is not currently used, the following timekeeping
patch in this series will depend on the macro existing (or not existing)
when defining random_get_entropy().

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13 23:59:23 +02:00
Heiko Carstens
63678eecec s390/preempt: disable __preempt_count_add() optimization for PROFILE_ALL_BRANCHES
gcc 12 does not (always) optimize away code that should only be generated
if parameters are constant and within in a certain range. This depends on
various obscure kernel config options, however in particular
PROFILE_ALL_BRANCHES can trigger this compile error:

In function ‘__atomic_add_const’,
    inlined from ‘__preempt_count_add.part.0’ at ./arch/s390/include/asm/preempt.h:50:3:
./arch/s390/include/asm/atomic_ops.h:80:9: error: impossible constraint in ‘asm’
   80 |         asm volatile(                                                   \
      |         ^~~

Workaround this by simply disabling the optimization for
PROFILE_ALL_BRANCHES, since the kernel will be so slow, that this
optimization won't matter at all.

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-11 14:40:58 +02:00
Sven Schnelle
5ace65ebb5 s390/stp: clock_delta should be signed
clock_delta is declared as unsigned long in various places. However,
the clock sync delta can be negative. This would add a huge positive
offset in clock_sync_global where clock_delta is added to clk.eitod
which is a 72 bit integer. Declare it as signed long to fix this.

Cc: stable@vger.kernel.org
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-11 14:40:57 +02:00
Sven Schnelle
03780c83c7 s390/stp: fix todoff size
The size of the TOD offset field in the stp info response is 64 bits.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@de.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-11 14:40:57 +02:00
Thomas Richter
39d62336f5 s390/pai: add support for cryptography counters
PMU device driver perf_pai_crypto supports Processor Activity
Instrumentation (PAI), available with IBM z16:
- maps a full page to lowcore address 0x1500.
- uses CR0 bit 13 to turn PAI crypto counting on and off.
- creates a sample with raw data on each context switch out when
  at context switch some mapped counters have a value of nonzero.
This device driver only supports CPU wide context, no task context
is allowed.

Support for counting:
- one or more counters can be specified using
  perf stat -e pai_crypto/xxx/
  where xxx stands for the counter event name. Multiple invocation
  of this command is possible. The counter names are listed in
  /sys/devices/pai_crypto/events directory.
- one special counters can be specified using
  perf stat -e pai_crypto/CRYPTO_ALL/
  which returns the sum of all incremented crypto counters.
- one event pai_crypto/CRYPTO_ALL/ is reserved for sampling.
  No multiple invocations are possible. The event collects data at
  context switch out and saves them in the ring buffer.

Add qpaci assembly instruction to query supported memory mapped crypto
counters. It returns the number of counters (no holes allowed in that
range).

The PAI crypto counter events are system wide and can not be executed
in parallel. Therefore some restrictions documented in function
paicrypt_busy apply.
In particular event CRYPTO_ALL for sampling must run exclusive.
Only counting events can run in parallel.

PAI crypto counter events can not be created when a CPU hot plug
add is processed. This means a CPU hot plug add does not get
the necessary PAI event to record PAI cryptography counter increments
on the newly added CPU. CPU hot plug remove removes the event and
terminates the counting of PAI counters immediately.

Co-developed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20220504062351.2954280-3-tmricht@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-09 11:50:01 +02:00
Sven Schnelle
6d97af487d entry: Rename arch_check_user_regs() to arch_enter_from_user_mode()
arch_check_user_regs() is used at the moment to verify that struct pt_regs
contains valid values when entering the kernel from userspace. s390 needs
a place in the generic entry code to modify a cpu data structure when
switching from userspace to kernel mode. As arch_check_user_regs() is
exactly this, rename it to arch_enter_from_user_mode().

When entering the kernel from userspace, arch_check_user_regs() is
used to verify that struct pt_regs contains valid values. Note that
the NMI codepath doesn't call this function. s390 needs a place in the
generic entry code to modify a cpu data structure when switching from
userspace to kernel mode. As arch_check_user_regs() is exactly this,
rename it to arch_enter_from_user_mode().

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20220504062351.2954280-2-tmricht@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-09 11:33:38 +02:00
Heiko Carstens
fcdc03f78d s390/compat: cleanup compat_linux.h header file
Remove various declarations from former s390 specific compat system
calls which have been removed with commit fef747bab3c0 ("s390: use
generic UID16 implementation"). While at it clean up the whole small
header file.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:16 +02:00
Heiko Carstens
29b06ad7e8 s390/entry: remove broken and not needed code
LLVM's integrated assembler reports the following error when compiling
entry.S:

<instantiation>:38:5: error: unknown token in expression
 tm %r8,0x0001 # coming from user space?

The correct instruction would have been tmhh instead of tm.
The current code is doing nothing, since (with gas) it get's
translated to a tm instruction which reads from real address 8, which
again contains always zero, and therefore the conditional code is
never executed.
Note that due to the missing displacement gas translates "%r8" into
"8(%r0)".

Also code inspection reveals that this conditional code is not needed.
Therefore remove it.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:15 +02:00
Heiko Carstens
f84d88ed3b s390/boot: convert parmarea to C
Convert parmarea to C, which makes it much easier to initialize it. No need
to keep offsets in assembler code in sync with struct parmarea anymore.

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:15 +02:00
Heiko Carstens
834979c27f s390/boot: convert initial lowcore to C
Convert initial lowcore to C and use proper defines and structures to
initialize it. This should make the z/VM ipl procedure a bit less magic.

Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:15 +02:00
Heiko Carstens
67a9c428ef s390/ptrace: move short psw definitions to ptrace header file
The short psw definitions are contained in compat header files, however
short psws are not compat specific. Therefore move the definitions to
ptrace header file. This also gets rid of a compat header include in kvm
code.

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:15 +02:00
Heiko Carstens
aceb06d1e8 s390/head: initialize all new psws
Initialize all new psws with disabled wait psws, except for the restart new
psw. This way every unexpected exception, svc, machine check, or interrupt
is handled properly.

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:15 +02:00
Heiko Carstens
84f4e1dfb2 s390/boot: change initial program check handler to disabled wait psw
The program check handler of the kernel image points to
startup_pgm_check_handler. However an early program check which happens
while loading the kernel image will jump to potentially random code, since
the code of the program check handler is not yet loaded; leading to a
program check loop.

Therefore initialize it to a disabled wait psw and let the startup code set
the proper psw when everything is in memory.

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:15 +02:00
Heiko Carstens
734757976e s390/head: adjust iplstart entry point
Move iplstart entry point to 0x200 again, instead of the middle of the ipl
code. This way even the comment describing the ccw program is correct
again.

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:14 +02:00
Heiko Carstens
edd4a86673 s390/boot: get rid of startup archive
The final kernel image is created by linking decompressor object files with
a startup archive. The startup archive file however does not contain only
optional code and data which can be discarded if not referenced. It also
contains mandatory object data like head.o which must never be discarded,
even if not referenced.

Move the decompresser code and linker script to the boot directory and get
rid of the startup archive so everything is kept during link time.

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:14 +02:00
Heiko Carstens
964bc5dbe6 s390/vx: remove comments from macros which break LLVM's IAS
LLVM's integrated assembler does not like comments within macros:

<instantiation>:3:19: error: too many positional arguments
        GR_NUM  b2, 1       /* Base register */
                            ^
Remove them, since they are obvious anyway.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:14 +02:00
Heiko Carstens
68a971acc9 s390/extable: prefer local labels in .set directives
Use local labels in .set directives to avoid potential compile errors
with LTO + clang. See commit 334865b2915c ("x86/extable: Prefer local
labels in .set directives") for further details.

Since s390 doesn't support LTO currently this doesn't fix a real bug
for now, but helps to avoid problems as soon as required pieces have
been added to llvm.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:14 +02:00
Heiko Carstens
f9a3099f79 s390/nospec: prefer local labels in .set directives
Use local labels in .set directives to avoid potential compile errors
with LTO + clang. See commit 334865b2915c ("x86/extable: Prefer local
labels in .set directives") for further details.

Since s390 doesn't support LTO currently this doesn't fix a real bug
for now, but helps to avoid problems as soon as required pieces have
been added to llvm.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:13 +02:00
Julia Lawall
108ab40fc1 s390/hypfs: fix typos in comments
Various spelling mistakes in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Link: https://lore.kernel.org/r/20220430191122.8667-5-Julia.Lawall@inria.fr
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:13 +02:00
Julia Lawall
4b03b3ee60 s390/crypto: fix typos in comments
Various spelling mistakes in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Link: https://lore.kernel.org/r/20220430191122.8667-2-Julia.Lawall@inria.fr
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:13 +02:00
Christian Borntraeger
a06afe8383 KVM: s390: vsie/gmap: reduce gmap_rmap overhead
there are cases that trigger a 2nd shadow event for the same
vmaddr/raddr combination. (prefix changes, reboots, some known races)
This will increase memory usages and it will result in long latencies
when cleaning up, e.g. on shutdown. To avoid cases with a list that has
hundreds of identical raddrs we check existing entries at insert time.
As this measurably reduces the list length this will be faster than
traversing the list at shutdown time.

In the long run several places will be optimized to create less entries
and a shrinker might be necessary.

Fixes: 4be130a08420 ("s390/mm: add shadow gmap support")
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20220429151526.1560-1-borntraeger@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-03 15:15:44 +02:00
Janis Schoetterl-Glausch
b5d1274409 KVM: s390: Fix lockdep issue in vm memop
Issuing a memop on a protected vm does not make sense,
neither is the memory readable/writable, nor does it make sense to check
storage keys. This is why the ioctl will return -EINVAL when it detects
the vm to be protected. However, in order to ensure that the vm cannot
become protected during the memop, the kvm->lock would need to be taken
for the duration of the ioctl. This is also required because
kvm_s390_pv_is_protected asserts that the lock must be held.
Instead, don't try to prevent this. If user space enables secure
execution concurrently with a memop it must accecpt the possibility of
the memop failing.
Still check if the vm is currently protected, but without locking and
consider it a heuristic.

Fixes: ef11c9463ae0 ("KVM: s390: Add vm IOCTL for key checked guest absolute memory access")
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20220322153204.2637400-1-scgl@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-02 19:45:03 +02:00
Pingfan Liu
6260f6427c s390/irq: utilize RCU instead of irq_lock_sparse() in show_msi_interrupt()
As demonstrated by commit 74bdf7815dfb ("genirq: Speedup
show_interrupts()"), irq_desc can be accessed safely in RCU read section.

Hence here resorting to rcu read lock to get rid of irq_lock_sparse().

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Link: https://lore.kernel.org/r/20220422100212.22666-1-kernelfans@gmail.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-27 12:53:34 +02:00
Sven Schnelle
8b202ee218 s390: disable -Warray-bounds
gcc-12 shows a lot of array bound warnings on s390. This is caused
by the S390_lowcore macro which uses a hardcoded address of 0.

Wrapping that with absolute_pointer() works, but gcc no longer knows
that a 12 bit displacement is sufficient to access lowcore. So it
emits instructions like 'lghi %r1,0; l %rx,xxx(%r1)' instead of a
single load/store instruction. As s390 stores variables often
read/written in lowcore, this is considered problematic. Therefore
disable -Warray-bounds on s390 for gcc-12 for the time being, until
there is a better solution.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Link: https://lore.kernel.org/r/yt9dzgkelelc.fsf@linux.ibm.com
Link: https://lore.kernel.org/r/20220422134308.1613610-1-svens@linux.ibm.com
Link: https://lore.kernel.org/r/20220425121742.3222133-1-svens@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-27 12:53:07 +02:00
Ilya Leoshkevich
9a07731702 s390: add KCSAN instrumentation to barriers and spinlocks
test_barrier fails on s390 because of the missing KCSAN instrumentation
for several synchronization primitives.

Add it to barriers by defining __mb(), __rmb(), __wmb(), __dma_rmb()
and __dma_wmb(), and letting the common code in asm-generic/barrier.h
do the rest.

Spinlocks require instrumentation only on the unlock path; notify KCSAN
that the CPU cannot move memory accesses outside of the spin lock. In
reality it also cannot move stores inside of it, but this is not
important and can be omitted.

Reported-by: Tobias Huschle <huschle@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:16 +02:00
Niklas Schnelle
34fb0e7034 s390/pci: add error record for CC 2 retries
Currently it is not detectable from within Linux when PCI instructions
are retried because of a busy condition. Detecting such conditions and
especially how long they lasted can however be quite useful in problem
determination. This patch enables this by adding an s390dbf error log
when a CC 2 is first encountered as well as after the retried
instruction.

Despite being unlikely it may be possible that these added debug
messages drown out important other messages so allow setting the debug
level in zpci_err_insn*() and set their level to 1 so they can be
filtered out if need be.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:15 +02:00
Niklas Schnelle
cde8833e40 s390/pci: add PCI access type and length to error records
Currently when a PCI instruction returns a non-zero condition code it
can be very hard to tell from the s390dbf logs what kind of instruction
was executed. In case of PCI memory I/O (MIO) instructions it is even
impossible to tell if we attempted a load, store or block store or how
large the access was because only the address is logged.

Improve this by adding an indicator byte for the instruction type to the
error record and also store the length of the access for MIO
instructions where this can not be deduced from the request.

We use the following indicator values:
 - 'l': PCI load
 - 's': PCI store
 - 'b': PCI store block
 - 'L': PCI load (MIO)
 - 'S': PCI store (MIO)
 - 'B': PCI store block (MIO)
 - 'M': MPCIFC
 - 'R': RPCIT

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:15 +02:00
Niklas Schnelle
723b5a9d2b s390/pci: don't log availability events as errors
Availability events are logged in s390dbf in s390dbf/pci_error/hex_ascii
even though they don't indicate an error condition.

They have also become redundant as commit 6526a597a2e85 ("s390/pci: add
simpler s390dbf traces for events") added an s390dbf/pci_msg/sprintf log
entry for availability events which contains all non reserved fields of
struct zpci_ccdf_avail. On the other hand the availability entries in
the error log make it easy to miss actual errors and may even overwrite
error entries if the message buffer wraps.

Thus simply remove the availability events from the error log thereby
establishing the rule that any content in s390dbf/pci_error indicates
some kind of error.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:15 +02:00
Niklas Schnelle
52c79e636a s390/pci: make better use of zpci_dbg() levels
While the zpci_dbg() macro offers a level parameter this is currently
largely unused. The only instance with higher importance than 3 is the
UID checking change debug message which is not actually more important
as the UID uniqueness guarantee is already exposed in sysfs so this
should rather be 3 as well.

On the other hand the "add ..." message which shows what devices are
visible at the lowest level is essential during problem determination.
By setting its level to 1, lowering the debug level can act as a filter
to only show the available functions.

On the error side the default level is set to 6 while all existing
messages are printed at level 0. This is inconsistent and means there is
no room for having messages be invisible on the default level so instead
set the default level to 3 like for errors matching the default for
debug messages.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:15 +02:00
Sven Schnelle
41cd81abaf s390/vdso: add vdso randomization
Randomize the address of vdso if randomize_va_space is enabled.
Note that this keeps the vdso address on the same PMD as the stack
to avoid allocating an extra page table just for vdso.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:15 +02:00
Sven Schnelle
9e37a2e854 s390/vdso: map vdso above stack
In the current code vdso is mapped below the stack. This is
problematic when programs mapped to the top of the address space
are allocating a lot of memory, because the heap will clash with
the vdso. To avoid this map the vdso above the stack and move
STACK_TOP so that it all fits into three level paging.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:14 +02:00
Sven Schnelle
57761da4dc s390/vdso: move vdso mapping to its own function
This is a preparation patch for adding vdso randomization to s390.
It adds a function vdso_size(), which will be used later in calculating
the STACK_TOP value. It also moves the vdso mapping into a new function
vdso_map(), to keep the code similar to other architectures.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:14 +02:00
Sven Schnelle
f2f47d0ef7 s390/mmap: increase stack/mmap gap to 128MB
This basically reverts commit 9e78a13bfb16 ("[S390] reduce miminum
gap between stack and mmap_base"). 32MB is not enough space
between stack and mmap for some programs. Given that compat
task aren't common these days, lets revert back to 128MB.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:14 +02:00
Harald Freudenberger
2004b57cde s390/zcrypt: code cleanup
This patch tries to fix as much as possible of the
checkpatch.pl --strict findings:
  CHECK: Logical continuations should be on the previous line
  CHECK: No space is necessary after a cast
  CHECK: Alignment should match open parenthesis
  CHECK: 'useable' may be misspelled - perhaps 'usable'?
  WARNING: Possible repeated word: 'is'
  CHECK: spaces preferred around that '*' (ctx:VxV)
  CHECK: Comparison to NULL could be written "!msg"
  CHECK: Prefer kzalloc(sizeof(*zc)...) over kzalloc(sizeof(struct...)...)
  CHECK: Unnecessary parentheses around resp_type->work
  CHECK: Avoid CamelCase: <xcRB>

There is no functional change comming with this patch, only
code cleanup, renaming, whitespaces, indenting, ... but no
semantic change in any way. Also the API (zcrypt and pkey
header file) is semantically unchanged.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:14 +02:00
Harald Freudenberger
6acb086d9f s390/zcrypt: cleanup CPRB struct definitions
This patch does a little cleanup on the CPRBX struct
in zcrypt.h and the redundant CPRB struct definition in
zcrypt_msgtype6.c. Especially some of the misleading
fields from the CPRBX struct have been removed.

There is no semantic change coming with this patch.
The field names changed in the XCRB struct are only related
to reserved fields which should never been used.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:13 +02:00