35268 Commits

Author SHA1 Message Date
Rafael J. Wysocki
e1f1320fc0 Merge branch 'pm-cpufreq'
* pm-cpufreq: (31 commits)
  cpufreq: Fix cpufreq_online() return value on errors
  cpufreq: Fix up several kerneldoc comments
  cpufreq: stats: Use local_clock() instead of jiffies
  cpufreq: schedutil: Simplify sugov_update_next_freq()
  cpufreq: intel_pstate: Simplify intel_cpufreq_update_pstate()
  cpufreq: arm_scmi: Discover the power scale in performance protocol
  firmware: arm_scmi: Add power_scale_mw_get() interface
  cpufreq: tegra194: Rename tegra194_get_speed_common function
  cpufreq: tegra194: Remove unnecessary frequency calculation
  cpufreq: tegra186: Simplify cluster information lookup
  cpufreq: tegra186: Fix sparse 'incorrect type in assignment' warning
  cpufreq: imx: fix NVMEM_IMX_OCOTP dependency
  cpufreq: vexpress-spc: Add missing MODULE_ALIAS
  cpufreq: scpi: Add missing MODULE_ALIAS
  cpufreq: loongson1: Add missing MODULE_ALIAS
  cpufreq: sun50i: Add missing MODULE_DEVICE_TABLE
  cpufreq: st: Add missing MODULE_DEVICE_TABLE
  cpufreq: qcom: Add missing MODULE_DEVICE_TABLE
  cpufreq: mediatek: Add missing MODULE_DEVICE_TABLE
  cpufreq: highbank: Add missing MODULE_DEVICE_TABLE
  ...
2020-12-15 15:24:52 +01:00
Peter Zijlstra
ae79270232 sched: Optimize finish_lock_switch()
The kernel test robot measured a -1.6% performance regression on
will-it-scale/sched_yield due to commit:

  2558aacff858 ("sched/hotplug: Ensure only per-cpu kthreads run during hotplug")

Even though we were careful to replace a single load with another
single load from the same cacheline.

Restore finish_lock_switch() to the exact state before the offending
patch and solve the problem differently.

Fixes: 2558aacff858 ("sched/hotplug: Ensure only per-cpu kthreads run during hotplug")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20201210161408.GX3021@hirez.programming.kicks-ass.net
2020-12-15 11:27:53 +01:00
Thomas Gleixner
3c41e57a1e irqchip updates for Linux 5.11
- Preliminary support for managed interrupts on platform devices
 - Correctly identify allocation of MSIs proxyied by another device
 - Remove the fasteoi IPI flow which has been proved useless
 - Generalise the Ocelot support to new SoCs
 - Improve GICv4.1 vcpu entry, matching the corresponding KVM optimisation
 - Work around spurious interrupts on Qualcomm PDC
 - Random fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl/Uxq8PHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDoW0P/0ZMDvFPxrfnJD46exUgUOPuuFF8jZxAlxD8
 7UExqar7u6yX7bbq394jPgtOOxldDagfCx/jCXgb9ja7DK5EHKRcrfjaDT8knHi2
 Keg5RaRMRi9TVltvWQTxAkXwSv0Atl881qqsndPeZCez0GNZp+HB34s+rNkZwBOu
 MBrWihMQOSv5QE6milsNc7HXLSHM1eLZ7Y2XgumNtKrIGEX9yZI7qwdMofwP8Za3
 ayMOvc1WAWaTJI7Mg5ac1yTCVbqLmRHhCtws6c6DMgaRu6SI0itmbpQzkDuJJIe3
 k9h4KQPaKAFcQsoo3GV0MKTMm63eq82XT3CAdv+htYRY1z95D2+nzNK+mJtsGptX
 gJ2zeJkUb4u+yVtNguL9qjo5ssCXV/6IybJxv6baaEFnSwQMUwqa066NdxmtqfIe
 1BOWnc153a7SRbQ34M9/llje+v8YJbueGMS2RFR2LQ6IjjpaHsXh+YCZokfA/kdk
 zGbOUD5WWFtFD1T3UoaJ4gFt+pzHjNqym4CcEj4S1Vf5y+POUkNmC+GYK+xfm2Fp
 WJMbdIUxJhHFRD9L1ShtfAVUSbp712VOOdILp9rYAkOdqfb51BVUiMUP++s2dGp1
 ZIT78qt7kTKT1CxbDdFAjzsi7RoMqdSGYgKmG4sVprELeZnFwq47nBkBr8XEQ1TT
 0ccEUOY8
 =7Z24
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates for 5.11 from Marc Zyngier:

  - Preliminary support for managed interrupts on platform devices
  - Correctly identify allocation of MSIs proxyied by another device
  - Remove the fasteoi IPI flow which has been proved useless
  - Generalise the Ocelot support to new SoCs
  - Improve GICv4.1 vcpu entry, matching the corresponding KVM optimisation
  - Work around spurious interrupts on Qualcomm PDC
  - Random fixes and cleanups

Link: https://lore.kernel.org/r/20201212135626.1479884-1-maz@kernel.org
2020-12-15 10:48:07 +01:00
Linus Torvalds
148842c98a Yet another large set of x86 interrupt management updates:
- Simplification and distangling of the MSI related functionality
 
    - Let IO/APIC construct the RTE entries from an MSI message instead of
      having IO/APIC specific code in the interrupt remapping drivers
 
    - Make the retrieval of the parent interrupt domain (vector or remap
      unit) less hardcoded and use the relevant irqdomain callbacks for
      selection.
 
    - Allow the handling of more than 255 CPUs without a virtualized IOMMU
      when the hypervisor supports it. This has made been possible by the
      above modifications and also simplifies the existing workaround in the
      HyperV specific virtual IOMMU.
 
    - Cleanup of the historical timer_works() irq flags related
      inconsistencies.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/Xxd8THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoYpOD/9C5TppNlPMUyx2SflH6bxt37pJEpln
 +hYTKsk+jSThntr5mfj+GifGvgmHOVBTGnlDUnUnrpN7TQmLFBzwTOtnBLW53AO2
 16/u0+Xci4LNCtEkaymf0Rq4MfsfriXHPJr0A/CnZ0tpHSf5QKHAiitSiGujdMlb
 gbq43+zXd+jNkH7vkOLPX/7dZVI1hNASQEevJu2tRR4xYTuXFdBxvLgYkHtYKKrK
 R1sbs6nI6yIzye2u4m4xGu29SxgUft+zdUf+UehJKM3yFmf51d9qpkX+kLaTWuaL
 VPsMItbn0kdvxwXQWO6DYnIAAnVKCklyHQJTZCoNq9Fe91OoByak1CEVspSOa1av
 JmycNSch4IYWasR4vVCB1gbb+V9SejcKu5SV3CDrEDqwkOIpfiqpriUXSCJTLlFd
 QOEDOLuuk/79Qs//J/tb/nJ4IuKv8WPudDfIlMro8wUsAr67DjD4mnXprZ+svwWx
 Ct/0/Memk+BSa0cw6pvg24BUZGN6zrufkBu2HKT9GOXRUdNkdLkiPhT8mK4T/O0l
 f90QCLjPSOJ/K/pLEWdUHEPmgC5Q9RsXOmwVGqX+RbjfP7mYTJXlmWnBb+cFNch0
 xFIH3SxVGylxxT06NX3SkvinrHj10CoAlmneefBlLtx6dF+2P84DAMZSF0OFToVI
 c2KMg5zoesI4bg==
 =8Gfs
 -----END PGP SIGNATURE-----

Merge tag 'x86-apic-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 apic updates from Thomas Gleixner:
 "Yet another large set of x86 interrupt management updates:

   - Simplification and distangling of the MSI related functionality

   - Let IO/APIC construct the RTE entries from an MSI message instead
     of having IO/APIC specific code in the interrupt remapping drivers

   - Make the retrieval of the parent interrupt domain (vector or remap
     unit) less hardcoded and use the relevant irqdomain callbacks for
     selection.

   - Allow the handling of more than 255 CPUs without a virtualized
     IOMMU when the hypervisor supports it. This has made been possible
     by the above modifications and also simplifies the existing
     workaround in the HyperV specific virtual IOMMU.

   - Cleanup of the historical timer_works() irq flags related
     inconsistencies"

* tag 'x86-apic-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
  x86/ioapic: Cleanup the timer_works() irqflags mess
  iommu/hyper-v: Remove I/O-APIC ID check from hyperv_irq_remapping_select()
  iommu/amd: Fix IOMMU interrupt generation in X2APIC mode
  iommu/amd: Don't register interrupt remapping irqdomain when IR is disabled
  iommu/amd: Fix union of bitfields in intcapxt support
  x86/ioapic: Correct the PCI/ISA trigger type selection
  x86/ioapic: Use I/O-APIC ID for finding irqdomain, not index
  x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it
  x86/kvm: Enable 15-bit extension when KVM_FEATURE_MSI_EXT_DEST_ID detected
  iommu/hyper-v: Disable IRQ pseudo-remapping if 15 bit APIC IDs are available
  x86/apic: Support 15 bits of APIC ID in MSI where available
  x86/ioapic: Handle Extended Destination ID field in RTE
  iommu/vt-d: Simplify intel_irq_remapping_select()
  x86: Kill all traces of irq_remapping_get_irq_domain()
  x86/ioapic: Use irq_find_matching_fwspec() to find remapping irqdomain
  x86/hpet: Use irq_find_matching_fwspec() to find remapping irqdomain
  iommu/hyper-v: Implement select() method on remapping irqdomain
  iommu/vt-d: Implement select() method on remapping irqdomain
  iommu/amd: Implement select() method on remapping irqdomain
  x86/apic: Add select() method on vector irqdomain
  ...
2020-12-14 18:59:53 -08:00
Linus Torvalds
edd7ab7684 The new preemtible kmap_local() implementation:
- Consolidate all kmap_atomic() internals into a generic implementation
     which builds the base for the kmap_local() API and make the
     kmap_atomic() interface wrappers which handle the disabling/enabling of
     preemption and pagefaults.
 
   - Switch the storage from per-CPU to per task and provide scheduler
     support for clearing mapping when scheduling out and restoring them
     when scheduling back in.
 
   - Merge the migrate_disable/enable() code, which is also part of the
     scheduler pull request. This was required to make the kmap_local()
     interface available which does not disable preemption when a mapping
     is established. It has to disable migration instead to guarantee that
     the virtual address of the mapped slot is the same accross preemption.
 
   - Provide better debug facilities: guard pages and enforced utilization
     of the mapping mechanics on 64bit systems when the architecture allows
     it.
 
   - Provide the new kmap_local() API which can now be used to cleanup the
     kmap_atomic() usage sites all over the place. Most of the usage sites
     do not require the implicit disabling of preemption and pagefaults so
     the penalty on 64bit and 32bit non-highmem systems is removed and quite
     some of the code can be simplified. A wholesale conversion is not
     possible because some usage depends on the implicit side effects and
     some need to be cleaned up because they work around these side effects.
 
     The migrate disable side effect is only effective on highmem systems
     and when enforced debugging is enabled. On 64bit and 32bit non-highmem
     systems the overhead is completely avoided.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/XyQwTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoUolD/9+R+BX96fGir+I8rG9dc3cbLw5meSi
 0I/Nq3PToZMs2Iqv50DsoaPYHHz/M6fcAO9LRIgsE9jRbnY93GnsBM0wU9Y8yQaT
 4wUzOG5WHaLDfqIkx/CN9coUl458oEiwOEbn79A2FmPXFzr7IpkufnV3ybGDwzwP
 p73bjMJMPPFrsa9ig87YiYfV/5IAZHi82PN8Cq1v4yNzgXRP3Tg6QoAuCO84ZnWF
 RYlrfKjcJ2xPdn+RuYyXolPtxr1hJQ0bOUpe4xu/UfeZjxZ7i1wtwLN9kWZe8CKH
 +x4Lz8HZZ5QMTQ9sCHOLtKzu2MceMcpISzoQH4/aFQCNMgLn1zLbS790XkYiQCuR
 ne9Cua+IqgYfGMG8cq8+bkU9HCNKaXqIBgPEKE/iHYVmqzCOqhW5Cogu4KFekf6V
 Wi7pyyUdX2en8BAWpk5NHc8de9cGcc+HXMq2NIcgXjVWvPaqRP6DeITERTZLJOmz
 XPxq5oPLGl7wdm7z+ICIaNApy8zuxpzb6sPLNcn7l5OeorViORlUu08AN8587wAj
 FiVjp6ZYomg+gyMkiNkDqFOGDH5TMENpOFoB0hNNEyJwwS0xh6CgWuwZcv+N8aPO
 HuS/P+tNANbD8ggT4UparXYce7YCtgOf3IG4GA3JJYvYmJ6pU+AZOWRoDScWq4o+
 +jlfoJhMbtx5Gg==
 =n71I
 -----END PGP SIGNATURE-----

Merge tag 'core-mm-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull kmap updates from Thomas Gleixner:
 "The new preemtible kmap_local() implementation:

   - Consolidate all kmap_atomic() internals into a generic
     implementation which builds the base for the kmap_local() API and
     make the kmap_atomic() interface wrappers which handle the
     disabling/enabling of preemption and pagefaults.

   - Switch the storage from per-CPU to per task and provide scheduler
     support for clearing mapping when scheduling out and restoring them
     when scheduling back in.

   - Merge the migrate_disable/enable() code, which is also part of the
     scheduler pull request. This was required to make the kmap_local()
     interface available which does not disable preemption when a
     mapping is established. It has to disable migration instead to
     guarantee that the virtual address of the mapped slot is the same
     across preemption.

   - Provide better debug facilities: guard pages and enforced
     utilization of the mapping mechanics on 64bit systems when the
     architecture allows it.

   - Provide the new kmap_local() API which can now be used to cleanup
     the kmap_atomic() usage sites all over the place. Most of the usage
     sites do not require the implicit disabling of preemption and
     pagefaults so the penalty on 64bit and 32bit non-highmem systems is
     removed and quite some of the code can be simplified. A wholesale
     conversion is not possible because some usage depends on the
     implicit side effects and some need to be cleaned up because they
     work around these side effects.

     The migrate disable side effect is only effective on highmem
     systems and when enforced debugging is enabled. On 64bit and 32bit
     non-highmem systems the overhead is completely avoided"

* tag 'core-mm-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
  ARM: highmem: Fix cache_is_vivt() reference
  x86/crashdump/32: Simplify copy_oldmem_page()
  io-mapping: Provide iomap_local variant
  mm/highmem: Provide kmap_local*
  sched: highmem: Store local kmaps in task struct
  x86: Support kmap_local() forced debugging
  mm/highmem: Provide CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP
  mm/highmem: Provide and use CONFIG_DEBUG_KMAP_LOCAL
  microblaze/mm/highmem: Add dropped #ifdef back
  xtensa/mm/highmem: Make generic kmap_atomic() work correctly
  mm/highmem: Take kmap_high_get() properly into account
  highmem: High implementation details and document API
  Documentation/io-mapping: Remove outdated blurb
  io-mapping: Cleanup atomic iomap
  mm/highmem: Remove the old kmap_atomic cruft
  highmem: Get rid of kmap_types.h
  xtensa/mm/highmem: Switch to generic kmap atomic
  sparc/mm/highmem: Switch to generic kmap atomic
  powerpc/mm/highmem: Switch to generic kmap atomic
  nds32/mm/highmem: Switch to generic kmap atomic
  ...
2020-12-14 18:35:53 -08:00
Linus Torvalds
adb35e8dc9 Scheduler updates:
- migrate_disable/enable() support which originates from the RT tree and
    is now a prerequisite for the new preemptible kmap_local() API which aims
    to replace kmap_atomic().
 
  - A fair amount of topology and NUMA related improvements
 
  - Improvements for the frequency invariant calculations
 
  - Enhanced robustness for the global CPU priority tracking and decision
    making
 
  - The usual small fixes and enhancements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/XwK4THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoX28D/9cVrvziSQGfBfuQWnUiw8iOIq1QBa2
 Me+Tvenhfrlt7xU6rbP9ciFu7eTN+fS06m5uQPGI+t22WuJmHzbmw1bJVXfkvYfI
 /QoU+Hg7DkDAn1p7ZKXh0dRkV0nI9ixxSHl0E+Zf1ATBxCUMV2SO85flg6z/4qJq
 3VWUye0dmR7/bhtkIjv5rwce9v2JB2g1AbgYXYTW9lHVoUdGoMSdiZAF4tGyHLnx
 sJ6DMqQ+k+dmPyYO0z5MTzjW/fXit4n9w2e3z9TvRH/uBu58WSW1RBmQYX6aHBAg
 dhT9F4lvTs6lJY23x5RSFWDOv6xAvKF5a0xfb8UZcyH5EoLYrPRvm42a0BbjdeRa
 u0z7LbwIlKA+RFdZzFZWz8UvvO0ljyMjmiuqZnZ5dY9Cd80LSBuxrWeQYG0qg6lR
 Y2povhhCepEG+q8AXIe2YjHKWKKC1s/l/VY3CNnCzcd21JPQjQ4Z5eWGmHif5IED
 CntaeFFhZadR3w02tkX35zFmY3w4soKKrbI4EKWrQwd+cIEQlOSY7dEPI/b5BbYj
 MWAb3P4EG9N77AWTNmbhK4nN0brEYb+rBbCA+5dtNBVhHTxAC7OTWElJOC2O66FI
 e06dREjvwYtOkRUkUguWwErbIai2gJ2MH0VILV3hHoh64oRk7jjM8PZYnjQkdptQ
 Gsq0rJW5iiu/OQ==
 =Oz1V
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Thomas Gleixner:

 - migrate_disable/enable() support which originates from the RT tree
   and is now a prerequisite for the new preemptible kmap_local() API
   which aims to replace kmap_atomic().

 - A fair amount of topology and NUMA related improvements

 - Improvements for the frequency invariant calculations

 - Enhanced robustness for the global CPU priority tracking and decision
   making

 - The usual small fixes and enhancements all over the place

* tag 'sched-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (61 commits)
  sched/fair: Trivial correction of the newidle_balance() comment
  sched/fair: Clear SMT siblings after determining the core is not idle
  sched: Fix kernel-doc markup
  x86: Print ratio freq_max/freq_base used in frequency invariance calculations
  x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC
  x86, sched: Calculate frequency invariance for AMD systems
  irq_work: Optimize irq_work_single()
  smp: Cleanup smp_call_function*()
  irq_work: Cleanup
  sched: Limit the amount of NUMA imbalance that can exist at fork time
  sched/numa: Allow a floating imbalance between NUMA nodes
  sched: Avoid unnecessary calculation of load imbalance at clone time
  sched/numa: Rename nr_running and break out the magic number
  sched: Make migrate_disable/enable() independent of RT
  sched/topology: Condition EAS enablement on FIE support
  arm64: Rebuild sched domains on invariance status changes
  sched/topology,schedutil: Wrap sched domains rebuild
  sched/uclamp: Allow to reset a task uclamp constraint value
  sched/core: Fix typos in comments
  Documentation: scheduler: fix information on arch SD flags, sched_domain and sched_debug
  ...
2020-12-14 18:29:11 -08:00
Linus Torvalds
533369b145 timers and timekeeping updates:
Core:
 
   - Robustness improvements for the NOHZ tick management
 
   - Fixes and consolidation of the NTP/RTC synchronization code
 
   - Small fixes and improvements in various places
 
   - A set of function documentation udpates and fixes
 
  Drivers:
 
   - Cleanups and improvements in various clocksoure/event drivers
 
   - Removal of the EZChip NPS clocksource driver as the platfrom support
     was removed from ARC
 
   - The usual set of new device tree binding and json conversions
 
   - The RTC driver which have been acked by the RTC maintainer:
 
     - Fix a long standing bug in the MC146818 library code which can cause
       reading garbage during the RTC internal update.
 
     - The changes related to the NTP/RTC consolidation work.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/Xw1wTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYof7SD/4iIjuP5HoY7ec0z9wSFQ5U5nUwJnpW
 Sre13SUXpW+wOa/RcjAaHiD2G4MGtQyUIBibuL18Q5GMtGOvlIueEniuYP57p1XU
 ipr1UMnFvRkAaFNOnySzLiQyuliteBcNSDHrLYsSWW2BwjLbNzX46zG5kILrt31i
 IsseHZdD9+7SXBLvCjO6FAYkVH8FeIaFKv+3ZmroWOxPBOXi4wn02K86HrXs/6Wu
 9SCUIMcewhvSx3xCURzyMv6S2hgKSzywRNc5WcYIE8OPlKbnAE0IC370r3o2uL1B
 4dZPv4H1y7F7M4G+/XlIv0l2DTp9RuiWut9QcYmHtlFCKkrEO3ZGlcgPU6y5+mNc
 AwwG0J51yJYqg42aifdDNJ18B9GUNVCfVAKZcOYHLXOBgSvshd2WkPJkXsGaHd3z
 KrK3kZUnx+/QUWZB7dMuq+HQG2PJTvKkEwu4VGReWPGmubXbsIqBZ0vH5jYHjuEo
 t4QCUc5BpNlXOUJxal5wzVmDWnoqfKqbmnPky/f/cmNEfQNY6nA9hC3vo781j532
 Z5snFXhbITqIkaHoN86wMuuDCjKBKBJGQvejZKgPvh3oIg9d5yaj9P0UAhoYtv+M
 jMus4QDb6eBirgnZIVpgBC3kVZOxNOEHNsPeCcVfvPa7QOQnY4Cmb0GWnpZ2SZOz
 KYSjTIXKgZnHiQ==
 =eWC0
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timers and timekeeping updates from Thomas Gleixner:
 "Core:

   - Robustness improvements for the NOHZ tick management

   - Fixes and consolidation of the NTP/RTC synchronization code

   - Small fixes and improvements in various places

   - A set of function documentation udpates and fixes

   Drivers:

   - Cleanups and improvements in various clocksoure/event drivers

   - Removal of the EZChip NPS clocksource driver as the platfrom
     support was removed from ARC

   - The usual set of new device tree binding and json conversions

   - The RTC driver which have been acked by the RTC maintainer:

       * fix a long standing bug in the MC146818 library code which can
         cause reading garbage during the RTC internal update.

       * changes related to the NTP/RTC consolidation work"

* tag 'timers-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
  ntp: Fix prototype in the !CONFIG_GENERIC_CMOS_UPDATE case
  tick/sched: Make jiffies update quick check more robust
  ntp: Consolidate the RTC update implementation
  ntp: Make the RTC sync offset less obscure
  ntp, rtc: Move rtc_set_ntp_time() to ntp code
  ntp: Make the RTC synchronization more reliable
  rtc: core: Make the sync offset default more realistic
  rtc: cmos: Make rtc_cmos sync offset correct
  rtc: mc146818: Reduce spinlock section in mc146818_set_time()
  rtc: mc146818: Prevent reading garbage
  clocksource/drivers/sh_cmt: Fix potential deadlock when calling runtime PM
  clocksource/drivers/arm_arch_timer: Correct fault programming of CNTKCTL_EL1.EVNTI
  clocksource/drivers/arm_arch_timer: Use stable count reader in erratum sne
  clocksource/drivers/dw_apb_timer_of: Add error handling if no clock available
  clocksource/drivers/riscv: Make RISCV_TIMER depends on RISCV_SBI
  clocksource/drivers/ingenic: Fix section mismatch
  clocksource/drivers/cadence_ttc: Fix memory leak in ttc_setup_clockevent()
  dt-bindings: timer: renesas: tmu: Convert to json-schema
  dt-bindings: timer: renesas: tmu: Document r8a774e1 bindings
  clocksource/drivers/orion: Add missing clk_disable_unprepare() on error path
  ...
2020-12-14 18:21:14 -08:00
Linus Torvalds
76d4acf22b perf/kprobes updates:
- Make kretprobes lockless to avoid the rp->lock performance and potential
    lock ordering issues.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/XvuYTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoV3XEADA3yp4ApabrdSMK+JpTM053mM3NCCk
 VLZEdh5+ydvPfgTWZcgLDfL4P4MVySDKf40pSVgZOA73uDWhdO4jcMoJgl9Du4Nq
 qfvz6Atj0a8XEgAFNh1IWGGAHydIwKOQZJyjFT5Kh94QNOErF2PJGAMnoMYpdJsj
 E7kgDM+vmWJk0GE+OYTzsAYQ99XhLfUAO9f8WoRirxyNgga6bu0arRYWZSX3Sg/h
 oDUHeizyrrURUBgxJBewCxvCsy4TTfefwZFUBLK5gm3zRJLKDT2O8wiy+KzlRQqA
 kYV3fSx8fYETlSOJWJC8S01MLpxslGdenIdRgNc63C021DtwMGM83FCl0DLnPMeg
 iX5u+0Qg77rnJ8zh0cgSxyP6EgZzrUW8+DjZagge3PAnTXwYRv95pOJahJifDVmF
 mo2RJ2Me+XbqeB4BYoLivvWpXdsWOvtXl3BTA6ZLV+K823lMPYcZO/cXHIUYHhtu
 ExrZ+aw3opt43KT5sNQmPll7d1UsMD4/761L7gysIYK0RthunmlWpAnnfLTbRdPe
 ELKIHcuSCGkGfRs07/oPbbOpMorhel+3alW0B6Vzar0/0nw3fPX/yPIkCh7s941o
 G0UIPquvBGk3u0bZKZZ7QJPjT0ktdQpQs69+J2ARXWvApAGKnkOlPsNSI9TbPE3D
 ZIguKqSyzqJwuA==
 =PDBa
 -----END PGP SIGNATURE-----

Merge tag 'perf-kprobes-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf/kprobes updates from Thomas Gleixner:
 "Make kretprobes lockless to avoid the rp->lock performance and
  potential lock ordering issues"

* tag 'perf-kprobes-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/atomics: Regenerate the atomics-check SHA1's
  kprobes: Replace rp->free_instance with freelist
  freelist: Implement lockless freelist
  asm-generic/atomic: Add try_cmpxchg() fallbacks
  kprobes: Remove kretprobe hash
  llist: Add nonatomic __llist_add() and __llist_dell_all()
2020-12-14 17:41:38 -08:00
Linus Torvalds
8a8ca83ec3 Perf updates:
Core:
 
    - Better handling of page table leaves on archictectures which have
      architectures have non-pagetable aligned huge/large pages.  For such
      architectures a leaf can actually be part of a larger entry.
 
    - Prevent a deadlock vs. exec_update_mutex
 
  Architectures:
 
    - The related updates for page size calculation of leaf entries
 
    - The usual churn to support new CPUs
 
    - Small fixes and improvements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/XvgATHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoUrdEACatdr93wv75vnm5tCZM4EsFvB2PzVJ
 ck4K4+hHiMVV4802qf+kW5plF+rckAU4TAai/L7wkTntKHvjD/0/o1epoIStb+dS
 SCpVkQMCLT/8xT242iHPOfgsQpVpJnIiBwVRjn8HXu82nXdgMJhKnBjTe634UfxW
 o2OCFiyJzpRi5l86gVp67ueqgvl34NPI2JaSLc0g80QfZ8akzdePPpED35CzYjZh
 41k+7ssvt6qch3vMUySHAhkX4gQl0nc80YAaF/XZbCfvdyY7D03PtfBjfvphTSK0
 l54z9aWh0ciK9P1aPfvkHDXBJUR2VtUAx2GiURK+XU3jNk3KMrz9CcBl1D/exIAg
 07IsiYVoB38YAUOZoR9K8p+p+5EuwYRRUMAgfQfBALCuaLQV477Cne82b2KmNCus
 1izUQvcDDf0s74OyYTHWFXRGla95COJvNLzkrZ1oU3mX4HgdKdOAUbf/2XTLWeKO
 3HOIS+jsg5cp82tRe4X5r51h73pONYlo9lLo/CjQXz25vMcXKtE/MZGq2gkRff4p
 N4k88eQ5LOsRqUaU46GcHozXRCfcpW7SPI9AaN5I/fKGIZvHP7uMdMb+g5DV8yHI
 dNZ8u5uLPHwdg80C3fJ3Pnp7VsVNHliPXMwv0vib7BCp7aUVZWeFnOntw3PdYFRk
 XKEbfl36IuAadg==
 =rZ99
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Thomas Gleixner:
 "Core:

   - Better handling of page table leaves on archictectures which have
     architectures have non-pagetable aligned huge/large pages. For such
     architectures a leaf can actually be part of a larger entry.

   - Prevent a deadlock vs exec_update_mutex

  Architectures:

   - The related updates for page size calculation of leaf entries

   - The usual churn to support new CPUs

   - Small fixes and improvements all over the place"

* tag 'perf-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  perf/x86/intel: Add Tremont Topdown support
  uprobes/x86: Fix fall-through warnings for Clang
  perf/x86: Fix fall-through warnings for Clang
  kprobes/x86: Fix fall-through warnings for Clang
  perf/x86/intel/lbr: Fix the return type of get_lbr_cycles()
  perf/x86/intel: Fix rtm_abort_event encoding on Ice Lake
  x86/kprobes: Restore BTF if the single-stepping is cancelled
  perf: Break deadlock involving exec_update_mutex
  sparc64/mm: Implement pXX_leaf_size() support
  powerpc/8xx: Implement pXX_leaf_size() support
  arm64/mm: Implement pXX_leaf_size() support
  perf/core: Fix arch_perf_get_page_size()
  mm: Introduce pXX_leaf_size()
  mm/gup: Provide gup_get_pte() more generic
  perf/x86/intel: Add event constraint for CYCLE_ACTIVITY.STALLS_MEM_ANY
  perf/x86/intel/uncore: Add Rocket Lake support
  perf/x86/msr: Add Rocket Lake CPU support
  perf/x86/cstate: Add Rocket Lake CPU support
  perf/x86/intel: Add Rocket Lake CPU support
  perf,mm: Handle non-page-table-aligned hugetlbfs
  ...
2020-12-14 17:34:12 -08:00
Linus Torvalds
e857b6fcc5 A moderate set of locking updates:
- A few extensions to the rwsem API and support for opportunistic
     spinning and lock stealing
 
   - lockdep selftest improvements
 
   - Documentation updates
 
   - Cleanups and small fixes all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/XvCgTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYodbkD/9kmXCablxzCG+IGdRU0KfSvbalHkoS
 hUW7sJ8qYdoysOVMdvImPwqxLDy/P6D8Nk6z+hdaPmfWvIDQQECd7Mg/UhZLkRzI
 BGNgpatnzX4PK5sm/IFExCisPCkkbkjprocnk//TGjdwTiMMDxrndsEpwVwcucDp
 TwOjPPxoAbfWHUmnv2SUOD7mWMqMH/ISTQlKUaz+UCQicPNuHumdsQKvZx3eu7Cv
 KvucTso5Qjmyy0HwpmJO/IEyZs7Ibrb5Ocw5wds3yo2PFTjYTvo3JlJ16g8IvaZW
 ckk+o+3QKp29oFAPQ+dFGEG10w4JQI3AZkDVouFR4BDD0sbOm7BvWCsVq/J8vk3i
 xnmaHT3zB5F4T97O+osBj2KS4zLliOHohWzDNv1+JVBCfniYbPo5hqa/n7OO2oot
 M3xXY3ddgfTEUOtvOPPfZwfG5XmPrgwj8iiyywlTQU4BR5rWYj2ehvhWOwugQJ6x
 g56nQzuf3KmyoI2S+1GZoxtgWSLwoXbUAPL8p4lyvy6jKKFV84BOJeVac803BBUo
 yLFBSvTfZ95iNc84XHjJOJ/MGE8e2hOGa2KEdxuh1qE5FPazBg5e2cQh2j125PLz
 uyhelQn7SgAHSKSXSAOPq0JFsrmxRmkzIgG9zLSEqo+6g6uKdWgGYVCbEzOB+9gB
 2tNEgP6Mfh+ARg==
 =uqcN
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Thomas Gleixner:
 "A moderate set of locking updates:

   - A few extensions to the rwsem API and support for opportunistic
     spinning and lock stealing

   - lockdep selftest improvements

   - Documentation updates

   - Cleanups and small fixes all over the place"

* tag 'locking-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  seqlock: kernel-doc: Specify when preemption is automatically altered
  seqlock: Prefix internal seqcount_t-only macros with a "do_"
  Documentation: seqlock: s/LOCKTYPE/LOCKNAME/g
  locking/rwsem: Remove reader optimistic spinning
  locking/rwsem: Enable reader optimistic lock stealing
  locking/rwsem: Prevent potential lock starvation
  locking/rwsem: Pass the current atomic count to rwsem_down_read_slowpath()
  locking/rwsem: Fold __down_{read,write}*()
  locking/rwsem: Introduce rwsem_write_trylock()
  locking/rwsem: Better collate rwsem_read_trylock()
  rwsem: Implement down_read_interruptible
  rwsem: Implement down_read_killable_nested
  refcount: Fix a kernel-doc markup
  completion: Drop init_completion define
  atomic: Update MAINTAINERS
  atomic: Delete obsolete documentation
  seqlock: Rename __seqprop() users
  lockdep/selftest: Add spin_nest_lock test
  lockdep/selftests: Fix PROVE_RAW_LOCK_NESTING
  seqlock: avoid -Wshadow warnings
  ...
2020-12-14 17:27:47 -08:00
Linus Torvalds
8c1dccc803 RCU, LKMM and KCSAN updates collected by Paul McKenney:
RCU:
 
     - Avoid cpuinfo-induced IPI pileups and idle-CPU IPIs.
 
     - Lockdep-RCU updates reducing the need for __maybe_unused.
 
     - Tasks-RCU updates.
 
     - Miscellaneous fixes.
 
     - Documentation updates.
 
     - Torture-test updates.
 
   KCSAN:
 
     - updates for selftests, avoiding setting watchpoints on NULL pointers
 
     - fix to watchpoint encoding
 
   LKMM:
 
     - updates for documentation along with some updates to example-code
       litmus tests
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/Xon4THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYobXUD/92LJTI/TMgK6Z6EEQBiJZO/2mNKjK8
 FEKc6AqTNMlZNsWCfQ5UgqtHpn+MkBZsX1x4u22gehE1qaCB8gnQ5wXgbXon8tQm
 exxVk6vvQZjseeqCMqrsUYQlD7dNgHnf1qAmWXJvji4sA/1Opo6n2M74tqfE2ueV
 S5hpQwSuK/6Zu2Hrr62HD8+Fx0in6ZuKRZxHGp1392l++DGbniJM3dzntRXB+JbZ
 w3PDHFCQuGzTytyeKuQV48ot9IK+2YzmjIp/+4tHL6mvU38xeSu6gcYtqKPcfYWw
 D6HXvDa965h5IrFdSA2JWSzjJ+VYgZVElk2HyXDNIae0fM/8GidgoIDQipT1WAur
 sxW/Ke4U6Jm5MMqXqV8iMNduktkGD1/h6G/iB1Yis29xFdthorNpbHVAP+8cKXgf
 1cR6RorOuBYv6XpyzygHtE7qfLY5ST352pJ4+UqNzboujOcuEnGaygttt0F/F8sA
 ZH8NT5dyUfbGeqepdZWkbj116Hjeg3fyV3CZeyBhDeqpjf1Nn3nbJ1xRksPLfa3i
 IKvN7HSzEg+vKnsJNnQeFlAmQ/W3n2bedzRqfaCg77pNhKI6jPuavY5f2YGFUj0y
 yx0UzOYoI1Cln0keBMmynbyUKgJ7zstLkrt/JenjhtD3B+0df5BmYjkL+nqkP6ax
 +XTCu7Xg+B061g==
 =N/iO
 -----END PGP SIGNATURE-----

Merge tag 'core-rcu-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RCU updates from Thomas Gleixner:
 "RCU, LKMM and KCSAN updates collected by Paul McKenney.

  RCU:
   - Avoid cpuinfo-induced IPI pileups and idle-CPU IPIs

   - Lockdep-RCU updates reducing the need for __maybe_unused

   - Tasks-RCU updates

   - Miscellaneous fixes

   - Documentation updates

   - Torture-test updates

  KCSAN:
   - updates for selftests, avoiding setting watchpoints on NULL pointers

   - fix to watchpoint encoding

  LKMM:
   - updates for documentation along with some updates to example-code
     litmus tests"

* tag 'core-rcu-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
  srcu: Take early exit on memory-allocation failure
  rcu/tree: Defer kvfree_rcu() allocation to a clean context
  rcu: Do not report strict GPs for outgoing CPUs
  rcu: Fix a typo in rcu_blocking_is_gp() header comment
  rcu: Prevent lockdep-RCU splats on lock acquisition/release
  rcu/tree: nocb: Avoid raising softirq for offloaded ready-to-execute CBs
  rcu,ftrace: Fix ftrace recursion
  rcu/tree: Make struct kernel_param_ops definitions const
  rcu/tree: Add a warning if CPU being onlined did not report QS already
  rcu: Clarify nocb kthreads naming in RCU_NOCB_CPU config
  rcu: Fix single-CPU check in rcu_blocking_is_gp()
  rcu: Implement rcu_segcblist_is_offloaded() config dependent
  list.h: Update comment to explicitly note circular lists
  rcu: Panic after fixed number of stalls
  x86/smpboot:  Move rcu_cpu_starting() earlier
  rcu: Allow rcu_irq_enter_check_tick() from NMI
  tools/memory-model: Label MP tests' producers and consumers
  tools/memory-model: Use "buf" and "flag" for message-passing tests
  tools/memory-model: Add types to litmus tests
  tools/memory-model: Add a glossary of LKMM terms
  ...
2020-12-14 17:21:16 -08:00
Linus Torvalds
1ac0884d54 A set of updates for entry/exit handling:
- More generalization of entry/exit functionality
 
  - The consolidation work to reclaim TIF flags on x86 and also for non-x86
    specific TIF flags which are solely relevant for syscall related work
    and have been moved into their own storage space. The x86 specific part
    had to be merged in to avoid a major conflict.
 
  - The TIF_NOTIFY_SIGNAL work which replaces the inefficient signal
    delivery mode of task work and results in an impressive performance
    improvement for io_uring. The non-x86 consolidation of this is going to
    come seperate via Jens.
 
  - The selective syscall redirection facility which provides a clean and
    efficient way to support the non-Linux syscalls of WINE by catching them
    at syscall entry and redirecting them to the user space emulation. This
    can be utilized for other purposes as well and has been designed
    carefully to avoid overhead for the regular fastpath. This includes the
    core changes and the x86 support code.
 
  - Simplification of the context tracking entry/exit handling for the users
    of the generic entry code which guarantee the proper ordering and
    protection.
 
  - Preparatory changes to make the generic entry code accomodate S390
    specific requirements which are mostly related to their syscall restart
    mechanism.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/XoPoTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoe0tD/4jSKHIogVM9kVpiYfwjDGS1NluaBXn
 71ZoASbX9GZebyGandMyF2QP1iJ24ZO0RztBwHEVH6fyomKB2iFNedssCpO9yfWV
 3eFRpOvMpbszY2W2bd0QG3GrqaTttjVfB4ahkGLzqeSbchdob6hZpNDYtBZnujA6
 GSnrrurfJkCGoQny+yJQYdQJXQU+BIX90B2a2Q+jW123Luy/iHXC1f/krZSA1m14
 fC9xYLSUjPphTzh2ZOW+C3DgdjOL5PfAm/6F+DArt4GtLgrEGD7R74aLSFhvetky
 dn5QtG+yAsz1i0cc5Wu/JBcT9tOkY92rPYSyLI9bYQUSQ/bMyuprz6oYKj3dubsu
 ZSsKPdkNFPIniL4fLdCMWZcIXX5xgnrxKjdgXZXW3gtrcxSns8w8uED3Sh7dgE08
 pgIeq67E5g/OB8kJXH1VxdewmeQb9cOmnzzHwNO7TrrGbBKjDTYHNdYOKf1dUTTK
 ZX1UjLfGwxTkMYAbQD1k0JGZ2OLRshzSaH5BW/ZKa3bvJW6yYOq+/YT8B8hbJ8U3
 vThlO75/55IJxS5r5Y3vZd/IHdsYbPuETD+TA8tNYtPqNZasW8nnk4TYctWqzDuO
 /Ka1wvWYid3c6ySznQn4zSyRjr968AfHeZ9YTUMhWufy5waXVmdBMG41u3IKfsVt
 osyzNc4EK19/Mg==
 =hsjV
 -----END PGP SIGNATURE-----

Merge tag 'core-entry-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core entry/exit updates from Thomas Gleixner:
 "A set of updates for entry/exit handling:

   - More generalization of entry/exit functionality

   - The consolidation work to reclaim TIF flags on x86 and also for
     non-x86 specific TIF flags which are solely relevant for syscall
     related work and have been moved into their own storage space. The
     x86 specific part had to be merged in to avoid a major conflict.

   - The TIF_NOTIFY_SIGNAL work which replaces the inefficient signal
     delivery mode of task work and results in an impressive performance
     improvement for io_uring. The non-x86 consolidation of this is
     going to come seperate via Jens.

   - The selective syscall redirection facility which provides a clean
     and efficient way to support the non-Linux syscalls of WINE by
     catching them at syscall entry and redirecting them to the user
     space emulation. This can be utilized for other purposes as well
     and has been designed carefully to avoid overhead for the regular
     fastpath. This includes the core changes and the x86 support code.

   - Simplification of the context tracking entry/exit handling for the
     users of the generic entry code which guarantee the proper ordering
     and protection.

   - Preparatory changes to make the generic entry code accomodate S390
     specific requirements which are mostly related to their syscall
     restart mechanism"

* tag 'core-entry-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  entry: Add syscall_exit_to_user_mode_work()
  entry: Add exit_to_user_mode() wrapper
  entry_Add_enter_from_user_mode_wrapper
  entry: Rename exit_to_user_mode()
  entry: Rename enter_from_user_mode()
  docs: Document Syscall User Dispatch
  selftests: Add benchmark for syscall user dispatch
  selftests: Add kselftest for syscall user dispatch
  entry: Support Syscall User Dispatch on common syscall entry
  kernel: Implement selective syscall userspace redirection
  signal: Expose SYS_USER_DISPATCH si_code type
  x86: vdso: Expose sigreturn address on vdso to the kernel
  MAINTAINERS: Add entry for common entry code
  entry: Fix boot for !CONFIG_GENERIC_ENTRY
  x86: Support HAVE_CONTEXT_TRACKING_OFFSTACK
  context_tracking: Only define schedule_user() on !HAVE_CONTEXT_TRACKING_OFFSTACK archs
  sched: Detect call to schedule from critical entry code
  context_tracking: Don't implement exception_enter/exit() on CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK
  context_tracking: Introduce HAVE_CONTEXT_TRACKING_OFFSTACK
  x86: Reclaim unused x86 TI flags
  ...
2020-12-14 17:13:53 -08:00
Linus Torvalds
f9b4240b07 fixes-v5.11
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCX9daOgAKCRCRxhvAZXjc
 ohPkAQChXUB2BAjtIzXlCkZoDBbzHHblm5DZ37oy/4xYFmAcEwEA5sw6dQqyGHnF
 GEP9def51HvXLpBV2BzNUGggo1SoGgQ=
 =w/cO
 -----END PGP SIGNATURE-----

Merge tag 'fixes-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull misc fixes from Christian Brauner:
 "This contains several fixes which felt worth being combined into a
  single branch:

   - Use put_nsproxy() instead of open-coding it switch_task_namespaces()

   - Kirill's work to unify lifecycle management for all namespaces. The
     lifetime counters are used identically for all namespaces types.
     Namespaces may of course have additional unrelated counters and
     these are not altered. This work allows us to unify the type of the
     counters and reduces maintenance cost by moving the counter in one
     place and indicating that basic lifetime management is identical
     for all namespaces.

   - Peilin's fix adding three byte padding to Dmitry's
     PTRACE_GET_SYSCALL_INFO uapi struct to prevent an info leak.

   - Two smal patches to convert from the /* fall through */ comment
     annotation to the fallthrough keyword annotation which I had taken
     into my branch and into -next before df561f6688fe ("treewide: Use
     fallthrough pseudo-keyword") made it upstream which fixed this
     tree-wide.

     Since I didn't want to invalidate all testing for other commits I
     didn't rebase and kept them"

* tag 'fixes-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  nsproxy: use put_nsproxy() in switch_task_namespaces()
  sys: Convert to the new fallthrough notation
  signal: Convert to the new fallthrough notation
  time: Use generic ns_common::count
  cgroup: Use generic ns_common::count
  mnt: Use generic ns_common::count
  user: Use generic ns_common::count
  pid: Use generic ns_common::count
  ipc: Use generic ns_common::count
  uts: Use generic ns_common::count
  net: Use generic ns_common::count
  ns: Add a common refcount into ns_common
  ptrace: Prevent kernel-infoleak in ptrace_get_syscall_info()
2020-12-14 16:40:27 -08:00
Linus Torvalds
6d93a1971a time-namespace-v5.11
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCX9cwgAAKCRCRxhvAZXjc
 onViAP9CDMQct0RfdpdKOrh4NkxWiheBp7CzVSP1Xfy8KHBslgD/X7kilcthT8PC
 JTJmngrVWoehX+s49kl2PSuuLsGElAo=
 =llnx
 -----END PGP SIGNATURE-----

Merge tag 'time-namespace-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull time namespace updates from Christian Brauner:
 "When time namespaces were introduced we missed to virtualize the
  'btime' field in /proc/stat. This confuses tasks which are in another
  time namespace with a virtualized boottime which is common in some
  container workloads. This contains Michael's series to fix 'btime'
  which Thomas asked me to take through my tree.

  To fix 'btime' virtualization we simply subtract the offset of the
  time namespace's boottime from btime before printing the stats. Note
  that since start_boottime of processes are seconds since boottime and
  the boottime stamp is now shifted according to the time namespace's
  offset, the offset of the time namespace also needs to be applied
  before the process stats are given to userspace. This avoids that
  processes shown by tools such as 'ps' appear as time travelers in the
  corresponding time namespace.

  Selftests are included to verify that btime virtualization in
  /proc/stat works as expected"

* tag 'time-namespace-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  namespace: make timens_on_fork() return nothing
  selftests/timens: added selftest for /proc/stat btime
  fs/proc: apply the time namespace offset to /proc/stat btime
  timens: additional helper functions for boottime offset handling
2020-12-14 16:35:39 -08:00
Linus Torvalds
0ca2ce81eb arm64 updates for 5.11:
- Expose tag address bits in siginfo. The original arm64 ABI did not
   expose any of the bits 63:56 of a tagged address in siginfo. In the
   presence of user ASAN or MTE, this information may be useful. The
   implementation is generic to other architectures supporting tags (like
   SPARC ADI, subject to wiring up the arch code). The user will have to
   opt in via sigaction(SA_EXPOSE_TAGBITS) so that the extra bits, if
   available, become visible in si_addr.
 
 - Default to 32-bit wide ZONE_DMA. Previously, ZONE_DMA was set to the
   lowest 1GB to cope with the Raspberry Pi 4 limitations, to the
   detriment of other platforms. With these changes, the kernel scans the
   Device Tree dma-ranges and the ACPI IORT information before deciding
   on a smaller ZONE_DMA.
 
 - Strengthen READ_ONCE() to acquire when CONFIG_LTO=y. When building
   with LTO, there is an increased risk of the compiler converting an
   address dependency headed by a READ_ONCE() invocation into a control
   dependency and consequently allowing for harmful reordering by the
   CPU.
 
 - Add CPPC FFH support using arm64 AMU counters.
 
 - set_fs() removal on arm64. This renders the User Access Override (UAO)
   ARMv8 feature unnecessary.
 
 - Perf updates: PMU driver for the ARM DMC-620 memory controller, sysfs
   identifier file for SMMUv3, stop event counters support for i.MX8MP,
   enable the perf events-based hard lockup detector.
 
 - Reorganise the kernel VA space slightly so that 52-bit VA
   configurations can use more virtual address space.
 
 - Improve the robustness of the arm64 memory offline event notifier.
 
 - Pad the Image header to 64K following the EFI header definition
   updated recently to increase the section alignment to 64K.
 
 - Support CONFIG_CMDLINE_EXTEND on arm64.
 
 - Do not use tagged PC in the kernel (TCR_EL1.TBID1==1), freeing up 8
   bits for PtrAuth.
 
 - Switch to vmapped shadow call stacks.
 
 - Miscellaneous clean-ups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl/XcSgACgkQa9axLQDI
 XvGkwg//SLknimELD/cphf2UzZm5RFuCU0x1UnIXs9XYo5BrOpgVLLA//+XkCrKN
 0GLAdtBDfw1axWJudzgMBiHrv6wSGh4p3YWjLIW06u/PJu3m3U8oiiolvvF8d7Yq
 UKDseKGQnQkrl97J0SyA+Da/u8D11GEzp52SWL5iRxzt6vInEC27iTOp9n1yoaoP
 f3y7qdp9kv831ryUM3rXFYpc8YuMWXk+JpBSNaxqmjlvjMzipA5PhzBLmNzfc657
 XcrRX5qsgjEeJW8UUnWUVNB42j7tVzN77yraoUpoVVCzZZeWOQxqq5EscKPfIhRt
 AjtSIQNOs95ZVE0SFCTjXnUUb823coUs4dMCdftqlE62JNRwdR+3bkfa+QjPTg1F
 O9ohW1AzX0/JB19QBxMaOgbheB8GFXh3DVJ6pizTgxJgyPvQQtFuEhT1kq8Cst0U
 Pe+pEWsg9t41bUXNz+/l9tUWKWpeCfFNMTrBXLmXrNlTLeOvDh/0UiF0+2lYJYgf
 YAboibQ5eOv2wGCcSDEbNMJ6B2/6GtubDJxH4du680F6Emb6pCSw0ntPwB7mSGLG
 5dXz+9FJxDLjmxw7BXxQgc5MoYIrt5JQtaOQ6UxU8dPy53/+py4Ck6tXNkz0+Ap7
 gPPaGGy1GqobQFu3qlHtOK1VleQi/sWcrpmPHrpiiFUf6N7EmcY=
 =zXFk
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - Expose tag address bits in siginfo. The original arm64 ABI did not
   expose any of the bits 63:56 of a tagged address in siginfo. In the
   presence of user ASAN or MTE, this information may be useful. The
   implementation is generic to other architectures supporting tags
   (like SPARC ADI, subject to wiring up the arch code). The user will
   have to opt in via sigaction(SA_EXPOSE_TAGBITS) so that the extra
   bits, if available, become visible in si_addr.

 - Default to 32-bit wide ZONE_DMA. Previously, ZONE_DMA was set to the
   lowest 1GB to cope with the Raspberry Pi 4 limitations, to the
   detriment of other platforms. With these changes, the kernel scans
   the Device Tree dma-ranges and the ACPI IORT information before
   deciding on a smaller ZONE_DMA.

 - Strengthen READ_ONCE() to acquire when CONFIG_LTO=y. When building
   with LTO, there is an increased risk of the compiler converting an
   address dependency headed by a READ_ONCE() invocation into a control
   dependency and consequently allowing for harmful reordering by the
   CPU.

 - Add CPPC FFH support using arm64 AMU counters.

 - set_fs() removal on arm64. This renders the User Access Override
   (UAO) ARMv8 feature unnecessary.

 - Perf updates: PMU driver for the ARM DMC-620 memory controller, sysfs
   identifier file for SMMUv3, stop event counters support for i.MX8MP,
   enable the perf events-based hard lockup detector.

 - Reorganise the kernel VA space slightly so that 52-bit VA
   configurations can use more virtual address space.

 - Improve the robustness of the arm64 memory offline event notifier.

 - Pad the Image header to 64K following the EFI header definition
   updated recently to increase the section alignment to 64K.

 - Support CONFIG_CMDLINE_EXTEND on arm64.

 - Do not use tagged PC in the kernel (TCR_EL1.TBID1==1), freeing up 8
   bits for PtrAuth.

 - Switch to vmapped shadow call stacks.

 - Miscellaneous clean-ups.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (78 commits)
  perf/imx_ddr: Add system PMU identifier for userspace
  bindings: perf: imx-ddr: add compatible string
  arm64: Fix build failure when HARDLOCKUP_DETECTOR_PERF is enabled
  arm64: mte: fix prctl(PR_GET_TAGGED_ADDR_CTRL) if TCF0=NONE
  arm64: mark __system_matches_cap as __maybe_unused
  arm64: uaccess: remove vestigal UAO support
  arm64: uaccess: remove redundant PAN toggling
  arm64: uaccess: remove addr_limit_user_check()
  arm64: uaccess: remove set_fs()
  arm64: uaccess cleanup macro naming
  arm64: uaccess: split user/kernel routines
  arm64: uaccess: refactor __{get,put}_user
  arm64: uaccess: simplify __copy_user_flushcache()
  arm64: uaccess: rename privileged uaccess routines
  arm64: sdei: explicitly simulate PAN/UAO entry
  arm64: sdei: move uaccess logic to arch/arm64/
  arm64: head.S: always initialize PSTATE
  arm64: head.S: cleanup SCTLR_ELx initialization
  arm64: head.S: rename el2_setup -> init_kernel_el
  arm64: add C wrappers for SET_PSTATE_*()
  ...
2020-12-14 16:24:30 -08:00
Jakub Kicinski
a6b5e026e6 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2020-12-14

1) Expose bpf_sk_storage_*() helpers to iterator programs, from Florent Revest.

2) Add AF_XDP selftests based on veth devs to BPF selftests, from Weqaar Janjua.

3) Support for finding BTF based kernel attach targets through libbpf's
   bpf_program__set_attach_target() API, from Andrii Nakryiko.

4) Permit pointers on stack for helper calls in the verifier, from Yonghong Song.

5) Fix overflows in hash map elem size after rlimit removal, from Eric Dumazet.

6) Get rid of direct invocation of llc in BPF selftests, from Andrew Delgadillo.

7) Fix xsk_recvmsg() to reorder socket state check before access, from Björn Töpel.

8) Add new libbpf API helper to retrieve ring buffer epoll fd, from Brendan Jackman.

9) Batch of minor BPF selftest improvements all over the place, from Florian Lehner,
   KP Singh, Jiri Olsa and various others.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (31 commits)
  selftests/bpf: Add a test for ptr_to_map_value on stack for helper access
  bpf: Permits pointers on stack for helper calls
  libbpf: Expose libbpf ring_buffer epoll_fd
  selftests/bpf: Add set_attach_target() API selftest for module target
  libbpf: Support modules in bpf_program__set_attach_target() API
  selftests/bpf: Silence ima_setup.sh when not running in verbose mode.
  selftests/bpf: Drop the need for LLVM's llc
  selftests/bpf: fix bpf_testmod.ko recompilation logic
  samples/bpf: Fix possible hang in xdpsock with multiple threads
  selftests/bpf: Make selftest compilation work on clang 11
  selftests/bpf: Xsk selftests - adding xdpxceiver to .gitignore
  selftests/bpf: Drop tcp-{client,server}.py from Makefile
  selftests/bpf: Xsk selftests - Bi-directional Sockets - SKB, DRV
  selftests/bpf: Xsk selftests - Socket Teardown - SKB, DRV
  selftests/bpf: Xsk selftests - DRV POLL, NOPOLL
  selftests/bpf: Xsk selftests - SKB POLL, NOPOLL
  selftests/bpf: Xsk selftests framework
  bpf: Only provide bpf_sock_from_file with CONFIG_NET
  bpf: Return -ENOTSUPP when attaching to non-kernel BTF
  xsk: Validate socket state in xsk_recvmsg, prior touching socket members
  ...
====================

Link: https://lore.kernel.org/r/20201214214316.20642-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-14 15:34:36 -08:00
Uladzislau Rezki (Sony)
1b04fa9900 rcu-tasks: Move RCU-tasks initialization to before early_initcall()
PowerPC testing encountered boot failures due to RCU Tasks not being
fully initialized until core_initcall() time.  This commit therefore
initializes RCU Tasks (along with Rude RCU and RCU Tasks Trace) just
before early_initcall() time, thus allowing waiting on RCU Tasks grace
periods from early_initcall() handlers.

Link: https://lore.kernel.org/rcu/87eekfh80a.fsf@dja-thinkpad.axtens.net/
Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall")
Tested-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-12-14 15:31:13 -08:00
Yonghong Song
cd17d38f8b bpf: Permits pointers on stack for helper calls
Currently, when checking stack memory accessed by helper calls,
for spills, only PTR_TO_BTF_ID and SCALAR_VALUE are
allowed.

Song discovered an issue where the below bpf program
  int dump_task(struct bpf_iter__task *ctx)
  {
    struct seq_file *seq = ctx->meta->seq;
    static char[] info = "abc";
    BPF_SEQ_PRINTF(seq, "%s\n", info);
    return 0;
  }
may cause a verifier failure.

The verifier output looks like:
  ; struct seq_file *seq = ctx->meta->seq;
  1: (79) r1 = *(u64 *)(r1 +0)
  ; BPF_SEQ_PRINTF(seq, "%s\n", info);
  2: (18) r2 = 0xffff9054400f6000
  4: (7b) *(u64 *)(r10 -8) = r2
  5: (bf) r4 = r10
  ;
  6: (07) r4 += -8
  ; BPF_SEQ_PRINTF(seq, "%s\n", info);
  7: (18) r2 = 0xffff9054400fe000
  9: (b4) w3 = 4
  10: (b4) w5 = 8
  11: (85) call bpf_seq_printf#126
   R1_w=ptr_seq_file(id=0,off=0,imm=0) R2_w=map_value(id=0,off=0,ks=4,vs=4,imm=0)
  R3_w=inv4 R4_w=fp-8 R5_w=inv8 R10=fp0 fp-8_w=map_value
  last_idx 11 first_idx 0
  regs=8 stack=0 before 10: (b4) w5 = 8
  regs=8 stack=0 before 9: (b4) w3 = 4
  invalid indirect read from stack off -8+0 size 8

Basically, the verifier complains the map_value pointer at "fp-8" location.
To fix the issue, if env->allow_ptr_leaks is true, let us also permit
pointers on the stack to be accessible by the helper.

Reported-by: Song Liu <songliubraving@fb.com>
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201210013349.943719-1-yhs@fb.com
2020-12-14 21:50:10 +01:00
Linus Torvalds
9e4b0d55d8 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Add speed testing on 1420-byte blocks for networking

  Algorithms:
   - Improve performance of chacha on ARM for network packets
   - Improve performance of aegis128 on ARM for network packets

  Drivers:
   - Add support for Keem Bay OCS AES/SM4
   - Add support for QAT 4xxx devices
   - Enable crypto-engine retry mechanism in caam
   - Enable support for crypto engine on sdm845 in qce
   - Add HiSilicon PRNG driver support"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (161 commits)
  crypto: qat - add capability detection logic in qat_4xxx
  crypto: qat - add AES-XTS support for QAT GEN4 devices
  crypto: qat - add AES-CTR support for QAT GEN4 devices
  crypto: atmel-i2c - select CONFIG_BITREVERSE
  crypto: hisilicon/trng - replace atomic_add_return()
  crypto: keembay - Add support for Keem Bay OCS AES/SM4
  dt-bindings: Add Keem Bay OCS AES bindings
  crypto: aegis128 - avoid spurious references crypto_aegis128_update_simd
  crypto: seed - remove trailing semicolon in macro definition
  crypto: x86/poly1305 - Use TEST %reg,%reg instead of CMP $0,%reg
  crypto: x86/sha512 - Use TEST %reg,%reg instead of CMP $0,%reg
  crypto: aesni - Use TEST %reg,%reg instead of CMP $0,%reg
  crypto: cpt - Fix sparse warnings in cptpf
  hwrng: ks-sa - Add dependency on IOMEM and OF
  crypto: lib/blake2s - Move selftest prototype into header file
  crypto: arm/aes-ce - work around Cortex-A57/A72 silion errata
  crypto: ecdh - avoid unaligned accesses in ecdh_set_secret()
  crypto: ccree - rework cache parameters handling
  crypto: cavium - Use dma_set_mask_and_coherent to simplify code
  crypto: marvell/octeontx - Use dma_set_mask_and_coherent to simplify code
  ...
2020-12-14 12:18:19 -08:00
Rafael J. Wysocki
30c768829a Merge branch 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Pull ARM cpufreq updates for 5.11-rc1 from Viresh Kumar:

"This contains the following updates:

 - Fix imx's NVMEM_IMX_OCOTP dependency (Arnd Bergmann).

 - Add support for mt8167 and blacklist mt8516 (Fabien Parent).

 - Some ->get() callback related cleanups to the tegra194 driver and
   some optimizations in tegra186 driver (Jon Hunter and Sumit Gupta).

 - Power scale improvements to arm_scmi driver (Lukasz Luba).

 - Add missing MODULE_DEVICE_TABLE and MODULE_ALIAS to several drivers
   (Pali Rohár).

 - Fix error path in mediatek driver (Qinglang Miao).

 - Fix memleak in ST's cpufreq driver (Yangtao Li)."

* 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: (22 commits)
  cpufreq: arm_scmi: Discover the power scale in performance protocol
  firmware: arm_scmi: Add power_scale_mw_get() interface
  cpufreq: tegra194: Rename tegra194_get_speed_common function
  cpufreq: tegra194: Remove unnecessary frequency calculation
  cpufreq: tegra186: Simplify cluster information lookup
  cpufreq: tegra186: Fix sparse 'incorrect type in assignment' warning
  cpufreq: imx: fix NVMEM_IMX_OCOTP dependency
  cpufreq: vexpress-spc: Add missing MODULE_ALIAS
  cpufreq: scpi: Add missing MODULE_ALIAS
  cpufreq: loongson1: Add missing MODULE_ALIAS
  cpufreq: sun50i: Add missing MODULE_DEVICE_TABLE
  cpufreq: st: Add missing MODULE_DEVICE_TABLE
  cpufreq: qcom: Add missing MODULE_DEVICE_TABLE
  cpufreq: mediatek: Add missing MODULE_DEVICE_TABLE
  cpufreq: highbank: Add missing MODULE_DEVICE_TABLE
  cpufreq: ap806: Add missing MODULE_DEVICE_TABLE
  cpufreq: mediatek: add missing platform_driver_unregister() on error in mtk_cpufreq_driver_init
  cpufreq: tegra194: get consistent cpuinfo_cur_freq
  cpufreq: blacklist mt8516 in cpufreq-dt-platdev
  cpufreq: mediatek: Add support for mt8167
  ...
2020-12-14 20:29:50 +01:00
Steven Rostedt (VMware)
adab66b71a Revert: "ring-buffer: Remove HAVE_64BIT_ALIGNED_ACCESS"
It was believed that metag was the only architecture that required the ring
buffer to keep 8 byte words aligned on 8 byte architectures, and with its
removal, it was assumed that the ring buffer code did not need to handle
this case. It appears that sparc64 also requires this.

The following was reported on a sparc64 boot up:

   kernel: futex hash table entries: 65536 (order: 9, 4194304 bytes, linear)
   kernel: Running postponed tracer tests:
   kernel: Testing tracer function:
   kernel: Kernel unaligned access at TPC[552a20] trace_function+0x40/0x140
   kernel: Kernel unaligned access at TPC[552a24] trace_function+0x44/0x140
   kernel: Kernel unaligned access at TPC[552a20] trace_function+0x40/0x140
   kernel: Kernel unaligned access at TPC[552a24] trace_function+0x44/0x140
   kernel: Kernel unaligned access at TPC[552a20] trace_function+0x40/0x140
   kernel: PASSED

Need to put back the 64BIT aligned code for the ring buffer.

Link: https://lore.kernel.org/r/CADxRZqzXQRYgKc=y-KV=S_yHL+Y8Ay2mh5ezeZUnpRvg+syWKw@mail.gmail.com

Cc: stable@vger.kernel.org
Fixes: 86b3de60a0b6 ("ring-buffer: Remove HAVE_64BIT_ALIGNED_ACCESS")
Reported-by: Anatoly Pugachev <matorola@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-12-14 12:33:51 -05:00
Qiujun Huang
74e2afc6df ring-buffer: Add rb_check_bpage in __rb_allocate_pages
It may be better to check each page is aligned by 4 bytes. The 2
least significant bits of the address will be used as flags.

Link: https://lkml.kernel.org/r/20201015113842.2921-1-hqjagain@gmail.com

Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-12-14 12:26:32 -05:00
Qiujun Huang
82db909e6b ring-buffer: Fix two typos in comments
s/inerrupting/interrupting/
s/beween/between/

Link: https://lkml.kernel.org/r/20201014152749.29986-1-hqjagain@gmail.com

Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-12-14 12:21:27 -05:00
Lukas Bulwahn
3b3493531c tracing: Drop unneeded assignment in ring_buffer_resize()
Since commit 0a1754b2a97e ("ring-buffer: Return 0 on success from
ring_buffer_resize()"), computing the size is not needed anymore.

Drop unneeded assignment in ring_buffer_resize().

Link: https://lkml.kernel.org/r/20201214084503.3079-1-lukas.bulwahn@gmail.com

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-12-14 12:09:54 -05:00
Masami Hiramatsu
60efe21e59 tracing: Disable ftrace selftests when any tracer is running
Disable ftrace selftests when any tracer (kernel command line options
like ftrace=, trace_events=, kprobe_events=, and boot-time tracing)
starts running because selftest can disturb it.

Currently ftrace= and trace_events= are checked, but kprobe_events
has a different flag, and boot-time tracing didn't checked. This unifies
the disabled flag and all of those boot-time tracing features sets
the flag.

This also fixes warnings on kprobe-event selftest
(CONFIG_FTRACE_STARTUP_TEST=y and CONFIG_KPROBE_EVENTS=y) with boot-time
tracing (ftrace.event.kprobes.EVENT.probes) like below;

[   59.803496] trace_kprobe: Testing kprobe tracing:
[   59.804258] ------------[ cut here ]------------
[   59.805682] WARNING: CPU: 3 PID: 1 at kernel/trace/trace_kprobe.c:1987 kprobe_trace_self_tests_ib
[   59.806944] Modules linked in:
[   59.807335] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.10.0-rc7+ #172
[   59.808029] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/204
[   59.808999] RIP: 0010:kprobe_trace_self_tests_init+0x5f/0x42b
[   59.809696] Code: e8 03 00 00 48 c7 c7 30 8e 07 82 e8 6d 3c 46 ff 48 c7 c6 00 b2 1a 81 48 c7 c7 7
[   59.812439] RSP: 0018:ffffc90000013e78 EFLAGS: 00010282
[   59.813038] RAX: 00000000ffffffef RBX: 0000000000000000 RCX: 0000000000049443
[   59.813780] RDX: 0000000000049403 RSI: 0000000000049403 RDI: 000000000002deb0
[   59.814589] RBP: ffffc90000013e90 R08: 0000000000000001 R09: 0000000000000001
[   59.815349] R10: 0000000000000001 R11: 0000000000000000 R12: 00000000ffffffef
[   59.816138] R13: ffff888004613d80 R14: ffffffff82696940 R15: ffff888004429138
[   59.816877] FS:  0000000000000000(0000) GS:ffff88807dcc0000(0000) knlGS:0000000000000000
[   59.817772] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   59.818395] CR2: 0000000001a8dd38 CR3: 0000000002222000 CR4: 00000000000006a0
[   59.819144] Call Trace:
[   59.819469]  ? init_kprobe_trace+0x6b/0x6b
[   59.819948]  do_one_initcall+0x5f/0x300
[   59.820392]  ? rcu_read_lock_sched_held+0x4f/0x80
[   59.820916]  kernel_init_freeable+0x22a/0x271
[   59.821416]  ? rest_init+0x241/0x241
[   59.821841]  kernel_init+0xe/0x10f
[   59.822251]  ret_from_fork+0x22/0x30
[   59.822683] irq event stamp: 16403349
[   59.823121] hardirqs last  enabled at (16403359): [<ffffffff810db81e>] console_unlock+0x48e/0x580
[   59.824074] hardirqs last disabled at (16403368): [<ffffffff810db786>] console_unlock+0x3f6/0x580
[   59.825036] softirqs last  enabled at (16403200): [<ffffffff81c0033a>] __do_softirq+0x33a/0x484
[   59.825982] softirqs last disabled at (16403087): [<ffffffff81a00f02>] asm_call_irq_on_stack+0x10
[   59.827034] ---[ end trace 200c544775cdfeb3 ]---
[   59.827635] trace_kprobe: error on probing function entry.

Link: https://lkml.kernel.org/r/160741764955.3448999.3347769358299456915.stgit@devnote2

Fixes: 4d655281eb1b ("tracing/boot Add kprobe event support")
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-12-14 12:05:03 -05:00
Petr Mladek
5ed37174e6 Merge branch 'for-5.11' into for-linus 2020-12-14 15:15:07 +01:00
Petr Mladek
5f3b8d3986 Merge branch 'for-5.11-null-console' into for-linus 2020-12-14 15:14:57 +01:00
Linus Torvalds
ec6f5e0e5c A set of x86 and membarrier fixes:
- Correct a few problems in the x86 and the generic membarrier
     implementation. Small corrections for assumptions about visibility
     which have turned out not to be true.
 
   - Make the PAT bits for memory encryption correct vs. 4K and 2M/1G page
     table entries as they are at a different location.
 
   - Fix a concurrency issue in the the local bandwidth readout of resource
     control leading to incorrect values
 
   - Fix the ordering of allocating a vector for an interrupt. The order
     missed to respect the provided cpumask when the first attempt of
     allocating node local in the mask fails. It then tries the node instead
     of trying the full provided mask first. This leads to erroneous error
     messages and breaking the (user) supplied affinity request. Reorder it.
 
   - Make the INT3 padding detection in optprobe work correctly.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/WOw0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYodUhEACW02Al+mlpZjOhzIGwNUALGX7YGEOi
 wJoiIDTV2vOtKhcJ/AOVn05CznoWXa6PN1+7pm0nR2M1XaiXZ3B6HMjy1+0rpSqK
 5dCCEqr+QowwMRsKd4hnuJowR/7Og66FiYIcknDJg/1egg+RXy7yKds577W6/KW0
 2VV/x35xDywERiQ28qIftQ4L7NREyiioomN/GFpeoQf8tQF06Rb4t12T5pdA6D8S
 /C1wj0IKVRMJo/7HbB/X6skxPOK0PWAEBpaT0+Q66VKSNnFVN5ap+rGTKdUoTmuM
 NdxHOukpHyFCQtbRPOjRUeeSZJLKSX5oXsBO1GUvyyxcT2XNlTOdNamYLWyO+RfV
 uP97qhzYDKL+cgDkwypJ0WOzOb9EIXOh4P9BTnJFBGhc4EQwen3cpb3CyWWftjnv
 /obXiRDnAOE6P6H2AGiwrK3Pny+SvgrFYKMe+iy+ntToz1yrDh7ZrZ0DQKVoFWEr
 n3qUnlPZmVvRzHIRVYoK69nS/UgmNN0LssavzRBzab3BcK93f23QkW86P42kNCLa
 9kL+ZgGwlpqUZZR/p3pHq9Mv2ZXGEdUYY99h2vy1fgMwOw/RQZnPDAj/131UXlsU
 4DL0mAlTK1/0/81H6/V1GDBkMkO+hN4x6Y3asi7bPHEKEzlLe+P1tUt3YELd3CEC
 dc8ebICG1InXsw==
 =t31y
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2020-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A set of x86 and membarrier fixes:

   - Correct a few problems in the x86 and the generic membarrier
     implementation. Small corrections for assumptions about visibility
     which have turned out not to be true.

   - Make the PAT bits for memory encryption correct vs 4K and 2M/1G
     page table entries as they are at a different location.

   - Fix a concurrency issue in the the local bandwidth readout of
     resource control leading to incorrect values

   - Fix the ordering of allocating a vector for an interrupt. The order
     missed to respect the provided cpumask when the first attempt of
     allocating node local in the mask fails. It then tries the node
     instead of trying the full provided mask first. This leads to
     erroneous error messages and breaking the (user) supplied affinity
     request. Reorder it.

   - Make the INT3 padding detection in optprobe work correctly"

* tag 'x86-urgent-2020-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/kprobes: Fix optprobe to detect INT3 padding correctly
  x86/apic/vector: Fix ordering in vector assignment
  x86/resctrl: Fix incorrect local bandwidth when mba_sc is enabled
  x86/mm/mem_encrypt: Fix definition of PMD_FLAGS_DEC_WP
  membarrier: Execute SYNC_CORE on the calling thread
  membarrier: Explicitly sync remote cores when SYNC_CORE is requested
  membarrier: Add an actual barrier before rseq_preempt()
  x86/membarrier: Get rid of a dubious optimization
2020-12-13 11:31:19 -08:00
Jens Axboe
e296dc4996 kernel: remove checking for TIF_NOTIFY_SIGNAL
It's available everywhere now, no need to check or add dummy defines.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-12 09:17:38 -07:00
Jens Axboe
98b89b649f signal: kill JOBCTL_TASK_WORK
It's no longer used, get rid of it.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-12 09:17:38 -07:00
Jens Axboe
03941ccfda task_work: remove legacy TWA_SIGNAL path
All archs now support TIF_NOTIFY_SIGNAL.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-12 09:17:38 -07:00
Jakub Kicinski
46d5e62dd3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
xdp_return_frame_bulk() needs to pass a xdp_buff
to __xdp_return().

strlcpy got converted to strscpy but here it makes no
functional difference, so just keep the right code.

Conflicts:
	net/netfilter/nf_tables_api.c

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-11 22:29:38 -08:00
Thomas Gleixner
aa3b66f401 tick/sched: Make jiffies update quick check more robust
The quick check in tick_do_update_jiffies64() whether jiffies need to be
updated is not really correct under all circumstances and on all
architectures, especially not on 32bit systems.

The quick check does:

    if (now < READ_ONCE(tick_next_period))
    	return;

and the counterpart in the update is:

    WRITE_ONCE(tick_next_period, next_update_time);

This has two problems:

  1) On weakly ordered architectures there is no guarantee that the stores
     before the WRITE_ONCE() are visible which means that other CPUs can
     operate on a stale jiffies value.

  2) On 32bit the store of tick_next_period which is an u64 is split into
     two 32bit stores. If the first 32bit store advances tick_next_period
     far out and the second 32bit store is delayed (virt, NMI ...) then
     jiffies will become stale until the second 32bit store happens.

Address this by seperating the handling for 32bit and 64bit.

On 64bit problem #1 is addressed by replacing READ_ONCE() / WRITE_ONCE()
with smp_load_acquire() / smp_store_release().

On 32bit problem #2 is addressed by protecting the quick check with the
jiffies sequence counter. The load and stores can be plain because the
sequence count mechanics provides the required barriers already.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/87czzpc02w.fsf@nanos.tec.linutronix.de
2020-12-11 23:19:10 +01:00
Andrii Nakryiko
b7906b70a2 bpf: Fix enum names for bpf_this_cpu_ptr() and bpf_per_cpu_ptr() helpers
Remove bpf_ prefix, which causes these helpers to be reported in verifier
dump as bpf_bpf_this_cpu_ptr() and bpf_bpf_per_cpu_ptr(), respectively. Lets
fix it as long as it is still possible before UAPI freezes on these helpers.

Fixes: eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11 14:19:07 -08:00
Arnd Bergmann
6e7b64b9dd elfcore: fix building with clang
kernel/elfcore.c only contains weak symbols, which triggers a bug with
clang in combination with recordmcount:

  Cannot find symbol for section 2: .text.
  kernel/elfcore.o: failed

Move the empty stubs into linux/elfcore.h as inline functions.  As only
two architectures use these, just use the architecture specific Kconfig
symbols to key off the declaration.

Link: https://lkml.kernel.org/r/20201204165742.3815221-2-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Barret Rhoden <brho@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11 14:02:14 -08:00
Ashish Kalra
e998879d4f x86,swiotlb: Adjust SWIOTLB bounce buffer size for SEV guests
For SEV, all DMA to and from guest has to use shared (un-encrypted) pages.
SEV uses SWIOTLB to make this happen without requiring changes to device
drivers.  However, depending on the workload being run, the default 64MB
of it might not be enough and it may run out of buffers to use for DMA,
resulting in I/O errors and/or performance degradation for high
I/O workloads.

Adjust the default size of SWIOTLB for SEV guests using a
percentage of the total memory available to guest for the SWIOTLB buffers.

Adds a new sev_setup_arch() function which is invoked from setup_arch()
and it calls into a new swiotlb generic code function swiotlb_adjust_size()
to do the SWIOTLB buffer adjustment.

v5 fixed build errors and warnings as
Reported-by: kbuild test robot <lkp@intel.com>

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Co-developed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2020-12-11 15:43:41 -05:00
Rafael J. Wysocki
90ac908a41 cpufreq: schedutil: Simplify sugov_update_next_freq()
Rearrange a conditional to make it more straightforward.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2020-12-11 19:53:58 +01:00
John Garry
1d3aec8928 genirq/affinity: Add irq_update_affinity_desc()
Add a function to allow the affinity of an interrupt be switched to
managed, such that interrupts allocated for platform devices may be
managed.

This new interface has certain limitations, and attempts to use it in the
following circumstances will fail:
- For when the kernel is configured for generic IRQ reservation mode (in
  config GENERIC_IRQ_RESERVATION_MODE). The reason being that it could
  conflict with managed vs. non-managed interrupt accounting.
- The interrupt is already started, which should not be the case during
  init
- The interrupt is already configured as managed, which means double init

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/1606905417-183214-2-git-send-email-john.garry@huawei.com
2020-12-11 14:47:50 +00:00
Valentin Schneider
b388fa5014 Revert "genirq: Add fasteoi IPI flow"
handle_percpu_devid_fasteoi_ipi() has no more users, and
handle_percpu_devid_irq() can do all that it was supposed to do. Get rid of
it.

This reverts commit c5e5ec033c4ab25c53f1fd217849e75deb0bf7bf.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201109094121.29975-6-valentin.schneider@arm.com
2020-12-11 14:47:50 +00:00
Thomas Gleixner
76e87d96b3 ntp: Consolidate the RTC update implementation
The code for the legacy RTC and the RTC class based update are pretty much
the same. Consolidate the common parts into one function and just invoke
the actual setter functions.

For RTC class based devices the update code checks whether the offset is
valid for the device, which is usually not the case for the first
invocation. If it's not the same it stores the correct offset and lets the
caller try again. That's not much different from the previous approach
where the first invocation had a pretty low probability to actually hit the
allowed window.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201206220542.355743355@linutronix.de
2020-12-11 10:40:53 +01:00
Thomas Gleixner
69eca258c8 ntp: Make the RTC sync offset less obscure
The current RTC set_offset_nsec value is not really intuitive to
understand. 

  tsched       twrite(t2.tv_sec - 1) 	 t2 (seconds increment)

The offset is calculated from twrite based on the assumption that t2 -
twrite == 1s. That means for the MC146818 RTC the offset needs to be
negative so that the write happens 500ms before t2.

It's easier to understand when the whole calculation is based on t2. That
avoids negative offsets and the meaning is obvious:

 t2 - twrite:     The time defined by the chip when seconds increment
      		  after the write.

 twrite - tsched: The time for the transport to the point where the chip
 	  	  is updated. 

==> set_offset_nsec =  t2 - tsched
    ttransport      =  twrite - tsched
    tRTCinc         =  t2 - twrite
==> set_offset_nsec =  ttransport + tRTCinc

tRTCinc is a chip property and can be obtained from the data sheet.

ttransport depends on how the RTC is connected. It is close to 0 for
directly accessible RTCs. For RTCs behind a slow bus, e.g. i2c, it's the
time required to send the update over the bus. This can be estimated or
even calibrated, but that's a different problem.

Adjust the implementation and update comments accordingly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201206220542.263204937@linutronix.de
2020-12-11 10:40:53 +01:00
Thomas Gleixner
33e62e8323 ntp, rtc: Move rtc_set_ntp_time() to ntp code
rtc_set_ntp_time() is not really RTC functionality as the code is just a
user of RTC. Move it into the NTP code which allows further cleanups.

Requested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201206220542.166871172@linutronix.de
2020-12-11 10:40:52 +01:00
Thomas Gleixner
c9e6189fb0 ntp: Make the RTC synchronization more reliable
Miroslav reported that the periodic RTC synchronization in the NTP code
fails more often than not to hit the specified update window.

The reason is that the code uses delayed_work to schedule the update which
needs to be in thread context as the underlying RTC might be connected via
a slow bus, e.g. I2C. In the update function it verifies whether the
current time is correct vs. the requirements of the underlying RTC.

But delayed_work is using the timer wheel for scheduling which is
inaccurate by design. Depending on the distance to the expiry the wheel
gets less granular to allow batching and to avoid the cascading of the
original timer wheel. See 500462a9de65 ("timers: Switch to a non-cascading
wheel") and the code for further details.

The code already deals with this by splitting the 660 seconds period into a
long 659 seconds timer and then retrying with a smaller delta.

But looking at the actual granularities of the timer wheel (which depend on
the HZ configuration) the 659 seconds timer ends up in an outer wheel level
and is affected by a worst case granularity of:

HZ          Granularity
1000        32s
 250        16s
 100        40s

So the initial timer can be already off by max 12.5% which is not a big
issue as the period of the sync is defined as ~11 minutes.

The fine grained second attempt schedules to the desired update point with
a timer expiring less than a second from now. Depending on the actual delta
and the HZ setting even the second attempt can end up in outer wheel levels
which have a large enough granularity to make the correctness check fail.

As this is a fundamental property of the timer wheel there is no way to
make this more accurate short of iterating in one jiffies steps towards the
update point.

Switch it to an hrtimer instead which schedules the actual update work. The
hrtimer will expire precisely (max 1 jiffie delay when high resolution
timers are not available). The actual scheduling delay of the work is the
same as before.

The update is triggered from do_adjtimex() which is a bit racy but not much
more racy than it was before:

     if (ntp_synced())
     	queue_delayed_work(system_power_efficient_wq, &sync_work, 0);

which is racy when the work is currently executed and has not managed to
reschedule itself.

This becomes now:

     if (ntp_synced() && !hrtimer_is_queued(&sync_hrtimer))
     	queue_work(system_power_efficient_wq, &sync_work, 0);

which is racy when the hrtimer has expired and the work is currently
executed and has not yet managed to rearm the hrtimer.

Not a big problem as it just schedules work for nothing.

The new implementation has a safe guard in place to catch the case where
the hrtimer is queued on entry to the work function and avoids an extra
update attempt of the RTC that way.

Reported-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Miroslav Lichvar <mlichvar@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201206220542.062910520@linutronix.de
2020-12-11 10:40:52 +01:00
Barry Song
5b78f2dc31 sched/fair: Trivial correction of the newidle_balance() comment
idle_balance() has been renamed to newidle_balance(). To differentiate
with nohz_idle_balance, it seems refining the comment will be helpful
for the readers of the code.

Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20201202220641.22752-1-song.bao.hua@hisilicon.com
2020-12-11 10:30:44 +01:00
Mel Gorman
13d5a5e9f9 sched/fair: Clear SMT siblings after determining the core is not idle
The clearing of SMT siblings from the SIS mask before checking for an idle
core is a small but unnecessary cost. Defer the clearing of the siblings
until the scan moves to the next potential target. The cost of this was
not measured as it is borderline noise but it should be self-evident.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/20201130144020.GS3371@techsingularity.net
2020-12-11 10:30:38 +01:00
Mauro Carvalho Chehab
59a74b1544 sched: Fix kernel-doc markup
Kernel-doc requires that a kernel-doc markup to be immediately
below the function prototype, as otherwise it will rename it.
So, move sys_sched_yield() markup to the right place.

Also fix the cpu_util() markup: Kernel-doc markups
should use this format:
        identifier - description

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/50cd6f460aeb872ebe518a8e9cfffda2df8bdb0a.1606823973.git.mchehab+huawei@kernel.org
2020-12-11 10:30:31 +01:00
Linus Torvalds
4d31058b82 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) IPsec compat fixes, from Dmitry Safonov.

 2) Fix memory leak in xfrm_user_policy(). Fix from Yu Kuai.

 3) Fix polling in xsk sockets by using sk_poll_wait() instead of
    datagram_poll() which keys off of sk_wmem_alloc and such which xsk
    sockets do not update. From Xuan Zhuo.

 4) Missing init of rekey_data in cfgh80211, from Sara Sharon.

 5) Fix destroy of timer before init, from Davide Caratti.

 6) Missing CRYPTO_CRC32 selects in ethernet driver Kconfigs, from Arnd
    Bergmann.

 7) Missing error return in rtm_to_fib_config() switch case, from Zhang
    Changzhong.

 8) Fix some src/dest address handling in vrf and add a testcase. From
    Stephen Suryaputra.

 9) Fix multicast handling in Seville switches driven by mscc-ocelot
    driver. From Vladimir Oltean.

10) Fix proto value passed to skb delivery demux in udp, from Xin Long.

11) HW pkt counters not reported correctly in enetc driver, from Claudiu
    Manoil.

12) Fix deadlock in bridge, from Joseph Huang.

13) Missing of_node_pur() in dpaa2 driver, fromn Christophe JAILLET.

14) Fix pid fetching in bpftool when there are a lot of results, from
    Andrii Nakryiko.

15) Fix long timeouts in nft_dynset, from Pablo Neira Ayuso.

16) Various stymmac fixes, from Fugang Duan.

17) Fix null deref in tipc, from Cengiz Can.

18) When mss is biog, coose more resonable rcvq_space in tcp, fromn Eric
    Dumazet.

19) Revert a geneve change that likely isnt necessary, from Jakub
    Kicinski.

20) Avoid premature rx buffer reuse in various Intel driversm from Björn
    Töpel.

21) retain EcT bits during TIS reflection in tcp, from Wei Wang.

22) Fix Tso deferral wrt. cwnd limiting in tcp, from Neal Cardwell.

23) MPLS_OPT_LSE_LABEL attribute is 342 ot 8 bits, from Guillaume Nault

24) Fix propagation of 32-bit signed bounds in bpf verifier and add test
    cases, from Alexei Starovoitov.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits)
  selftests: fix poll error in udpgro.sh
  selftests/bpf: Fix "dubious pointer arithmetic" test
  selftests/bpf: Fix array access with signed variable test
  selftests/bpf: Add test for signed 32-bit bound check bug
  bpf: Fix propagation of 32-bit signed bounds from 64-bit bounds.
  MAINTAINERS: Add entry for Marvell Prestera Ethernet Switch driver
  net: sched: Fix dump of MPLS_OPT_LSE_LABEL attribute in cls_flower
  net/mlx4_en: Handle TX error CQE
  net/mlx4_en: Avoid scheduling restart task if it is already running
  tcp: fix cwnd-limited bug for TSO deferral where we send nothing
  net: flow_offload: Fix memory leak for indirect flow block
  tcp: Retain ECT bits for tos reflection
  ethtool: fix stack overflow in ethnl_parse_bitset()
  e1000e: fix S0ix flow to allow S0i3.2 subset entry
  ice: avoid premature Rx buffer reuse
  ixgbe: avoid premature Rx buffer reuse
  i40e: avoid premature Rx buffer reuse
  igb: avoid transmit queue timeout in xdp path
  igb: use xdp_do_flush
  igb: skb add metasize for xdp
  ...
2020-12-10 15:30:13 -08:00
David S. Miller
d9838b1d39 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2020-12-10

The following pull-request contains BPF updates for your *net* tree.

We've added 21 non-merge commits during the last 12 day(s) which contain
a total of 21 files changed, 163 insertions(+), 88 deletions(-).

The main changes are:

1) Fix propagation of 32-bit signed bounds from 64-bit bounds, from Alexei.

2) Fix ring_buffer__poll() return value, from Andrii.

3) Fix race in lwt_bpf, from Cong.

4) Fix test_offload, from Toke.

5) Various xsk fixes.

Please consider pulling these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git

Thanks a lot!

Also thanks to reporters, reviewers and testers of commits in this pull-request:

Cong Wang, Hulk Robot, Jakub Kicinski, Jean-Philippe Brucker, John
Fastabend, Magnus Karlsson, Maxim Mikityanskiy, Yonghong Song
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10 14:29:30 -08:00
Alexei Starovoitov
b02709587e bpf: Fix propagation of 32-bit signed bounds from 64-bit bounds.
The 64-bit signed bounds should not affect 32-bit signed bounds unless the
verifier knows that upper 32-bits are either all 1s or all 0s. For example the
register with smin_value==1 doesn't mean that s32_min_value is also equal to 1,
since smax_value could be larger than 32-bit subregister can hold.
The verifier refines the smax/s32_max return value from certain helpers in
do_refine_retval_range(). Teach the verifier to recognize that smin/s32_min
value is also bounded. When both smin and smax bounds fit into 32-bit
subregister the verifier can propagate those bounds.

Fixes: 3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking")
Reported-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-12-10 13:02:53 -08:00
Eric W. Biederman
f7cfd871ae exec: Transform exec_update_mutex into a rw_semaphore
Recently syzbot reported[0] that there is a deadlock amongst the users
of exec_update_mutex.  The problematic lock ordering found by lockdep
was:

   perf_event_open  (exec_update_mutex -> ovl_i_mutex)
   chown            (ovl_i_mutex       -> sb_writes)
   sendfile         (sb_writes         -> p->lock)
     by reading from a proc file and writing to overlayfs
   proc_pid_syscall (p->lock           -> exec_update_mutex)

While looking at possible solutions it occured to me that all of the
users and possible users involved only wanted to state of the given
process to remain the same.  They are all readers.  The only writer is
exec.

There is no reason for readers to block on each other.  So fix
this deadlock by transforming exec_update_mutex into a rw_semaphore
named exec_update_lock that only exec takes for writing.

Cc: Jann Horn <jannh@google.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Bernd Edlinger <bernd.edlinger@hotmail.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Christopher Yeoh <cyeoh@au1.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Fixes: eea9673250db ("exec: Add exec_update_mutex to replace cred_guard_mutex")
[0] https://lkml.kernel.org/r/00000000000063640c05ade8e3de@google.com
Reported-by: syzbot+db9cdf3dd1f64252c6ef@syzkaller.appspotmail.com
Link: https://lkml.kernel.org/r/87ft4mbqen.fsf@x220.int.ebiederm.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2020-12-10 13:13:32 -06:00