34108 Commits

Author SHA1 Message Date
Linus Torvalds
ab5c60b79a Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Add support for allocating transforms on a specific NUMA Node
   - Introduce the flag CRYPTO_ALG_ALLOCATES_MEMORY for storage users

  Algorithms:
   - Drop PMULL based ghash on arm64
   - Fixes for building with clang on x86
   - Add sha256 helper that does the digest in one go
   - Add SP800-56A rev 3 validation checks to dh

  Drivers:
   - Permit users to specify NUMA node in hisilicon/zip
   - Add support for i.MX6 in imx-rngc
   - Add sa2ul crypto driver
   - Add BA431 hwrng driver
   - Add Ingenic JZ4780 and X1000 hwrng driver
   - Spread IRQ affinity in inside-secure and marvell/cesa"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (157 commits)
  crypto: sa2ul - Fix inconsistent IS_ERR and PTR_ERR
  hwrng: core - remove redundant initialization of variable ret
  crypto: x86/curve25519 - Remove unused carry variables
  crypto: ingenic - Add hardware RNG for Ingenic JZ4780 and X1000
  dt-bindings: RNG: Add Ingenic RNG bindings.
  crypto: caam/qi2 - add module alias
  crypto: caam - add more RNG hw error codes
  crypto: caam/jr - remove incorrect reference to caam_jr_register()
  crypto: caam - silence .setkey in case of bad key length
  crypto: caam/qi2 - create ahash shared descriptors only once
  crypto: caam/qi2 - fix error reporting for caam_hash_alloc
  crypto: caam - remove deadcode on 32-bit platforms
  crypto: ccp - use generic power management
  crypto: xts - Replace memcpy() invocation with simple assignment
  crypto: marvell/cesa - irq balance
  crypto: inside-secure - irq balance
  crypto: ecc - SP800-56A rev 3 local public key validation
  crypto: dh - SP800-56A rev 3 local public key validation
  crypto: dh - check validity of Z before export
  lib/mpi: Add mpi_sub_ui()
  ...
2020-08-03 10:40:14 -07:00
Zhaoyang Huang
0f69dae4d1 trace : Have tracing buffer info use kvzalloc instead of kzalloc
High order memory stuff within trace could introduce OOM, use kvzalloc instead.

Please find the bellowing for the call stack we run across in an android system.
The scenario happens when traced_probes is woken up to get a large quantity of
trace even if free memory is even higher than watermark_low. 

traced_probes invoked oom-killer: gfp_mask=0x140c0c0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),  order=2, oom_score_adj=-1

traced_probes cpuset=system-background mems_allowed=0
CPU: 3 PID: 588 Comm: traced_probes Tainted: G        W  O    4.14.181 #1
Hardware name: Generic DT based system
(unwind_backtrace) from [<c010d824>] (show_stack+0x20/0x24)
(show_stack) from [<c0b2e174>] (dump_stack+0xa8/0xec)
(dump_stack) from [<c027d584>] (dump_header+0x9c/0x220)
(dump_header) from [<c027cfe4>] (oom_kill_process+0xc0/0x5c4)
(oom_kill_process) from [<c027cb94>] (out_of_memory+0x220/0x310)
(out_of_memory) from [<c02816bc>] (__alloc_pages_nodemask+0xff8/0x13a4)
(__alloc_pages_nodemask) from [<c02a6a1c>] (kmalloc_order+0x30/0x48)
(kmalloc_order) from [<c02a6a64>] (kmalloc_order_trace+0x30/0x118)
(kmalloc_order_trace) from [<c0223d7c>] (tracing_buffers_open+0x50/0xfc)
(tracing_buffers_open) from [<c02e6f58>] (do_dentry_open+0x278/0x34c)
(do_dentry_open) from [<c02e70d0>] (vfs_open+0x50/0x70)
(vfs_open) from [<c02f7c24>] (path_openat+0x5fc/0x169c)
(path_openat) from [<c02f75c4>] (do_filp_open+0x94/0xf8)
(do_filp_open) from [<c02e7650>] (do_sys_open+0x168/0x26c)
(do_sys_open) from [<c02e77bc>] (SyS_openat+0x34/0x38)
(SyS_openat) from [<c0108bc0>] (ret_fast_syscall+0x0/0x28)

Link: https://lkml.kernel.org/r/1596155265-32365-1-git-send-email-zhaoyang.huang@unisoc.com

Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-08-03 11:52:20 -04:00
Thomas Gleixner
3d5128c1de irqchip updates for Linux 5.9
- Add infrastructure to allow DT irqchip platform drivers to
   be built as modules
 - Allow qcom-pdc, mtk-cirq and mtk-sysirq to be built as module
 - Fix ACPI probing to avoid abusing function pointer casting
 - Allow bcm7120-l2 and brcmstb-l2 to be used as wake-up sources
 - Teach NXP's IMX INTMUX some power management
 - Allow stm32-exti to be used as a hierarchical irqchip
 - Let stm32-exti use the hw spinlock API in its full glory
 - A couple of GICv4.1 fixes
 - Tons of cleanups (mtk-sysirq, aic5, bcm7038-l1, imx-intmux,
   brcmstb-l2, ativic32, ti-sci-inta, lonsoon, MIPS GIC, GICv3)
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl8n5hEPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDcE8P/1qNZD3riUrljI/LubsT13ernJ8jeSv658Xp
 YYZ1ItJ9I5Bwcwi/mqrQCULmHWXBVtXIGU7mzaFAXskfVR09tjmmMHbVyB+AT9OR
 C4zH2+G0Hl8axYtQwDrUP/klCLy9GDPvTPTFhmX3eiOwfEGXfBD5bw0Za9lQJ2OL
 SttVxYp/4xJQli7LvOFJ8RrvF9egW5O0mbGTKGhwi+yBEuFanJw5xwn3PYHaApLk
 gpxdcESZskZo6CaKUVFCVr+/t/P6hO2aGv+y4QQMzC3g/wr6evkxYrFZuc3lWtku
 UieGwxfTS1PA16h9ndwXdH6JIlbaynsHkeCY+xKNqwTE+wf4pDdP2zsUjsf8NPBy
 BupyajOpQ1T3m4G4Y6DymoEb+7LyJUddSL0kuFSRd33Y0pf9BskYlHycAkXhCzLZ
 8kZp09SLh6ujRCjjgtHyfOw0/0ZuVmNlt6v/DdoLOAN228smH5KIdwXb46wbox1o
 hFyvPOg1BuGIpDLET+qja+ajZHkPbPBQKsfbG0xWfGOhlYNnMyd8L3RL/IkEuunQ
 RVKpHQTXYOfWpV2apklGzZP6XiYyEYF5cIiP7ECAqbcOTTX1JDghbsXNHdt1/L+Y
 NEwJYk2C7XFOqaOx6ZGffxrA2dkr9jE47aRr5WarYcOHOBBksoL4qZs3HHSvFb94
 2FjSVo+U
 =hgPS
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates from Marc Zyngier:

 - Add infrastructure to allow DT irqchip platform drivers to
   be built as modules
 - Allow qcom-pdc, mtk-cirq and mtk-sysirq to be built as module
 - Fix ACPI probing to avoid abusing function pointer casting
 - Allow bcm7120-l2 and brcmstb-l2 to be used as wake-up sources
 - Teach NXP's IMX INTMUX some power management
 - Allow stm32-exti to be used as a hierarchical irqchip
 - Let stm32-exti use the hw spinlock API in its full glory
 - A couple of GICv4.1 fixes
 - Tons of cleanups (mtk-sysirq, aic5, bcm7038-l1, imx-intmux,
   brcmstb-l2, ativic32, ti-sci-inta, lonsoon, MIPS GIC, GICv3)
2020-08-03 14:33:23 +02:00
Rafael J. Wysocki
86ba54fb08 Merge branches 'pm-sleep', 'pm-domains', 'powercap' and 'pm-tools'
* pm-sleep:
  PM: sleep: spread "const char *" correctness
  PM: hibernate: fix white space in a few places
  freezer: Add unsafe version of freezable_schedule_timeout_interruptible() for NFS
  PM: sleep: core: Emit changed uevent on wakeup_sysfs_add/remove

* pm-domains:
  PM: domains: Restore comment indentation for generic_pm_domain.child_links
  PM: domains: Fix up terminology with parent/child

* powercap:
  powercap: Add Power Limit4 support
  powercap: idle_inject: Replace play_idle() with play_idle_precise() in comments
  powercap: intel_rapl: add support for Sapphire Rapids

* pm-tools:
  pm-graph v5.7 - important s2idle fixes
  cpupower: Replace HTTP links with HTTPS ones
  cpupower: Fix NULL but dereferenced coccicheck errors
  cpupower: Fix comparing pointer to 0 coccicheck warns
2020-08-03 13:12:44 +02:00
Rafael J. Wysocki
c81b30c895 Merge branch 'pm-cpufreq'
* pm-cpufreq: (24 commits)
  cpufreq: intel_pstate: Fix EPP setting via sysfs in active mode
  cpufreq: intel_pstate: Rearrange the storing of new EPP values
  cpufreq: intel_pstate: Avoid enabling HWP if EPP is not supported
  cpufreq: intel_pstate: Clean up aperf_mperf_shift description
  cpufreq: powernv: Make some symbols static
  cpufreq: amd_freq_sensitivity: Mark sometimes used ID structs as __maybe_unused
  cpufreq: intel_pstate: Supply struct attribute description for get_aperf_mperf_shift()
  cpufreq: pcc-cpufreq: Mark sometimes used ID structs as __maybe_unused
  cpufreq: powernow-k8: Mark 'hi' and 'lo' dummy variables as __always_unused
  cpufreq: acpi-cpufreq: Mark sometimes used ID structs as __maybe_unused
  cpufreq: acpi-cpufreq: Mark 'dummy' variable as __always_unused
  cpufreq: powernv-cpufreq: Fix a bunch of kerneldoc related issues
  cpufreq: pasemi: Include header file for {check,restore}_astate prototypes
  cpufreq: cpufreq_governor: Demote store_sampling_rate() header to standard comment block
  cpufreq: cpufreq: Demote lots of function headers unworthy of kerneldoc status
  cpufreq: freq_table: Demote obvious misuse of kerneldoc to standard comment blocks
  cpufreq: Replace HTTP links with HTTPS ones
  cpufreq: intel_pstate: Fix static checker warning for epp variable
  cpufreq: Remove the weakly defined cpufreq_default_governor()
  cpufreq: Specify default governor on command line
  ...
2020-08-03 13:12:36 +02:00
Rafael J. Wysocki
5b5642075c Merge branches 'pm-em' and 'pm-core'
* pm-em:
  OPP: refactor dev_pm_opp_of_register_em() and update related drivers
  Documentation: power: update Energy Model description
  PM / EM: change name of em_pd_energy to em_cpu_energy
  PM / EM: remove em_register_perf_domain
  PM / EM: add support for other devices than CPUs in Energy Model
  PM / EM: update callback structure and add device pointer
  PM / EM: introduce em_dev_register_perf_domain function
  PM / EM: change naming convention from 'capacity' to 'performance'

* pm-core:
  mmc: jz4740: Use pm_ptr() macro
  PM: Make *_DEV_PM_OPS macros use __maybe_unused
  PM: core: introduce pm_ptr() macro
2020-08-03 13:11:39 +02:00
Ingo Molnar
992414a18c Merge branch 'locking/nmi' into locking/core, to pick up completed topic branch
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-08-03 13:00:27 +02:00
Linus Torvalds
c6fe44d96f list: add "list_del_init_careful()" to go with "list_empty_careful()"
That gives us ordering guarantees around the pair.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-02 20:39:44 -07:00
David S. Miller
bd0b33b248 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Resolved kernel/bpf/btf.c using instructions from merge commit
69138b34a7248d2396ab85c8652e20c0c39beaba

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-02 01:02:12 -07:00
Andrii Nakryiko
73b11c2ab0 bpf: Add support for forced LINK_DETACH command
Add LINK_DETACH command to force-detach bpf_link without destroying it. It has
the same behavior as auto-detaching of bpf_link due to cgroup dying for
bpf_cgroup_link or net_device being destroyed for bpf_xdp_link. In such case,
bpf_link is still a valid kernel object, but is defuncts and doesn't hold BPF
program attached to corresponding BPF hook. This functionality allows users
with enough access rights to manually force-detach attached bpf_link without
killing respective owner process.

This patch implements LINK_DETACH for cgroup, xdp, and netns links, mostly
re-using existing link release handling code.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200731182830.286260-2-andriin@fb.com
2020-08-01 20:38:28 -07:00
Linus Torvalds
ac3a0c8472 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Encap offset calculation is incorrect in esp6, from Sabrina Dubroca.

 2) Better parameter validation in pfkey_dump(), from Mark Salyzyn.

 3) Fix several clang issues on powerpc in selftests, from Tanner Love.

 4) cmsghdr_from_user_compat_to_kern() uses the wrong length, from Al
    Viro.

 5) Out of bounds access in mlx5e driver, from Raed Salem.

 6) Fix transfer buffer memleak in lan78xx, from Johan Havold.

 7) RCU fixups in rhashtable, from Herbert Xu.

 8) Fix ipv6 nexthop refcnt leak, from Xiyu Yang.

 9) vxlan FDB dump must be done under RCU, from Ido Schimmel.

10) Fix use after free in mlxsw, from Ido Schimmel.

11) Fix map leak in HASH_OF_MAPS bpf code, from Andrii Nakryiko.

12) Fix bug in mac80211 Tx ack status reporting, from Vasanthakumar
    Thiagarajan.

13) Fix memory leaks in IPV6_ADDRFORM code, from Cong Wang.

14) Fix bpf program reference count leaks in mlx5 during
    mlx5e_alloc_rq(), from Xin Xiong.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (86 commits)
  vxlan: fix memleak of fdb
  rds: Prevent kernel-infoleak in rds_notify_queue_get()
  net/sched: The error lable position is corrected in ct_init_module
  net/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rq
  net/mlx5e: E-Switch, Specify flow_source for rule with no in_port
  net/mlx5e: E-Switch, Add misc bit when misc fields changed for mirroring
  net/mlx5e: CT: Support restore ipv6 tunnel
  net: gemini: Fix missing clk_disable_unprepare() in error path of gemini_ethernet_port_probe()
  ionic: unlock queue mutex in error path
  atm: fix atm_dev refcnt leaks in atmtcp_remove_persistent
  net: ethernet: mtk_eth_soc: fix MTU warnings
  net: nixge: fix potential memory leak in nixge_probe()
  devlink: ignore -EOPNOTSUPP errors on dumpit
  rxrpc: Fix race between recvmsg and sendmsg on immediate call failure
  MAINTAINERS: Replace Thor Thayer as Altera Triple Speed Ethernet maintainer
  selftests/bpf: fix netdevsim trap_flow_action_cookie read
  ipv6: fix memory leaks on IPV6_ADDRFORM path
  net/bpfilter: Initialize pos in __bpfilter_process_sockopt
  igb: reinit_locked() should be called with rtnl_lock
  e1000e: continue to init PHY even when failed to disable ULP
  ...
2020-08-01 16:47:24 -07:00
Linus Torvalds
0ae3495b65 for-linus-2020-08-01
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXyXDTQAKCRCRxhvAZXjc
 olxlAQDCiyWstd8pmtyX4vuaoyDZ6re6P/TCr3mzr6tQyux/zgD/chlfAvJdyzk8
 2Tw44odp3gF5EfzF+5wx2whZZPfVrQY=
 =Hv2c
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-2020-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull thread fix from Christian Brauner:
 "A simple spelling fix for dequeue_synchronous_signal()"

* tag 'for-linus-2020-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  signal: fix typo in dequeue_synchronous_signal()
2020-08-01 16:40:59 -07:00
Christoph Hellwig
ef1dac6021 modules: return licensing information from find_symbol
Report the GPLONLY status through a new argument.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2020-08-01 16:05:02 +02:00
Christoph Hellwig
cd8732cdcc modules: rename the licence field in struct symsearch to license
Use the same spelling variant as the rest of the file.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2020-08-01 16:05:02 +02:00
Christoph Hellwig
34e64705ad modules: unexport __module_address
__module_address is only used by built-in code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2020-08-01 16:05:01 +02:00
Christoph Hellwig
3fe1e56d0e modules: unexport __module_text_address
__module_text_address is only used by built-in code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2020-08-01 16:05:00 +02:00
Christoph Hellwig
a54e04914c modules: mark each_symbol_section static
each_symbol_section is only used inside of module.c.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2020-08-01 16:05:00 +02:00
Christoph Hellwig
773110470e modules: mark find_symbol static
find_symbol is only used in module.c.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2020-08-01 16:04:59 +02:00
Christoph Hellwig
7ef5264de7 modules: mark ref_module static
ref_module isn't used anywhere outside of module.c.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2020-08-01 16:04:55 +02:00
Ingo Molnar
63722bbca6 Merge branch 'kcsan' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into locking/core
Pull v5.9 KCSAN bits from Paul E. McKenney.

Perhaps the most important change is that GCC 11 now has all fixes in place
to support KCSAN, so GCC support can be enabled again.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-08-01 09:26:27 +02:00
Valentin Schneider
f4470cdf10 sched: Document arch_scale_*_capacity()
Rather that hide their purpose in some dark, damp corner of Documentation/,
add some documentation to the default implementations.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200731192016.7484-2-valentin.schneider@arm.com
2020-08-01 09:19:43 +02:00
David S. Miller
69138b34a7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2020-07-31

The following pull-request contains BPF updates for your *net* tree.

We've added 5 non-merge commits during the last 21 day(s) which contain
a total of 5 files changed, 126 insertions(+), 18 deletions(-).

The main changes are:

1) Fix a map element leak in HASH_OF_MAPS map type, from Andrii Nakryiko.

2) Fix a NULL pointer dereference in __btf_resolve_helper_id() when no
   btf_vmlinux is available, from Peilin Ye.

3) Init pos variable in __bpfilter_process_sockopt(), from Christoph Hellwig.

4) Fix a cgroup sockopt verifier test by specifying expected attach type,
   from Jean-Philippe Brucker.

Note that when net gets merged into net-next later on, there is a small
merge conflict in kernel/bpf/btf.c between commit 5b801dfb7feb ("bpf: Fix
NULL pointer dereference in __btf_resolve_helper_id()") from the bpf tree
and commit 138b9a0511c7 ("bpf: Remove btf_id helpers resolving") from the
net-next tree.

Resolve as follows: remove the old hunk with the __btf_resolve_helper_id()
function. Change the btf_resolve_helper_id() so it actually tests for a
NULL btf_vmlinux and bails out:

int btf_resolve_helper_id(struct bpf_verifier_log *log,
                          const struct bpf_func_proto *fn, int arg)
{
        int id;

        if (fn->arg_type[arg] != ARG_PTR_TO_BTF_ID || !btf_vmlinux)
                return -EINVAL;
        id = fn->btf_id[arg];
        if (!id || id > btf_vmlinux->nr_types)
                return -EINVAL;
        return id;
}

Let me know if you run into any others issues (CC'ing Jiri Olsa so he's in
the loop with regards to merge conflict resolution).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31 17:19:47 -07:00
Catalin Marinas
4557062da7 Merge branches 'for-next/misc', 'for-next/vmcoreinfo', 'for-next/cpufeature', 'for-next/acpi', 'for-next/perf', 'for-next/timens', 'for-next/msi-iommu' and 'for-next/trivial' into for-next/core
* for-next/misc:
  : Miscellaneous fixes and cleanups
  arm64: use IRQ_STACK_SIZE instead of THREAD_SIZE for irq stack
  arm64/mm: save memory access in check_and_switch_context() fast switch path
  recordmcount: only record relocation of type R_AARCH64_CALL26 on arm64.
  arm64: Reserve HWCAP2_MTE as (1 << 18)
  arm64/entry: deduplicate SW PAN entry/exit routines
  arm64: s/AMEVTYPE/AMEVTYPER
  arm64/hugetlb: Reserve CMA areas for gigantic pages on 16K and 64K configs
  arm64: stacktrace: Move export for save_stack_trace_tsk()
  smccc: Make constants available to assembly
  arm64/mm: Redefine CONT_{PTE, PMD}_SHIFT
  arm64/defconfig: Enable CONFIG_KEXEC_FILE
  arm64: Document sysctls for emulated deprecated instructions
  arm64/panic: Unify all three existing notifier blocks
  arm64/module: Optimize module load time by optimizing PLT counting

* for-next/vmcoreinfo:
  : Export the virtual and physical address sizes in vmcoreinfo
  arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
  crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo

* for-next/cpufeature:
  : CPU feature handling cleanups
  arm64/cpufeature: Validate feature bits spacing in arm64_ftr_regs[]
  arm64/cpufeature: Replace all open bits shift encodings with macros
  arm64/cpufeature: Add remaining feature bits in ID_AA64MMFR2 register
  arm64/cpufeature: Add remaining feature bits in ID_AA64MMFR1 register
  arm64/cpufeature: Add remaining feature bits in ID_AA64MMFR0 register

* for-next/acpi:
  : ACPI updates for arm64
  arm64/acpi: disallow writeable AML opregion mapping for EFI code regions
  arm64/acpi: disallow AML memory opregions to access kernel memory

* for-next/perf:
  : perf updates for arm64
  arm64: perf: Expose some new events via sysfs
  tools headers UAPI: Update tools's copy of linux/perf_event.h
  arm64: perf: Add cap_user_time_short
  perf: Add perf_event_mmap_page::cap_user_time_short ABI
  arm64: perf: Only advertise cap_user_time for arch_timer
  arm64: perf: Implement correct cap_user_time
  time/sched_clock: Use raw_read_seqcount_latch()
  sched_clock: Expose struct clock_read_data
  arm64: perf: Correct the event index in sysfs
  perf/smmuv3: To simplify code for ioremap page in pmcg

* for-next/timens:
  : Time namespace support for arm64
  arm64: enable time namespace support
  arm64/vdso: Restrict splitting VVAR VMA
  arm64/vdso: Handle faults on timens page
  arm64/vdso: Add time namespace page
  arm64/vdso: Zap vvar pages when switching to a time namespace
  arm64/vdso: use the fault callback to map vvar pages

* for-next/msi-iommu:
  : Make the MSI/IOMMU input/output ID translation PCI agnostic, augment the
  : MSI/IOMMU ACPI/OF ID mapping APIs to accept an input ID bus-specific parameter
  : and apply the resulting changes to the device ID space provided by the
  : Freescale FSL bus
  bus: fsl-mc: Add ACPI support for fsl-mc
  bus/fsl-mc: Refactor the MSI domain creation in the DPRC driver
  of/irq: Make of_msi_map_rid() PCI bus agnostic
  of/irq: make of_msi_map_get_device_domain() bus agnostic
  dt-bindings: arm: fsl: Add msi-map device-tree binding for fsl-mc bus
  of/device: Add input id to of_dma_configure()
  of/iommu: Make of_map_rid() PCI agnostic
  ACPI/IORT: Add an input ID to acpi_dma_configure()
  ACPI/IORT: Remove useless PCI bus walk
  ACPI/IORT: Make iort_msi_map_rid() PCI agnostic
  ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic
  ACPI/IORT: Make iort_match_node_callback walk the ACPI namespace for NC

* for-next/trivial:
  : Trivial fixes
  arm64: sigcontext.h: delete duplicated word
  arm64: ptrace.h: delete duplicated word
  arm64: pgtable-hwdef.h: delete duplicated words
2020-07-31 18:09:39 +01:00
Ingo Molnar
28cff52eae Merge branch 'linus' into locking/core, to resolve conflict
Conflicts:
	arch/arm/include/asm/percpu.h

As Stephen Rothwell noted, there's a conflict between this commit
in locking/core:

  a21ee6055c30 ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables")

and this fresh upstream commit:

  aa54ea903abb ("ARM: percpu.h: fix build error")

a21ee6055c30 is a simpler solution to the dependency problem and doesn't
further increase header hell - so this conflict resolution effectively
reverts aa54ea903abb and uses the a21ee6055c30 solution.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-31 12:16:09 +02:00
Marco Elver
92c209ac6d kcsan: Improve IRQ state trace reporting
To improve the general usefulness of the IRQ state trace events with
KCSAN enabled, save and restore the trace information when entering and
exiting the KCSAN runtime as well as when generating a KCSAN report.

Without this, reporting the IRQ trace events (whether via a KCSAN report
or outside of KCSAN via a lockdep report) is rather useless due to
continuously being touched by KCSAN. This is because if KCSAN is
enabled, every instrumented memory access causes changes to IRQ trace
events (either by KCSAN disabling/enabling interrupts or taking
report_lock when generating a report).

Before "lockdep: Prepare for NMI IRQ state tracking", KCSAN avoided
touching the IRQ trace events via raw_local_irq_save/restore() and
lockdep_off/on().

Fixes: 248591f5d257 ("kcsan: Make KCSAN compatible with new IRQ state tracking")
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200729110916.3920464-2-elver@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-31 12:12:03 +02:00
Marco Elver
0584df9c12 lockdep: Refactor IRQ trace events fields into struct
Refactor the IRQ trace events fields, used for printing information
about the IRQ trace events, into a separate struct 'irqtrace_events'.

This improves readability by separating the information only used in
reporting, as well as enables (simplified) storing/restoring of
irqtrace_events snapshots.

No functional change intended.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200729110916.3920464-1-elver@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-31 12:11:58 +02:00
Vincent Whitchurch
ee896ee805 tracing: Remove outdated comment in stack handling
This comment describes the behaviour before commit 2a820bf74918
("tracing: Use percpu stack trace buffer more intelligently").  Since
that commit, interrupts and NMIs do use the per-cpu stacks so the
comment is no longer correct.  Remove it.

(Note that the FTRACE_STACK_SIZE mentioned in the comment has never
existed, it probably should have said FTRACE_STACK_ENTRIES.)

Link: https://lkml.kernel.org/r/20200727092840.18659-1-vincent.whitchurch@axis.com

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-07-30 22:54:50 -04:00
Chengming Zhou
c5f51572a7 ftrace: Do not let direct or IPMODIFY ftrace_ops be added to module and set trampolines
When inserting a module, we find all ftrace_ops referencing it on the
ftrace_ops_list. But FTRACE_OPS_FL_DIRECT and FTRACE_OPS_FL_IPMODIFY
flags are special, and should not be set automatically. So warn and
skip ftrace_ops that have these two flags set and adding new code.
Also check if only one ftrace_ops references the module, in which case
we can use a trampoline as an optimization.

Link: https://lkml.kernel.org/r/20200728180554.65203-2-zhouchengming@bytedance.com

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-07-30 22:45:31 -04:00
Chengming Zhou
8a224ffb3f ftrace: Setup correct FTRACE_FL_REGS flags for module
When module loaded and enabled, we will use __ftrace_replace_code
for module if any ftrace_ops referenced it found. But we will get
wrong ftrace_addr for module rec in ftrace_get_addr_new, because
rec->flags has not been setup correctly. It can cause the callback
function of a ftrace_ops has FTRACE_OPS_FL_SAVE_REGS to be called
with pt_regs set to NULL.
So setup correct FTRACE_FL_REGS flags for rec when we call
referenced_filters to find ftrace_ops references it.

Link: https://lkml.kernel.org/r/20200728180554.65203-1-zhouchengming@bytedance.com

Cc: stable@vger.kernel.org
Fixes: 8c4f3c3fa9681 ("ftrace: Check module functions being traced on reload")
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-07-30 19:35:19 -04:00
Kevin Hao
96b4833b68 tracing/hwlat: Honor the tracing_cpumask
In calculation of the cpu mask for the hwlat kernel thread, the wrong
cpu mask is used instead of the tracing_cpumask, this causes the
tracing/tracing_cpumask useless for hwlat tracer. Fixes it.

Link: https://lkml.kernel.org/r/20200730082318.42584-2-haokexin@gmail.com

Cc: Ingo Molnar <mingo@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 0330f7aa8ee6 ("tracing: Have hwlat trace migrate across tracing_cpumask CPUs")
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-07-30 19:35:04 -04:00
Kevin Hao
a9d0ba6772 tracing/hwlat: Drop the duplicate assignment in start_kthread()
We have set 'current_mask' to '&save_cpumask' in its declaration,
so there is no need to assign again.

Link: https://lkml.kernel.org/r/20200730082318.42584-1-haokexin@gmail.com

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-07-30 19:35:04 -04:00
Yonghong Song
4fc00b79b8 bpf: Add missing newline characters in verifier error messages
Newline characters are added in two verifier error messages,
refactored in Commit afbf21dce668 ("bpf: Support readonly/readwrite
buffers in verifier"). This way, they do not mix with
messages afterwards.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200728221801.1090349-1-yhs@fb.com
2020-07-31 00:43:49 +02:00
Ingo Molnar
c1cc4784ce Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull the v5.9 RCU bits from Paul E. McKenney:

 - Documentation updates
 - Miscellaneous fixes
 - kfree_rcu updates
 - RCU tasks updates
 - Read-side scalability tests
 - SRCU updates
 - Torture-test updates

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-31 00:15:53 +02:00
Romain Perier
12cc923f1c tasklet: Introduce new initialization API
Nowadays, modern kernel subsystems that use callbacks pass the data
structure associated with a given callback as argument to the callback.
The tasklet subsystem remains one which passes an arbitrary unsigned
long to the callback function. This has several problems:

- This keeps an extra field for storing the argument in each tasklet
  data structure, it bloats the tasklet_struct structure with a redundant
  .data field

- No type checking can be performed on this argument. Instead of
  using container_of() like other callback subsystems, it forces callbacks
  to do explicit type cast of the unsigned long argument into the required
  object type.

- Buffer overflows can overwrite the .func and the .data field, so
  an attacker can easily overwrite the function and its first argument
  to whatever it wants.

Add a new tasklet initialization API, via DECLARE_TASKLET() and
tasklet_setup(), which will replace the existing ones.

This work is greatly inspired by the timer_struct conversion series,
see commit e99e88a9d2b0 ("treewide: setup_timer() -> timer_setup()")

To avoid problems with both -Wcast-function-type (which is enabled in
the kernel via -Wextra is several subsystems), and with mismatched
function prototypes when build with Control Flow Integrity enabled,
this adds the "use_callback" member to let the tasklet caller choose
which union member to call through. Once all old API uses are removed,
this and the .data member will be removed as well. (On 64-bit this does
not grow the struct size as the new member fills the hole after atomic_t,
which is also "int" sized.)

Signed-off-by: Romain Perier <romain.perier@gmail.com>
Co-developed-by: Allen Pais <allen.lkml@gmail.com>
Signed-off-by: Allen Pais <allen.lkml@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-30 11:16:01 -07:00
Kees Cook
b13fecb1c3 treewide: Replace DECLARE_TASKLET() with DECLARE_TASKLET_OLD()
This converts all the existing DECLARE_TASKLET() (and ...DISABLED)
macros with DECLARE_TASKLET_OLD() in preparation for refactoring the
tasklet callback type. All existing DECLARE_TASKLET() users had a "0"
data argument, it has been removed here as well.

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-30 11:15:58 -07:00
Andrii Nakryiko
1d4e1eab45 bpf: Fix map leak in HASH_OF_MAPS map
Fix HASH_OF_MAPS bug of not putting inner map pointer on bpf_map_elem_update()
operation. This is due to per-cpu extra_elems optimization, which bypassed
free_htab_elem() logic doing proper clean ups. Make sure that inner map is put
properly in optimized case as well.

Fixes: 8c290e60fa2a ("bpf: fix hashmap extra_elems logic")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200729040913.2815687-1-andriin@fb.com
2020-07-30 01:30:22 +02:00
Linus Torvalds
d3590ebf6f audit/stable-5.8 PR 20200729
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl8hgm0UHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPc4xAAxWSkLThFbdC+dWA8cFQvyJhXdcl6
 C3ALyBnx2hyr/MxJ9OcfYDl8TMafKFkXzq4+2vLiZPl/UBSpnr47ralUHl+aAh+I
 cZdV9bF3aSlsb4mIEg3H03xkPBCWfTR+UMzdrYAgqxyeYoZ/VteR1O3yWi80caQK
 vh2UlbuPyiEsz1A21ems88dDw28RkzETNFmBARSh7cPrvGorQNJKYGkMNqsVpUbb
 elx+DCSh4J+QYqByeQUY64L1n7jHGQkTpdZaVA7FhBeAilelL6PIa4qpyHU28VGg
 ZzOWJBkZwYz1lVEhHu1h3Jzv9dwTzzyopJ/YpPZUsvZ+GPuIfYmY+C1InkMvGd4S
 Ytj9WO+rNpvJR8EWUhl1O7J/0HN+dy3MGst9MkJOMea0gsgf9cTgnIEohFawYZRt
 t1pKB2VximglOx2IRVK/2//8u/s8d7c5/5uVY4akS++tbrk5j8uPcO+4wIf/njMM
 WqfUT58M6oY9mQkErewNrZEi2CHBg71GT4hJQ+1qnyrTSe9WfrmA01m/pIUNHzu3
 j1hhZH2KCT5IKF4b5dA2DmssorfVgC1VnAoa0UM9jC+awqSYI83S20d8EF48msIW
 XqEUSURh/bfn3T9Y75YVsNJ6EOvrhsf9TSCb43oNhAXBv0+XgO3bKOpBB6W+UIZ7
 86vGfemi82Rt+Sk=
 =zLU9
 -----END PGP SIGNATURE-----

Merge tag 'audit-pr-20200729' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit

Pull audit fixes from Paul Moore:
 "One small audit fix that you can hopefully merge before v5.8 is
  released. Unfortunately it is a revert of a patch that went in during
  the v5.7 window and we just recently started to see some bug reports
  relating to that commit.

  We are working on a proper fix, but I'm not yet clear on when that
  will be ready and we need to fix the v5.7 kernels anyway, so in the
  interest of time a revert seemed like the best solution right now"

* tag 'audit-pr-20200729' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  revert: 1320a4052ea1 ("audit: trigger accompanying records when no rules present")
2020-07-29 12:35:36 -07:00
Willy Tarreau
f227e3ec3b random32: update the net random state on interrupt and activity
This modifies the first 32 bits out of the 128 bits of a random CPU's
net_rand_state on interrupt or CPU activity to complicate remote
observations that could lead to guessing the network RNG's internal
state.

Note that depending on some network devices' interrupt rate moderation
or binding, this re-seeding might happen on every packet or even almost
never.

In addition, with NOHZ some CPUs might not even get timer interrupts,
leaving their local state rarely updated, while they are running
networked processes making use of the random state.  For this reason, we
also perform this update in update_process_times() in order to at least
update the state when there is user or system activity, since it's the
only case we care about.

Reported-by: Amit Klein <aksecurity@gmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-29 10:35:37 -07:00
Ahmed S. Darwish
af5a06b582 hrtimer: Use sequence counter with associated raw spinlock
A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.

Use the new seqcount_raw_spinlock_t data type, which allows to associate
a raw spinlock with the sequence counter. This enables lockdep to verify
that the raw spinlock used for writer serialization is held when the
write side critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200720155530.1173732-25-a.darwish@linutronix.de
2020-07-29 16:14:29 +02:00
Ahmed S. Darwish
025e82bcbc timekeeping: Use sequence counter with associated raw spinlock
A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.

Use the new seqcount_raw_spinlock_t data type, which allows to associate
a raw spinlock with the sequence counter. This enables lockdep to verify
that the raw spinlock used for writer serialization is held when the
write side critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200720155530.1173732-18-a.darwish@linutronix.de
2020-07-29 16:14:27 +02:00
Ahmed S. Darwish
b75058614f sched: tasks: Use sequence counter with associated spinlock
A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.

Use the new seqcount_spinlock_t data type, which allows to associate a
spinlock with the sequence counter. This enables lockdep to verify that
the spinlock used for writer serialization is held when the write side
critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200720155530.1173732-14-a.darwish@linutronix.de
2020-07-29 16:14:26 +02:00
Paul Moore
8ac68dc455 revert: 1320a4052ea1 ("audit: trigger accompanying records when no rules present")
Unfortunately the commit listed in the subject line above failed
to ensure that the task's audit_context was properly initialized/set
before enabling the "accompanying records".  Depending on the
situation, the resulting audit_context could have invalid values in
some of it's fields which could cause a kernel panic/oops when the
task/syscall exists and the audit records are generated.

We will revisit the original patch, with the necessary fixes, in a
future kernel but right now we just want to fix the kernel panic
with the least amount of added risk.

Cc: stable@vger.kernel.org
Fixes: 1320a4052ea1 ("audit: trigger accompanying records when no rules present")
Reported-by: j2468h@googlemail.com
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-07-29 10:00:36 -04:00
Hari Bathini
f891f19736 kexec_file: Allow archs to handle special regions while locating memory hole
Some architectures may have special memory regions, within the given
memory range, which can't be used for the buffer in a kexec segment.
Implement weak arch_kexec_locate_mem_hole() definition which arch code
may override, to take care of special regions, while trying to locate
a memory hole.

Also, add the missing declarations for arch overridable functions and
and drop the __weak descriptors in the declarations to avoid non-weak
definitions from becoming weak.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Tested-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/159602273603.575379.17665852963340380839.stgit@hbathini
2020-07-29 23:47:53 +10:00
Qais Yousef
13685c4a08 sched/uclamp: Add a new sysctl to control RT default boost value
RT tasks by default run at the highest capacity/performance level. When
uclamp is selected this default behavior is retained by enforcing the
requested uclamp.min (p->uclamp_req[UCLAMP_MIN]) of the RT tasks to be
uclamp_none(UCLAMP_MAX), which is SCHED_CAPACITY_SCALE; the maximum
value.

This is also referred to as 'the default boost value of RT tasks'.

See commit 1a00d999971c ("sched/uclamp: Set default clamps for RT tasks").

On battery powered devices, it is desired to control this default
(currently hardcoded) behavior at runtime to reduce energy consumed by
RT tasks.

For example, a mobile device manufacturer where big.LITTLE architecture
is dominant, the performance of the little cores varies across SoCs, and
on high end ones the big cores could be too power hungry.

Given the diversity of SoCs, the new knob allows manufactures to tune
the best performance/power for RT tasks for the particular hardware they
run on.

They could opt to further tune the value when the user selects
a different power saving mode or when the device is actively charging.

The runtime aspect of it further helps in creating a single kernel image
that can be run on multiple devices that require different tuning.

Keep in mind that a lot of RT tasks in the system are created by the
kernel. On Android for instance I can see over 50 RT tasks, only
a handful of which created by the Android framework.

To control the default behavior globally by system admins and device
integrator, introduce the new sysctl_sched_uclamp_util_min_rt_default
to change the default boost value of the RT tasks.

I anticipate this to be mostly in the form of modifying the init script
of a particular device.

To avoid polluting the fast path with unnecessary code, the approach
taken is to synchronously do the update by traversing all the existing
tasks in the system. This could race with a concurrent fork(), which is
dealt with by introducing sched_post_fork() function which will ensure
the racy fork will get the right update applied.

Tested on Juno-r2 in combination with the RT capacity awareness [1].
By default an RT task will go to the highest capacity CPU and run at the
maximum frequency, which is particularly energy inefficient on high end
mobile devices because the biggest core[s] are 'huge' and power hungry.

With this patch the RT task can be controlled to run anywhere by
default, and doesn't cause the frequency to be maximum all the time.
Yet any task that really needs to be boosted can easily escape this
default behavior by modifying its requested uclamp.min value
(p->uclamp_req[UCLAMP_MIN]) via sched_setattr() syscall.

[1] 804d402fb6f6: ("sched/rt: Make RT capacity-aware")

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200716110347.19553-2-qais.yousef@arm.com
2020-07-29 13:51:47 +02:00
Qais Yousef
e65855a52b sched/uclamp: Fix a deadlock when enabling uclamp static key
The following splat was caught when setting uclamp value of a task:

  BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:49

   cpus_read_lock+0x68/0x130
   static_key_enable+0x1c/0x38
   __sched_setscheduler+0x900/0xad8

Fix by ensuring we enable the key outside of the critical section in
__sched_setscheduler()

Fixes: 46609ce22703 ("sched/uclamp: Protect uclamp fast path code with static key")
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200716110347.19553-4-qais.yousef@arm.com
2020-07-29 13:51:47 +02:00
Peter Zijlstra
4fd5750af0 sched,tracing: Convert to sched_set_fifo()
One module user of sched_setscheduler() was overlooked and is
obviously causing build failures.

Convert ring_buffer_benchmark to use sched_set_fifo_low() when fifo==1
and sched_set_fifo() when fifo==2. This is a bit of an abuse, but it
makes the thing 'work' again.

Specifically, it enables all combinations that were previously
possible:

  producer higher than consumer
  consumer higher than producer

Fixes: 616d91b68cd5 ("sched: Remove sched_setscheduler*() EXPORTs")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lkml.kernel.org/r/20200720214918.GM5523@worktop.programming.kicks-ass.net
2020-07-29 11:43:53 +02:00
Dan Williams
48001ea50d PM, libnvdimm: Add runtime firmware activation support
Abstract platform specific mechanics for nvdimm firmware activation
behind a handful of generic ops. At the bus level ->activate_state()
indicates the unified state (idle, busy, armed) of all DIMMs on the bus,
and ->capability() indicates the system state expectations for activate.
At the DIMM level ->activate_state() indicates the per-DIMM state,
->activate_result() indicates the outcome of the last activation
attempt, and ->arm() attempts to transition the DIMM from 'idle' to
'armed'.

A new hibernate_quiet_exec() facility is added to support firmware
activation in an OS defined system quiesce state. It leverages the fact
that the hibernate-freeze state wants to assert that a memory
hibernation snapshot can be taken. This is in contrast to a platform
firmware defined quiesce state that may forcefully quiet the memory
controller independent of whether an individual device-driver properly
supports hibernate-freeze.

The libnvdimm sysfs interface is extended to support detection of a
firmware activate capability. The mechanism supports enumeration and
triggering of firmware activate, optionally in the
hibernate_quiet_exec() context.

[rafael: hibernate_quiet_exec() proposal]
[vishal: fix up sparse warning, grammar in Documentation/]

Cc: Pavel Machek <pavel@ucw.cz>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Co-developed-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-07-28 19:28:32 -06:00
Andrii Nakryiko
310ad7970a bpf: Fix build without CONFIG_NET when using BPF XDP link
Entire net/core subsystem is not built without CONFIG_NET. linux/netdevice.h
just assumes that it's always there, so the easiest way to fix this is to
conditionally compile out bpf_xdp_link_attach() use in bpf/syscall.c.

Fixes: aa8d3a716b59 ("bpf, xdp: Add bpf_link-based XDP attachment API")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200728190527.110830-1-andriin@fb.com
2020-07-29 00:29:00 +02:00
Christoph Hellwig
274b3f7bf3 dma-contiguous: cleanup dma_alloc_contiguous
Split out a cma_alloc_aligned helper to deal with the "interesting"
calling conventions for cma_alloc, which then allows to the main
function to be written straight forward.  This also takes advantage
of the fact that NULL dev arguments have been gone from the DMA API
for a while.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nicolin Chen <nicoleotsuka@gmail.com>
Reviewed-by: Barry Song <song.bao.hua@hisilicon.com>
2020-07-28 13:42:15 +02:00
Miaohe Lin
21a6ee14a8 sched: Remove duplicated tick_nohz_full_enabled() check
In sched_update_tick_dependency() there's two calls that check
whether nohz_full is enabled: tick_nohz_full_cpu() does it
implicitly, while there's also an explicit call to tick_nohz_full_enabled().

Remove the duplicated, open coded check.

[ mingo: Amended the changelog. ]

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/1595935075-14223-1-git-send-email-linmiaohe@huawei.com
2020-07-28 13:27:54 +02:00