45586c7078
32198 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Alexey Dobriyan
|
97a32539b9 |
proc: convert everything to "struct proc_ops"
The most notable change is DEFINE_SHOW_ATTRIBUTE macro split in seq_file.h. Conversion rule is: llseek => proc_lseek unlocked_ioctl => proc_ioctl xxx => proc_xxx delete ".owner = THIS_MODULE" line [akpm@linux-foundation.org: fix drivers/isdn/capi/kcapi_proc.c] [sfr@canb.auug.org.au: fix kernel/sched/psi.c] Link: http://lkml.kernel.org/r/20200122180545.36222f50@canb.auug.org.au Link: http://lkml.kernel.org/r/20191225172546.GB13378@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Linus Torvalds
|
e17ac02b18 |
kgdb patches for 5.6-rc1
Everything for kgdb this time around is either simplifications or clean ups. In particular Douglas Anderson's modifications to the backtrace machine in the *last* dev cycle have enabled Doug to tidy up some MIPS specific backtrace code and stop sharing certain data structures across the kernel. Note that The MIPS folks were on Cc: for the MIPS patch and reacted positively (but without an explicit Acked-by). Doug also got rid of the implicit switching between tasks and register sets during some but not of kdb's backtrace actions (because the implicit switching was either confusing for users, pointless or both). Finally there is a coverity fix and patch to replace open coded console traversal with the proper helper function. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEELzVBU1D3lWq6cKzwfOMlXTn3iKEFAl44NQ0ACgkQfOMlXTn3 iKHiXw//d6w5bIuA/HAQ24u/piEDlvYG7TYJ3GJLE1qaQMti9e2Ob48ahgUqQDbH K2slFvlhZbrXMHO8BZ1pQt2xaUx9rhmJEBh3GvEudFp4RgwRkebNF2YDuT5yq/Di gi3eeB4ZKBvCTsKGI+bNXYQCdTYEJ55gH+vj7jL1Kb2bmrNisnCKhzQhM2RvrkNB hRfpuFet3i9WsW9OILyt8aDTHCTKrPkghWiGQZ+9Z3TROI80CbO0Vwmg0xrrYEvh //X1Hu+IjoOSfQHNblBm9AMsqeo73HYJ9i5mtDhPL/BVensicY19Q7/bNSdw2yHL it3pPpyVGEhMXr/Qdbe2B7oqLUOzawpngdSzzcaa/lUT4zjh0F1tNrIyXjTZ4iCH kk2posDN+C/IfcOmZpSGBZQ8Ef57qtSAzvdGpyQPSTChyf8z1ufvCHfIzESpkaPU aa5jNwbAZCWmGDR3tGweUAUvgrKNaulbjygTvarNnv5Rt8gNXV7sKCilFF/nFLb4 Pe9+NUWPSH81cwKyq/r4oG2TGPRUKMg5lo2k/ELHevTtXS5c2P/jtBp7NCstulk2 RBp4oQhZ+lZNt8kz4l0yRXbaA5kqk3JRd8K76Bkm6E4ceXeX07d7rySkJPmzAGeA ZyLPUNGgn9k4XDMlkTUbFVocFtm+gxfelHcR1raDRg3MfYYzVAM= =igIA -----END PGP SIGNATURE----- Merge tag 'kgdb-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux Pull kgdb updates from Daniel Thompson: "Everything for kgdb this time around is either simplifications or clean ups. In particular Douglas Anderson's modifications to the backtrace machine in the *last* dev cycle have enabled Doug to tidy up some MIPS specific backtrace code and stop sharing certain data structures across the kernel. Note that The MIPS folks were on Cc: for the MIPS patch and reacted positively (but without an explicit Acked-by). Doug also got rid of the implicit switching between tasks and register sets during some but not of kdb's backtrace actions (because the implicit switching was either confusing for users, pointless or both). Finally there is a coverity fix and patch to replace open coded console traversal with the proper helper function" * tag 'kgdb-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux: kdb: Use for_each_console() helper kdb: remove redundant assignment to pointer bp kdb: Get rid of confusing diag msg from "rd" if current task has no regs kdb: Gid rid of implicit setting of the current task / regs kdb: kdb_current_task shouldn't be exported kdb: kdb_current_regs should be private MIPS: kdb: Remove old workaround for backtracing on other CPUs |
||
Linus Torvalds
|
7eec11d3a7 |
Merge branch 'akpm' (patches from Andrew)
Pull updates from Andrew Morton: "Most of -mm and quite a number of other subsystems: hotfixes, scripts, ocfs2, misc, lib, binfmt, init, reiserfs, exec, dma-mapping, kcov. MM is fairly quiet this time. Holidays, I assume" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits) kcov: ignore fault-inject and stacktrace include/linux/io-mapping.h-mapping: use PHYS_PFN() macro in io_mapping_map_atomic_wc() execve: warn if process starts with executable stack reiserfs: prevent NULL pointer dereference in reiserfs_insert_item() init/main.c: fix misleading "This architecture does not have kernel memory protection" message init/main.c: fix quoted value handling in unknown_bootoption init/main.c: remove unnecessary repair_env_string in do_initcall_level init/main.c: log arguments and environment passed to init fs/binfmt_elf.c: coredump: allow process with empty address space to coredump fs/binfmt_elf.c: coredump: delete duplicated overflow check fs/binfmt_elf.c: coredump: allocate core ELF header on stack fs/binfmt_elf.c: make BAD_ADDR() unlikely fs/binfmt_elf.c: better codegen around current->mm fs/binfmt_elf.c: don't copy ELF header around fs/binfmt_elf.c: fix ->start_code calculation fs/binfmt_elf.c: smaller code generation around auxv vector fill lib/find_bit.c: uninline helper _find_next_bit() lib/find_bit.c: join _find_next_bit{_le} uapi: rename ext2_swab() to swab() and share globally in swab.h lib/scatterlist.c: adjust indentation in __sg_alloc_table ... |
||
Linus Torvalds
|
ddaefe8947 |
Modules updates for v5.6
Summary of modules changes for the 5.6 merge window: - Add "MS" (SHF_MERGE|SHF_STRINGS) section flags to __ksymtab_strings to indicate to the linker that it can perform string deduplication (i.e., duplicate strings are reduced to a single copy in the string table). This means any repeated namespace string would be merged to just one entry in __ksymtab_strings. - Various code cleanups and small fixes (fix small memleak in error path, improve moduleparam docs, silence rcu warnings, improve error logging) Signed-off-by: Jessica Yu <jeyu@kernel.org> -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEVrp26glSWYuDNrCUwEV+OM47wXIFAl40TvwQHGpleXVAa2Vy bmVsLm9yZwAKCRDARX44zjvBcigxD/4/ksGeXvf3tcsRc5M5S33Tws25vcHeByz/ WEX1f7ZnXukCApFdnpUbVkjiH7EM0+T6lGumv4NPJht+ggP8JoY9hMkBqMmd0js/ +R9U6o0vB4LW8zU68RwE0TS4qphpmpJz16HlhTPtIk4Vo0GBxnEYMMMcVWIeqq1W m3KcEUudv9/Y7IFawDNRJcUWI1jD2vcfaavbU6XbTw82ARiiScZFrWYzf1PGYJ6L XvJNwCVh8TDbS4C5kaNWp2LiGXegjKClosdisCIjkQr/3e+Rg1jOGHpa6B2+Vow2 ttq6lmcikNpcCkCV1tFz+ex2LLsLBMAO939c2C0LIhnnIxVgSkDU0pWn3psAxiOl lRqHtQN42dRlOtBwZ9JoKTT9Wi3H/Lx0FCxg5OdblrSlOqH+GxQjBLkgtvmn/ZAh /dReehUoqbL55GieZuPPyostg3upCDE27IQZdFrZLWbE0VGiIyU9p6GYo7Tssuo2 Tr8kmhYUF9o1AnlzVQgGgZF73PpM6vhmEnn/dipZrgFI//2A3xkAfi5JdhGLKsFi UsaeTX3q/AmnC8dqaNayiftSgaK/4hdSboW1hgWLLD98H608s7Bl1reTmXPxSyWj RvBVP0vp5+u9EItfkAG6jbEpM5ZtyFDUc+5KNfJhym6vaplp5H+krIrT2Li+oLUu d/eifJ/1vA== =boqg -----END PGP SIGNATURE----- Merge tag 'modules-for-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull module updates from Jessica Yu: "Summary of modules changes for the 5.6 merge window: - Add "MS" (SHF_MERGE|SHF_STRINGS) section flags to __ksymtab_strings to indicate to the linker that it can perform string deduplication (i.e., duplicate strings are reduced to a single copy in the string table). This means any repeated namespace string would be merged to just one entry in __ksymtab_strings. - Various code cleanups and small fixes (fix small memleak in error path, improve moduleparam docs, silence rcu warnings, improve error logging)" * tag 'modules-for-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: module.h: Annotate mod_kallsyms with __rcu module: avoid setting info->name early in case we can fall back to info->mod->name modsign: print module name along with error message kernel/module: Fix memleak in module_add_modinfo_attrs() export.h: reduce __ksymtab_strings string duplication by using "MS" section flags moduleparam: fix kerneldoc modules: lockdep: Suppress suspicious RCU usage warning |
||
Dmitry Vyukov
|
43e76af85f |
kcov: ignore fault-inject and stacktrace
Don't instrument 3 more files that contain debugging facilities and produce large amounts of uninteresting coverage for every syscall. The following snippets are sprinkled all over the place in kcov traces in a debugging kernel. We already try to disable instrumentation of stack unwinding code and of most debug facilities. I guess we did not use fault-inject.c at the time, and stacktrace.c was somehow missed (or something has changed in kernel/configs). This change both speeds up kcov (kernel doesn't need to store these PCs, user-space doesn't need to process them) and frees trace buffer capacity for more useful coverage. should_fail lib/fault-inject.c:149 fail_dump lib/fault-inject.c:45 stack_trace_save kernel/stacktrace.c:124 stack_trace_consume_entry kernel/stacktrace.c:86 stack_trace_consume_entry kernel/stacktrace.c:89 ... a hundred frames skipped ... stack_trace_consume_entry kernel/stacktrace.c:93 stack_trace_consume_entry kernel/stacktrace.c:86 Link: http://lkml.kernel.org/r/20200116111449.217744-1-dvyukov@gmail.com Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Andy Shevchenko
|
dc2c733e65 |
kdb: Use for_each_console() helper
Replace open coded single-linked list iteration loop with for_each_console() helper in use. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> |
||
Colin Ian King
|
a4f8a7fb19 |
kdb: remove redundant assignment to pointer bp
The point bp is assigned a value that is never read, it is being re-assigned later to bp = &kdb_breakpoints[lowbp] in a for-loop. Remove the redundant assignment. Addresses-Coverity ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20191128130753.181246-1-colin.king@canonical.com Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> |
||
Douglas Anderson
|
bbfceba15f |
kdb: Get rid of confusing diag msg from "rd" if current task has no regs
If you switch to a sleeping task with the "pid" command and then type "rd", kdb tells you this: No current kdb registers. You may need to select another task diag: -17: Invalid register name The first message makes sense, but not the second. Fix it by just returning 0 after commands accessing the current registers finish if we've already printed the "No current kdb registers" error. While fixing kdb_rd(), change the function to use "if" rather than "ifdef". It cleans the function up a bit and any modern compiler will have no trouble handling still producing good code. Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20191109111624.5.I121f4c6f0c19266200bf6ef003de78841e5bfc3d@changeid Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> |
||
Douglas Anderson
|
9441d5f6b7 |
kdb: Gid rid of implicit setting of the current task / regs
Some (but not all?) of the kdb backtrace paths would cause the
kdb_current_task and kdb_current_regs to remain changed. As discussed
in a review of a previous patch [1], this doesn't seem intuitive, so
let's fix that.
...but, it turns out that there's actually no longer any reason to set
the current task / current regs while backtracing anymore anyway. As
of commit
|
||
Douglas Anderson
|
a8649fb0a8 |
kdb: kdb_current_task shouldn't be exported
The kdb_current_task variable has been declared in "kernel/debug/kdb/kdb_private.h" since 2010 when kdb was added to the mainline kernel. This is not a public header. There should be no reason that kdb_current_task should be exported and there are no in-kernel users that need it. Remove the export. Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20191109111623.3.I14b22b5eb15ca8f3812ab33e96621231304dc1f7@changeid Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> |
||
Douglas Anderson
|
c67c10a67f |
kdb: kdb_current_regs should be private
As of the patch ("MIPS: kdb: Remove old workaround for backtracing on other CPUs") there is no reason for kdb_current_regs to be in the public "kdb.h". Let's move it next to kdb_current_task. Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20191109111623.2.Iadbfb484e90b557cc4b5ac9890bfca732cd99d77@changeid Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> |
||
Linus Torvalds
|
39bed42de2 |
hmm related patches for 5.6
This small series revises the names in mmu_notifier to make the code clearer and more readable. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl4wf2EACgkQOG33FX4g mxqrdw//XIexbXQqP4dUKFCFeI7Um6ZqYE6iVCQi6JEetpKxCR8BSrJsq6EP60Mg cVCKolISuudzOccz/liotg9SrwRlcO3mzucd8LJZG0v2FZMzQr0EKjst0RC4/xvK U2RxGvwLQ+XVR/3/l6hXyWyw7u28+F1RsfQMMX3kqR3qlcQachQ3k7oUINDIq2XH JkQcBV+XK0doXEp6VCCVKwuwEN7O5xSm8lAIHDNFZEEPre0iKxwatgWxdXFIWQek tRywwB7bRzFROBlDcoOQ0GDTqScr3bghz6vWU4GGv3avYkystKwy44ha6BzO2xQc ZNIo8AN9UFFhcmF531wklsXTCbxbxJAJAwdyIuQnKq5glw64EFnrjo2sxuL6s56h C1GHADtxDccv+nr2sKP/rFFeq9K3VqHDtjEdBOhReuB0Vp1YfVr17A4R8yAn8A+1 vm3IusoOq+g8qMYxRHEb+76/S//joaxAlFQkU5Gjn/0xsykP99YQSQFBjXmkzWlS IiHLf0HJiCCL8SHe4Wnyhyl1DUIIl38HQULqbFWZ8hK4ELhTd2KEuDxzT8q+v+v7 2M9nBVdRaw1kskGiFv+F7mb6c990CTEZO9B5fHpAjPRxeVkLYc06QfJY+hXbbu4c 6yzIvERRRlAviCmgb7G+3pLyBCKdvlIlCVsVOdxHXSRsl904BnA= =hhT0 -----END PGP SIGNATURE----- Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull mmu_notifier updates from Jason Gunthorpe: "This small series revises the names in mmu_notifier to make the code clearer and more readable" * tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: mm/mmu_notifiers: Use 'interval_sub' as the variable for mmu_interval_notifier mm/mmu_notifiers: Use 'subscription' as the variable name for mmu_notifier mm/mmu_notifier: Rename struct mmu_notifier_mm to mmu_notifier_subscriptions |
||
Linus Torvalds
|
83fa805bcb |
threads-v5.6
-----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXjFo8wAKCRCRxhvAZXjc omaGAQDVwCHQekqxp2eC8EJH4Pkt+Bn1BLrA25stlTo93YBPHgEAsPVUCRNcrZAl VncYmxCfpt3Yu0S/MTVXu5xrRiIXPQk= =uqTN -----END PGP SIGNATURE----- Merge tag 'threads-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull thread management updates from Christian Brauner: "Sargun Dhillon over the last cycle has worked on the pidfd_getfd() syscall. This syscall allows for the retrieval of file descriptors of a process based on its pidfd. A task needs to have ptrace_may_access() permissions with PTRACE_MODE_ATTACH_REALCREDS (suggested by Oleg and Andy) on the target. One of the main use-cases is in combination with seccomp's user notification feature. As a reminder, seccomp's user notification feature was made available in v5.0. It allows a task to retrieve a file descriptor for its seccomp filter. The file descriptor is usually handed of to a more privileged supervising process. The supervisor can then listen for syscall events caught by the seccomp filter of the supervisee and perform actions in lieu of the supervisee, usually emulating syscalls. pidfd_getfd() is needed to expand its uses. There are currently two major users that wait on pidfd_getfd() and one future user: - Netflix, Sargun said, is working on a service mesh where users should be able to connect to a dns-based VIP. When a user connects to e.g. 1.2.3.4:80 that runs e.g. service "foo" they will be redirected to an envoy process. This service mesh uses seccomp user notifications and pidfd to intercept all connect calls and instead of connecting them to 1.2.3.4:80 connects them to e.g. 127.0.0.1:8080. - LXD uses the seccomp notifier heavily to intercept and emulate mknod() and mount() syscalls for unprivileged containers/processes. With pidfd_getfd() more uses-cases e.g. bridging socket connections will be possible. - The patchset has also seen some interest from the browser corner. Right now, Firefox is using a SECCOMP_RET_TRAP sandbox managed by a broker process. In the future glibc will start blocking all signals during dlopen() rendering this type of sandbox impossible. Hence, in the future Firefox will switch to a seccomp-user-nofication based sandbox which also makes use of file descriptor retrieval. The thread for this can be found at https://sourceware.org/ml/libc-alpha/2019-12/msg00079.html With pidfd_getfd() it is e.g. possible to bridge socket connections for the supervisee (binding to a privileged port) and taking actions on file descriptors on behalf of the supervisee in general. Sargun's first version was using an ioctl on pidfds but various people pushed for it to be a proper syscall which he duely implemented as well over various review cycles. Selftests are of course included. I've also added instructions how to deal with merge conflicts below. There's also a small fix coming from the kernel mentee project to correctly annotate struct sighand_struct with __rcu to fix various sparse warnings. We've received a few more such fixes and even though they are mostly trivial I've decided to postpone them until after -rc1 since they came in rather late and I don't want to risk introducing build warnings. Finally, there's a new prctl() command PR_{G,S}ET_IO_FLUSHER which is needed to avoid allocation recursions triggerable by storage drivers that have userspace parts that run in the IO path (e.g. dm-multipath, iscsi, etc). These allocation recursions deadlock the device. The new prctl() allows such privileged userspace components to avoid allocation recursions by setting the PF_MEMALLOC_NOIO and PF_LESS_THROTTLE flags. The patch carries the necessary acks from the relevant maintainers and is routed here as part of prctl() thread-management." * tag 'threads-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: prctl: PR_{G,S}ET_IO_FLUSHER to support controlling memory reclaim sched.h: Annotate sighand_struct with __rcu test: Add test for pidfd getfd arch: wire up pidfd_getfd syscall pid: Implement pidfd_getfd syscall vfs, fdtable: Add fget_task helper |
||
Linus Torvalds
|
08a3ef8f6b |
linux-kselftest-5.6-rc1-kunit
This kunit update for Linux 5.6-rc1 consists of: -- Support for building kunit as a module from Alan Maguire -- AppArmor KUnit tests for policy unpack from Mike Salvatore -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAl4xz/wACgkQCwJExA0N Qxyg2A//X0bnhN82oCchkTRW3GyGi5wTR2wGhoNzMZD0XUtCvn+4BlCSP20ttYdT beiLCiewcuEdvXRyEV9Kikvet/67ovbjA/ce6ZrR7TlIHo8esKcy19/nu1OTvtI1 8eji1q7NSEV9iswz1ZoBAw+MTDHZfOI9qYY2UPcwjy7xWN84z2X1w+8UQ3EamOKd 6BfbohsYuuTTHhA2k1aUzvQcHqNz0YdH4yvNQpdunJXLUI04TeGZA6Ug66u6kWEd 1f5SSAu6r1vnU7DADrb1QwEDuIwL4KBuaMg2Rj5GLxTNp3wxmW9M2Dit+iN7+vNH TS31kZW6KgxC5XuGVPENJaWlDX5Hm+5W8uiRZLNXsxDy927u53RzwrSZw/FbdbB1 HuPZZCzE1soWHdPIQz44HCCAg9XddypYlC1o4IYL1JkJknqG12ky4xgM8GRNCZAB oUW3Ax3Lcr0EJALO/kFd/uEbl79PdmDk8uPMU1jtLyx5cs70yC3fsT2GB+DbP802 i/FxTtrOMGjU2OWcYfQcXapvZdgImf9nPsSZe3FJXjHfytNRbVZOZ2rHAMh03Keu EBthDs6ejm6OUSGUXjngE9NaQKXsNSQ1Qor+6FrGnT4IxUMzWenudqHH7/dgF7Fr fHlZGBilKMc/EYKb/6hj4kvEChrSIXj6TFknmI28I/epPiOr2gU= =AFO4 -----END PGP SIGNATURE----- Merge tag 'linux-kselftest-5.6-rc1-kunit' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest kunit updates from Shuah Khan: "This kunit update consists of: - Support for building kunit as a module from Alan Maguire - AppArmor KUnit tests for policy unpack from Mike Salvatore" * tag 'linux-kselftest-5.6-rc1-kunit' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: building kunit as a module breaks allmodconfig kunit: update documentation to describe module-based build kunit: allow kunit to be loaded as a module kunit: remove timeout dependence on sysctl_hung_task_timeout_seconds kunit: allow kunit tests to be loaded as a module kunit: hide unexported try-catch interface in try-catch-impl.h kunit: move string-stream.h to lib/kunit apparmor: add AppArmor KUnit tests for policy unpack |
||
Linus Torvalds
|
22b17db4ea |
y2038: core, driver and file system changes
These are updates to device drivers and file systems that for some reason or another were not included in the kernel in the previous y2038 series. I've gone through all users of time_t again to make sure the kernel is in a long-term maintainable state, replacing all remaining references to time_t with safe alternatives. Some related parts of the series were picked up into the nfsd, xfs, alsa and v4l2 trees. A final set of patches in linux-mm removes the now unused time_t/timeval/timespec types and helper functions after all five branches are merged for linux-5.6, ensuring that no new users get merged. As a result, linux-5.6, or my backport of the patches to 5.4 [1], should be the first release that can serve as a base for a 32-bit system designed to run beyond year 2038, with a few remaining caveats: - All user space must be compiled with a 64-bit time_t, which will be supported in the coming musl-1.2 and glibc-2.32 releases, along with installed kernel headers from linux-5.6 or higher. - Applications that use the system call interfaces directly need to be ported to use the time64 syscalls added in linux-5.1 in place of the existing system calls. This impacts most users of futex() and seccomp() as well as programming languages that have their own runtime environment not based on libc. - Applications that use a private copy of kernel uapi header files or their contents may need to update to the linux-5.6 version, in particular for sound/asound.h, xfs/xfs_fs.h, linux/input.h, linux/elfcore.h, linux/sockios.h, linux/timex.h and linux/can/bcm.h. - A few remaining interfaces cannot be changed to pass a 64-bit time_t in a compatible way, so they must be configured to use CLOCK_MONOTONIC times or (with a y2106 problem) unsigned 32-bit timestamps. Most importantly this impacts all users of 'struct input_event'. - All y2038 problems that are present on 64-bit machines also apply to 32-bit machines. In particular this affects file systems with on-disk timestamps using signed 32-bit seconds: ext4 with ext3-style small inodes, ext2, xfs (to be fixed soon) and ufs. Changes since v1 [2]: - Add Acks I received - Rebase to v5.5-rc1, dropping patches that got merged already - Add NFS, XFS and the final three patches from another series - Rewrite etnaviv patches - Add one late revert to avoid an etnaviv regression [1] https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git/log/?h=y2038-endgame [2] https://lore.kernel.org/lkml/20191108213257.3097633-1-arnd@arndb.de/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJeMYy3AAoJEGCrR//JCVInEGwP/0R+S+ok7vw9OdLVT0lFl07D IcVabgOWf24imN7m7L7Mlt3nDfxIT4tMpiAXq7eMO3spcyViG18O2LXdSQ4/7QBp +BlhoMjOP9w34Jyd7mnkFr4vqQALvfIqkS8rFObDtDub2Rfj9PC36MRMIu8BPXlv RK8bigwJeH/DV38yc5/JeUcD+WuewYLsK9XPWN+4yB4vgGsNU3ZQQ6nnzbR3hMsN DN8WZ68Y7IBs0Kyxkf+s2zmRXtCa2RiFg/2TUsk5olVAJVaenvte69hq5RSbg1vW vLi6K8cBoPWL59nqCzcNE+TUhSUg3LOj/a/KWyl76yovz7AlJaNjssOf8ZjHw6sL MhQqz3hXTxiJDS2Jvbf1yojiYGlzrq/gqcRFGe9jPcZdieMc4/yZCx60G/Exa5Pu YdMcqMyDWPFyUAFQNWEF59HPheOdj6tb1KpJ6bwgCo3P7QqhLrU4z9w3Py4/ZfBO 4sWcWteSsD6MN/ADJ2WQ56nNxzM2AvkeVJKcF6FCkdngXX9T0GExmZz7SqB5Du99 9lNjIiD5E+LBa/Swo/7n49aYa8x06V1pmHYTZVh9Wkl+CZiO21umezQFrWsfaMTp xt3c6pFdMG5xNMGpreTAXOmf2R+T6O8IO2qQq/TYjzqOLH7QC830P7avkmml+cK1 LjOBE2TfSeO8Ru1dXV4t =wx0A -----END PGP SIGNATURE----- Merge tag 'y2038-drivers-for-v5.6-signed' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground Pull y2038 updates from Arnd Bergmann: "Core, driver and file system changes These are updates to device drivers and file systems that for some reason or another were not included in the kernel in the previous y2038 series. I've gone through all users of time_t again to make sure the kernel is in a long-term maintainable state, replacing all remaining references to time_t with safe alternatives. Some related parts of the series were picked up into the nfsd, xfs, alsa and v4l2 trees. A final set of patches in linux-mm removes the now unused time_t/timeval/timespec types and helper functions after all five branches are merged for linux-5.6, ensuring that no new users get merged. As a result, linux-5.6, or my backport of the patches to 5.4 [1], should be the first release that can serve as a base for a 32-bit system designed to run beyond year 2038, with a few remaining caveats: - All user space must be compiled with a 64-bit time_t, which will be supported in the coming musl-1.2 and glibc-2.32 releases, along with installed kernel headers from linux-5.6 or higher. - Applications that use the system call interfaces directly need to be ported to use the time64 syscalls added in linux-5.1 in place of the existing system calls. This impacts most users of futex() and seccomp() as well as programming languages that have their own runtime environment not based on libc. - Applications that use a private copy of kernel uapi header files or their contents may need to update to the linux-5.6 version, in particular for sound/asound.h, xfs/xfs_fs.h, linux/input.h, linux/elfcore.h, linux/sockios.h, linux/timex.h and linux/can/bcm.h. - A few remaining interfaces cannot be changed to pass a 64-bit time_t in a compatible way, so they must be configured to use CLOCK_MONOTONIC times or (with a y2106 problem) unsigned 32-bit timestamps. Most importantly this impacts all users of 'struct input_event'. - All y2038 problems that are present on 64-bit machines also apply to 32-bit machines. In particular this affects file systems with on-disk timestamps using signed 32-bit seconds: ext4 with ext3-style small inodes, ext2, xfs (to be fixed soon) and ufs" [1] https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git/log/?h=y2038-endgame * tag 'y2038-drivers-for-v5.6-signed' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (21 commits) Revert "drm/etnaviv: reject timeouts with tv_nsec >= NSEC_PER_SEC" y2038: sh: remove timeval/timespec usage from headers y2038: sparc: remove use of struct timex y2038: rename itimerval to __kernel_old_itimerval y2038: remove obsolete jiffies conversion functions nfs: fscache: use timespec64 in inode auxdata nfs: fix timstamp debug prints nfs: use time64_t internally sunrpc: convert to time64_t for expiry drm/etnaviv: avoid deprecated timespec drm/etnaviv: reject timeouts with tv_nsec >= NSEC_PER_SEC drm/msm: avoid using 'timespec' hfs/hfsplus: use 64-bit inode timestamps hostfs: pass 64-bit timestamps to/from user space packet: clarify timestamp overflow tsacct: add 64-bit btime field acct: stop using get_seconds() um: ubd: use 64-bit time_t where possible xtensa: ISS: avoid struct timeval dlm: use SO_SNDTIMEO_NEW instead of SO_SNDTIMEO_OLD ... |
||
Linus Torvalds
|
a4fe2b4d87 |
Printk changes for 5.6
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAl4xgVoACgkQUqAMR0iA lPInOw//XnGCL9WggQQV/Kq8JSlXz96quZcPMoIOQkXQQp56FfGz3Y8NtNFtAOpG BiA1VeOkmfdGP08mtUvEjrvZM35JBQxtn3FWbuNMqBmlnVrffFaYTizcCnGG0w6Y rLaVSOqml1FqUKq8unxZvBpjactqVLC85L8dmEJD9/SpZwQJZky/fSpDeuMHTgx2 KZ0tilIc+hJNawgXHJWfl6+EIMa6ZVl9IMFO+i87I4kdOpXzyC2vdqD8r7irYzB6 j4KakPSTgpm3GdIOMijENEeGWvqxD/1jm41ujbDGeE6+WnKW/UXxhgbYZhGlKzSS HLU49Pmk9TtyeSRewue6pZtG2nPj+UwT3qNMRyNK8u53EoN/eFBys2h7tEildRKY jHquIYY849YpC1/Db38shHOD0Phx+VpxzMIM0ZjLZmKVJyaAzdg2srcHcXWS8EmU ij9Ybe9T+7JKvS/l4rMaw44yoZJ7ePs62fMnCcJF38RojwqJGvwRRcLr8U4X09ap PlAPXykcZkIpYge/6dzWSCQfHUeJvoHN5YBoBOH5sx3xlimXaHnmEZA4OVbRknFo Ye8xjkUKejFsONWLu8Jh5P78ifcZw99hOpX4Cv+opc4q3nVJuQ4RgWR5PfD9F+U7 dvEkboTHme0mFbeQCz1WJtKr7xB4NO8O62suqYY0dDvWOyCdcVc= =TQ5g -----END PGP SIGNATURE----- Merge tag 'printk-for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk Pull printk update from Petr Mladek: "Prevent replaying log on all consoles" * tag 'printk-for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: printk: fix exclusive_console replaying |
||
Linus Torvalds
|
6aee4badd8 |
Merge branch 'work.openat2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull openat2 support from Al Viro: "This is the openat2() series from Aleksa Sarai. I'm afraid that the rest of namei stuff will have to wait - it got zero review the last time I'd posted #work.namei, and there had been a leak in the posted series I'd caught only last weekend. I was going to repost it on Monday, but the window opened and the odds of getting any review during that... Oh, well. Anyway, openat2 part should be ready; that _did_ get sane amount of review and public testing, so here it comes" From Aleksa's description of the series: "For a very long time, extending openat(2) with new features has been incredibly frustrating. This stems from the fact that openat(2) is possibly the most famous counter-example to the mantra "don't silently accept garbage from userspace" -- it doesn't check whether unknown flags are present[1]. This means that (generally) the addition of new flags to openat(2) has been fraught with backwards-compatibility issues (O_TMPFILE has to be defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old kernels gave errors, since it's insecure to silently ignore the flag[2]). All new security-related flags therefore have a tough road to being added to openat(2). Furthermore, the need for some sort of control over VFS's path resolution (to avoid malicious paths resulting in inadvertent breakouts) has been a very long-standing desire of many userspace applications. This patchset is a revival of Al Viro's old AT_NO_JUMPS[3] patchset (which was a variant of David Drysdale's O_BENEATH patchset[4] which was a spin-off of the Capsicum project[5]) with a few additions and changes made based on the previous discussion within [6] as well as others I felt were useful. In line with the conclusions of the original discussion of AT_NO_JUMPS, the flag has been split up into separate flags. However, instead of being an openat(2) flag it is provided through a new syscall openat2(2) which provides several other improvements to the openat(2) interface (see the patch description for more details). The following new LOOKUP_* flags are added: LOOKUP_NO_XDEV: Blocks all mountpoint crossings (upwards, downwards, or through absolute links). Absolute pathnames alone in openat(2) do not trigger this. Magic-link traversal which implies a vfsmount jump is also blocked (though magic-link jumps on the same vfsmount are permitted). LOOKUP_NO_MAGICLINKS: Blocks resolution through /proc/$pid/fd-style links. This is done by blocking the usage of nd_jump_link() during resolution in a filesystem. The term "magic-links" is used to match with the only reference to these links in Documentation/, but I'm happy to change the name. It should be noted that this is different to the scope of ~LOOKUP_FOLLOW in that it applies to all path components. However, you can do openat2(NO_FOLLOW|NO_MAGICLINKS) on a magic-link and it will *not* fail (assuming that no parent component was a magic-link), and you will have an fd for the magic-link. In order to correctly detect magic-links, the introduction of a new LOOKUP_MAGICLINK_JUMPED state flag was required. LOOKUP_BENEATH: Disallows escapes to outside the starting dirfd's tree, using techniques such as ".." or absolute links. Absolute paths in openat(2) are also disallowed. Conceptually this flag is to ensure you "stay below" a certain point in the filesystem tree -- but this requires some additional to protect against various races that would allow escape using "..". Currently LOOKUP_BENEATH implies LOOKUP_NO_MAGICLINKS, because it can trivially beam you around the filesystem (breaking the protection). In future, there might be similar safety checks done as in LOOKUP_IN_ROOT, but that requires more discussion. In addition, two new flags are added that expand on the above ideas: LOOKUP_NO_SYMLINKS: Does what it says on the tin. No symlink resolution is allowed at all, including magic-links. Just as with LOOKUP_NO_MAGICLINKS this can still be used with NOFOLLOW to open an fd for the symlink as long as no parent path had a symlink component. LOOKUP_IN_ROOT: This is an extension of LOOKUP_BENEATH that, rather than blocking attempts to move past the root, forces all such movements to be scoped to the starting point. This provides chroot(2)-like protection but without the cost of a chroot(2) for each filesystem operation, as well as being safe against race attacks that chroot(2) is not. If a race is detected (as with LOOKUP_BENEATH) then an error is generated, and similar to LOOKUP_BENEATH it is not permitted to cross magic-links with LOOKUP_IN_ROOT. The primary need for this is from container runtimes, which currently need to do symlink scoping in userspace[7] when opening paths in a potentially malicious container. There is a long list of CVEs that could have bene mitigated by having RESOLVE_THIS_ROOT (such as CVE-2017-1002101, CVE-2017-1002102, CVE-2018-15664, and CVE-2019-5736, just to name a few). In order to make all of the above more usable, I'm working on libpathrs[8] which is a C-friendly library for safe path resolution. It features a userspace-emulated backend if the kernel doesn't support openat2(2). Hopefully we can get userspace to switch to using it, and thus get openat2(2) support for free once it's ready. Future work would include implementing things like RESOLVE_NO_AUTOMOUNT and possibly a RESOLVE_NO_REMOTE (to allow programs to be sure they don't hit DoSes though stale NFS handles)" * 'work.openat2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: Documentation: path-lookup: include new LOOKUP flags selftests: add openat2(2) selftests open: introduce openat2(2) syscall namei: LOOKUP_{IN_ROOT,BENEATH}: permit limited ".." resolution namei: LOOKUP_IN_ROOT: chroot-like scoped resolution namei: LOOKUP_BENEATH: O_BENEATH-like scoped resolution namei: LOOKUP_NO_XDEV: block mountpoint crossing namei: LOOKUP_NO_MAGICLINKS: block magic-link resolution namei: LOOKUP_NO_SYMLINKS: block symlink resolution namei: allow set_root() to produce errors namei: allow nd_jump_link() to produce errors nsfs: clean-up ns_get_path() signature to return int namei: only return -ECHILD from follow_dotdot_rcu() |
||
Linus Torvalds
|
15d6632496 |
Merge branch 'urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU warning removal from Paul McKenney: "A single commit that fixes an embarrassing bug discussed here: https://lore.kernel.org/lkml/20200125131425.GB16136@zn.tnic/ which apparently also affects smaller systems" [ This was sent to Ingo, but since I see the issue on the laptop I use for testing during the merge window, I'm doing the pull directly - Linus ] * 'urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: rcu: Forgive slow expedited grace periods at boot time |
||
Linus Torvalds
|
fad7bdc9b0 |
This pull request contains the following changes for UML:
- Fix for time travel mode - Disable CONFIG_CONSTRUCTORS again - A new command line option to have an non-raw serial line - Preparations to remove obsolete UML network drivers -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAl4k2EYWHHJpY2hhcmRA c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wTe2EACDEsoWZvvKnocFH/umFfZdxciU Ys5noEPElnILVIwV+Gm9SHq/RQWzG8BqSOirfOn1iGhEqWjDTPzwqPuqFGxKtRVp VoaYDA506oDH903i4vj1OuGDHxgModEmR/GFqU9uEtXUws2qbeZQcG0COkquJU8X URMz4XB+KLqDI2TvOTnbWevjJnslwLIqRuDdZ2q0d685J1XhRhuq/srgZGMiUpGn 4H/E4k0UxlC082oh9QWRFYYyc6vhyvlguupphzBgICZQmP4P4ck3pe23OT+vOWBl +e2ti9MlB9/Tv3dGhzmq2180U0D74RvtHIi7RjUdaTcEoOkgDwXqKsZ1CY4kCV78 mxrXHCE6YUMvsQcTBxobXYD/zUXeqXtlSHyGQ4MUATCvI6ag8vWKWjGXV/kDVWdf FEeL0O6AHjruTrPxi1aSJ3TFG+JerXCGZpSt2DG67sCcWJ/RqYnrs45DF4U6ywf4 BQ/nA0bpdZouLrhtCS6yBRvPiA5TVXHmrQMpK/LsOpBD4sKCV+MXghbYoWAwcSoM H+RSpf1em3zQrlRcuNPW8XGVkqOmUKn9pFzT9ybWv0h2hVhrDiutjJEPgbpJooIr yB0G/MVTtk3Xrok2lq8TT+Hp13TWCTFynsmKYvgv4s37p5jA5fvKL0vhdhIlAxHE FCyGsZIkAcMLfjvC3Q== =yi/o -----END PGP SIGNATURE----- Merge tag 'for-linus-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML updates from Anton Ivanov: "I am sending this on behalf of Richard who is traveling. This contains the following changes for UML: - Fix for time travel mode - Disable CONFIG_CONSTRUCTORS again - A new command line option to have an non-raw serial line - Preparations to remove obsolete UML network drivers" * tag 'for-linus-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: Fix time-travel=inf-cpu with xor/raid6 Revert "um: Enable CONFIG_CONSTRUCTORS" um: Mark non-vector net transports as obsolete um: Add an option to make serial driver non-raw |
||
Linus Torvalds
|
a78416d974 |
Kprobe events added "ustring" to distinguish reading strings from kernel space
or user space. But the creating of the event format file only checks for "string" to display string formats. "ustring" must also be handled. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXi8JRxQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qvaiAP943Srl0C1NHuKtJGHpYkgHJRt4mPFO 569Wx82a2ODH4AEA/D8uda0+p0wJB/uDnd/VyhTeb1nAjqzhx4pfGPNjaw8= =lu3h -----END PGP SIGNATURE----- Merge tag 'trace-v5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "Kprobe events added 'ustring' to distinguish reading strings from kernel space or user space. But the creating of the event format file only checks for 'string' to display string formats. 'ustring' must also be handled" * tag 'trace-v5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/kprobes: Have uname use __get_str() in print_fmt |
||
Linus Torvalds
|
bd2463ac7d |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller: 1) Add WireGuard 2) Add HE and TWT support to ath11k driver, from John Crispin. 3) Add ESP in TCP encapsulation support, from Sabrina Dubroca. 4) Add variable window congestion control to TIPC, from Jon Maloy. 5) Add BCM84881 PHY driver, from Russell King. 6) Start adding netlink support for ethtool operations, from Michal Kubecek. 7) Add XDP drop and TX action support to ena driver, from Sameeh Jubran. 8) Add new ipv4 route notifications so that mlxsw driver does not have to handle identical routes itself. From Ido Schimmel. 9) Add BPF dynamic program extensions, from Alexei Starovoitov. 10) Support RX and TX timestamping in igc, from Vinicius Costa Gomes. 11) Add support for macsec HW offloading, from Antoine Tenart. 12) Add initial support for MPTCP protocol, from Christoph Paasch, Matthieu Baerts, Florian Westphal, Peter Krystad, and many others. 13) Add Octeontx2 PF support, from Sunil Goutham, Geetha sowjanya, Linu Cherian, and others. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1469 commits) net: phy: add default ARCH_BCM_IPROC for MDIO_BCM_IPROC udp: segment looped gso packets correctly netem: change mailing list qed: FW 8.42.2.0 debug features qed: rt init valid initialization changed qed: Debug feature: ilt and mdump qed: FW 8.42.2.0 Add fw overlay feature qed: FW 8.42.2.0 HSI changes qed: FW 8.42.2.0 iscsi/fcoe changes qed: Add abstraction for different hsi values per chip qed: FW 8.42.2.0 Additional ll2 type qed: Use dmae to write to widebus registers in fw_funcs qed: FW 8.42.2.0 Parser offsets modified qed: FW 8.42.2.0 Queue Manager changes qed: FW 8.42.2.0 Expose new registers and change windows qed: FW 8.42.2.0 Internal ram offsets modifications MAINTAINERS: Add entry for Marvell OcteonTX2 Physical Function driver Documentation: net: octeontx2: Add RVU HW and drivers overview octeontx2-pf: ethtool RSS config support octeontx2-pf: Add basic ethtool support ... |
||
Linus Torvalds
|
a78208e243 |
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu: "API: - Removed CRYPTO_TFM_RES flags - Extended spawn grabbing to all algorithm types - Moved hash descsize verification into API code Algorithms: - Fixed recursive pcrypt dead-lock - Added new 32 and 64-bit generic versions of poly1305 - Added cryptogams implementation of x86/poly1305 Drivers: - Added support for i.MX8M Mini in caam - Added support for i.MX8M Nano in caam - Added support for i.MX8M Plus in caam - Added support for A33 variant of SS in sun4i-ss - Added TEE support for Raven Ridge in ccp - Added in-kernel API to submit TEE commands in ccp - Added AMD-TEE driver - Added support for BCM2711 in iproc-rng200 - Added support for AES256-GCM based ciphers for chtls - Added aead support on SEC2 in hisilicon" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (244 commits) crypto: arm/chacha - fix build failured when kernel mode NEON is disabled crypto: caam - add support for i.MX8M Plus crypto: x86/poly1305 - emit does base conversion itself crypto: hisilicon - fix spelling mistake "disgest" -> "digest" crypto: chacha20poly1305 - add back missing test vectors and test chunking crypto: x86/poly1305 - fix .gitignore typo tee: fix memory allocation failure checks on drv_data and amdtee crypto: ccree - erase unneeded inline funcs crypto: ccree - make cc_pm_put_suspend() void crypto: ccree - split overloaded usage of irq field crypto: ccree - fix PM race condition crypto: ccree - fix FDE descriptor sequence crypto: ccree - cc_do_send_request() is void func crypto: ccree - fix pm wrongful error reporting crypto: ccree - turn errors to debug msgs crypto: ccree - fix AEAD decrypt auth fail crypto: ccree - fix typo in comment crypto: ccree - fix typos in error msgs crypto: atmel-{aes,sha,tdes} - Retire crypto_platform_data crypto: x86/sha - Eliminate casts on asm implementations ... |
||
Linus Torvalds
|
c677124e63 |
Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar: "These were the main changes in this cycle: - More -rt motivated separation of CONFIG_PREEMPT and CONFIG_PREEMPTION. - Add more low level scheduling topology sanity checks and warnings to filter out nonsensical topologies that break scheduling. - Extend uclamp constraints to influence wakeup CPU placement - Make the RT scheduler more aware of asymmetric topologies and CPU capacities, via uclamp metrics, if CONFIG_UCLAMP_TASK=y - Make idle CPU selection more consistent - Various fixes, smaller cleanups, updates and enhancements - please see the git log for details" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits) sched/fair: Define sched_idle_cpu() only for SMP configurations sched/topology: Assert non-NUMA topology masks don't (partially) overlap idle: fix spelling mistake "iterrupts" -> "interrupts" sched/fair: Remove redundant call to cpufreq_update_util() sched/psi: create /proc/pressure and /proc/pressure/{io|memory|cpu} only when psi enabled sched/fair: Fix sgc->{min,max}_capacity calculation for SD_OVERLAP sched/fair: calculate delta runnable load only when it's needed sched/cputime: move rq parameter in irqtime_account_process_tick stop_machine: Make stop_cpus() static sched/debug: Reset watchdog on all CPUs while processing sysrq-t sched/core: Fix size of rq::uclamp initialization sched/uclamp: Fix a bug in propagating uclamp value in new cgroups sched/fair: Load balance aggressively for SCHED_IDLE CPUs sched/fair : Improve update_sd_pick_busiest for spare capacity case watchdog: Remove soft_lockup_hrtimer_cnt and related code sched/rt: Make RT capacity-aware sched/fair: Make EAS wakeup placement consider uclamp restrictions sched/fair: Make task_fits_capacity() consider uclamp restrictions sched/uclamp: Rename uclamp_util_with() into uclamp_rq_util_with() sched/uclamp: Make uclamp util helpers use and return UL values ... |
||
Linus Torvalds
|
c0e809e244 |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "Kernel side changes: - Ftrace is one of the last W^X violators (after this only KLP is left). These patches move it over to the generic text_poke() interface and thereby get rid of this oddity. This requires a surprising amount of surgery, by Peter Zijlstra. - x86/AMD PMUs: add support for 'Large Increment per Cycle Events' to count certain types of events that have a special, quirky hw ABI (by Kim Phillips) - kprobes fixes by Masami Hiramatsu Lots of tooling updates as well, the following subcommands were updated: annotate/report/top, c2c, clang, record, report/top TUI, sched timehist, tests; plus updates were done to the gtk ui, libperf, headers and the parser" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits) perf/x86/amd: Add support for Large Increment per Cycle Events perf/x86/amd: Constrain Large Increment per Cycle events perf/x86/intel/rapl: Add Comet Lake support tracing: Initialize ret in syscall_enter_define_fields() perf header: Use last modification time for timestamp perf c2c: Fix return type for histogram sorting comparision functions perf beauty sockaddr: Fix augmented syscall format warning perf/ui/gtk: Fix gtk2 build perf ui gtk: Add missing zalloc object perf tools: Use %define api.pure full instead of %pure-parser libperf: Setup initial evlist::all_cpus value perf report: Fix no libunwind compiled warning break s390 issue perf tools: Support --prefix/--prefix-strip perf report: Clarify in help that --children is default tools build: Fix test-clang.cpp with Clang 8+ perf clang: Fix build with Clang 9 kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic tools lib: Fix builds when glibc contains strlcpy() perf report/top: Make 'e' visible in the help and make it toggle showing callchains perf report/top: Do not offer annotation for symbols without samples ... |
||
Linus Torvalds
|
2180f214f4 |
Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar: "Just a handful of changes in this cycle: an ARM64 performance optimization, a comment fix and a debug output fix" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/osq: Use optimized spinning loop for arm64 locking/qspinlock: Fix inaccessible URL of MCS lock paper locking/lockdep: Fix lockdep_stats indentation problem |
||
Linus Torvalds
|
d99391ec2b |
Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar: "The RCU changes in this cycle were: - Expedited grace-period updates - kfree_rcu() updates - RCU list updates - Preemptible RCU updates - Torture-test updates - Miscellaneous fixes - Documentation updates" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (69 commits) rcu: Remove unused stop-machine #include powerpc: Remove comment about read_barrier_depends() .mailmap: Add entries for old paulmck@kernel.org addresses srcu: Apply *_ONCE() to ->srcu_last_gp_end rcu: Switch force_qs_rnp() to for_each_leaf_node_cpu_mask() rcu: Move rcu_{expedited,normal} definitions into rcupdate.h rcu: Move gp_state_names[] and gp_state_getname() to tree_stall.h rcu: Remove the declaration of call_rcu() in tree.h rcu: Fix tracepoint tracking RCU CPU kthread utilization rcu: Fix harmless omission of "CONFIG_" from #if condition rcu: Avoid tick_dep_set_cpu() misordering rcu: Provide wrappers for uses of ->rcu_read_lock_nesting rcu: Use READ_ONCE() for ->expmask in rcu_read_unlock_special() rcu: Clear ->rcu_read_unlock_special only once rcu: Clear .exp_hint only when deferred quiescent state has been reported rcu: Rename some instance of CONFIG_PREEMPTION to CONFIG_PREEMPT_RCU rcu: Remove kfree_call_rcu_nobatch() rcu: Remove kfree_rcu() special casing and lazy-callback handling rcu: Add support for debug_objects debugging for kfree_rcu() rcu: Add multiple in-flight batches of kfree_rcu() work ... |
||
Mike Christie
|
8d19f1c8e1
|
prctl: PR_{G,S}ET_IO_FLUSHER to support controlling memory reclaim
There are several storage drivers like dm-multipath, iscsi, tcmu-runner, amd nbd that have userspace components that can run in the IO path. For example, iscsi and nbd's userspace deamons may need to recreate a socket and/or send IO on it, and dm-multipath's daemon multipathd may need to send SG IO or read/write IO to figure out the state of paths and re-set them up. In the kernel these drivers have access to GFP_NOIO/GFP_NOFS and the memalloc_*_save/restore functions to control the allocation behavior, but for userspace we would end up hitting an allocation that ended up writing data back to the same device we are trying to allocate for. The device is then in a state of deadlock, because to execute IO the device needs to allocate memory, but to allocate memory the memory layers want execute IO to the device. Here is an example with nbd using a local userspace daemon that performs network IO to a remote server. We are using XFS on top of the nbd device, but it can happen with any FS or other modules layered on top of the nbd device that can write out data to free memory. Here a nbd daemon helper thread, msgr-worker-1, is performing a write/sendmsg on a socket to execute a request. This kicks off a reclaim operation which results in a WRITE to the nbd device and the nbd thread calling back into the mm layer. [ 1626.609191] msgr-worker-1 D 0 1026 1 0x00004000 [ 1626.609193] Call Trace: [ 1626.609195] ? __schedule+0x29b/0x630 [ 1626.609197] ? wait_for_completion+0xe0/0x170 [ 1626.609198] schedule+0x30/0xb0 [ 1626.609200] schedule_timeout+0x1f6/0x2f0 [ 1626.609202] ? blk_finish_plug+0x21/0x2e [ 1626.609204] ? _xfs_buf_ioapply+0x2e6/0x410 [ 1626.609206] ? wait_for_completion+0xe0/0x170 [ 1626.609208] wait_for_completion+0x108/0x170 [ 1626.609210] ? wake_up_q+0x70/0x70 [ 1626.609212] ? __xfs_buf_submit+0x12e/0x250 [ 1626.609214] ? xfs_bwrite+0x25/0x60 [ 1626.609215] xfs_buf_iowait+0x22/0xf0 [ 1626.609218] __xfs_buf_submit+0x12e/0x250 [ 1626.609220] xfs_bwrite+0x25/0x60 [ 1626.609222] xfs_reclaim_inode+0x2e8/0x310 [ 1626.609224] xfs_reclaim_inodes_ag+0x1b6/0x300 [ 1626.609227] xfs_reclaim_inodes_nr+0x31/0x40 [ 1626.609228] super_cache_scan+0x152/0x1a0 [ 1626.609231] do_shrink_slab+0x12c/0x2d0 [ 1626.609233] shrink_slab+0x9c/0x2a0 [ 1626.609235] shrink_node+0xd7/0x470 [ 1626.609237] do_try_to_free_pages+0xbf/0x380 [ 1626.609240] try_to_free_pages+0xd9/0x1f0 [ 1626.609245] __alloc_pages_slowpath+0x3a4/0xd30 [ 1626.609251] ? ___slab_alloc+0x238/0x560 [ 1626.609254] __alloc_pages_nodemask+0x30c/0x350 [ 1626.609259] skb_page_frag_refill+0x97/0xd0 [ 1626.609274] sk_page_frag_refill+0x1d/0x80 [ 1626.609279] tcp_sendmsg_locked+0x2bb/0xdd0 [ 1626.609304] tcp_sendmsg+0x27/0x40 [ 1626.609307] sock_sendmsg+0x54/0x60 [ 1626.609308] ___sys_sendmsg+0x29f/0x320 [ 1626.609313] ? sock_poll+0x66/0xb0 [ 1626.609318] ? ep_item_poll.isra.15+0x40/0xc0 [ 1626.609320] ? ep_send_events_proc+0xe6/0x230 [ 1626.609322] ? hrtimer_try_to_cancel+0x54/0xf0 [ 1626.609324] ? ep_read_events_proc+0xc0/0xc0 [ 1626.609326] ? _raw_write_unlock_irq+0xa/0x20 [ 1626.609327] ? ep_scan_ready_list.constprop.19+0x218/0x230 [ 1626.609329] ? __hrtimer_init+0xb0/0xb0 [ 1626.609331] ? _raw_spin_unlock_irq+0xa/0x20 [ 1626.609334] ? ep_poll+0x26c/0x4a0 [ 1626.609337] ? tcp_tsq_write.part.54+0xa0/0xa0 [ 1626.609339] ? release_sock+0x43/0x90 [ 1626.609341] ? _raw_spin_unlock_bh+0xa/0x20 [ 1626.609342] __sys_sendmsg+0x47/0x80 [ 1626.609347] do_syscall_64+0x5f/0x1c0 [ 1626.609349] ? prepare_exit_to_usermode+0x75/0xa0 [ 1626.609351] entry_SYSCALL_64_after_hwframe+0x44/0xa9 This patch adds a new prctl command that daemons can use after they have done their initial setup, and before they start to do allocations that are in the IO path. It sets the PF_MEMALLOC_NOIO and PF_LESS_THROTTLE flags so both userspace block and FS threads can use it to avoid the allocation recursion and try to prevent from being throttled while writing out data to free up memory. Signed-off-by: Mike Christie <mchristi@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Tested-by: Masato Suzuki <masato.suzuki@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Link: https://lore.kernel.org/r/20191112001900.9206-1-mchristi@redhat.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Ingo Molnar
|
0cc4bd8f70 |
Merge branch 'core/kprobes' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Linus Torvalds
|
3d3b44a61a |
The interrupt departement provides:
- A mechanism to shield isolated tasks from managed interrupts: The affinity of managed interrupts is completely controlled by the kernel and user space has no influence on them. The reason is that the automatically assigned affinity correlates to the multi-queue CPU handling of block devices. If the generated affinity mask spaws both housekeeping and isolated CPUs the interrupt could be routed to an isolated CPU which would then be disturbed by I/O submitted by a housekeeping CPU. The new mechamism ensures that as long as one housekeeping CPU is online in the assigned affinity mask the interrupt is routed to a housekeeping CPU. If there is no online housekeeping CPU in the affinity mask, then the interrupt is routed to an isolated CPU to keep the device queue intact, but unless the isolated CPU submits I/O by itself these interrupts are not raised. - A small addon to the device tree irqdomain core code to avoid duplication in irq chip drivers - Conversion of the SiFive PLIC to hierarchical domains - The usual pile of new irq chip drivers: SiFive GPIO, Aspeed SCI, NXP INTMUX, Meson A1 GPIO - The first cut of support for the new ARM GICv4.1 - The usual pile of fixes and improvements in core and driver code -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl4vcbETHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoezyEADBPf0ipu5+KeTtCR+DjRAO8o0wM0J/ JNkRkSrS/qENSda/d6pZE2AWpqlDOs6apg+SNGkv0knM+1Xy94nLOf4zJBsR+GW0 w2jw68egnyB2QZtm/BvOJL+qCoixcObg5sLt0165pDdKzyDNWeCMtRU+QAw42T/l WC2QrhjKKqYST1m+UgDf1UXz8TDGIW4muRP9UiG0Uwc0LU6cG2H4OmGn0bYissaT JTG75pzGqUH3kZ1a1qD28nGyoY85BXz1iV5/IvIPaQbkQARbvfMbh1KvAnGhJj7N 96rjMpOGv2/kv1FI+4FUy6w5Wn4EyW2OaCtB/oUCFNcZvrNNgvglxCRQkkO8yb3D VOOm595ICm3EnIfxBpSzhgvVl5MY39g6qRb6Rpnna+8eRtrYnytMBdvhY0OGlG8/ cZYZDay0nzhY6vq023iw1YMDKqft7TR1R+6w1iPL7nXHXW99Dhv87d1Fjt0CqphD NIoNDgxciIyfMbMBvcg1qPe/g3L8+cAKNzGsIwIU9GneEZFBk3/piGcBlFpoEEOK 2QKvks3QRXMx+qVWkIqy3LZKV9EAQlb9Lpjaa1ec5d4m/EdACm19OpZpqoCljPtw 9vdaMz4ZxvUbwjih3VnVPklZCiVGiKj1j0iw5v3FCHh4MUljzCrxNMqK/U9CR8H0 uid3EX8YMi+DXA== =E2VR -----END PGP SIGNATURE----- Merge tag 'irq-core-2020-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "The interrupt departement provides: - A mechanism to shield isolated tasks from managed interrupts: The affinity of managed interrupts is completely controlled by the kernel and user space has no influence on them. The reason is that the automatically assigned affinity correlates to the multi-queue CPU handling of block devices. If the generated affinity mask spaws both housekeeping and isolated CPUs the interrupt could be routed to an isolated CPU which would then be disturbed by I/O submitted by a housekeeping CPU. The new mechamism ensures that as long as one housekeeping CPU is online in the assigned affinity mask the interrupt is routed to a housekeeping CPU. If there is no online housekeeping CPU in the affinity mask, then the interrupt is routed to an isolated CPU to keep the device queue intact, but unless the isolated CPU submits I/O by itself these interrupts are not raised. - A small addon to the device tree irqdomain core code to avoid duplication in irq chip drivers - Conversion of the SiFive PLIC to hierarchical domains - The usual pile of new irq chip drivers: SiFive GPIO, Aspeed SCI, NXP INTMUX, Meson A1 GPIO - The first cut of support for the new ARM GICv4.1 - The usual pile of fixes and improvements in core and driver code" * tag 'irq-core-2020-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) genirq, sched/isolation: Isolate from handling managed interrupts irqchip/gic-v4.1: Allow direct invalidation of VLPIs irqchip/gic-v4.1: Suppress per-VLPI doorbell irqchip/gic-v4.1: Add VPE INVALL callback irqchip/gic-v4.1: Add VPE eviction callback irqchip/gic-v4.1: Add VPE residency callback irqchip/gic-v4.1: Add mask/unmask doorbell callbacks irqchip/gic-v4.1: Plumb skeletal VPE irqchip irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP irqchip/gic-v4.1: Don't use the VPE proxy if RVPEID is set irqchip/gic-v4.1: Implement the v4.1 flavour of VMAPP irqchip/gic-v4.1: VPE table (aka GICR_VPROPBASER) allocation irqchip/gic-v3: Add GICv4.1 VPEID size discovery irqchip/gic-v3: Detect GICv4.1 supporting RVPEID irqchip/gic-v3-its: Fix get_vlpi_map() breakage with doorbells irqdomain: Fix a memory leak in irq_domain_push_irq() irqchip: Add NXP INTMUX interrupt multiplexer support dt-bindings: interrupt-controller: Add binding for NXP INTMUX interrupt multiplexer irqchip: Define EXYNOS_IRQ_COMBINER irqchip/meson-gpio: Add support for meson a1 SoCs ... |
||
Linus Torvalds
|
ab67f60025 |
A small set of SMP core code changes:
- Rework the smp function call core code to avoid the allocation of an additional cpumask. - Remove the not longer required GFP argument from on_each_cpu_cond() and on_each_cpu_cond_mask() and fixup the callers. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl4vcrATHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYocr1D/4ptWrZKsgBxGKBP34lvJAjd0KRqVoz J9dLAN+AAs6YZSnOmRBX1b9d9IL2PrccOEF+J/Ja3ZkB+PAoAQ9W3uCHkZ77WUph xx5eJahZCo+3nZ6amGgS2cPdG8WjxSK3enxPcU4pJhV/QaaP7R9BZt5YQgreYAQO kRi0qyt10AExLqLd+077GX5DKcEOXwwVG/qckUQK2h8Kkd68vTbjDxggvsHwmpSE MHaszv85UpE+YQbT6DyG5Hi4kK3AJeODBy/fKr2VODIBLZpKiuQ5kK4lbNHYPpVB wXw0umXHLQggrKoPKo58ayoCXD0bAG9JT0rvapjUJIz1/9YejQ6lB/t5f0dPbSrU al4CJq/pfNky4H6uLWFVbAXJabJuBcB/eG1csaM88Yw0pEXkbnHCOkJAdosoDhhl qNQYg4yaE9tTuy1chXDMntH0R0Qztqry6+DMsczJxT21TgERsHCRJV+mGLV46/ZN GXJEoJ/cnjNJlqj8GirjbksPRbxuvmQNHRVrTh8qOSxbPKUQZfZocp9HHNmFsBaN Q07VgWMHXzYj1L4r3cbJ/ONpOCo66lw7F//MNGk0eIWdeL6H7XZvJQPX+YUrLsZc tVlZh8mZOGbRiM8g1dN0BSJO7QrVYmJWGb0oQQtv5tVSRN/V8Y9VZ8YX8lpYlF1e ETkrZLGhTJWp4A== =M4aK -----END PGP SIGNATURE----- Merge tag 'smp-core-2020-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core SMP updates from Thomas Gleixner: "A small set of SMP core code changes: - Rework the smp function call core code to avoid the allocation of an additional cpumask - Remove the not longer required GFP argument from on_each_cpu_cond() and on_each_cpu_cond_mask() and fixup the callers" * tag 'smp-core-2020-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: smp: Remove allocation mask from on_each_cpu_cond.*() smp: Add a smp_cond_func_t argument to smp_call_function_many() smp: Use smp_cond_func_t as type for the conditional function |
||
Linus Torvalds
|
e279160f49 |
The timekeeping and timers departement provides:
- Time namespace support: If a container migrates from one host to another then it expects that clocks based on MONOTONIC and BOOTTIME are not subject to disruption. Due to different boot time and non-suspended runtime these clocks can differ significantly on two hosts, in the worst case time goes backwards which is a violation of the POSIX requirements. The time namespace addresses this problem. It allows to set offsets for clock MONOTONIC and BOOTTIME once after creation and before tasks are associated with the namespace. These offsets are taken into account by timers and timekeeping including the VDSO. Offsets for wall clock based clocks (REALTIME/TAI) are not provided by this mechanism. While in theory possible, the overhead and code complexity would be immense and not justified by the esoteric potential use cases which were discussed at Plumbers '18. The overhead for tasks in the root namespace (host time offsets = 0) is in the noise and great effort was made to ensure that especially in the VDSO. If time namespace is disabled in the kernel configuration the code is compiled out. Kudos to Andrei Vagin and Dmitry Sofanov who implemented this feature and kept on for more than a year addressing review comments, finding better solutions. A pleasant experience. - Overhaul of the alarmtimer device dependency handling to ensure that the init/suspend/resume ordering is correct. - A new clocksource/event driver for Microchip PIT64 - Suspend/resume support for the Hyper-V clocksource - The usual pile of fixes, updates and improvements mostly in the driver code. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl4vbTcTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoXT2D/96iJ3G9Snn2khEQP3XS2rYmtDGw7NO m1n96falwWeGe6zreU80R2Jge5nLxQtNhRoMPLLee1GpHwRC6lvqEqgdZ4LMBrD2 JqV7Gzg8Urmdh+hpDsyTCpeEWEzoMKxiFOX8PxwctqUhM4szEe5iQg2YQsg85Jw2 vG6M93N2xwDILh4rhEMbKjo+5ZmYn7c1RQvpGOSmpKOj940W/N7H2HBsFhdaJ1Kw FW5pFv1211PaU5RV2YNb2dMeeMTT1N3e2VN4Dkadoxp47pb+725gNHEBEjmV9poG Lp4IhzGAPnj8zVD88icQZSTaK3gUHMClxprJ0Pf84WEtiH7SeGu8BPYyu77+oNDe yzcctDJNyCWXkzmaP/fe/HLc0TStbvNAJ5Tagp4BC75gzebeb4/n8RtRT0fKeDYL pxpDPKDAPU7p1JSjxiWAtshqjBycWNY3Z49bA7/VhKBhnv8BDyBPGlYd7/4xrbGr RK7DQNXJwaJaiNJ7p5PiaFxGzNyB0B9sThD/slSlEInIKb4h9YzWr0TV+NB62VnB sDcN+tpLbRPz5/5cHGGfxR0+zKWpfyai8pzbmmaXEaKssjRYwyvcac5EZdgbWpbK k7CqAjoWLA2P+tGeePNJOf5JYK6Vmdyh4clmuwM0zOiRJ9NlWUyMf3z7QYILs4RO UAI+6opYlZEPAw== =x3qT -----END PGP SIGNATURE----- Merge tag 'timers-core-2020-01-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "The timekeeping and timers departement provides: - Time namespace support: If a container migrates from one host to another then it expects that clocks based on MONOTONIC and BOOTTIME are not subject to disruption. Due to different boot time and non-suspended runtime these clocks can differ significantly on two hosts, in the worst case time goes backwards which is a violation of the POSIX requirements. The time namespace addresses this problem. It allows to set offsets for clock MONOTONIC and BOOTTIME once after creation and before tasks are associated with the namespace. These offsets are taken into account by timers and timekeeping including the VDSO. Offsets for wall clock based clocks (REALTIME/TAI) are not provided by this mechanism. While in theory possible, the overhead and code complexity would be immense and not justified by the esoteric potential use cases which were discussed at Plumbers '18. The overhead for tasks in the root namespace (ie where host time offsets = 0) is in the noise and great effort was made to ensure that especially in the VDSO. If time namespace is disabled in the kernel configuration the code is compiled out. Kudos to Andrei Vagin and Dmitry Sofanov who implemented this feature and kept on for more than a year addressing review comments, finding better solutions. A pleasant experience. - Overhaul of the alarmtimer device dependency handling to ensure that the init/suspend/resume ordering is correct. - A new clocksource/event driver for Microchip PIT64 - Suspend/resume support for the Hyper-V clocksource - The usual pile of fixes, updates and improvements mostly in the driver code" * tag 'timers-core-2020-01-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits) alarmtimer: Make alarmtimer_get_rtcdev() a stub when CONFIG_RTC_CLASS=n alarmtimer: Use wakeup source from alarmtimer platform device alarmtimer: Make alarmtimer platform device child of RTC device alarmtimer: Update alarmtimer_get_rtcdev() docs to reflect reality hrtimer: Add missing sparse annotation for __run_timer() lib/vdso: Only read hrtimer_res when needed in __cvdso_clock_getres() MIPS: vdso: Define BUILD_VDSO32 when building a 32bit kernel clocksource/drivers/hyper-v: Set TSC clocksource as default w/ InvariantTSC clocksource/drivers/hyper-v: Untangle stimers and timesync from clocksources clocksource/drivers/timer-microchip-pit64b: Fix sparse warning clocksource/drivers/exynos_mct: Rename Exynos to lowercase clocksource/drivers/timer-ti-dm: Fix uninitialized pointer access clocksource/drivers/timer-ti-dm: Switch to platform_get_irq clocksource/drivers/timer-ti-dm: Convert to devm_platform_ioremap_resource clocksource/drivers/em_sti: Fix variable declaration in em_sti_probe clocksource/drivers/em_sti: Convert to devm_platform_ioremap_resource clocksource/drivers/bcm2835_timer: Fix memory leak of timer clocksource/drivers/cadence-ttc: Use ttc driver as platform driver clocksource/drivers/timer-microchip-pit64b: Add Microchip PIT64B support clocksource/drivers/hyper-v: Reserve PAGE_SIZE space for tsc page ... |
||
Linus Torvalds
|
b11c89a158 |
A set of watchdog/softlockup related improvements:
- Enforce that the watchdog timestamp is always valid on boot. The original implementation caused a watchdog disabled gap of one second in the boot process due to truncation of the underlying sched clock. The sched clock is divided by 1e9 to convert nanoseconds to seconds. So for the first second of the boot process the result is 0 which is at the same time the indicator to disable the watchdog. The trivial fix is to change the disabled indicator to ULONG_MAX. - Two cleanup patches removing unused and redundant code which got forgotten to be cleaned up in previous changes. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl4vbrQTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoTQHD/9ONyg9VQLjk6aH94H1Sjik/K7zvxoC aMGY2onZ6PddVrcTgJoMmWteQlQ2YScCSVnfVedmxTRU8laEHU/LQnMntTAbuHWj VUkK8X/AI5l+VY6p0Sr1iCyxcFezoC2VMqOKntuQl3080mK7R7/fQ+ZVmimiPihr 46qMikIfBN7w2od7Ger3dZRttbnRj5YsmLBenX/HtBY/HPdhoDx6lfW/5AbAgUH5 qnAmM0yPZ/VUSfo45z+exESUezxByIkGsrROBtPSRwql3Oqbyrza2UC48dRjsuIQ vO0coorlhqJGF72WW45DiLvg4Hew/vVyzcYrIiOSQPZpeTtPzL23zk/cqcqpKy6N pCuiSgimzbPgzqTHs6WQR/D0Dn76rruUqXqteuD5zirC9Kjf2TWeIMPTgPfy8irt 2RwT1+5Ao/SNkdm/Pxk0S/+Y99uRJSqeNTV3lroYGC7IFMAnG4P0S9uyFJ6ZFIMz nOvEOhUlFXWw/w7WPZv+ytx40sRkqFVIePSRtzq+cjlDEYCgLhuveE2A4/6IGPMP Ej6vsGh3lMyHieRhmymESG8uLU2P/L7hhPexUPJJu4QSxKbKQNfWx+0z7bm86Ic7 0uDSNZZl7UDYq6tioS1DBTq9ybly9vn1WDe5tHMJDllPe9TIEnqynvVLIg6MMGdm GjbTNysDPx85yw== =WMiM -----END PGP SIGNATURE----- Merge tag 'core-core-2020-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull watchdog updates from Thomas Gleixner: "A set of watchdog/softlockup related improvements: - Enforce that the watchdog timestamp is always valid on boot. The original implementation caused a watchdog disabled gap of one second in the boot process due to truncation of the underlying sched clock. The sched clock is divided by 1e9 to convert nanoseconds to seconds. So for the first second of the boot process the result is 0 which is at the same time the indicator to disable the watchdog. The trivial fix is to change the disabled indicator to ULONG_MAX. - Two cleanup patches removing unused and redundant code which got forgotten to be cleaned up in previous changes" * tag 'core-core-2020-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: watchdog/softlockup: Enforce that timestamp is valid on boot watchdog/softlockup: Remove obsolete check of last reported task watchdog: Remove soft_lockup_hrtimer_cnt and related code |
||
Linus Torvalds
|
a56c41e5d7 |
Two fixes for the generic VDSO code which missed 5.5:
- Make the update to the coarse timekeeper unconditional. This is required because the coarse timekeeper interfaces in the VDSO do not depend on a VDSO capable clocksource. If the system does not have a VDSO capable clocksource and the update is depending on the VDSO capable clocksource, the coarse VDSO interfaces would operate on stale data forever. - Invert the logic of __arch_update_vdso_data() to avoid further head scratching. Tripped over this several times while analyzing the update problem above. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl4vXzUTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYodbPD/4km+XOhsbefcn1Xo6SAQV9akPhKSHY h1gfjpe4UD+Uj4WfmpERHcCJA3sYtZSjNyEWkwagH1XjB+rcLc3JE8XvhPCZTXCx g/OQlww1ef6mBZ5nslpPUZs8i0HppoV7Sa955QxR/jWuOIEssg5c+XGqP8xX8AhX TqBOUcJd0LhqCGt76Gb6LHnOEshE8e6ptZ0xayzMZsab3LJTEaJCrsoDpADQ1q8A hMjiL3CG9/e12qKYhODFTbyc/wgyGQYK8g6sb9E1Twd2Tw2+ikRbtZuQd3HQv4jV SiVtmMqLu6IH+G608zeNIn/67/WX9zYqUZ3fZgSjBwXWoB84Gyj11KLnjmCgS6SH 0ddOQKPn8VyQc2anG4obRtMNB+TjJvGnB4QSL2ROJB7Zx6EYMsduhXwIbaNZDDro nIh6Xvl6iyb0lkhd9zCR7ak7UHJg4ECJsVKK3kAMIHJM4f53d/DwT+ZaHbJZa/2a OLoBGpBkJoE1X40dXou+0FUyUFRla42+ho99nCU580EyK/ZAuZEqKjjez9QIh4vN L/I6uEHGBw9myB40nb0DFhRIFR97BUkRTRA3VhyX0CYIE3gUL43zNFsdvcugsxRy 4/Cf7tqhQcSjYjJxpLTRRWt2t6QvDoWfTnrwiPqSepcO17uV8WHLrxK4mT2i8Vjc PIq7OgZlp09gQA== =ONO4 -----END PGP SIGNATURE----- Merge tag 'timers-urgent-2020-01-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "Two fixes for the generic VDSO code which missed 5.5: - Make the update to the coarse timekeeper unconditional. This is required because the coarse timekeeper interfaces in the VDSO do not depend on a VDSO capable clocksource. If the system does not have a VDSO capable clocksource and the update is depending on the VDSO capable clocksource, the coarse VDSO interfaces would operate on stale data forever. - Invert the logic of __arch_update_vdso_data() to avoid further head scratching. Tripped over this several times while analyzing the update problem above" * tag 'timers-urgent-2020-01-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: lib/vdso: Update coarse timekeeper unconditionally lib/vdso: Make __arch_update_vdso_data() logic understandable |
||
Linus Torvalds
|
07e309a972 |
audit/stable-5.6 PR 20200127
-----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl4vRtMUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXM6rw//RXPHJ+U1gjtC5kWQX66/HxEwSY3c M236UiJD+xbEHKWpViFd6S7YzHQCkqEO2UvMSwMFP0aL2D56nhkEIKblQJ5sLSK9 3kNq/7wmxZgCj+/YrGeCiFFWpgSj/PiNB+VDouUkEkT5ZtKamA63qzhqEAUY995L vlZVgE8Cpu92JKJKZXKOnlJ+gYh3icFXKbWp0Lk9mmte4RiJ/zsFo+rRou5TzrMm 30D3A9p9A7sC3jMeRQCowE5UwTkdOeknRi1b4obAGAajuaA+/HtL7bUj8rVwjJXl bpX/wShrZDb+dc0NGLQikhzDV/i3qn1DzMbSMuJL/1tf9Jv5lzoJ0/14RkBzd5sm pPFA/tUs/3NlPKEyZluA7W21LOUdWk4UxeOJkysJLjfYvsVDg02yFS3qYaZRPaSa B3Ex36drCfQfMpMH4Nglh1iDl5oOIoAwn4mSCtirAw6YYG/sW6YnBEnloNYFfahs b4/xPhzKfzLtKdc+4yUSbTlIUU+GAdCLxPlp2IvRgqfa9oTATIRP9DY70//V3myN PGnCLCu10ag47fJWV4mNetYUv6BR22dvLLX8igcfYmIS3zYM0lEWEz7SOaRuPBdf QqAHMNaDCY6z8aEFr+aXW6kr2SP3ycqdvv+b+CbfX1Z7R7wZ8iG3uRyaQHEGPvN2 zje4VYJQcJs+EXE= =tPy4 -----END PGP SIGNATURE----- Merge tag 'audit-pr-20200127' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit update from Paul Moore: "One small audit patch for the Linux v5.6 merge window, and unsurprisingly it passes our test suite with flying colors" * tag 'audit-pr-20200127' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: Add __rcu annotation to RCU pointer |
||
Linus Torvalds
|
03aa8c8cfa |
Merge branch 'for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo: - cgroup2 interface for hugetlb controller. I think this was the last remaining bit which was missing from cgroup2 - fixes for race and a spurious warning in threaded cgroup handling - other minor changes * 'for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: iocost: Fix iocost_monitor.py due to helper type mismatch cgroup: Prevent double killing of css when enabling threaded cgroup cgroup: fix function name in comment mm: hugetlb controller for cgroups v2 |
||
Linus Torvalds
|
16d06120d7 |
Merge branch 'for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue updates from Tejun Heo: "Just a couple tracepoint patches" * 'for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: remove workqueue_work event class workqueue: add worker function to workqueue_execute_end tracepoint |
||
Linus Torvalds
|
6d277aca48 |
Power management updates for 5.6-rc1
- Update the ACPI processor driver in order to export acpi_processor_evaluate_cst() to the code outside of it, add ACPI support to the intel_idle driver based on that and clean up that driver somewhat (Rafael Wysocki). - Add an admin guide document for the intel_idle driver (Rafael Wysocki). - Clean up cpuidle core and drivers, enable compilation testing for some of them (Benjamin Gaignard, Krzysztof Kozlowski, Rafael Wysocki, Yangtao Li). - Fix reference counting of OPP (operating performance points) table structures (Viresh Kumar). - Add support for CPR (Core Power Reduction) to the AVS (Adaptive Voltage Scaling) subsystem (Niklas Cassel, Colin Ian King, YueHaibing). - Add support for TigerLake Mobile and JasperLake to the Intel RAPL power capping driver (Zhang Rui). - Update cpufreq drivers: * Add i.MX8MP support to imx-cpufreq-dt (Anson Huang). * Fix usage of a macro in loongson2_cpufreq (Alexandre Oliva). * Fix cpufreq policy reference counting issues in s3c and brcmstb-avs (chenqiwu). * Fix ACPI table reference counting issue and HiSilicon quirk handling in the CPPC driver (Hanjun Guo). * Clean up spelling mistake in intel_pstate (Harry Pan). * Convert the kirkwood and tegra186 drivers to using devm_platform_ioremap_resource() (Yangtao Li). - Update devfreq core: * Add 'name' sysfs attribute for devfreq devices (Chanwoo Choi). * Clean up the handing of transition statistics and allow them to be reset by writing 0 to the 'trans_stat' devfreq device attribute in sysfs (Kamil Konieczny). * Add 'devfreq_summary' to debugfs (Chanwoo Choi). * Clean up kerneldoc comments and Kconfig indentation (Krzysztof Kozlowski, Randy Dunlap). - Update devfreq drivers: * Add dynamic scaling for the imx8m DDR controller and clean up imx8m-ddrc (Leonard Crestez, YueHaibing). * Fix DT node reference counting and nitialization error code path in rk3399_dmc and add COMPILE_TEST and HAVE_ARM_SMCCC dependency for it (Chanwoo Choi, Yangtao Li). * Fix DT node reference counting in rockchip-dfi and make it use devm_platform_ioremap_resource() (Yangtao Li). * Fix excessive stack usage in exynos-ppmu (Arnd Bergmann). * Fix initialization error code paths in exynos-bus (Yangtao Li). * Clean up exynos-bus and exynos somewhat (Artur Świgoń, Krzysztof Kozlowski). - Add tracepoints for tracking usage_count updates unrelated to status changes in PM-runtime (Michał Mirosław). - Add sysfs attribute to control the "sync on suspend" behavior during system-wide suspend (Jonas Meurer). - Switch system-wide suspend tests over to 64-bit time (Alexandre Belloni). - Make wakeup sources statistics in debugfs cover deleted ones which used to be the case some time ago (zhuguangqing). - Clean up computations carried out during hibernation, update messages related to hibernation and fix a spelling mistake in one of them (Wen Yang, Luigi Semenzato, Colin Ian King). - Add mailmap entry for maintainer e-mail address that has not been functional for several years (Rafael Wysocki). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl4u2fESHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxvlkP/j5vDzyNUNJjnD6+897c8W+z5dwdiQfU QNtoopFXgw/fpOhGXRdj2mA4e6RtpU9aCCiHR6/qdh3/1qSnR5Y9R/51/gmdkwhY YakSxmgpgGrOJru94ApI1o/35eWwN/GxjajbfNY5ScrPQl/L0DF3iJWRsAOR5534 p9e2gQqKecoE+MEn5JcGAXApA5xBLXuUmtWPUn5UGyhaz+jdmsf1zkDEOEvxREay hLGH1y6BY8HS/jytyNzISs9iDeBvg2fHmG8SskDiXVMke5sHBTU9MilgpnCFfQ0l OF/eNnTXTU7mAJhlnjBUt2rIe5peGSuhgg+Ur7s86xYqbj2SfsVM4UHjU0A6t9Jm sauWQh/Nbzw6XaCNzYKxP+dREAg0g/aq7xFqQi3bWx7YvzLk/hvNWi2+bv3adzx7 Z3fvOki4xMXzLLrh0f1ipC8BKTsdioDZPAy06B80a0luv6ROdr6bPL7did14mWt2 eCuPuZyXKhdV+PkjZHF+c4XT7N9NfGtE0WUQf54Q4VT00hDagGDliwXpm4ht1pjJ iO7uUJevXKSxMaV2xPZ+nWZaOeCVrMMTA1Ec1ELgC1n8WROZJ+SfhehgMQGp7BHS Hz4QO1HjTsCDnT+OU7JFeCRrkyXIlh75MOndWOOH6eTEXCAI9PihstB+UGXeNsK0 BesNQz1sYY1O =g48u -----END PGP SIGNATURE----- Merge tag 'pm-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These add ACPI support to the intel_idle driver along with an admin guide document for it, add support for CPR (Core Power Reduction) to the AVS (Adaptive Voltage Scaling) subsystem, add new hardware support in a few places, add some new sysfs attributes, debugfs files and tracepoints, fix bugs and clean up a bunch of things all over. Specifics: - Update the ACPI processor driver in order to export acpi_processor_evaluate_cst() to the code outside of it, add ACPI support to the intel_idle driver based on that and clean up that driver somewhat (Rafael Wysocki). - Add an admin guide document for the intel_idle driver (Rafael Wysocki). - Clean up cpuidle core and drivers, enable compilation testing for some of them (Benjamin Gaignard, Krzysztof Kozlowski, Rafael Wysocki, Yangtao Li). - Fix reference counting of OPP (operating performance points) table structures (Viresh Kumar). - Add support for CPR (Core Power Reduction) to the AVS (Adaptive Voltage Scaling) subsystem (Niklas Cassel, Colin Ian King, YueHaibing). - Add support for TigerLake Mobile and JasperLake to the Intel RAPL power capping driver (Zhang Rui). - Update cpufreq drivers: - Add i.MX8MP support to imx-cpufreq-dt (Anson Huang). - Fix usage of a macro in loongson2_cpufreq (Alexandre Oliva). - Fix cpufreq policy reference counting issues in s3c and brcmstb-avs (chenqiwu). - Fix ACPI table reference counting issue and HiSilicon quirk handling in the CPPC driver (Hanjun Guo). - Clean up spelling mistake in intel_pstate (Harry Pan). - Convert the kirkwood and tegra186 drivers to using devm_platform_ioremap_resource() (Yangtao Li). - Update devfreq core: - Add 'name' sysfs attribute for devfreq devices (Chanwoo Choi). - Clean up the handing of transition statistics and allow them to be reset by writing 0 to the 'trans_stat' devfreq device attribute in sysfs (Kamil Konieczny). - Add 'devfreq_summary' to debugfs (Chanwoo Choi). - Clean up kerneldoc comments and Kconfig indentation (Krzysztof Kozlowski, Randy Dunlap). - Update devfreq drivers: - Add dynamic scaling for the imx8m DDR controller and clean up imx8m-ddrc (Leonard Crestez, YueHaibing). - Fix DT node reference counting and nitialization error code path in rk3399_dmc and add COMPILE_TEST and HAVE_ARM_SMCCC dependency for it (Chanwoo Choi, Yangtao Li). - Fix DT node reference counting in rockchip-dfi and make it use devm_platform_ioremap_resource() (Yangtao Li). - Fix excessive stack usage in exynos-ppmu (Arnd Bergmann). - Fix initialization error code paths in exynos-bus (Yangtao Li). - Clean up exynos-bus and exynos somewhat (Artur Świgoń, Krzysztof Kozlowski). - Add tracepoints for tracking usage_count updates unrelated to status changes in PM-runtime (Michał Mirosław). - Add sysfs attribute to control the "sync on suspend" behavior during system-wide suspend (Jonas Meurer). - Switch system-wide suspend tests over to 64-bit time (Alexandre Belloni). - Make wakeup sources statistics in debugfs cover deleted ones which used to be the case some time ago (zhuguangqing). - Clean up computations carried out during hibernation, update messages related to hibernation and fix a spelling mistake in one of them (Wen Yang, Luigi Semenzato, Colin Ian King). - Add mailmap entry for maintainer e-mail address that has not been functional for several years (Rafael Wysocki)" * tag 'pm-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (83 commits) cpufreq: loongson2_cpufreq: adjust cpufreq uses of LOONGSON_CHIPCFG intel_idle: Clean up irtl_2_usec() intel_idle: Move 3 functions closer to their callers intel_idle: Annotate initialization code and data structures intel_idle: Move and clean up intel_idle_cpuidle_devices_uninit() intel_idle: Rearrange intel_idle_cpuidle_driver_init() intel_idle: Clean up NULL pointer check in intel_idle_init() intel_idle: Fold intel_idle_probe() into intel_idle_init() intel_idle: Eliminate __setup_broadcast_timer() cpuidle: fix cpuidle_find_deepest_state() kerneldoc warnings cpuidle: sysfs: fix warnings when compiling with W=1 cpuidle: coupled: fix warnings when compiling with W=1 cpufreq: brcmstb-avs: fix imbalance of cpufreq policy refcount PM: suspend: Add sysfs attribute to control the "sync on suspend" behavior PM / devfreq: Add debugfs support with devfreq_summary file Documentation: admin-guide: PM: Add intel_idle document cpuidle: arm: Enable compile testing for some of drivers PM-runtime: add tracepoints for usage_count changes cpufreq: intel_pstate: fix spelling mistake: "Whethet" -> "Whether" PM: hibernate: fix spelling mistake "shapshot" -> "snapshot" ... |
||
Linus Torvalds
|
0238d3c753 |
arm64 updates for 5.6
- New architecture features * Support for Armv8.5 E0PD, which benefits KASLR in the same way as KPTI but without the overhead. This allows KPTI to be disabled on CPUs that are not affected by Meltdown, even is KASLR is enabled. * Initial support for the Armv8.5 RNG instructions, which claim to provide access to a high bandwidth, cryptographically secure hardware random number generator. As well as exposing these to userspace, we also use them as part of the KASLR seed and to seed the crng once all CPUs have come online. * Advertise a bunch of new instructions to userspace, including support for Data Gathering Hint, Matrix Multiply and 16-bit floating point. - Kexec * Cleanups in preparation for relocating with the MMU enabled * Support for loading crash dump kernels with kexec_file_load() - Perf and PMU drivers * Cleanups and non-critical fixes for a couple of system PMU drivers - FPU-less (aka broken) CPU support * Considerable fixes to support CPUs without the FP/SIMD extensions, including their presence in heterogeneous systems. Good luck finding a 64-bit userspace that handles this. - Modern assembly function annotations * Start migrating our use of ENTRY() and ENDPROC() over to the new-fangled SYM_{CODE,FUNC}_{START,END} macros, which are intended to aid debuggers - Kbuild * Cleanup detection of LSE support in the assembler by introducing 'as-instr' * Remove compressed Image files when building clean targets - IP checksumming * Implement optimised IPv4 checksumming routine when hardware offload is not in use. An IPv6 version is in the works, pending testing. - Hardware errata * Work around Cortex-A55 erratum #1530923 - Shadow call stack * Work around some issues with Clang's integrated assembler not liking our perfectly reasonable assembly code * Avoid allocating the X18 register, so that it can be used to hold the shadow call stack pointer in future - ACPI * Fix ID count checking in IORT code. This may regress broken firmware that happened to work with the old implementation, in which case we'll have to revert it and try something else * Fix DAIF corruption on return from GHES handler with pseudo-NMIs - Miscellaneous * Whitelist some CPUs that are unaffected by Spectre-v2 * Reduce frequency of ASID rollover when KPTI is compiled in but inactive * Reserve a couple of arch-specific PROT flags that are already used by Sparc and PowerPC and are planned for later use with BTI on arm64 * Preparatory cleanup of our entry assembly code in preparation for moving more of it into C later on * Refactoring and cleanup -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl4oY+IQHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNNfRB/4p3vax0hqaOnLRvmJPRXF31B8oPlivnr2u 6HCA9LkdU5IlrgaTNOJ/sQEqJAPOPCU7v49Ol0iYw0iKL1suUE7Ikui5VB6Uybqt YbfF5UNzfXAMs2A86TF/hzqhxw+W+lpnZX8NVTuQeAODfHEGUB1HhTLfRi9INsER wKEAuoZyuSUibxTFvji+DAq7nVRniXX7CM7tE385pxDisCMuu/7E5wOl+3EZYXWz DTGzTbHXuVFL+UFCANFEUlAtmr3dQvPFIqAwVl/CxjRJjJ7a+/G3cYLsHFPrQCjj qYX4kfhAeeBtqmHL7YFNWFwFs5WaT5UcQquFO665/+uCTWSJpORY =AIh/ -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "The changes are a real mixed bag this time around. The only scary looking one from the diffstat is the uapi change to asm-generic/mman-common.h, but this has been acked by Arnd and is actually just adding a pair of comments in an attempt to prevent allocation of some PROT values which tend to get used for arch-specific purposes. We'll be using them for Branch Target Identification (a CFI-like hardening feature), which is currently under review on the mailing list. New architecture features: - Support for Armv8.5 E0PD, which benefits KASLR in the same way as KPTI but without the overhead. This allows KPTI to be disabled on CPUs that are not affected by Meltdown, even is KASLR is enabled. - Initial support for the Armv8.5 RNG instructions, which claim to provide access to a high bandwidth, cryptographically secure hardware random number generator. As well as exposing these to userspace, we also use them as part of the KASLR seed and to seed the crng once all CPUs have come online. - Advertise a bunch of new instructions to userspace, including support for Data Gathering Hint, Matrix Multiply and 16-bit floating point. Kexec: - Cleanups in preparation for relocating with the MMU enabled - Support for loading crash dump kernels with kexec_file_load() Perf and PMU drivers: - Cleanups and non-critical fixes for a couple of system PMU drivers FPU-less (aka broken) CPU support: - Considerable fixes to support CPUs without the FP/SIMD extensions, including their presence in heterogeneous systems. Good luck finding a 64-bit userspace that handles this. Modern assembly function annotations: - Start migrating our use of ENTRY() and ENDPROC() over to the new-fangled SYM_{CODE,FUNC}_{START,END} macros, which are intended to aid debuggers Kbuild: - Cleanup detection of LSE support in the assembler by introducing 'as-instr' - Remove compressed Image files when building clean targets IP checksumming: - Implement optimised IPv4 checksumming routine when hardware offload is not in use. An IPv6 version is in the works, pending testing. Hardware errata: - Work around Cortex-A55 erratum #1530923 Shadow call stack: - Work around some issues with Clang's integrated assembler not liking our perfectly reasonable assembly code - Avoid allocating the X18 register, so that it can be used to hold the shadow call stack pointer in future ACPI: - Fix ID count checking in IORT code. This may regress broken firmware that happened to work with the old implementation, in which case we'll have to revert it and try something else - Fix DAIF corruption on return from GHES handler with pseudo-NMIs Miscellaneous: - Whitelist some CPUs that are unaffected by Spectre-v2 - Reduce frequency of ASID rollover when KPTI is compiled in but inactive - Reserve a couple of arch-specific PROT flags that are already used by Sparc and PowerPC and are planned for later use with BTI on arm64 - Preparatory cleanup of our entry assembly code in preparation for moving more of it into C later on - Refactoring and cleanup" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (73 commits) arm64: acpi: fix DAIF manipulation with pNMI arm64: kconfig: Fix alignment of E0PD help text arm64: Use v8.5-RNG entropy for KASLR seed arm64: Implement archrandom.h for ARMv8.5-RNG arm64: kbuild: remove compressed images on 'make ARCH=arm64 (dist)clean' arm64: entry: Avoid empty alternatives entries arm64: Kconfig: select HAVE_FUTEX_CMPXCHG arm64: csum: Fix pathological zero-length calls arm64: entry: cleanup sp_el0 manipulation arm64: entry: cleanup el0 svc handler naming arm64: entry: mark all entry code as notrace arm64: assembler: remove smp_dmb macro arm64: assembler: remove inherit_daif macro ACPI/IORT: Fix 'Number of IDs' handling in iort_id_map() mm: Reserve asm-generic prot flags 0x10 and 0x20 for arch use arm64: Use macros instead of hard-coded constants for MAIR_EL1 arm64: Add KRYO{3,4}XX CPU cores to spectre-v2 safe list arm64: kernel: avoid x18 in __cpu_soft_restart arm64: kvm: stop treating register x18 as caller save arm64/lib: copy_page: avoid x18 register in assembler code ... |
||
Steven Rostedt (VMware)
|
20279420ae |
tracing/kprobes: Have uname use __get_str() in print_fmt
Thomas Richter reported:
> Test case 66 'Use vfs_getname probe to get syscall args filenames'
> is broken on s390, but works on x86. The test case fails with:
>
> [root@m35lp76 perf]# perf test -F 66
> 66: Use vfs_getname probe to get syscall args filenames
> :Recording open file:
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.004 MB /tmp/__perf_test.perf.data.TCdYj\
> (20 samples) ]
> Looking at perf.data file for vfs_getname records for the file we touched:
> FAILED!
> [root@m35lp76 perf]#
The root cause was the print_fmt of the kprobe event that referenced the
"ustring"
> Setting up the kprobe event using perf command:
>
> # ./perf probe "vfs_getname=getname_flags:72 pathname=filename:ustring"
>
> generates this format file:
> [root@m35lp76 perf]# cat /sys/kernel/debug/tracing/events/probe/\
> vfs_getname/format
> name: vfs_getname
> ID: 1172
> format:
> field:unsigned short common_type; offset:0; size:2; signed:0;
> field:unsigned char common_flags; offset:2; size:1; signed:0;
> field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
> field:int common_pid; offset:4; size:4; signed:1;
>
> field:unsigned long __probe_ip; offset:8; size:8; signed:0;
> field:__data_loc char[] pathname; offset:16; size:4; signed:1;
>
> print fmt: "(%lx) pathname=\"%s\"", REC->__probe_ip, REC->pathname
Instead of using "__get_str(pathname)" it referenced it directly.
Link: http://lkml.kernel.org/r/20200124100742.4050c15e@gandalf.local.home
Cc: stable@vger.kernel.org
Fixes:
|
||
David S. Miller
|
9e0703a265 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says: ==================== pull-request: bpf-next 2020-01-27 The following pull-request contains BPF updates for your *net-next* tree. We've added 20 non-merge commits during the last 5 day(s) which contain a total of 24 files changed, 433 insertions(+), 104 deletions(-). The main changes are: 1) Make BPF trampolines and dispatcher aware for the stack unwinder, from Jiri Olsa. 2) Improve handling of failed CO-RE relocations in libbpf, from Andrii Nakryiko. 3) Several fixes to BPF sockmap and reuseport selftests, from Lorenz Bauer. 4) Various cleanups in BPF devmap's XDP flush code, from John Fastabend. 5) Fix BPF flow dissector when used with port ranges, from Yoshiki Komachi. 6) Fix bpffs' map_seq_next callback to always inc position index, from Vasily Averin. 7) Allow overriding LLVM tooling for runqslower utility, from Andrey Ignatov. 8) Silence false-positive lockdep splats in devmap hash lookup, from Amol Grover. 9) Fix fentry/fexit selftests to initialize a variable before use, from John Sperbeck. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Rafael J. Wysocki
|
245224d1cb |
Merge branches 'pm-cpufreq' and 'pm-sleep'
* pm-cpufreq: cpufreq: loongson2_cpufreq: adjust cpufreq uses of LOONGSON_CHIPCFG cpufreq: brcmstb-avs: fix imbalance of cpufreq policy refcount cpufreq: intel_pstate: fix spelling mistake: "Whethet" -> "Whether" cpufreq: s3c: fix unbalances of cpufreq policy refcount cpufreq: imx-cpufreq-dt: Add i.MX8MP support cpufreq: Use imx-cpufreq-dt for i.MX8MP's speed grading cpufreq: tegra186: convert to devm_platform_ioremap_resource cpufreq: kirkwood: convert to devm_platform_ioremap_resource cpufreq: CPPC: put ACPI table after using it cpufreq : CPPC: Break out if HiSilicon CPPC workaround is matched * pm-sleep: PM: suspend: Add sysfs attribute to control the "sync on suspend" behavior PM: hibernate: fix spelling mistake "shapshot" -> "snapshot" PM: hibernate: Add more logging on hibernation failure PM: hibernate: improve arithmetic division in preallocate_highmem_fraction() PM: wakeup: Show statistics for deleted wakeup sources again PM: sleep: Switch to rtc_time64_to_tm()/rtc_tm_to_time64() |
||
John Fastabend
|
b23bfa5633 |
bpf, xdp: Remove no longer required rcu_read_{un}lock()
Now that we depend on rcu_call() and synchronize_rcu() to also wait for preempt_disabled region to complete the rcu read critical section in __dev_map_flush() is no longer required. Except in a few special cases in drivers that need it for other reasons. These originally ensured the map reference was safe while a map was also being free'd. And additionally that bpf program updates via ndo_bpf did not happen while flush updates were in flight. But flush by new rules can only be called from preempt-disabled NAPI context. The synchronize_rcu from the map free path and the rcu_call from the delete path will ensure the reference there is safe. So lets remove the rcu_read_lock and rcu_read_unlock pair to avoid any confusion around how this is being protected. If the rcu_read_lock was required it would mean errors in the above logic and the original patch would also be wrong. Now that we have done above we put the rcu_read_lock in the driver code where it is needed in a driver dependent way. I think this helps readability of the code so we know where and why we are taking read locks. Most drivers will not need rcu_read_locks here and further XDP drivers already have rcu_read_locks in their code paths for reading xdp programs on RX side so this makes it symmetric where we don't have half of rcu critical sections define in driver and the other half in devmap. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/1580084042-11598-4-git-send-email-john.fastabend@gmail.com |
||
John Fastabend
|
42a84a8cd0 |
bpf, xdp: Update devmap comments to reflect napi/rcu usage
Now that we rely on synchronize_rcu and call_rcu waiting to
exit perempt-disable regions (NAPI) lets update the comments
to reflect this.
Fixes:
|
||
Vasily Averin
|
90435a7891 |
bpf: map_seq_next should always increase position index
If seq_file .next fuction does not change position index, read after some lseek can generate an unexpected output. See also: https://bugzilla.kernel.org/show_bug.cgi?id=206283 v1 -> v2: removed missed increment in end of function Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/eca84fdd-c374-a154-d874-6c7b55fc3bc4@virtuozzo.com |
||
Madhuparna Bhowmik
|
913292c97d |
sched.h: Annotate sighand_struct with __rcu
This patch fixes the following sparse errors by annotating the sighand_struct with __rcu kernel/fork.c:1511:9: error: incompatible types in comparison expression kernel/exit.c💯19: error: incompatible types in comparison expression kernel/signal.c:1370:27: error: incompatible types in comparison expression This fix introduces the following sparse error in signal.c due to checking the sighand pointer without rcu primitives: kernel/signal.c:1386:21: error: incompatible types in comparison expression This new sparse error is also fixed in this patch. Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20200124045908.26389-1-madhuparnabhowmik10@gmail.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
David S. Miller
|
4d8773b68e |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Minor conflict in mlx5 because changes happened to code that has moved meanwhile. Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Paul E. McKenney
|
59d8cc6b2e |
rcu: Forgive slow expedited grace periods at boot time
Boot-time processing often loops in the kernel longer than one might prefer, which can prevent expedited grace periods from completing in a timely manner. This in turn triggers a splat In nohz_full CPUs One could argue that long-looping code should be fixed, but on the other hand, boot time is a bit special. This commit therefore removes the splat. Later commits will add the splat back in, but in a way that removes false positives. Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> |
||
Jiri Olsa
|
e9b4e606c2 |
bpf: Allow to resolve bpf trampoline and dispatcher in unwind
When unwinding the stack we need to identify each address to successfully continue. Adding latch tree to keep trampolines for quick lookup during the unwind. The patch uses first 48 bytes for latch tree node, leaving 4048 bytes from the rest of the page for trampoline or dispatcher generated code. It's still enough not to affect trampoline and dispatcher progs maximum counts. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200123161508.915203-3-jolsa@kernel.org |
||
Jiri Olsa
|
84ad7a7ab6 |
bpf: Allow BTF ctx access for string pointers
When accessing the context we allow access to arguments with scalar type and pointer to struct. But we deny access for pointer to scalar type, which is the case for many functions. Alexei suggested to take conservative approach and allow currently only string pointer access, which is the case for most functions now: Adding check if the pointer is to string type and allow access to it. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200123161508.915203-2-jolsa@kernel.org |
||
Ingo Molnar
|
f8a4bb6bfa |
Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney: - Expedited grace-period updates - kfree_rcu() updates - RCU list updates - Preemptible RCU updates - Torture-test updates - Miscellaneous fixes - Documentation updates Signed-off-by: Ingo Molnar <mingo@kernel.org> |