1346d00e1b
25103 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Michael Ellerman
|
1346d00e1b |
powerpc: Don't select HAVE_IRQ_EXIT_ON_IRQ_STACK
The HAVE_IRQ_EXIT_ON_IRQ_STACK option tells generic code that irq_exit() is called while still running on the hard irq stack (hardirq_ctx[] in the powerpc code). Selecting the option means the generic code will *not* switch to the softirq stack before running softirqs, because the code is already running on the (mostly empty) hard irq stack. But since commit |
||
He Ying
|
a1b29ba2f2 |
powerpc/kasan: Silence KASAN warnings in __get_wchan()
The following KASAN warning was reported in our kernel.
BUG: KASAN: stack-out-of-bounds in get_wchan+0x188/0x250
Read of size 4 at addr d216f958 by task ps/14437
CPU: 3 PID: 14437 Comm: ps Tainted: G O 5.10.0 #1
Call Trace:
[daa63858] [c0654348] dump_stack+0x9c/0xe4 (unreliable)
[daa63888] [c035cf0c] print_address_description.constprop.3+0x8c/0x570
[daa63908] [c035d6bc] kasan_report+0x1ac/0x218
[daa63948] [c00496e8] get_wchan+0x188/0x250
[daa63978] [c0461ec8] do_task_stat+0xce8/0xe60
[daa63b98] [c0455ac8] proc_single_show+0x98/0x170
[daa63bc8] [c03cab8c] seq_read_iter+0x1ec/0x900
[daa63c38] [c03cb47c] seq_read+0x1dc/0x290
[daa63d68] [c037fc94] vfs_read+0x164/0x510
[daa63ea8] [c03808e4] ksys_read+0x144/0x1d0
[daa63f38] [c005b1dc] ret_from_syscall+0x0/0x38
--- interrupt: c00 at 0x8fa8f4
LR = 0x8fa8cc
The buggy address belongs to the page:
page:98ebcdd2 refcount:0 mapcount:0 mapping:00000000 index:0x2 pfn:0x1216f
flags: 0x0()
raw: 00000000 00000000 01010122 00000000 00000002 00000000 ffffffff 00000000
raw: 00000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
d216f800: 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00 00
d216f880: f2 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>d216f900: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
^
d216f980: f2 f2 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 00 00
d216fa00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
After looking into this issue, I find the buggy address belongs
to the task stack region. It seems KASAN has something wrong.
I look into the code of __get_wchan in x86 architecture and
find the same issue has been resolved by the commit
|
||
Paul Mackerras
|
743cdb7bd0 |
powerpc/kasan: Mark more real-mode code as not to be instrumented
This marks more files and functions that can possibly be called in real mode as not to be instrumented by KASAN. Most were found by inspection, except for get_pseries_errorlog() which was reported as causing a crash in testing. Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/YoX1kZPnmUX4RZEK@cleo |
||
Linus Torvalds
|
6112bd00e8 |
powerpc updates for 5.19
- Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT). - Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later). - Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ. - Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later. - Drop support for system call instruction emulation. - Many other small features and fixes. Thanks to: Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas Sanjaya, Bjorn Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian King, Daniel Axtens, Dwaipayan Ray, Fabiano Rosas, Finn Thain, Frank Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu Hua, Haowen Bai, Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof Kozlowski, Laurent Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes, Miaoqian Lin, Minghao Chi, Nathan Chancellor, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár, Paul Mackerras, Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang wangx, Xiaomeng Tong, Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing, Yu Kuai, Zheng Bin, Zou Wei, Zucheng Zheng. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmKSEgETHG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgJpLEACee7mu2I00Z7VWtW5ckT4RFbAXYZcM Hv5DbTnVB2ItoQMRHvG52DNbR73j9HnYrz8kpwfTBVk90udxVP14L/swXDs3xbT4 riXEYtJ1DRVc/bLiOK637RLPWNrmmZStWZme7k0Y9Ki5Aif8i1Erjjq7EIy47m9j j1MTcwp3ND7IsBON2nZ3PkttEHhevKvOwCPb/BWtPMDV0OhyQUFKB2SNegrlCrkT wshDgdQcYqbIix98PoGa2ZfUVgFQD3JVLzXa4sLpqouzGD+HvEFStOFa2Gq/ZEvV zunaeXDdZUCjlib6KvA8+aumBbIQ1s/urrDbxd+3BuYxZ094vNP1B428NT1AWVtl 3bEZQIN8GSx0v9aHxZ8HePsAMXgG9d2o0xC9EMQ430+cqroN+6UHP7lkekwkprb7 U9EpZCG9U8jV6SDcaMigW3tooEjn657we0R8nZG2NgUNssdSHVh/JYxGDALPXIAk awL3NQrR0tYF3Y3LJm5AxdQrK1hJH8E+hZFCZvIpUXGsr/uf9Gemy/62pD1rhrr/ niULpxIneRGkJiXB5qdGy8pRu27ED53k7Ky6+8MWSEFQl1mUsHSryYACWz939D8c DydhBwQqDTl6Ozs41a5TkVjIRLOCrZADUd/VZM6A4kEOqPJ5t2Gz22Bn8ya1z6Ks 5Sx6vrGH7GnDjA== =15oQ -----END PGP SIGNATURE----- Merge tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT) - Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later) - Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ - Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later - Drop support for system call instruction emulation - Many other small features and fixes Thanks to Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas Sanjaya, Bjorn Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian King, Daniel Axtens, Dwaipayan Ray, Fabiano Rosas, Finn Thain, Frank Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu Hua, Haowen Bai, Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof Kozlowski, Laurent Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes, Miaoqian Lin, Minghao Chi, Nathan Chancellor, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár, Paul Mackerras, Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang wangx, Xiaomeng Tong, Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing, Yu Kuai, Zheng Bin, Zou Wei, and Zucheng Zheng. * tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (200 commits) powerpc/64: Include cache.h directly in paca.h powerpc/64s: Only set HAVE_ARCH_UNMAPPED_AREA when CONFIG_PPC_64S_HASH_MMU is set powerpc/xics: Include missing header powerpc/powernv/pci: Drop VF MPS fixup powerpc/fsl_book3e: Don't set rodata RO too early powerpc/microwatt: Add mmu bits to device tree powerpc/powernv/flash: Check OPAL flash calls exist before using powerpc/powermac: constify device_node in of_irq_parse_oldworld() powerpc/powermac: add missing g5_phy_disable_cpu1() declaration selftests/powerpc/pmu: fix spelling mistake "mis-match" -> "mismatch" powerpc: Enable the DAWR on POWER9 DD2.3 and above powerpc/64s: Add CPU_FTRS_POWER10 to ALWAYS mask powerpc/64s: Add CPU_FTRS_POWER9_DD2_2 to CPU_FTRS_ALWAYS mask powerpc: Fix all occurences of "the the" selftests/powerpc/pmu/ebb: remove fixed_instruction.S powerpc/platforms/83xx: Use of_device_get_match_data() powerpc/eeh: Drop redundant spinlock initialization powerpc/iommu: Add missing of_node_put in iommu_init_early_dart powerpc/pseries/vas: Call misc_deregister if sysfs init fails powerpc/papr_scm: Fix leaking nvdimm_events_map elements ... |
||
Linus Torvalds
|
6f664045c8 |
Not a lot of material this cycle. Many singleton patches against various
subsystems. Most notably some maintenance work in ocfs2 and initramfs. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYo/6xQAKCRDdBJ7gKXxA jkD9AQCPczLBbRWpe1edL+5VHvel9ePoHQmvbHQnufdTh9rB5QEAu0Uilxz4q9cx xSZypNhj2n9f8FCYca/ZrZneBsTnAA8= =AJEO -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2022-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc updates from Andrew Morton: "The non-MM patch queue for this merge window. Not a lot of material this cycle. Many singleton patches against various subsystems. Most notably some maintenance work in ocfs2 and initramfs" * tag 'mm-nonmm-stable-2022-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (65 commits) kcov: update pos before writing pc in trace function ocfs2: dlmfs: fix error handling of user_dlm_destroy_lock ocfs2: dlmfs: don't clear USER_LOCK_ATTACHED when destroying lock fs/ntfs: remove redundant variable idx fat: remove time truncations in vfat_create/vfat_mkdir fat: report creation time in statx fat: ignore ctime updates, and keep ctime identical to mtime in memory fat: split fat_truncate_time() into separate functions MAINTAINERS: add Muchun as a memcg reviewer proc/sysctl: make protected_* world readable ia64: mca: drop redundant spinlock initialization tty: fix deadlock caused by calling printk() under tty_port->lock relay: remove redundant assignment to pointer buf fs/ntfs3: validate BOOT sectors_per_clusters lib/string_helpers: fix not adding strarray to device's resource list kernel/crash_core.c: remove redundant check of ck_cmdline ELF, uapi: fixup ELF_ST_TYPE definition ipc/mqueue: use get_tree_nodev() in mqueue_get_tree() ipc: update semtimedop() to use hrtimer ipc/sem: remove redundant assignments ... |
||
Michael Ellerman
|
dcf280e6f8 |
powerpc/64: Include cache.h directly in paca.h
paca.h uses ____cacheline_aligned without directly including cache.h,
where it's defined.
For Book3S builds that's OK because paca.h includes lppaca.h, and it
does include cache.h.
But Book3E builds have been getting cache.h indirectly via printk.h,
which is dicey, and in fact that include was recently removed, leading
to build errors such as:
ld: fs/isofs/dir.o:(.bss+0x0): multiple definition of `____cacheline_aligned'; fs/isofs/namei.o:(.bss+0x0): first defined here
So include cache.h directly to fix the build error.
Fixes:
|
||
Linus Torvalds
|
ef98f9cfe2 |
Modules updates for v5.19-rc1
As promised, for v5.19 I queued up quite a bit of work for modules, but still with a pretty conservative eye. These changes have been soaking on modules-next (and so linux-next) for quite some time, the code shift was merged onto modules-next on March 22, and the last patch was queued on May 5th. The following are the highlights of what bells and whistles we will get for v5.19: 1) It was time to tidy up kernel/module.c and one way of starting with that effort was to split it up into files. At my request Aaron Tomlin spearheaded that effort with the goal to not introduce any functional at all during that endeavour. The penalty for the split is +1322 bytes total, +112 bytes in data, +1210 bytes in text while bss is unchanged. One of the benefits of this other than helping make the code easier to read and review is summoning more help on review for changes with livepatching so kernel/module/livepatch.c is now pegged as maintained by the live patching folks. The before and after with just the move on a defconfig on x86-64: $ size kernel/module.o text data bss dec hex filename 38434 4540 104 43078 a846 kernel/module.o $ size -t kernel/module/*.o text data bss dec hex filename 4785 120 0 4905 1329 kernel/module/kallsyms.o 28577 4416 104 33097 8149 kernel/module/main.o 1158 8 0 1166 48e kernel/module/procfs.o 902 108 0 1010 3f2 kernel/module/strict_rwx.o 3390 0 0 3390 d3e kernel/module/sysfs.o 832 0 0 832 340 kernel/module/tree_lookup.o 39644 4652 104 44400 ad70 (TOTALS) 2) Aaron added module unload taint tracking (MODULE_UNLOAD_TAINT_TRACKING), so to enable tracking unloaded modules which did taint the kernel. 3) Christophe Leroy added CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC which lets architectures to request having modules data in vmalloc area instead of module area. There are three reasons why an architecture might want this: a) On some architectures (like book3s/32) it is not possible to protect against execution on a page basis. The exec stuff can be mapped by different arch segment sizes (on book3s/32 that is 256M segments). By default the module area is in an Exec segment while vmalloc area is in a NoExec segment. Using vmalloc lets you muck with module data as NoExec on those architectures whereas before you could not. b) By pushing more module data to vmalloc you also increase the probability of module text to remain within a closer distance from kernel core text and this reduces trampolines, this has been reported on arm first and powerpc folks are following that lead. c) Free'ing module_alloc() (Exec by default) area leaves this exposed as Exec by default, some architectures have some security enhancements to set this as NoExec on free, and splitting module data with text let's future generic special allocators be added to the kernel without having developers try to grok the tribal knowledge per arch. Work like Rick Edgecombe's permission vmalloc interface [0] becomes easier to address over time. [0] https://lore.kernel.org/lkml/20201120202426.18009-1-rick.p.edgecombe@intel.com/#r 4) Masahiro Yamada's symbol search enhancements -----BEGIN PGP SIGNATURE----- iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmKOnHkSHG1jZ3JvZkBr ZXJuZWwub3JnAAoJEM4jHQowkoinFw4P/1ADdvfj+b6wbAxou6tPa2ZKnx/ImEnE 0T1P/n2guWg+2Q8oYjqifTpadGzr8td4c/PaGb5UpfdEOdBIyIGklrVZpQ+xkqfT X4KIvqsf4ajL24OKxOSNtvL8RXEIDUhJ4Veq6BImBk8CPrPjsUBlNyAIlvV0aom2 BsFROQ2pMTSCiFY47gkMKLBlBny1l7zktoF0lhWTzHimw8VSDbTJFlu+fZvspd0o lCqiHTkpiBSJDSEEjqk0lT6wIb27fvdzjmjy+Ur71bBKiPIEPiL5XNUufkGe6oB3 mnTOPow+wPTQc0dtkTpCHQYXE/a70Sbkwp1JfkbSYeHzJLlFru/tkmKiwN0RUo9l 0mY7VPEKuQWmxsOkLqvwcPBGx5JOSWOJKrbgpFmH+RLgeEgEa8t7uQDURK2KeIj8 P7ZzN5M2klKIHHA4vjfekYOJAb1Tii9Ibp7iGeiYxf93mPJBqwvRwbtBXBZpB4ce FoDrxwEq812KPW7P2O1kgOvq7Fn1KWh0wVeKc8iBGxFxJhzOQY86H1ZRWDLAxRss Rr1PMLt2TbTLUBt7MzR4vrg0NoQvpLYyf2jGFjWyZDRHU8nLeHkOlQot3xRDAtq9 Bpx5mSlM9BGfPibd1Kw4BaxBha5vVCQ+AcleT+NWnCjw4I0wLoFi9RLUSyItn9No tlHLgdrM2a54 =cxtr -----END PGP SIGNATURE----- Merge tag 'modules-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull modules updates from Luis Chamberlain: - It was time to tidy up kernel/module.c and one way of starting with that effort was to split it up into files. At my request Aaron Tomlin spearheaded that effort with the goal to not introduce any functional at all during that endeavour. The penalty for the split is +1322 bytes total, +112 bytes in data, +1210 bytes in text while bss is unchanged. One of the benefits of this other than helping make the code easier to read and review is summoning more help on review for changes with livepatching so kernel/module/livepatch.c is now pegged as maintained by the live patching folks. The before and after with just the move on a defconfig on x86-64: $ size kernel/module.o text data bss dec hex filename 38434 4540 104 43078 a846 kernel/module.o $ size -t kernel/module/*.o text data bss dec hex filename 4785 120 0 4905 1329 kernel/module/kallsyms.o 28577 4416 104 33097 8149 kernel/module/main.o 1158 8 0 1166 48e kernel/module/procfs.o 902 108 0 1010 3f2 kernel/module/strict_rwx.o 3390 0 0 3390 d3e kernel/module/sysfs.o 832 0 0 832 340 kernel/module/tree_lookup.o 39644 4652 104 44400 ad70 (TOTALS) - Aaron added module unload taint tracking (MODULE_UNLOAD_TAINT_TRACKING), to enable tracking unloaded modules which did taint the kernel. - Christophe Leroy added CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC which lets architectures to request having modules data in vmalloc area instead of module area. There are three reasons why an architecture might want this: a) On some architectures (like book3s/32) it is not possible to protect against execution on a page basis. The exec stuff can be mapped by different arch segment sizes (on book3s/32 that is 256M segments). By default the module area is in an Exec segment while vmalloc area is in a NoExec segment. Using vmalloc lets you muck with module data as NoExec on those architectures whereas before you could not. b) By pushing more module data to vmalloc you also increase the probability of module text to remain within a closer distance from kernel core text and this reduces trampolines, this has been reported on arm first and powerpc folks are following that lead. c) Free'ing module_alloc() (Exec by default) area leaves this exposed as Exec by default, some architectures have some security enhancements to set this as NoExec on free, and splitting module data with text let's future generic special allocators be added to the kernel without having developers try to grok the tribal knowledge per arch. Work like Rick Edgecombe's permission vmalloc interface [0] becomes easier to address over time. [0] https://lore.kernel.org/lkml/20201120202426.18009-1-rick.p.edgecombe@intel.com/#r - Masahiro Yamada's symbol search enhancements * tag 'modules-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (33 commits) module: merge check_exported_symbol() into find_exported_symbol_in_section() module: do not binary-search in __ksymtab_gpl if fsa->gplok is false module: do not pass opaque pointer for symbol search module: show disallowed symbol name for inherit_taint() module: fix [e_shstrndx].sh_size=0 OOB access module: Introduce module unload taint tracking module: Move module_assert_mutex_or_preempt() to internal.h module: Make module_flags_taint() accept a module's taints bitmap and usable outside core code module.h: simplify MODULE_IMPORT_NS powerpc: Select ARCH_WANTS_MODULES_DATA_IN_VMALLOC on book3s/32 and 8xx module: Remove module_addr_min and module_addr_max module: Add CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC module: Introduce data_layout module: Prepare for handling several RB trees module: Always have struct mod_tree_root module: Rename debug_align() as strict_align() module: Rework layout alignment to avoid BUG_ON()s module: Move module_enable_x() and frob_text() in strict_rwx.c module: Make module_enable_x() independent of CONFIG_ARCH_HAS_STRICT_MODULE_RWX module: Move version support into a separate file ... |
||
Linus Torvalds
|
98931dd95f |
Yang Shi has improved the behaviour of khugepaged collapsing of readonly
file-backed transparent hugepages. Johannes Weiner has arranged for zswap memory use to be tracked and managed on a per-cgroup basis. Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime enablement of the recent huge page vmemmap optimization feature. Baolin Wang contributes a series to fix some issues around hugetlb pagetable invalidation. Zhenwei Pi has fixed some interactions between hwpoisoned pages and virtualization. Tong Tiangen has enabled the use of the presently x86-only page_table_check debugging feature on arm64 and riscv. David Vernet has done some fixup work on the memcg selftests. Peter Xu has taught userfaultfd to handle write protection faults against shmem- and hugetlbfs-backed files. More DAMON development from SeongJae Park - adding online tuning of the feature and support for monitoring of fixed virtual address ranges. Also easier discovery of which monitoring operations are available. Nadav Amit has done some optimization of TLB flushing during mprotect(). Neil Brown continues to labor away at improving our swap-over-NFS support. David Hildenbrand has some fixes to anon page COWing versus get_user_pages(). Peng Liu fixed some errors in the core hugetlb code. Joao Martins has reduced the amount of memory consumed by device-dax's compound devmaps. Some cleanups of the arch-specific pagemap code from Anshuman Khandual. Muchun Song has found and fixed some errors in the TLB flushing of transparent hugepages. Roman Gushchin has done more work on the memcg selftests. And, of course, many smaller fixes and cleanups. Notably, the customary million cleanup serieses from Miaohe Lin. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYo52xQAKCRDdBJ7gKXxA jtJFAQD238KoeI9z5SkPMaeBRYSRQmNll85mxs25KapcEgWgGQD9FAb7DJkqsIVk PzE+d9hEfirUGdL6cujatwJ6ejYR8Q8= =nFe6 -----END PGP SIGNATURE----- Merge tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Almost all of MM here. A few things are still getting finished off, reviewed, etc. - Yang Shi has improved the behaviour of khugepaged collapsing of readonly file-backed transparent hugepages. - Johannes Weiner has arranged for zswap memory use to be tracked and managed on a per-cgroup basis. - Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime enablement of the recent huge page vmemmap optimization feature. - Baolin Wang contributes a series to fix some issues around hugetlb pagetable invalidation. - Zhenwei Pi has fixed some interactions between hwpoisoned pages and virtualization. - Tong Tiangen has enabled the use of the presently x86-only page_table_check debugging feature on arm64 and riscv. - David Vernet has done some fixup work on the memcg selftests. - Peter Xu has taught userfaultfd to handle write protection faults against shmem- and hugetlbfs-backed files. - More DAMON development from SeongJae Park - adding online tuning of the feature and support for monitoring of fixed virtual address ranges. Also easier discovery of which monitoring operations are available. - Nadav Amit has done some optimization of TLB flushing during mprotect(). - Neil Brown continues to labor away at improving our swap-over-NFS support. - David Hildenbrand has some fixes to anon page COWing versus get_user_pages(). - Peng Liu fixed some errors in the core hugetlb code. - Joao Martins has reduced the amount of memory consumed by device-dax's compound devmaps. - Some cleanups of the arch-specific pagemap code from Anshuman Khandual. - Muchun Song has found and fixed some errors in the TLB flushing of transparent hugepages. - Roman Gushchin has done more work on the memcg selftests. ... and, of course, many smaller fixes and cleanups. Notably, the customary million cleanup serieses from Miaohe Lin" * tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits) mm: kfence: use PAGE_ALIGNED helper selftests: vm: add the "settings" file with timeout variable selftests: vm: add "test_hmm.sh" to TEST_FILES selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests selftests: vm: add migration to the .gitignore selftests/vm/pkeys: fix typo in comment ksm: fix typo in comment selftests: vm: add process_mrelease tests Revert "mm/vmscan: never demote for memcg reclaim" mm/kfence: print disabling or re-enabling message include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace" include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion" mm: fix a potential infinite loop in start_isolate_page_range() MAINTAINERS: add Muchun as co-maintainer for HugeTLB zram: fix Kconfig dependency warning mm/shmem: fix shmem folio swapoff hang cgroup: fix an error handling path in alloc_pagecache_max_30M() mm: damon: use HPAGE_PMD_SIZE tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate nodemask.h: fix compilation error with GCC12 ... |
||
Linus Torvalds
|
df202b452f |
Kbuild updates for v5.19
- Add HOSTPKG_CONFIG env variable to allow users to override pkg-config - Support W=e as a shorthand for KCFLAGS=-Werror - Fix CONFIG_IKHEADERS build to support toybox cpio - Add scripts/dummy-tools/pahole to ease distro packagers' life - Suppress false-positive warnings from checksyscalls.sh for W=2 build - Factor out the common code of arch/*/boot/install.sh into scripts/install.sh - Support 'kernel-install' tool in scripts/prune-kernel - Refactor module-versioning to link the symbol versions at the final link of vmlinux and modules - Remove CONFIG_MODULE_REL_CRCS because module-versioning now works in an arch-agnostic way - Refactor modpost, Makefiles -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmKOO2oVHG1hc2FoaXJv eUBrZXJuZWwub3JnAAoJED2LAQed4NsGG54P/3/U5FIP5EoPAVu9HqSUKeeUiBYc z1B8d7Wt1xU0xHImPWNjoacfye4MrDMUv8mEWKgHCVusJxbUoS+3Z/kd64NU75Fg Cpj+9fP1N8m02IJzraxn6fw0bmfx4zp9Zsa9l2fjwL0emq4qhB7BA9/Nl6Png1IW p0TPR6gV0Wgp6ikf/eJ3b1decFSqM7QzDlbo860nPMG164gNpDZmFVf2G4HCRQoY GtgoQLEy2pBeOdU7+nJTKl2f5JOhDjRKX8equ7BHW9l7nbUvHd6ys3DGqYO3nvwV hZZdHwDtxxO6bJtzClKPREyfL2H9R2AGxq94HzSwdvwdLLoFxrTN+mg88xBg17Rm tKHy8jpZT36qh218h5lX5n9ZWcovTA38giZ+S/tkwOvvYGivKHDS23QwzB0HrG8/ VRd+0rhfIvuIpu0OQaTpTkZr2QVci2Zn6PPnxpyPEsGvWVFRjyx0WyZh4fFXnkQT n+GS7j5g1LVMra0qu0y+yp4zy/DVFKIcfry0xU8S5SaSEBBcWUxLS2nnoBVB4vb2 RpiVD2vaOlvu/Zs2SOgtuMOnTw+Qqrvh7OYm/WyxWrB3JQGa/r+vipMKiFEDi2NN pwR8wJT/CW1ycte93m3oO83jiitFqzXtAqo24wKlp4SOqnR/TQ/dx743ku2xvONe uynJVW/gZVm4KEUl =Y2TB -----END PGP SIGNATURE----- Merge tag 'kbuild-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Add HOSTPKG_CONFIG env variable to allow users to override pkg-config - Support W=e as a shorthand for KCFLAGS=-Werror - Fix CONFIG_IKHEADERS build to support toybox cpio - Add scripts/dummy-tools/pahole to ease distro packagers' life - Suppress false-positive warnings from checksyscalls.sh for W=2 build - Factor out the common code of arch/*/boot/install.sh into scripts/install.sh - Support 'kernel-install' tool in scripts/prune-kernel - Refactor module-versioning to link the symbol versions at the final link of vmlinux and modules - Remove CONFIG_MODULE_REL_CRCS because module-versioning now works in an arch-agnostic way - Refactor modpost, Makefiles * tag 'kbuild-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (56 commits) genksyms: adjust the output format to modpost kbuild: stop merging *.symversions kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS modpost: extract symbol versions from *.cmd files modpost: add sym_find_with_module() helper modpost: change the license of EXPORT_SYMBOL to bool type modpost: remove left-over cross_compile declaration kbuild: record symbol versions in *.cmd files kbuild: generate a list of objects in vmlinux modpost: move *.mod.c generation to write_mod_c_files() modpost: merge add_{intree_flag,retpoline,staging_flag} to add_header scripts/prune-kernel: Use kernel-install if available kbuild: factor out the common installation code into scripts/install.sh modpost: split new_symbol() to symbol allocation and hash table addition modpost: make sym_add_exported() always allocate a new symbol modpost: make multiple export error modpost: dump Module.symvers in the same order of modules.order modpost: traverse the namespace_list in order modpost: use doubly linked list for dump_lists modpost: traverse unresolved symbols in order ... |
||
Linus Torvalds
|
16477cdfef |
asm-generic changes for 5.19
The asm-generic tree contains three separate changes for linux-5.19: - The h8300 architecture is retired after it has been effectively unmaintained for a number of years. This is the last architecture we supported that has no MMU implementation, but there are still a few architectures (arm, m68k, riscv, sh and xtensa) that support CPUs with and without an MMU. - A series to add a generic ticket spinlock that can be shared by most architectures with a working cmpxchg or ll/sc type atomic, including the conversion of riscv, csky and openrisc. This series is also a prerequisite for the loongarch64 architecture port that will come as a separate pull request. - A cleanup of some exported uapi header files to ensure they can be included from user space without relying on other kernel headers. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmKPlXoACgkQmmx57+YA GNkxrRAAnuSgOUo9JC5C4Gm2Q9yhEUHU1QIYeVO0jlan5CkF18bo1Loptq4MdQtO /0pXJPH8rFhDSJQLetO4AAjEMDfJGR5ibmf7SasO03HjqC9++fIeN047MbnkHAwY hFqIkgqn4l+g1RMWK5WUSDJ3XQ7p5/yWzpg/CuxJ+D0w9by/LWI5A+2NKGXOS3GF yi7cWvIKC1l+PmrH3BFA+JYVTvFzlr9P6x5pSEBi6HmjGQR+Xn3s0bnIf6DGRZ+B Q6v03kMxtcqI9e9C0r0r7ZGbdMuRTYbGrksa4EfK0yJM9P0HchhTtT9zawAK7Ddv VMM4B+9r60UEM++hOLS6XrLJdn+Fv+OJDnhONb5c+Mndd8cwV1JbOlVbUlGkn92e WSdUCW6m0TBzDs9Ae1++1kUl1LodlcmSzxlb0ueAhU01QacCPlneyIEKUhcrCl5w ITVw4YVa/BVCh+HvTEdhhak/Qb/nWiojMY+UIH5smiwj6FSFdwEmmgCgHAKprQaA STMxRnccFknGW9CZheoMATYsPIHQKPlm9lbiulSoMLDHxGwshU/6vKD4HDoZU51d HPmUZeKVPahXCUXB4iFI3qD4Ltxaru9VbgfUiY18VB2oc6Mk+0oeh6luqwsrgBdz P2sQ2riZKhN5Frm3DCh7IbJqoqKHlLMWh0itpNllgP5SDmDJjng= =ri2Q -----END PGP SIGNATURE----- Merge tag 'asm-generic-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "The asm-generic tree contains three separate changes for linux-5.19: - The h8300 architecture is retired after it has been effectively unmaintained for a number of years. This is the last architecture we supported that has no MMU implementation, but there are still a few architectures (arm, m68k, riscv, sh and xtensa) that support CPUs with and without an MMU. - A series to add a generic ticket spinlock that can be shared by most architectures with a working cmpxchg or ll/sc type atomic, including the conversion of riscv, csky and openrisc. This series is also a prerequisite for the loongarch64 architecture port that will come as a separate pull request. - A cleanup of some exported uapi header files to ensure they can be included from user space without relying on other kernel headers" * tag 'asm-generic-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: h8300: remove stale bindings and symlink sparc: add asm/stat.h to UAPI compile-test coverage powerpc: add asm/stat.h to UAPI compile-test coverage mips: add asm/stat.h to UAPI compile-test coverage riscv: add linux/bpf_perf_event.h to UAPI compile-test coverage kbuild: prevent exported headers from including <stdlib.h>, <stdbool.h> agpgart.h: do not include <stdlib.h> from exported header csky: Move to generic ticket-spinlock RISC-V: Move to queued RW locks RISC-V: Move to generic spinlocks openrisc: Move to ticket-spinlock asm-generic: qrwlock: Document the spinlock fairness requirements asm-generic: qspinlock: Indicate the use of mixed-size atomics asm-generic: ticket-lock: New generic ticket-based spinlock remove the h8300 architecture |
||
Linus Torvalds
|
3f306ea2e1 |
dma-mapping updates for Linux 5.19
- don't over-decrypt memory (Robin Murphy) - takes min align mask into account for the swiotlb max mapping size (Tianyu Lan) - use GFP_ATOMIC in dma-debug (Mikulas Patocka) - fix DMA_ATTR_NO_KERNEL_MAPPING on xen/arm (me) - don't fail on highmem CMA pages in dma_direct_alloc_pages (me) - cleanup swiotlb initialization and share more code with swiotlb-xen (me, Stefano Stabellini) -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmKObTQLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYObmA//dIcDB/q4iFGD+WJh4MhM+asx0ZsdF2OJz42WEhgT Z9duOrgcneEQundCamqJP9rNTs980LHDA8uWQC5rZEc9vxuRVOdS7bSgYRUwWh6B r0ZjOsvQCn+ChoZML8uyk4rfmEINq+EvJuec3G5fgecZOhPuJS2i2uzzv5cHwqgP ChC0fwyZlkfdECXgvZXbEoCJLfTgGNlziN6Ai8dirSoqgEQUoCsY89/M7OiEBvV2 R4XUWD7OvQERfB4t6xLuUHyzf9PAuWB+OiblRVNeAmK3lMjxVrc3k4kIowgklnzD 8hfmphAa9Zou3zdfi6Gd4fiQRHRVOwKVp1rtqUmJ+lPSiwyMzu64z9ld2+2qac0h V4sSr/yJkhxnBT4/0MkTChvhnRobisackpUzNRpiM4ck7cNVb7eAvkISsbH+pWI9 aEexPhbyskjlV+GOyM4QL4ygG0dpXY0HSyoh6uaSVsaXMycnWIsJCPidXxV1HGV0 q2/RLHuHwYxia8cYCF01/DQvwOKSjwbU0zModxtRezGD5GYh2C0a+SrA1aX+qiTu yGJCs2UHtSQstAt78tTVp499YeDeL/oGSQkPAu8zyRkSczzF+CncGTuXyoJbAWyK otcgERWljgZ4scxjfu1uacfoVhKQ7nOu7hiJokL0U80FESAennLC3ZlocvB9h/ff HNA= =n2rk -----END PGP SIGNATURE----- Merge tag 'dma-mapping-5.19-2022-05-25' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping updates from Christoph Hellwig: - don't over-decrypt memory (Robin Murphy) - takes min align mask into account for the swiotlb max mapping size (Tianyu Lan) - use GFP_ATOMIC in dma-debug (Mikulas Patocka) - fix DMA_ATTR_NO_KERNEL_MAPPING on xen/arm (me) - don't fail on highmem CMA pages in dma_direct_alloc_pages (me) - cleanup swiotlb initialization and share more code with swiotlb-xen (me, Stefano Stabellini) * tag 'dma-mapping-5.19-2022-05-25' of git://git.infradead.org/users/hch/dma-mapping: (23 commits) dma-direct: don't over-decrypt memory swiotlb: max mapping size takes min align mask into account swiotlb: use the right nslabs-derived sizes in swiotlb_init_late swiotlb: use the right nslabs value in swiotlb_init_remap swiotlb: don't panic when the swiotlb buffer can't be allocated dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC dma-direct: don't fail on highmem CMA pages in dma_direct_alloc_pages swiotlb-xen: fix DMA_ATTR_NO_KERNEL_MAPPING on arm x86: remove cruft from <asm/dma-mapping.h> swiotlb: remove swiotlb_init_with_tbl and swiotlb_init_late_with_tbl swiotlb: merge swiotlb-xen initialization into swiotlb swiotlb: provide swiotlb_init variants that remap the buffer swiotlb: pass a gfp_mask argument to swiotlb_init_late swiotlb: add a SWIOTLB_ANY flag to lift the low memory restriction swiotlb: make the swiotlb_init interface more useful x86: centralize setting SWIOTLB_FORCE when guest memory encryption is enabled x86: remove the IOMMU table infrastructure MIPS/octeon: use swiotlb_init instead of open coding it arm/xen: don't check for xen_initial_domain() in xen_create_contiguous_region swiotlb: rename swiotlb_late_init_with_default_size ... |
||
Linus Torvalds
|
7e062cda7d |
Networking changes for 5.19.
Core ---- - Support TCPv6 segmentation offload with super-segments larger than 64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP). - Generalize skb freeing deferral to per-cpu lists, instead of per-socket lists. - Add a netdev statistic for packets dropped due to L2 address mismatch (rx_otherhost_dropped). - Continue work annotating skb drop reasons. - Accept alternative netdev names (ALT_IFNAME) in more netlink requests. - Add VLAN support for AF_PACKET SOCK_RAW GSO. - Allow receiving skb mark from the socket as a cmsg. - Enable memcg accounting for veth queues, sysctl tables and IPv6. BPF --- - Add libbpf support for User Statically-Defined Tracing (USDTs). - Speed up symbol resolution for kprobes multi-link attachments. - Support storing typed pointers to referenced and unreferenced objects in BPF maps. - Add support for BPF link iterator. - Introduce access to remote CPU map elements in BPF per-cpu map. - Allow middle-of-the-road settings for the kernel.unprivileged_bpf_disabled sysctl. - Implement basic types of dynamic pointers e.g. to allow for dynamically sized ringbuf reservations without extra memory copies. Protocols --------- - Retire port only listening_hash table, add a second bind table hashed by port and address. Avoid linear list walk when binding to very popular ports (e.g. 443). - Add bridge FDB bulk flush filtering support allowing user space to remove all FDB entries matching a condition. - Introduce accept_unsolicited_na sysctl for IPv6 to implement router-side changes for RFC9131. - Support for MPTCP path manager in user space. - Add MPTCP support for fallback to regular TCP for connections that have never connected additional subflows or transmitted out-of-sequence data (partial support for RFC8684 fallback). - Avoid races in MPTCP-level window tracking, stabilize and improve throughput. - Support lockless operation of GRE tunnels with seq numbers enabled. - WiFi support for host based BSS color collision detection. - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets. - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2). - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile). - Allow matching on the number of VLAN tags via tc-flower. - Add tracepoint for tcp_set_ca_state(). Driver API ---------- - Improve error reporting from classifier and action offload. - Add support for listing line cards in switches (devlink). - Add helpers for reporting page pool statistics with ethtool -S. - Add support for reading clock cycles when using PTP virtual clocks, instead of having the driver convert to time before reporting. This makes it possible to report time from different vclocks. - Support configuring low-latency Tx descriptor push via ethtool. - Separate Clause 22 and Clause 45 MDIO accesses more explicitly. New hardware / drivers ---------------------- - Ethernet: - Marvell's Octeon NIC PCI Endpoint support (octeon_ep) - Sunplus SP7021 SoC (sp7021_emac) - Add support for Renesas RZ/V2M (in ravb) - Add support for MediaTek mt7986 switches (in mtk_eth_soc) - Ethernet PHYs: - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting) - TI DP83TD510 PHY - Microchip LAN8742/LAN88xx PHYs - WiFi: - Driver for pureLiFi X, XL, XC devices (plfxlc) - Driver for Silicon Labs devices (wfx) - Support for WCN6750 (in ath11k) - Support Realtek 8852ce devices (in rtw89) - Mobile: - MediaTek T700 modems (Intel 5G 5000 M.2 cards) - CAN: - ctucanfd: add support for CTU CAN FD open-source IP core from Czech Technical University in Prague Drivers ------- - Delete a number of old drivers still using virt_to_bus(). - Ethernet NICs: - intel: support TSO on tunnels MPLS - broadcom: support multi-buffer XDP - nfp: support VF rate limiting - sfc: use hardware tx timestamps for more than PTP - mlx5: multi-port eswitch support - hyper-v: add support for XDP_REDIRECT - atlantic: XDP support (including multi-buffer) - macb: improve real-time perf by deferring Tx processing to NAPI - High-speed Ethernet switches: - mlxsw: implement basic line card information querying - prestera: add support for traffic policing on ingress and egress - Embedded Ethernet switches: - lan966x: add support for packet DMA (FDMA) - lan966x: add support for PTP programmable pins - ti: cpsw_new: enable bc/mc storm prevention - Qualcomm 802.11ax WiFi (ath11k): - Wake-on-WLAN support for QCA6390 and WCN6855 - device recovery (firmware restart) support - support setting Specific Absorption Rate (SAR) for WCN6855 - read country code from SMBIOS for WCN6855/QCA6390 - enable keep-alive during WoWLAN suspend - implement remain-on-channel support - MediaTek WiFi (mt76): - support Wireless Ethernet Dispatch offloading packet movement between the Ethernet switch and WiFi interfaces - non-standard VHT MCS10-11 support - mt7921 AP mode support - mt7921 IPv6 NS offload support - Ethernet PHYs: - micrel: ksz9031/ksz9131: cabletest support - lan87xx: SQI support for T1 PHYs - lan937x: add interrupt support for link detection Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmKNMPQACgkQMUZtbf5S IrsRARAAuDyYs6jFYB3p+xazZdOnbF4iAgVv71+DQGvmsCl6CB9OrsNZMlvE85OL Q3gjcRbgjrkN4lhgI8DmiGYbsUJnAvVjFdNjccz1Z/vTLYvuIM0ol54MUp5S+9WY StncOJkOGJxxR/Gi5gzVmejPDsysU3Jik+hm/fpIcz8pybXxAsFKU5waY5qfl+/T TZepfV0VCfqRDjqcF1qA5+jJZNU8pdodQlZ1+mh8bwu6Jk1ZkWkj6Ov8MWdwQldr LnPeK/9hIGzkdJYHZfajxA3t8D0K5CHzSuih2bJ9ry8ZXgVBkXEThew778/R5izW uB0YZs9COFlrIP7XHjtRTy/2xHOdYIPlj2nWhVdfuQDX8Crvt4VRN6EZ1rjko1ZJ WanfG6WHF8NH5pXBRQbh3kIMKBnYn6OIzuCfCQSqd+niHcxFIM4vRiggeXI5C5TW vJgEWfK6X+NfDiFVa3xyCrEmp5ieA/pNecpwd8rVkql+MtFAAw4vfsotLKOJEAru J/XL6UE+YuLqIJV9ACZ9x1AFXXAo661jOxBunOo4VXhXVzWS9lYYz5r5ryIkgT/8 /Fr0zjANJWgfIuNdIBtYfQ4qG+LozGq038VA06RhFUAZ5tF9DzhqJs2Q2AFuWWBC ewCePJVqo1j2Ceq2mGonXRt47OEnlePoOxTk9W+cKZb7ZWE+zEo= =Wjii -----END PGP SIGNATURE----- Merge tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core ---- - Support TCPv6 segmentation offload with super-segments larger than 64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP). - Generalize skb freeing deferral to per-cpu lists, instead of per-socket lists. - Add a netdev statistic for packets dropped due to L2 address mismatch (rx_otherhost_dropped). - Continue work annotating skb drop reasons. - Accept alternative netdev names (ALT_IFNAME) in more netlink requests. - Add VLAN support for AF_PACKET SOCK_RAW GSO. - Allow receiving skb mark from the socket as a cmsg. - Enable memcg accounting for veth queues, sysctl tables and IPv6. BPF --- - Add libbpf support for User Statically-Defined Tracing (USDTs). - Speed up symbol resolution for kprobes multi-link attachments. - Support storing typed pointers to referenced and unreferenced objects in BPF maps. - Add support for BPF link iterator. - Introduce access to remote CPU map elements in BPF per-cpu map. - Allow middle-of-the-road settings for the kernel.unprivileged_bpf_disabled sysctl. - Implement basic types of dynamic pointers e.g. to allow for dynamically sized ringbuf reservations without extra memory copies. Protocols --------- - Retire port only listening_hash table, add a second bind table hashed by port and address. Avoid linear list walk when binding to very popular ports (e.g. 443). - Add bridge FDB bulk flush filtering support allowing user space to remove all FDB entries matching a condition. - Introduce accept_unsolicited_na sysctl for IPv6 to implement router-side changes for RFC9131. - Support for MPTCP path manager in user space. - Add MPTCP support for fallback to regular TCP for connections that have never connected additional subflows or transmitted out-of-sequence data (partial support for RFC8684 fallback). - Avoid races in MPTCP-level window tracking, stabilize and improve throughput. - Support lockless operation of GRE tunnels with seq numbers enabled. - WiFi support for host based BSS color collision detection. - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets. - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2). - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile). - Allow matching on the number of VLAN tags via tc-flower. - Add tracepoint for tcp_set_ca_state(). Driver API ---------- - Improve error reporting from classifier and action offload. - Add support for listing line cards in switches (devlink). - Add helpers for reporting page pool statistics with ethtool -S. - Add support for reading clock cycles when using PTP virtual clocks, instead of having the driver convert to time before reporting. This makes it possible to report time from different vclocks. - Support configuring low-latency Tx descriptor push via ethtool. - Separate Clause 22 and Clause 45 MDIO accesses more explicitly. New hardware / drivers ---------------------- - Ethernet: - Marvell's Octeon NIC PCI Endpoint support (octeon_ep) - Sunplus SP7021 SoC (sp7021_emac) - Add support for Renesas RZ/V2M (in ravb) - Add support for MediaTek mt7986 switches (in mtk_eth_soc) - Ethernet PHYs: - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting) - TI DP83TD510 PHY - Microchip LAN8742/LAN88xx PHYs - WiFi: - Driver for pureLiFi X, XL, XC devices (plfxlc) - Driver for Silicon Labs devices (wfx) - Support for WCN6750 (in ath11k) - Support Realtek 8852ce devices (in rtw89) - Mobile: - MediaTek T700 modems (Intel 5G 5000 M.2 cards) - CAN: - ctucanfd: add support for CTU CAN FD open-source IP core from Czech Technical University in Prague Drivers ------- - Delete a number of old drivers still using virt_to_bus(). - Ethernet NICs: - intel: support TSO on tunnels MPLS - broadcom: support multi-buffer XDP - nfp: support VF rate limiting - sfc: use hardware tx timestamps for more than PTP - mlx5: multi-port eswitch support - hyper-v: add support for XDP_REDIRECT - atlantic: XDP support (including multi-buffer) - macb: improve real-time perf by deferring Tx processing to NAPI - High-speed Ethernet switches: - mlxsw: implement basic line card information querying - prestera: add support for traffic policing on ingress and egress - Embedded Ethernet switches: - lan966x: add support for packet DMA (FDMA) - lan966x: add support for PTP programmable pins - ti: cpsw_new: enable bc/mc storm prevention - Qualcomm 802.11ax WiFi (ath11k): - Wake-on-WLAN support for QCA6390 and WCN6855 - device recovery (firmware restart) support - support setting Specific Absorption Rate (SAR) for WCN6855 - read country code from SMBIOS for WCN6855/QCA6390 - enable keep-alive during WoWLAN suspend - implement remain-on-channel support - MediaTek WiFi (mt76): - support Wireless Ethernet Dispatch offloading packet movement between the Ethernet switch and WiFi interfaces - non-standard VHT MCS10-11 support - mt7921 AP mode support - mt7921 IPv6 NS offload support - Ethernet PHYs: - micrel: ksz9031/ksz9131: cabletest support - lan87xx: SQI support for T1 PHYs - lan937x: add interrupt support for link detection" * tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1809 commits) ptp: ocp: Add firmware header checks ptp: ocp: fix PPS source selector debugfs reporting ptp: ocp: add .init function for sma_op vector ptp: ocp: vectorize the sma accessor functions ptp: ocp: constify selectors ptp: ocp: parameterize input/output sma selectors ptp: ocp: revise firmware display ptp: ocp: add Celestica timecard PCI ids ptp: ocp: Remove #ifdefs around PCI IDs ptp: ocp: 32-bit fixups for pci start address Revert "net/smc: fix listen processing for SMC-Rv2" ath6kl: Use cc-disable-warning to disable -Wdangling-pointer selftests/bpf: Dynptr tests bpf: Add dynptr data slices bpf: Add bpf_dynptr_read and bpf_dynptr_write bpf: Dynptr support for ring buffers bpf: Add bpf_dynptr_from_mem for local dynptrs bpf: Add verifier support for dynptrs bpf: Suppress 'passing zero to PTR_ERR' warning bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack ... |
||
Linus Torvalds
|
ac2ab99072 |
Random number generator updates for Linux 5.19-rc1.
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmKKpM8ACgkQSfxwEqXe A6726w/+OJimGd4arvpSmdn+vxepSyDLgKfwM0x5zprRVd16xg8CjJr4eMonTesq YvtJRqpetb53MB+sMhutlvQqQzrjtf2MBkgPwF4I2gUrk7vLD45Q+AGdGhi/rUwz wHGA7xg1FHLHia2M/9idSqi8QlZmUP4u4l5ZnMyTUHiwvRD6XOrWKfqvUSawNzyh hCWlTUxDrjizsW5YpsJX/MkRadSC8loJEk5ByZebow6nRPfurJvqfrcOMgHyNrbY pOZ/CGPxcetMqotL2TuuJt5wKmenqYhIWGAp3YM2SWWgU2ueBZekW8AYeMfgUcvh LWV93RpSuAnE5wsdjIULvjFnEDJBf8ihfMnMrd9G5QjQu44tuKWfY2MghLSpYzaR V6UFbRmhrqhqiStHQXOvk1oqxtpbHlc9zzJLmvPmDJcbvzXQ9Opk5GVXAmdtnHnj M/ty3wGWxucY6mHqT8MkCShSSslbgEtc1pEIWHdrUgnaiSVoCVBEO+9LqLbjvOTm XA/6YtoiCE5FasK51pir1zVb2GORQn0v8HnuAOsusD/iPAlRQ/G5jZkaXbwRQI6j atYL1svqvSKn5POnzqAlMUXfMUr19K5xqJdp7i6qmlO1Vq6Z+tWbCQgD1JV+Wjkb CMyvXomFCFu4aYKGRE2SBRnWLRghG3kYHqEQ15yTPMQerxbUDNg= =SUr3 -----END PGP SIGNATURE----- Merge tag 'random-5.19-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "These updates continue to refine the work began in 5.17 and 5.18 of modernizing the RNG's crypto and streamlining and documenting its code. New for 5.19, the updates aim to improve entropy collection methods and make some initial decisions regarding the "premature next" problem and our threat model. The cloc utility now reports that random.c is 931 lines of code and 466 lines of comments, not that basic metrics like that mean all that much, but at the very least it tells you that this is very much a manageable driver now. Here's a summary of the various updates: - The random_get_entropy() function now always returns something at least minimally useful. This is the primary entropy source in most collectors, which in the best case expands to something like RDTSC, but prior to this change, in the worst case it would just return 0, contributing nothing. For 5.19, additional architectures are wired up, and architectures that are entirely missing a cycle counter now have a generic fallback path, which uses the highest resolution clock available from the timekeeping subsystem. Some of those clocks can actually be quite good, despite the CPU not having a cycle counter of its own, and going off-core for a stamp is generally thought to increase jitter, something positive from the perspective of entropy gathering. Done very early on in the development cycle, this has been sitting in next getting some testing for a while now and has relevant acks from the archs, so it should be pretty well tested and fine, but is nonetheless the thing I'll be keeping my eye on most closely. - Of particular note with the random_get_entropy() improvements is MIPS, which, on CPUs that lack the c0 count register, will now combine the high-speed but short-cycle c0 random register with the lower-speed but long-cycle generic fallback path. - With random_get_entropy() now always returning something useful, the interrupt handler now collects entropy in a consistent construction. - Rather than comparing two samples of random_get_entropy() for the jitter dance, the algorithm now tests many samples, and uses the amount of differing ones to determine whether or not jitter entropy is usable and how laborious it must be. The problem with comparing only two samples was that if the cycle counter was extremely slow, but just so happened to be on the cusp of a change, the slowness wouldn't be detected. Taking many samples fixes that to some degree. This, combined with the other improvements to random_get_entropy(), should make future unification of /dev/random and /dev/urandom maybe more possible. At the very least, were we to attempt it again today (we're not), it wouldn't break any of Guenter's test rigs that broke when we tried it with 5.18. So, not today, but perhaps down the road, that's something we can revisit. - We attempt to reseed the RNG immediately upon waking up from system suspend or hibernation, making use of the various timestamps about suspend time and such available, as well as the usual inputs such as RDRAND when available. - Batched randomness now falls back to ordinary randomness before the RNG is initialized. This provides more consistent guarantees to the types of random numbers being returned by the various accessors. - The "pre-init injection" code is now gone for good. I suspect you in particular will be happy to read that, as I recall you expressing your distaste for it a few months ago. Instead, to avoid a "premature first" issue, while still allowing for maximal amount of entropy availability during system boot, the first 128 bits of estimated entropy are used immediately as it arrives, with the next 128 bits being buffered. And, as before, after the RNG has been fully initialized, it winds up reseeding anyway a few seconds later in most cases. This resulted in a pretty big simplification of the initialization code and let us remove various ad-hoc mechanisms like the ugly crng_pre_init_inject(). - The RNG no longer pretends to handle the "premature next" security model, something that various academics and other RNG designs have tried to care about in the past. After an interesting mailing list thread, these issues are thought to be a) mainly academic and not practical at all, and b) actively harming the real security of the RNG by delaying new entropy additions after a potential compromise, making a potentially bad situation even worse. As well, in the first place, our RNG never even properly handled the premature next issue, so removing an incomplete solution to a fake problem was particularly nice. This allowed for numerous other simplifications in the code, which is a lot cleaner as a consequence. If you didn't see it before, https://lore.kernel.org/lkml/YmlMGx6+uigkGiZ0@zx2c4.com/ may be a thread worth skimming through. - While the interrupt handler received a separate code path years ago that avoids locks by using per-cpu data structures and a faster mixing algorithm, in order to reduce interrupt latency, input and disk events that are triggered in hardirq handlers were still hitting locks and more expensive algorithms. Those are now redirected to use the faster per-cpu data structures. - Rather than having the fake-crypto almost-siphash-based random32 implementation be used right and left, and in many places where cryptographically secure randomness is desirable, the batched entropy code is now fast enough to replace that. - As usual, numerous code quality and documentation cleanups. For example, the initialization state machine now uses enum symbolic constants instead of just hard coding numbers everywhere. - Since the RNG initializes once, and then is always initialized thereafter, a pretty heavy amount of code used during that initialization is never used again. It is now completely cordoned off using static branches and it winds up in the .text.unlikely section so that it doesn't reduce cache compactness after the RNG is ready. - A variety of functions meant for waiting on the RNG to be initialized were only used by vsprintf, and in not a particularly optimal way. Replacing that usage with a more ordinary setup made it possible to remove those functions. - A cleanup of how we warn userspace about the use of uninitialized /dev/urandom and uninitialized get_random_bytes() usage. Interestingly, with the change you merged for 5.18 that attempts to use jitter (but does not block if it can't), the majority of users should never see those warnings for /dev/urandom at all now, and the one for in-kernel usage is mainly a debug thing. - The file_operations struct for /dev/[u]random now implements .read_iter and .write_iter instead of .read and .write, allowing it to also implement .splice_read and .splice_write, which makes splice(2) work again after it was broken here (and in many other places in the tree) during the set_fs() removal. This was a bit of a last minute arrival from Jens that hasn't had as much time to bake, so I'll be keeping my eye on this as well, but it seems fairly ordinary. Unfortunately, read_iter() is around 3% slower than read() in my tests, which I'm not thrilled about. But Jens and Al, spurred by this observation, seem to be making progress in removing the bottlenecks on the iter paths in the VFS layer in general, which should remove the performance gap for all drivers. - Assorted other bug fixes, cleanups, and optimizations. - A small SipHash cleanup" * tag 'random-5.19-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (49 commits) random: check for signals after page of pool writes random: wire up fops->splice_{read,write}_iter() random: convert to using fops->write_iter() random: convert to using fops->read_iter() random: unify batched entropy implementations random: move randomize_page() into mm where it belongs random: remove mostly unused async readiness notifier random: remove get_random_bytes_arch() and add rng_has_arch_random() random: move initialization functions out of hot pages random: make consistent use of buf and len random: use proper return types on get_random_{int,long}_wait() random: remove extern from functions in header random: use static branch for crng_ready() random: credit architectural init the exact amount random: handle latent entropy and command line from random_init() random: use proper jiffies comparison macro random: remove ratelimiting for in-kernel unseeded randomness random: move initialization out of reseeding hot path random: avoid initializing twice in credit race random: use symbolic constants for crng_init states ... |
||
Christophe Leroy
|
c85ab4fe33 |
powerpc/64s: Only set HAVE_ARCH_UNMAPPED_AREA when CONFIG_PPC_64S_HASH_MMU is set
When CONFIG_PPC_64S_HASH_MMU is not set, slice.c is not built and
arch_get_unmapped_area() and arch_get_unmapped_area_topdown() are
not provided because RADIX uses the generic ones.
Therefore, neither set HAVE_ARCH_UNMAPPED_AREA nor
HAVE_ARCH_UNMAPPED_AREA_TOPDOWN.
Fixes:
|
||
Christophe Leroy
|
14554d92c1 |
powerpc/xics: Include missing header
Include of_adresses.h to get of_iomap() prototype.
Fixes:
|
||
Masahiro Yamada
|
7b4537199a |
kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS
include/{linux,asm-generic}/export.h defines a weak symbol, __crc_* as a placeholder. Genksyms writes the version CRCs into the linker script, which will be used for filling the __crc_* symbols. The linker script format depends on CONFIG_MODULE_REL_CRCS. If it is enabled, __crc_* holds the offset to the reference of CRC. It is time to get rid of this complexity. Now that modpost parses text files (.*.cmd) to collect all the CRCs, it can generate C code that will be linked to the vmlinux or modules. Generate a new C file, .vmlinux.export.c, which contains the CRCs of symbols exported by vmlinux. It is compiled and linked to vmlinux in scripts/link-vmlinux.sh. Put the CRCs of symbols exported by modules into the existing *.mod.c files. No additional build step is needed for modules. As before, *.mod.c are compiled and linked to *.ko in scripts/Makefile.modfinal. No linker magic is used here. The new C implementation works in the same way, whether CONFIG_RELOCATABLE is enabled or not. CONFIG_MODULE_REL_CRCS is no longer needed. Previously, Kbuild invoked additional $(LD) to update the CRCs in objects, but this step is unneeded too. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nicolas Schier <nicolas@fjasle.eu> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64) |
||
Linus Torvalds
|
de8ac81747 |
- Remove all the code around GS switching on 32-bit now that it is not
needed anymore - Other misc improvements -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmKLp74ACgkQEsHwGGHe VUpqrhAAgNdNw/vNTTzeOH5ZSNxyIoTQapmrSNev0cXRW4tV2hxuYSa2wPZPJZXx aYhnFxwL7rVy0er7jG/5KaOyzHmrh6PcmqgFdPVo8+yVrfcsPIUqg/4L5peFZh7T ETV2pvFIiB4njkL/pR3mU5uAtTjyO89tD/LclKmc4ndv19vI8maj+k/dCDOnNnEz m4wJMXYWh4bG47/izU5TcTYU7ttTLEiVQ/mC5kEuj7PQeUR0kXKvvLo4rX+lOI2v dQRHgHg/qoNM7uVLd7vV/YdMWwcHchmKG5Y7+a/ogdlwR7a/X9e+lklFSeuxNvyH 8dOHIyzcb6lKTijpqhisZ3o9150ax3Q5FlSWuE3F/9Rcuc1T5eY82kTW2RTOTdV9 xsjob4y+hlpsUfuImupxJLHn685xsYAdqyiG/SPkcnJL++tNBlWiGHX9NqXF5cgw bq4/94Aouxevl0OBxnFBeoQOJvOnf60OY3LHcYR78yEEJyi4iWsC0/TEmD+9IE+r EpC1wz9bHCYbSwZ+yv8u2tNPd/rKxdspPL/6SxT9a+WAVrOZbQAN3VmlOIon6W9O bW5ye6suqBbl/Q1FACVU1xxSNjLTJUTFsB1X3QKGm8E+Kr7/zD1ZtT0WQNvyLMfT p/I4VRcdIxV3eDiYqeTfJ3sTS7IjKHSaZVBnpkZvRh869mMdqCg= =CfX1 -----END PGP SIGNATURE----- Merge tag 'x86_core_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core x86 updates from Borislav Petkov: - Remove all the code around GS switching on 32-bit now that it is not needed anymore - Other misc improvements * tag 'x86_core_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: bug: Use normal relative pointers in 'struct bug_entry' x86/nmi: Make register_nmi_handler() more robust x86/asm: Merge load_gs_index() x86/32: Remove lazy GS macros ELF: Remove elf_core_copy_kernel_regs() x86/32: Simplify ELF_CORE_COPY_REGS |
||
Jakub Kicinski
|
32c53420d2 |
eth: de4x5: remove support for Generic DECchip & DIGITAL EtherWORKS PCI/EISA
Looks like almost all changes to this driver had been tree-wide refactoring since git era begun. There is one commit from Al 15 years ago which could potentially be fixing a real bug. The driver is using virt_to_bus() and is a real magnet for pointless cleanups. It seems unlikely to have real users. Let's try to shed this maintenance burden. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Oliver O'Halloran
|
a5d28039ec |
powerpc/powernv/pci: Drop VF MPS fixup
The MPS field in the VF config space is marked as reserved in current versions of the SR-IOV spec. In other words, this fixup doesn't do anything. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200902035159.1762596-1-oohall@gmail.com |
||
Christophe Leroy
|
ad91f66f5f |
powerpc/fsl_book3e: Don't set rodata RO too early
On fsl_book3e, rodata is set read-only at the same time as
init text is set NX at the end of init. That's too early.
As both action are performed at the same time, delay both
actions to the time rodata is expected to be made read-only.
It means we will have a small window with init mem freed but
still executable. It shouldn't be an issue though, especially
because the said memory gets poisoned and should therefore
result to a bad instruction fault in case it gets executed.
mmu_mark_initmem_nx() is bailing out before doing anything when
CONFIG_STRICT_KERNEL_RWX is not selected or rodata_enabled is false.
mmu_mark_rodata_ro() is called only when CONFIG_STRICT_KERNEL_RWX
is selected and rodata_enabled is true so this is equivalent.
Move code from mmu_mark_initmem_nx() into mmu_mark_rodata_ro() and
remove the call to strict_kernel_rwx_enabled() which is not needed
anymore.
Fixes:
|
||
Joel Stanley
|
0ef1ffc718 |
powerpc/microwatt: Add mmu bits to device tree
In commit |
||
Vasant Hegde
|
25e69962ef |
powerpc/powernv/flash: Check OPAL flash calls exist before using
Currently only FSP based powernv systems supports firmware update interfaces. Hence check that the token OPAL_FLASH_VALIDATE exists before initalising the flash driver. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210914101630.30613-1-hegdevasant@linux.vnet.ibm.com |
||
Krzysztof Kozlowski
|
bb12dd42d2 |
powerpc/powermac: constify device_node in of_irq_parse_oldworld()
The of_irq_parse_oldworld() does not modify passed device_node so make it a pointer to const for safety. Drop the extern while modifying the line. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210924105653.46963-2-krzysztof.kozlowski@canonical.com |
||
Krzysztof Kozlowski
|
cc025916b1 |
powerpc/powermac: add missing g5_phy_disable_cpu1() declaration
g5_phy_disable_cpu1() is used outside of platforms/powermac/feature.c, so it should have a declaration to fix W=1 warning: arch/powerpc/platforms/powermac/feature.c:1533:6: error: no previous prototype for ‘g5_phy_disable_cpu1’ [-Werror=missing-prototypes] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210924105653.46963-1-krzysztof.kozlowski@canonical.com |
||
Reza Arbab
|
26b78c81e8 |
powerpc: Enable the DAWR on POWER9 DD2.3 and above
The hardware bug in POWER9 preventing use of the DAWR was fixed in DD2.3. Set the CPU_FTR_DAWR feature bit on these newer systems to start using it again, and update the documentation accordingly. The CPU features for DD2.3 are currently determined by "DD2.2 or later" logic. In adding DD2.3 as a discrete case for the first time here, I'm carrying the quirks of DD2.2 forward to keep all behavior outside of this DAWR change the same. This leaves the assessment and potential removal of those quirks on DD2.3 for later. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220503170152.23412-1-arbab@linux.ibm.com |
||
Michael Ellerman
|
b4d9cc7572 |
powerpc/64s: Add CPU_FTRS_POWER10 to ALWAYS mask
CPU_FTRS_POWER10 is missing from the CPU_FTRS_ALWAYS mask.
Currently that doesn't cause any bug, because it is a superset of the
POWER9 mask, which the exception of CPU_FTR_TM, but POWER7 doesn't have
CPU_FTR_TM, so CPU_FTR_TM is not in the ALWAYS mask to begin with.
However for consistency, and to be robust against future changes, it
should be included in the ALWAYS mask.
Fixes:
|
||
Michael Ellerman
|
3e36960a27 |
powerpc/64s: Add CPU_FTRS_POWER9_DD2_2 to CPU_FTRS_ALWAYS mask
CPU_FTRS_POWER9_DD2_2 is missing from CPU_FTRS_ALWAYS.
That doesn't cause any bug, because CPU_FTRS_POWER9_DD2_2 adds new bits
that don't appear in other values, so when anded with the other masks
the result is the same.
But for consistency we should have all values in the CPU_FTRS_ALWAYS
mask, so that the logic is robust against the values being changed in
future.
Fixes:
|
||
Michael Ellerman
|
87c78b612f |
powerpc: Fix all occurences of "the the"
Rather than waiting for the bots to fix these one-by-one, fix all occurences of "the the" throughout arch/powerpc. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220518142629.513007-1-mpe@ellerman.id.au |
||
Minghao Chi (CGEL ZTE)
|
8a57c3cc2b |
powerpc/platforms/83xx: Use of_device_get_match_data()
Use of_device_get_match_data() to simplify the code. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi (CGEL ZTE) <chi.minghao@zte.com.cn> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220225010737.2038781-1-chi.minghao@zte.com.cn |
||
Haowen Bai
|
3def164a5c |
powerpc/eeh: Drop redundant spinlock initialization
slot_errbuf_lock has declared and initialized by DEFINE_SPINLOCK, so we don't need to spin_lock_init again, drop it. Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1652232476-9696-1-git-send-email-baihaowen@meizu.com |
||
Peng Wu
|
57b742a5b8 |
powerpc/iommu: Add missing of_node_put in iommu_init_early_dart
The device_node pointer is returned by of_find_compatible_node with refcount incremented. We should use of_node_put() to avoid the refcount leak. Signed-off-by: Peng Wu <wupeng58@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220425081245.21705-1-wupeng58@huawei.com |
||
Zheng Bin
|
426e580522 |
powerpc/pseries/vas: Call misc_deregister if sysfs init fails
Undo effects of misc_register if sysfs init fails after misc_register. Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220511033507.2745992-1-zhengbin13@huawei.com |
||
Vaibhav Jain
|
0e0946e22f |
powerpc/papr_scm: Fix leaking nvdimm_events_map elements
Right now 'char *' elements allocated for individual 'stat_id' in
'papr_scm_priv.nvdimm_events_map[]' during papr_scm_pmu_check_events(), get
leaked in papr_scm_remove() and papr_scm_pmu_register(),
papr_scm_pmu_check_events() error paths.
Also individual 'stat_id' arent NULL terminated 'char *' instead they are fixed
8-byte sized identifiers. However papr_scm_pmu_register() assumes it to be a
NULL terminated 'char *' and at other places it assumes it to be a
'papr_scm_perf_stat.stat_id' sized string which is 8-byes in size.
Fix this by allocating the memory for papr_scm_priv.nvdimm_events_map to also
include space for 'stat_id' entries. This is possible since number of available
events/stat_ids are known upfront. This saves some memory and one extra level of
indirection from 'nvdimm_events_map' to 'stat_id'. Also rest of the code
can continue to call 'kfree(papr_scm_priv.nvdimm_events_map)' without needing to
iterate over the array and free up individual elements.
Fixes:
|
||
Miaoqian Lin
|
fcee96924b |
powerpc/fsl_rio: Fix refcount leak in fsl_rio_setup
of_parse_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.
Fixes:
|
||
Miaoqian Lin
|
1d1fb9618b |
powerpc/xive: Fix refcount leak in xive_spapr_init
of_find_compatible_node() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.
Fixes:
|
||
Oscar Salvador
|
48b63961c8 |
powerpc/numa: Associate numa node to its cpu earlier
powerpc is the only platform that do not rely on cpu_up()->try_online_node() to bring up a numa node, and special cases it, instead, deep in its own machinery: dlpar_online_cpu find_and_online_cpu_nid try_online_node This should not be needed, but the thing is that the try_online_node() from cpu_up() will not apply on the right node, because cpu_to_node() will return the old mapping numa<->cpu that gets set on boot stage for all possible cpus. That can be seen easily if we try to print out the numa node passed to try_online_node() in cpu_up(). The thing is that the numa<->cpu mapping does not get updated till a much later stage in start_secondary: start_secondary: set_numa_node(numa_cpu_lookup_table[cpu]) But we do not really care, as we already now the CPU <-> NUMA associativity back in find_and_online_cpu_nid(), so let us make use of that and set the proper numa<->cpu mapping, so cpu_to_node() in cpu_up() returns the right node and try_online_node() can do its work. Signed-off-by: Oscar Salvador <osalvador@suse.de> Tested-by: Geetika Moolchandani <Geetika.Moolchandani1@ibm.com> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220411074934.4632-1-osalvador@suse.de |
||
Lv Ruyi
|
3ffa9fd471 |
powerpc/powernv: fix missing of_node_put in uv_init()
of_find_compatible_node() returns node pointer with refcount incremented, use of_node_put() on it when done. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220407090043.2491854-1-lv.ruyi@zte.com.cn |
||
Christophe Leroy
|
dc21ed2aef |
powerpc/85xx: Remove FSL_85XX_CACHE_SRAM
CONFIG_FSL_85XX_CACHE_SRAM is an option that is not user selectable and which is not selected by any driver nor any defconfig. Remove it and all associated code. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9949813a6b758903b7bee910f798ba2ca82ff8ee.1648720908.git.christophe.leroy@csgroup.eu |
||
Lv Ruyi
|
5dd9e27ea4 |
powerpc/xics: fix refcount leak in icp_opal_init()
The of_find_compatible_node() function returns a node pointer with refcount incremented, use of_node_put() on it when done. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220402013419.2410298-1-lv.ruyi@zte.com.cn |
||
Kajol Jain
|
ab0cc6bbf0 |
powerpc/perf: Fix the threshold compare group constraint for power9
Thresh compare bits for a event is used to program thresh compare
field in Monitor Mode Control Register A (MMCRA: 9-18 bits for power9).
When scheduling events as a group, all events in that group should
match value in threshold bits (like thresh compare, thresh control,
thresh select). Otherwise event open for the sibling events should fail.
But in the current code, incase thresh compare bits are not valid,
we are not failing in group_constraint function which can result
in invalid group schduling.
Fix the issue by returning -1 incase event is threshold and threshold
compare value is not valid.
Thresh control bits in the event code is used to program thresh_ctl
field in Monitor Mode Control Register A (MMCRA: 48-55). In below example,
the scheduling of group events PM_MRK_INST_CMPL (873534401e0) and
PM_THRESH_MET (8734340101ec) is expected to fail as both event
request different thresh control bits and invalid thresh compare value.
Result before the patch changes:
[command]# perf stat -e "{r8735340401e0,r8734340101ec}" sleep 1
Performance counter stats for 'sleep 1':
11,048 r8735340401e0
1,967 r8734340101ec
1.001354036 seconds time elapsed
0.001421000 seconds user
0.000000000 seconds sys
Result after the patch changes:
[command]# perf stat -e "{r8735340401e0,r8734340101ec}" sleep 1
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument)
for event (r8735340401e0).
/bin/dmesg | grep -i perf may provide additional information.
Fixes:
|
||
Kajol Jain
|
505d31650b |
powerpc/perf: Fix the threshold compare group constraint for power10
Thresh compare bits for a event is used to program thresh compare
field in Monitor Mode Control Register A (MMCRA: 8-18 bits for power10).
When scheduling events as a group, all events in that group should
match value in threshold bits. Otherwise event open for the sibling
events should fail. But in the current code, incase thresh compare bits are
not valid, we are not failing in group_constraint function which can result
in invalid group schduling.
Fix the issue by returning -1 incase event is threshold and threshold
compare value is not valid in group_constraint function.
Patch also fixes the p10_thresh_cmp_val function to return -1,
incase threshold bits are not valid and changes corresponding check in
is_thresh_cmp_valid function to return false only when the thresh_cmp
value is less then 0.
Thresh control bits in the event code is used to program thresh_ctl
field in Monitor Mode Control Register A (MMCRA: 48-55). In below example,
the scheduling of group events PM_MRK_INST_CMPL (3534401e0) and
PM_THRESH_MET (34340101ec) is expected to fail as both event
request different thresh control bits.
Result before the patch changes:
[command]# perf stat -e "{r35340401e0,r34340101ec}" sleep 1
Performance counter stats for 'sleep 1':
8,482 r35340401e0
0 r34340101ec
1.001474838 seconds time elapsed
0.001145000 seconds user
0.000000000 seconds sys
Result after the patch changes:
[command]# perf stat -e "{r35340401e0,r34340101ec}" sleep 1
Performance counter stats for 'sleep 1':
<not counted> r35340401e0
<not supported> r34340101ec
1.001499607 seconds time elapsed
0.000204000 seconds user
0.000760000 seconds sys
Fixes:
|
||
YueHaibing
|
cdf87d2bd1 |
powerpc/kaslr_booke: Fix build error
arch/powerpc/mm/nohash/kaslr_booke.c: In function ‘kaslr_get_cmdline’:
arch/powerpc/mm/nohash/kaslr_booke.c:46:2: error: implicit declaration of function ‘early_init_dt_scan_chosen’
early_init_dt_scan_chosen(boot_command_line);
^~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/mm/nohash/kaslr_booke.c: In function ‘get_initrd_range’:
arch/powerpc/mm/nohash/kaslr_booke.c:210:10: error: implicit declaration of function ‘of_read_number’
start = of_read_number(prop, len / 4);
^~~~~~~~~~~~~~
Add missing include files to fix this.
Fixes:
|
||
YueHaibing
|
7574dd080e |
powerpc/book3e: Fix build error
arch/powerpc/mm/nohash/fsl_book3e.c: In function ‘relocate_init’:
arch/powerpc/mm/nohash/fsl_book3e.c:348:2: error: implicit declaration of function ‘early_get_first_memblock_info’
early_get_first_memblock_info(__va(dt_ptr), &size);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Add missing include file linux/of_fdt.h to fix this.
Fixes:
|
||
Daniel Axtens
|
41b7a347bf |
powerpc: Book3S 64-bit outline-only KASAN support
Implement a limited form of KASAN for Book3S 64-bit machines running under the Radix MMU, supporting only outline mode. - Enable the compiler instrumentation to check addresses and maintain the shadow region. (This is the guts of KASAN which we can easily reuse.) - Require kasan-vmalloc support to handle modules and anything else in vmalloc space. - KASAN needs to be able to validate all pointer accesses, but we can't instrument all kernel addresses - only linear map and vmalloc. On boot, set up a single page of read-only shadow that marks all iomap and vmemmap accesses as valid. - Document KASAN in powerpc docs. Background ---------- KASAN support on Book3S is a bit tricky to get right: - It would be good to support inline instrumentation so as to be able to catch stack issues that cannot be caught with outline mode. - Inline instrumentation requires a fixed offset. - Book3S runs code with translations off ("real mode") during boot, including a lot of generic device-tree parsing code which is used to determine MMU features. [ppc64 mm note: The kernel installs a linear mapping at effective address c000...-c008.... This is a one-to-one mapping with physical memory from 0000... onward. Because of how memory accesses work on powerpc 64-bit Book3S, a kernel pointer in the linear map accesses the same memory both with translations on (accessing as an 'effective address'), and with translations off (accessing as a 'real address'). This works in both guests and the hypervisor. For more details, see s5.7 of Book III of version 3 of the ISA, in particular the Storage Control Overview, s5.7.3, and s5.7.5 - noting that this KASAN implementation currently only supports Radix.] - Some code - most notably a lot of KVM code - also runs with translations off after boot. - Therefore any offset has to point to memory that is valid with translations on or off. One approach is just to give up on inline instrumentation. This way boot-time checks can be delayed until after the MMU is set is up, and we can just not instrument any code that runs with translations off after booting. Take this approach for now and require outline instrumentation. Previous attempts allowed inline instrumentation. However, they came with some unfortunate restrictions: only physically contiguous memory could be used and it had to be specified at compile time. Maybe we can do better in the future. [paulus@ozlabs.org - Rebased onto 5.17. Note that a kernel with CONFIG_KASAN=y will crash during boot on a machine using HPT translation because not all the entry points to the generic KASAN code are protected with a call to kasan_arch_is_ready().] Originally-by: Balbir Singh <bsingharora@gmail.com> # ppc64 out-of-line radix version Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> [mpe: Update copyright year and comment formatting] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/YoTE69OQwiG7z+Gu@cleo |
||
Daniel Axtens
|
2ab2d5794f |
powerpc/kasan: Disable address sanitization in kexec paths
The kexec code paths involve code that necessarily run in real mode, as CPUs are disabled and control is transferred to the new kernel. Disable address sanitization for the kexec code and the functions called in real mode on CPUs being disabled. [paulus@ozlabs.org: combined a few work-in-progress commits of Daniel's and wrote the commit message.] Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> [mpe: Move pseries_machine_kexec() into kexec.c so setup.c can be instrumented] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/YoTFSQ2TUSEaDdVC@cleo |
||
Daniel Axtens
|
5352090a99 |
powerpc/kasan: Don't instrument non-maskable or raw interrupts
Disable address sanitization for raw and non-maskable interrupt handlers, because they can run in real mode, where we cannot access the shadow memory. (Note that kasan_arch_is_ready() doesn't test for real mode, since it is a static branch for speed, and in any case not all the entry points to the generic KASAN code are protected by kasan_arch_is_ready guards.) The changes to interrupt_nmi_enter/exit_prepare() look larger than they actually are. The changes are equivalent to adding !IS_ENABLED(CONFIG_KASAN) to the conditions for calling nmi_enter() or nmi_exit() in real mode. That is, the code is equivalent to using the following condition for calling nmi_enter/exit: if (((!IS_ENABLED(CONFIG_PPC_BOOK3S_64) || !firmware_has_feature(FW_FEATURE_LPAR) || radix_enabled()) && !IS_ENABLED(CONFIG_KASAN) || (mfmsr() & MSR_DR)) That unwieldy condition has been split into several statements with comments, for easier reading. The nmi_ipi_lock functions that call atomic functions (i.e., nmi_ipi_lock_start(), nmi_ipi_lock() and nmi_ipi_unlock()), besides being marked noinstr, now call arch_atomic_* functions instead of atomic_* functions because with KASAN enabled, the atomic_* functions are wrappers which explicitly do address sanitization on their arguments. Since we are trying to avoid address sanitization, we have to use the lower-level arch_atomic_* versions. In hv_nmi_check_nonrecoverable(), the regs_set_unrecoverable() call has been open-coded so as to avoid having to either trust the inlining or mark regs_set_unrecoverable() as noinstr. [paulus@ozlabs.org: combined a few work-in-progress commits of Daniel's and wrote the commit message.] Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/YoTFGaKM8Pd46PIK@cleo |
||
Daniel Axtens
|
f08aed5241 |
powerpc/mm/kasan: rename kasan_init_32.c to init_32.c
kasan is already implied by the directory name, we don't need to repeat it. Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/YoTEyoi+xu9brJYe@cleo |
||
Naveen N. Rao
|
84ade0a665 |
powerpc/ftrace: Remove ftrace init tramp once kernel init is complete
Stop using the ftrace trampoline for init section once kernel init is
complete.
Fixes:
|
||
Christophe Leroy
|
5fe855169f |
powerpc/irq: Remove arch_local_irq_restore() for !CONFIG_CC_HAS_ASM_GOTO
All supported versions of GCC & clang support asm goto. Remove the !CONFIG_CC_HAS_ASM_GOTO version of arch_local_irq_restore() Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/58df50c9e77e2ed945bacdead30412770578886b.1652715336.git.christophe.leroy@csgroup.eu |
||
Russell Currey
|
d2a3c13198 |
powerpc/powernv: Get STF barrier requirements from device-tree
The device-tree property no-need-store-drain-on-priv-state-switch is equivalent to H_CPU_BEHAV_NO_STF_BARRIER from the H_CPU_GET_CHARACTERISTICS hcall on pseries. Since commit |