1060735 Commits

Author SHA1 Message Date
David Woodhouse
2efd61a608 KVM: Warn if mark_page_dirty() is called without an active vCPU
The various kvm_write_guest() and mark_page_dirty() functions must only
ever be called in the context of an active vCPU, because if dirty ring
tracking is enabled it may simply oops when kvm_get_running_vcpu()
returns NULL for the vcpu and then kvm_dirty_ring_get() dereferences it.

This oops was reported by "butt3rflyh4ck" <butterflyhuangxx@gmail.com> in
https://lore.kernel.org/kvm/CAFcO6XOmoS7EacN_n6v4Txk7xL7iqRa2gABg3F7E3Naf5uG94g@mail.gmail.com/

That actual bug will be fixed under separate cover but this warning
should help to prevent new ones from being added.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20211210163625.2886-2-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:44 -05:00
David Woodhouse
f3f26dae05 x86/kvm: Silence per-cpu pr_info noise about KVM clocks and steal time
I made the actual CPU bringup go nice and fast... and then Linux spends
half a minute printing stupid nonsense about clocks and steal time for
each of 256 vCPUs. Don't do that. Nobody cares.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20211209150938.3518-12-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:43 -05:00
Eric Hankland
018d70ffcf KVM: x86: Update vPMCs when retiring branch instructions
When KVM retires a guest branch instruction through emulation,
increment any vPMCs that are configured to monitor "branch
instructions retired," and update the sample period of those counters
so that they will overflow at the right time.

Signed-off-by: Eric Hankland <ehankland@google.com>
[jmattson:
  - Split the code to increment "branch instructions retired" into a
    separate commit.
  - Moved/consolidated the calls to kvm_pmu_trigger_event() in the
    emulation of VMLAUNCH/VMRESUME to accommodate the evolution of
    that code.
]
Fixes: f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")
Signed-off-by: Jim Mattson <jmattson@google.com>
Message-Id: <20211130074221.93635-7-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:43 -05:00
Eric Hankland
9cd803d496 KVM: x86: Update vPMCs when retiring instructions
When KVM retires a guest instruction through emulation, increment any
vPMCs that are configured to monitor "instructions retired," and
update the sample period of those counters so that they will overflow
at the right time.

Signed-off-by: Eric Hankland <ehankland@google.com>
[jmattson:
  - Split the code to increment "branch instructions retired" into a
    separate commit.
  - Added 'static' to kvm_pmu_incr_counter() definition.
  - Modified kvm_pmu_incr_counter() to check pmc->perf_event->state ==
    PERF_EVENT_STATE_ACTIVE.
]
Fixes: f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")
Signed-off-by: Jim Mattson <jmattson@google.com>
[likexu:
  - Drop checks for pmc->perf_event or event state or event type
  - Increase a counter once its umask bits and the first 8 select bits are matched
  - Rewrite kvm_pmu_incr_counter() with a less invasive approach to the host perf;
  - Rename kvm_pmu_record_event to kvm_pmu_trigger_event;
  - Add counter enable and CPL check for kvm_pmu_trigger_event();
]
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20211130074221.93635-6-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:42 -05:00
Like Xu
40ccb96d54 KVM: x86/pmu: Add pmc->intr to refactor kvm_perf_overflow{_intr}()
Depending on whether intr should be triggered or not, KVM registers
two different event overflow callbacks in the perf_event context.

The code skeleton of these two functions is very similar, so
the pmc->intr can be stored into pmc from pmc_reprogram_counter()
which provides smaller instructions footprint against the
u-architecture branch predictor.

The __kvm_perf_overflow() can be called in non-nmi contexts
and a flag is needed to distinguish the caller context and thus
avoid a check on kvm_is_in_guest(), otherwise we might get
warnings from suspicious RCU or check_preemption_disabled().

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20211130074221.93635-5-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:42 -05:00
Like Xu
6ed1298eb0 KVM: x86/pmu: Reuse pmc_perf_hw_id() and drop find_fixed_event()
Since we set the same semantic event value for the fixed counter in
pmc->eventsel, returning the perf_hw_id for the fixed counter via
find_fixed_event() can be painlessly replaced by pmc_perf_hw_id()
with the help of pmc_is_fixed() check.

Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20211130074221.93635-4-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:42 -05:00
Like Xu
7c174f305c KVM: x86/pmu: Refactoring find_arch_event() to pmc_perf_hw_id()
The find_arch_event() returns a "unsigned int" value,
which is used by the pmc_reprogram_counter() to
program a PERF_TYPE_HARDWARE type perf_event.

The returned value is actually the kernel defined generic
perf_hw_id, let's rename it to pmc_perf_hw_id() with simpler
incoming parameters for better self-explanation.

Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20211130074221.93635-3-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:41 -05:00
Like Xu
761875634a KVM: x86/pmu: Setup pmc->eventsel for fixed PMCs
The current pmc->eventsel for fixed counter is underutilised. The
pmc->eventsel can be setup for all known available fixed counters
since we have mapping between fixed pmc index and
the intel_arch_events array.

Either gp or fixed counter, it will simplify the later checks for
consistency between eventsel and perf_hw_id.

Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20211130074221.93635-2-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:41 -05:00
Paolo Bonzini
006a0f0607 KVM: x86: avoid out of bounds indices for fixed performance counters
Because IceLake has 4 fixed performance counters but KVM only
supports 3, it is possible for reprogram_fixed_counters to pass
to reprogram_fixed_counter an index that is out of bounds for the
fixed_pmc_events array.

Ultimately intel_find_fixed_event, which is the only place that uses
fixed_pmc_events, handles this correctly because it checks against the
size of fixed_pmc_events anyway.  Every other place operates on the
fixed_counters[] array which is sized according to INTEL_PMC_MAX_FIXED.
However, it is cleaner if the unsupported performance counters are culled
early on in reprogram_fixed_counters.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:40 -05:00
Lai Jiangshan
5b61178cd2 KVM: VMX: Mark VCPU_EXREG_CR3 dirty when !CR0_PG -> CR0_PG if EPT + !URG
When !CR0_PG -> CR0_PG, vcpu->arch.cr3 becomes active, but GUEST_CR3 is
still vmx->ept_identity_map_addr if EPT + !URG.  So VCPU_EXREG_CR3 is
considered to be dirty and GUEST_CR3 needs to be updated in this case.

Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20211216021938.11752-4-jiangshanlai@gmail.com>
Fixes: c62c7bd4f95b ("KVM: VMX: Update vmcs.GUEST_CR3 only when the guest CR3 is dirty")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:40 -05:00
Lai Jiangshan
6b123c3a89 KVM: x86/mmu: Reconstruct shadow page root if the guest PDPTEs is changed
For shadow paging, the page table needs to be reconstructed before the
coming VMENTER if the guest PDPTEs is changed.

But not all paths that call load_pdptrs() will cause the page tables to be
reconstructed. Normally, kvm_mmu_reset_context() and kvm_mmu_free_roots()
are used to launch later reconstruction.

The commit d81135a57aa6("KVM: x86: do not reset mmu if CR0.CD and
CR0.NW are changed") skips kvm_mmu_reset_context() after load_pdptrs()
when changing CR0.CD and CR0.NW.

The commit 21823fbda552("KVM: x86: Invalidate all PGDs for the current
PCID on MOV CR3 w/ flush") skips kvm_mmu_free_roots() after
load_pdptrs() when rewriting the CR3 with the same value.

The commit a91a7c709600("KVM: X86: Don't reset mmu context when
toggling X86_CR4_PGE") skips kvm_mmu_reset_context() after
load_pdptrs() when changing CR4.PGE.

Guests like linux would keep the PDPTEs unchanged for every instance of
pagetable, so this missing reconstruction has no problem for linux
guests.

Fixes: d81135a57aa6("KVM: x86: do not reset mmu if CR0.CD and CR0.NW are changed")
Fixes: 21823fbda552("KVM: x86: Invalidate all PGDs for the current PCID on MOV CR3 w/ flush")
Fixes: a91a7c709600("KVM: X86: Don't reset mmu context when toggling X86_CR4_PGE")
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20211216021938.11752-3-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:40 -05:00
Lai Jiangshan
a9f2705ec8 KVM: VMX: Save HOST_CR3 in vmx_set_host_fs_gs()
The host CR3 in the vcpu thread can only be changed when scheduling,
so commit 15ad9762d69f ("KVM: VMX: Save HOST_CR3 in vmx_prepare_switch_to_guest()")
changed vmx.c to only save it in vmx_prepare_switch_to_guest().

However, it also has to be synced in vmx_sync_vmcs_host_state() when switching VMCS.
vmx_set_host_fs_gs() is called in both places, so rename it to
vmx_set_vmcs_host_state() and make it update HOST_CR3.

Fixes: 15ad9762d69f ("KVM: VMX: Save HOST_CR3 in vmx_prepare_switch_to_guest()")
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20211216021938.11752-2-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:39 -05:00
Paolo Bonzini
46cbc0400f Revert "KVM: X86: Update mmu->pdptrs only when it is changed"
This reverts commit 24cd19a28cb7174df502162641d6e1e12e7ffbd9.
Sean Christopherson reports:

"Commit 24cd19a28cb7 ('KVM: X86: Update mmu->pdptrs only when it is
changed') breaks nested VMs with EPT in L0 and PAE shadow paging in L2.
Reproducing is trivial, just disable EPT in L1 and run a VM.  I haven't
investigating how it breaks things."

Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:39 -05:00
Peter Gonda
a6fec53947 selftests: KVM: sev_migrate_tests: Add mirror command tests
Add tests to confirm mirror vms can only run correct subset of commands.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Marc Orr <marcorr@google.com>
Signed-off-by: Peter Gonda <pgonda@google.com>
Message-Id: <20211208191642.3792819-4-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:38 -05:00
Peter Gonda
427d046a41 selftests: KVM: sev_migrate_tests: Fix sev_ioctl()
TEST_ASSERT in SEV ioctl was allowing errors because it checked return
value was good OR the FW error code was OK. This TEST_ASSERT should
require both (aka. AND) values are OK. Removes the LAUNCH_START from the
mirror VM because this call correctly fails because mirror VMs cannot
call this command. Currently issues with the PSP driver functions mean
the firmware error is not always reset to SEV_RET_SUCCESS when a call is
successful. Mainly sev_platform_init() doesn't correctly set the fw
error if the platform has already been initialized.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Marc Orr <marcorr@google.com>
Signed-off-by: Peter Gonda <pgonda@google.com>
Message-Id: <20211208191642.3792819-3-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:38 -05:00
Peter Gonda
4c66b56781 selftests: KVM: sev_migrate_tests: Fix test_sev_mirror()
Mirrors should not be able to call LAUNCH_START. Remove the call on the
mirror to correct the test before fixing sev_ioctl() to correctly assert
on this failed ioctl.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Marc Orr <marcorr@google.com>
Signed-off-by: Peter Gonda <pgonda@google.com>
Message-Id: <20211208191642.3792819-2-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07 10:44:38 -05:00
Paolo Bonzini
1b0c9d00aa KVM/riscv changes for 5.17, take #1
- Use common KVM implementation of MMU memory caches
 - SBI v0.2 support for Guest
 - Initial KVM selftests support
 - Fix to avoid spurious virtual interrupts after clearing hideleg CSR
 - Update email address for Anup and Atish
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmHW6GcACgkQrUjsVaLH
 LAcA+Q//bRKyuC2JGn0qN0e4WOcb8zpDUw5zep8WqlWviFiNjxxVjHeroT//cDtr
 7apwTCJogDlgnkcH0e88CzD3M0Gh/NJ7JAZ/Z1gBMhBMz7afcahnADqXcTuottMf
 x0stMIQxKlQhess2IQa502KGb23uitbLfiY2MzaPVnXbxfBbM08YUPAcIhSSl+iP
 ZXtvweqhrafUoUvaEFXSHkA27QMWEH+vZq4JlRwLSy7y3U3Hd/51nH04Fxp/n4Qh
 5XyWO0mPqmiTb6Dz5I/hx7sZLZ5ErMpFI5II22sZYcOqtrrL59f5I9gvYQOYc7im
 GjyBshD8bB4SVEciMGEJq9QucOw41M6cTFmdiaQR+NCHfMa/A5RPwf0zT+15Xrtg
 zEkNQCRdWgDhxb/cYqKaAQXERfeposr0xS398qoSUT29GXFvv0N+P2N/WAgQqg+D
 2cnhGRMlsdUJEVWUXCJjZ1u/Wwx6gkxJbjvRY48vvvB76eZzr82sOXYEIF8MBrPG
 co5wa/mzUl3CfgzHO4fESvR+hNTbXiPLbW/FPzSdNNMWxB5GOREP42vcDn9be1Xf
 IXBcKlpL2MhExPh+J6DjM3BqMdV+8qywcn0iR0f3W7CuAiJS0sF7Pyn8Ur7GpU6U
 YFwwLWYBQmdPmkXZuH+8fAju2GMjAHxJBCOfpKahhDcWk7TECdE=
 =Ao1E
 -----END PGP SIGNATURE-----

Merge tag 'kvm-riscv-5.17-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv changes for 5.17, take #1

- Use common KVM implementation of MMU memory caches
- SBI v0.2 support for Guest
- Initial KVM selftests support
- Fix to avoid spurious virtual interrupts after clearing hideleg CSR
- Update email address for Anup and Atish
2022-01-07 10:43:02 -05:00
Paolo Bonzini
7fd55a02a4 KVM/arm64 updates for Linux 5.16
- Simplification of the 'vcpu first run' by integrating it into
   KVM's 'pid change' flow
 
 - Refactoring of the FP and SVE state tracking, also leading to
   a simpler state and less shared data between EL1 and EL2 in
   the nVHE case
 
 - Tidy up the header file usage for the nvhe hyp object
 
 - New HYP unsharing mechanism, finally allowing pages to be
   unmapped from the Stage-1 EL2 page-tables
 
 - Various pKVM cleanups around refcounting and sharing
 
 - A couple of vgic fixes for bugs that would trigger once
   the vcpu xarray rework is merged, but not sooner
 
 - Add minimal support for ARMv8.7's PMU extension
 
 - Rework kvm_pgtable initialisation ahead of the NV work
 
 - New selftest for IRQ injection
 
 - Teach selftests about the lack of default IPA space and
   page sizes
 
 - Expand sysreg selftest to deal with Pointer Authentication
 
 - The usual bunch of cleanups and doc update
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmHYIpgPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDndsP/RsBmX6bmQnDEhaaqfGAxOETyq/my1eT9r/V
 3Ax4fEqSFfD5yHbYvqNRC8ueycH4r8WAr4ACWDAI6XpS/pYx00nx2N+HCSgjGyQR
 FeXqITuGPEsn4NkGuPci0PFmI8rVUzanl1ugRGQAETVrZo2ZVH2uqKVGT8XOlu0J
 FB/0x6Z4vMuIgEXyfa+DZ8WdW1aCRgPU2oyOdSdWE57/grjyLJqk6EdMmLyaQ19E
 vz6vXuRnA/GQwOtByqYEnQ8a4VXsQedCMqg/f9mj0BxpDzxC1ps8Nrpv36aJXKUN
 LEXapP9bCWPW9LqaKAOZnQYrUIIEFHsCUom0n3reDHrgObA+jivpz75L8GEr3CdC
 Bv78N04Yymjpp2WA6CuO3r9HjL1nJ6tYqobXU2pvqln4nNC3Ukucjq9ZVuWgS6Hx
 qOZXgPcZ/HpS3l/U+dAu8yIcV2SchQXDudaq8BsfLd8M1bD+oirSBolZFSvz7MYZ
 6+jtEDLUOEO5s4rXiJF46+MauxiELcjaewAEK4WwrS8NBwEyhYe9EPsYcQ5pcrQF
 QwAd1+y7oLfhpGHv5KJKWswfvbtlLCm6NOAhawq0UXM8bS+79tu0dGjiDzVPBuSf
 SyA3VtBSKxcpvCrljw9ubtjxvKrviE0MDvlmTP2B1NU+lwm8xRBiwUwOH29qP9zU
 HDeUj2fy
 =HkZk
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for Linux 5.16

- Simplification of the 'vcpu first run' by integrating it into
  KVM's 'pid change' flow

- Refactoring of the FP and SVE state tracking, also leading to
  a simpler state and less shared data between EL1 and EL2 in
  the nVHE case

- Tidy up the header file usage for the nvhe hyp object

- New HYP unsharing mechanism, finally allowing pages to be
  unmapped from the Stage-1 EL2 page-tables

- Various pKVM cleanups around refcounting and sharing

- A couple of vgic fixes for bugs that would trigger once
  the vcpu xarray rework is merged, but not sooner

- Add minimal support for ARMv8.7's PMU extension

- Rework kvm_pgtable initialisation ahead of the NV work

- New selftest for IRQ injection

- Teach selftests about the lack of default IPA space and
  page sizes

- Expand sysreg selftest to deal with Pointer Authentication

- The usual bunch of cleanups and doc update
2022-01-07 10:42:19 -05:00
Linus Torvalds
ddec8ed2d4 RDMA v5.16 fourth rc pull request
- Revert the patch fixing the DM related crash causing a widespread
   regression for kernel ULPs. A proper fix just didn't appear this cycle
   due to the holidays
 
 - Missing NULL check on alloc in uverbs
 
 - Double free in rxe error paths
 
 - Fix a new kernel-infoleak report when forming ah_attr's without GRH's in
   ucma
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmHXiq8ACgkQOG33FX4g
 mxqGYQ//XvZDj6pvhyepeEX3FcLG4mlpeFMAR+GQg1K4HHQgtaFjWv96j9aku4/+
 G0uit4J4U4fVVDCKxIwuuYrOh9KK2r8JIpcbbsPMYb0KQyvBh/ugXta4lVQYzo7o
 h5qiNEmdRx2ugKzMImwRS3HEt7XAIoaysmXlm5FskOP7AYlYew8hS7P29NnnD3BO
 ixysSsvkZXX4N+geBw1YEHZ03W2/7DXRlXXAU4m8lh1ktKIBRwZwmyo2W7AJlTtd
 aJOzZ85zzrkwhwaUhXxCJuKXsCOP774l1TPjbOv0aenEjeLGNBHuxbLcphBpwJ3A
 JASx/VbDZzVZiRwL5TTpxuWVvBbxJdN8TU8QhOJqlnYMPf2IV8q7S/2qTkFm5Dnb
 miaFYVkXWr8MV3Bq4yAvRWBx3Cues5FBZ7Te9lIp8lJsddrweMw00OVj8HKrJU2Q
 gHVgBLfrPFkpohFe+7nSR4p9m47ssRy+/Ey5yPvkK21tePLlQi0lpCLpbioDA47O
 cOI4y0OSHm4QZIKYWcy3ux3F6RoCzbl1Smg0Yma4+UO60IisCyS/OtEU6R6zi/D7
 whplbKIhsDc0//tuuKOqdiVjyTqU4WQ3CXr3uSDClzXjfCnCCJpIHytIBKn0Z8Ow
 4IqY0iY7mFzxf6DbzRGSNUF4BERALUVGpSQkRiGwikEkWU89Ou8=
 =A1/J
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "Last pull for 5.16, the reversion has been known for a while now but
  didn't get a proper fix in time. Looks like we will have several
  info-leak bugs to take care of going foward.

   - Revert the patch fixing the DM related crash causing a widespread
     regression for kernel ULPs. A proper fix just didn't appear this
     cycle due to the holidays

   - Missing NULL check on alloc in uverbs

   - Double free in rxe error paths

   - Fix a new kernel-infoleak report when forming ah_attr's without
     GRH's in ucma"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/core: Don't infoleak GRH fields
  RDMA/uverbs: Check for null return of kmalloc_array
  Revert "RDMA/mlx5: Fix releasing unallocated memory in dereg MR flow"
  RDMA/rxe: Prevent double freeing rxe_map_set()
2022-01-06 18:35:17 -08:00
Linus Torvalds
b2b436ec02 Three minor tracing fixes:
- Fix missing prototypes in sample module for direct functions
 
 - Fix check of valid buffer in get_trace_buf()
 
 - Fix annotations of percpu pointers.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYddVnBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qg2PAQDVhSODIERza+YwP4AkMYBLWukngdi4
 2fvFOJa1qdGQ1AD/YMSsJzbqfUk5YL9LNElL37TFH0fyWzU85tXRHVwf4As=
 =KKJx
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "Three minor tracing fixes:

   - Fix missing prototypes in sample module for direct functions

   - Fix check of valid buffer in get_trace_buf()

   - Fix annotations of percpu pointers"

* tag 'trace-v5.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Tag trace_percpu_buffer as a percpu pointer
  tracing: Fix check for trace_percpu_buffer validity in get_trace_buf()
  ftrace/samples: Add missing prototypes direct functions
2022-01-06 15:00:43 -08:00
Tejun Heo
bf35a7879f selftests: cgroup: Test open-time cgroup namespace usage for migration checks
When a task is writing to an fd opened by a different task, the perm check
should use the cgroup namespace of the latter task. Add a test for it.

Tested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06 11:02:29 -10:00
Tejun Heo
613e040e4d selftests: cgroup: Test open-time credential usage for migration checks
When a task is writing to an fd opened by a different task, the perm check
should use the credentials of the latter task. Add a test for it.

Tested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06 11:02:29 -10:00
Tejun Heo
b09c2baa56 selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644
0644 is an odd perm to create a cgroup which is a directory. Use the regular
0755 instead. This is necessary for euid switching test case.

Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06 11:02:29 -10:00
Tejun Heo
e574576416 cgroup: Use open-time cgroup namespace for process migration perm checks
cgroup process migration permission checks are performed at write time as
whether a given operation is allowed or not is dependent on the content of
the write - the PID. This currently uses current's cgroup namespace which is
a potential security weakness as it may allow scenarios where a less
privileged process tricks a more privileged one into writing into a fd that
it created.

This patch makes cgroup remember the cgroup namespace at the time of open
and uses it for migration permission checks instad of current's. Note that
this only applies to cgroup2 as cgroup1 doesn't have namespace support.

This also fixes a use-after-free bug on cgroupns reported in

 https://lore.kernel.org/r/00000000000048c15c05d0083397@google.com

Note that backporting this fix also requires the preceding patch.

Reported-by: "Eric W. Biederman" <ebiederm@xmission.com>
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Reported-by: syzbot+50f5cf33a284ce738b62@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/00000000000048c15c05d0083397@google.com
Fixes: 5136f6365ce3 ("cgroup: implement "nsdelegate" mount option")
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06 11:02:29 -10:00
Tejun Heo
0d2b5955b3 cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv
of->priv is currently used by each interface file implementation to store
private information. This patch collects the current two private data usages
into struct cgroup_file_ctx which is allocated and freed by the common path.
This allows generic private data which applies to multiple files, which will
be used to in the following patch.

Note that cgroup_procs iterator is now embedded as procs.iter in the new
cgroup_file_ctx so that it doesn't need to be allocated and freed
separately.

v2: union dropped from cgroup_file_ctx and the procs iterator is embedded in
    cgroup_file_ctx as suggested by Linus.

v3: Michal pointed out that cgroup1's procs pidlist uses of->priv too.
    Converted. Didn't change to embedded allocation as cgroup1 pidlists get
    stored for caching.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
2022-01-06 11:02:29 -10:00
Tejun Heo
1756d7994a cgroup: Use open-time credentials for process migraton perm checks
cgroup process migration permission checks are performed at write time as
whether a given operation is allowed or not is dependent on the content of
the write - the PID. This currently uses current's credentials which is a
potential security weakness as it may allow scenarios where a less
privileged process tricks a more privileged one into writing into a fd that
it created.

This patch makes both cgroup2 and cgroup1 process migration interfaces to
use the credentials saved at the time of open (file->f_cred) instead of
current's.

Reported-by: "Eric W. Biederman" <ebiederm@xmission.com>
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Fixes: 187fe84067bd ("cgroup: require write perm on common ancestor when moving processes on the default hierarchy")
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06 11:02:28 -10:00
Dave Airlie
936a93775b Merge tag 'amd-drm-fixes-5.16-2021-12-31' of ssh://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.16-2021-12-31:

amdgpu:
- Suspend/resume fix
- Restore runtime pm behavior with efifb

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211231143825.11479-1-alexander.deucher@amd.com
2022-01-07 06:46:08 +10:00
Chris Packham
72a4a87da8 i2c: mpc: Avoid out of bounds memory access
When performing an I2C transfer where the last message was a write KASAN
would complain:

  BUG: KASAN: slab-out-of-bounds in mpc_i2c_do_action+0x154/0x630
  Read of size 2 at addr c814e310 by task swapper/2/0

  CPU: 2 PID: 0 Comm: swapper/2 Tainted: G    B             5.16.0-rc8 #1
  Call Trace:
  [e5ee9d50] [c08418e8] dump_stack_lvl+0x4c/0x6c (unreliable)
  [e5ee9d70] [c02f8a14] print_address_description.constprop.13+0x64/0x3b0
  [e5ee9da0] [c02f9030] kasan_report+0x1f0/0x204
  [e5ee9de0] [c0c76ee4] mpc_i2c_do_action+0x154/0x630
  [e5ee9e30] [c0c782c4] mpc_i2c_isr+0x164/0x240
  [e5ee9e60] [c00f3a04] __handle_irq_event_percpu+0xf4/0x3b0
  [e5ee9ec0] [c00f3d40] handle_irq_event_percpu+0x80/0x110
  [e5ee9f40] [c00f3e48] handle_irq_event+0x78/0xd0
  [e5ee9f60] [c00fcfec] handle_fasteoi_irq+0x19c/0x370
  [e5ee9fa0] [c00f1d84] generic_handle_irq+0x54/0x80
  [e5ee9fc0] [c0006b54] __do_irq+0x64/0x200
  [e5ee9ff0] [c0007958] __do_IRQ+0xe8/0x1c0
  [c812dd50] [e3eaab20] 0xe3eaab20
  [c812dd90] [c0007a4c] do_IRQ+0x1c/0x30
  [c812dda0] [c0000c04] ExternalInput+0x144/0x160
  --- interrupt: 500 at arch_cpu_idle+0x34/0x60
  NIP:  c000b684 LR: c000b684 CTR: c0019688
  REGS: c812ddb0 TRAP: 0500   Tainted: G    B              (5.16.0-rc8)
  MSR:  00029002 <CE,EE,ME>  CR: 22000488  XER: 20000000

  GPR00: c10ef7fc c812de90 c80ff200 c2394718 00000001 00000001 c10e3f90 00000003
  GPR08: 00000000 c0019688 c2394718 fc7d625b 22000484 00000000 21e17000 c208228c
  GPR16: e3e99284 00000000 ffffffff c2390000 c001bac0 c2082288 c812df60 c001ba60
  GPR24: c23949c0 00000018 00080000 00000004 c80ff200 00000002 c2348ee4 c2394718
  NIP [c000b684] arch_cpu_idle+0x34/0x60
  LR [c000b684] arch_cpu_idle+0x34/0x60
  --- interrupt: 500
  [c812de90] [c10e3f90] rcu_eqs_enter.isra.60+0xc0/0x110 (unreliable)
  [c812deb0] [c10ef7fc] default_idle_call+0xbc/0x230
  [c812dee0] [c00af0e8] do_idle+0x1c8/0x200
  [c812df10] [c00af3c0] cpu_startup_entry+0x20/0x30
  [c812df20] [c001e010] start_secondary+0x5d0/0xba0
  [c812dff0] [c00028a0] __secondary_start+0x90/0xdc

This happened because we would overrun the i2c->msgs array on the final
interrupt for the I2C STOP. This didn't happen if the last message was a
read because there is no interrupt in that case. Ensure that we only
access the current message if we are not processing a I2C STOP
condition.

Fixes: 1538d82f4647 ("i2c: mpc: Interrupt driven transfer")
Reported-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2022-01-06 14:39:59 +01:00
Anup Patel
497685f2c7 MAINTAINERS: Update Anup's email address
I am no longer work at Western Digital so update my email address to
personal one and add entries to .mailmap as well.

Signed-off-by: Anup Patel <anup@brainfault.org>
Acked-by: Atish Patra <atishp@rivosinc.com>
2022-01-06 15:18:22 +05:30
Vincent Chen
33e5b5746c KVM: RISC-V: Avoid spurious virtual interrupts after clearing hideleg CSR
When the last VM is terminated, the host kernel will invoke function
hardware_disable_nolock() on each CPU to disable the related virtualization
functions. Here, RISC-V currently only clears hideleg CSR and hedeleg CSR.
This behavior will cause the host kernel to receive spurious interrupts if
hvip CSR has pending interrupts and the corresponding enable bits in vsie
CSR are asserted. To avoid it, hvip CSR and vsie CSR must be cleared
before clearing hideleg CSR.

Fixes: 99cdc6c18c2d ("RISC-V: Add initial skeletal KVM support")
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Reviewed-by: Anup Patel <anup.patel@wdc.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
2022-01-06 15:18:18 +05:30
Anup Patel
3e06cdf105 KVM: selftests: Add initial support for RISC-V 64-bit
We add initial support for RISC-V 64-bit in KVM selftests using
which we can cross-compile and run arch independent tests such as:
demand_paging_test
dirty_log_test
kvm_create_max_vcpus,
kvm_page_table_test
set_memory_region_test
kvm_binary_stats_test

All VM guest modes defined in kvm_util.h require at least 48-bit
guest virtual address so to use KVM RISC-V selftests hardware
need to support at least Sv48 MMU for guest (i.e. VS-mode).

Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-and-tested-by: Atish Patra <atishp@rivosinc.com>
2022-01-06 15:17:50 +05:30
Anup Patel
788490e798 KVM: selftests: Add EXTRA_CFLAGS in top-level Makefile
We add EXTRA_CFLAGS to the common CFLAGS of top-level Makefile which will
allow users to pass additional compile-time flags such as "-static".

Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-and-tested-by: Atish Patra <atishp@rivosinc.com>
Reviewed-and-tested-by: Sean Christopherson <seanjc@google.com>
2022-01-06 15:17:46 +05:30
Anup Patel
a457fd5660 RISC-V: KVM: Add VM capability to allow userspace get GPA bits
The number of GPA bits supported for a RISC-V Guest/VM is based on the
MMU mode used by the G-stage translation. The KVM RISC-V will detect and
use the best possible MMU mode for the G-stage in kvm_arch_init().

We add a generic VM capability KVM_CAP_VM_GPA_BITS which can be used by
the KVM userspace to get the number of GPA (guest physical address) bits
supported for a Guest/VM.

Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-and-tested-by: Atish Patra <atishp@rivosinc.com>
2022-01-06 15:16:58 +05:30
Anup Patel
ef8949a986 RISC-V: KVM: Forward SBI experimental and vendor extensions
The SBI experimental extension space is for temporary (or experimental)
stuff whereas SBI vendor extension space is for hardware vendor specific
stuff. Both these SBI extension spaces won't be standardized by the SBI
specification so let's blindly forward such SBI calls to the userspace.

Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-and-tested-by: Atish Patra <atishp@rivosinc.com>
2022-01-06 15:14:33 +05:30
Jisheng Zhang
637ad6551b RISC-V: KVM: make kvm_riscv_vcpu_fp_clean() static
There are no users outside vcpu_fp.c so make kvm_riscv_vcpu_fp_clean()
static.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
2022-01-06 15:13:58 +05:30
Atish Patra
4abed558b2 MAINTAINERS: Update Atish's email address
I am no longer employed by western digital. Update my email address to
personal one and add entries to .mailmap as well.

Signed-off-by: Atish Patra <atishp@atishpatra.org>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
2022-01-06 15:13:54 +05:30
Atish Patra
3e1d86569c RISC-V: KVM: Add SBI HSM extension in KVM
SBI HSM extension allows OS to start/stop harts any time. It also allows
ordered booting of harts instead of random booting.

Implement SBI HSM exntesion and designate the vcpu 0 as the boot vcpu id.
All other non-zero non-booting vcpus should be brought up by the OS
implementing HSM extension. If the guest OS doesn't implement HSM
extension, only single vcpu will be available to OS.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
2022-01-06 15:12:47 +05:30
Atish Patra
5f862df558 RISC-V: KVM: Add v0.1 replacement SBI extensions defined in v0.2
The SBI v0.2 contains some of the improved versions of required v0.1
extensions such as remote fence, timer and IPI.

This patch implements those extensions.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
2022-01-06 15:12:15 +05:30
Atish Patra
c62a768597 RISC-V: KVM: Add SBI v0.2 base extension
SBI v0.2 base extension defined to allow backward compatibility and
probing of future extensions. This is also the only mandatory SBI
extension that must be implemented by SBI implementors.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
2022-01-06 15:08:29 +05:30
Atish Patra
a046c2d857 RISC-V: KVM: Reorganize SBI code by moving SBI v0.1 to its own file
With SBI v0.2, there may be more SBI extensions in future. It makes more
sense to group related extensions in separate files. Guest kernel will
choose appropriate SBI version dynamically.

Move the existing implementation to a separate file so that it can be
removed in future without much conflict.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
2022-01-06 14:57:16 +05:30
Atish Patra
cf70be9d21 RISC-V: KVM: Mark the existing SBI implementation as v0.1
The existing SBI specification impelementation follows v0.1
specification. The latest specification allows more scalability
and performance improvements.

Rename the existing implementation as v0.1 and provide a way
to allow future extensions.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
2022-01-06 14:38:52 +05:30
Sean Christopherson
cc4f602bc4 KVM: RISC-V: Use common KVM implementation of MMU memory caches
Use common KVM's implementation of the MMU memory caches, which for all
intents and purposes is semantically identical to RISC-V's version, the
only difference being that the common implementation will fall back to an
atomic allocation if there's a KVM bug that triggers a cache underflow.

RISC-V appears to have based its MMU code on arm64 before the conversion
to the common caches in commit c1a33aebe91d ("KVM: arm64: Use common KVM
implementation of MMU memory caches"), despite having also copy-pasted
the definition of KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE in kvm_types.h.

Opportunistically drop the superfluous wrapper
kvm_riscv_stage2_flush_cache(), whose name is very, very confusing as
"cache flush" in the context of MMU code almost always refers to flushing
hardware caches, not freeing unused software objects.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
2022-01-06 14:38:50 +05:30
Olof Johansson
8922bb6526 SoCFPGA dts updates for v5.16, part 3
- Change the SoCFPGA compatible to "intel,socfpga-qspi"
 - Update dt-bindings document to include "intel,socfpga-qspi"
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEoHhMeiyk5VmwVMwNGZQEC4GjKPQFAmHJlCQUHGRpbmd1eWVu
 QGtlcm5lbC5vcmcACgkQGZQEC4GjKPQIvQ//dpNgaJ6yCM25J8280ip8ry2vZULV
 0+q+KCud3k3Qd8YRzy1iHShkacDaVWJCn9vaQSwV8ertnnOMVnPZJiZjcQExMte9
 lHAuwSDxEd5/hI19H4DMCqZv5xKUC6o4m2N++MrZLLtruv5K8sw2CFfC+TaqgPdi
 JK/JBj1M44tm8CzxqwPTr5abw4OqdFGgaGuK6ZNcuLe70gYJaWBo9UqUbm3efkX6
 HzrtfqykxTNwrUCtWGew/vNrTznhHMo+xz6D6fHJj5UvEBUthIkfoz2GJ/tjX8od
 qBVKE0GiyGsWjuOOTrtVAkhIfv0D3eTYWFl+6uE0J3IdpT7hq09VoRFysMznrv04
 N5T2fOpeuISNVKMPvmBIf0t0HDG5VCcEqu4rRYLEQqiuCYnd0H3Ho1DcZJKlRi5f
 naAkBIabmyMkmqAnLVmP+Dg/AMayswIXGCpLgxqXF2ucGF0k3sq7K+ZwsZxjVKkz
 QBV1elr0NBuFjThDjrtvm6pYJJTR9K8PCuwqlOMIy2OQRyFG+NoSGZrNO2nzmjIe
 JUWUsq1wUhE+EQ7ShlTU82uAbwfDsFR6L0laMr2HmhfKVJOOXZSU0/CTca5p1TgN
 0WBrxkHyYGjc+gFTszXxfOpDLmNXqkdSie4WTYOkAa+DBwrZu15xcdLnEepg8c6U
 FR5QWhvaSDA6qSs=
 =ErOv
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAmHWNWsPHG9sb2ZAbGl4
 b20ubmV0AAoJEIwa5zzehBx3QXsP/RMpoaAC6Posg0d8N3IzwzWKf2z3uvatL7sZ
 fuj9BiduSOsSXs5wQcYyFvCXzWif8C3TgUxGnhp589ZlZVC2Mdy8rJurpg2qvRO2
 JSko7Jq0u+ct+YQef1JsS0QcCznF6pmUUoFMrfcVeThLZEJK2J+JXNyx5EbeOHts
 8Q2Tz4+UkI/4985xqrTE+WPonnPjgk8pyIQDUpQuFmMaiFE1MUKRZtBbBYZ2Wigx
 yN9GMwRVx6op+dKZa8V1tyiW/Ls7Jj7BVi3M98X0VHcf1kD2vchERVUvOQtMR4gx
 SU+AhrjE8o17D8PfPpUCB1MKzWtRkzlcRypLgCG/BrVIIXtnSIDH5X/XRTZx8vqD
 CFtdo13EbDai/7tuYPNAYXG/JsIR5uFMuV4gMhQmdyo5hqgSeQFA1AsrV6T9xBVo
 6P2Gy71S08BCLnP51tzUeegEQgu29W5aEbiumEYQvPTxNCq4AWLGG/z9phl3aMxq
 nZVT5qfipH0IE01euExAaA4Fx5K6cbzAJCpZp+z8+sJiB47Yer2p6dxgGhyA0DMh
 3DRN9NaR/3R7AoxJThmKixEVSolmihR3rc4ftaGs2lttgJmj5W80QT9biDB5KyL8
 BbJTymPQDlml3WnV7oxPGcFKUTaAzzr+K+PicbbZ/8m/91KLCsTt7NUPQdq0LAu6
 L8qsgVLO
 =p4VG
 -----END PGP SIGNATURE-----

Merge tag 'socfpga_fix_for_v5.16_part_3' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux into arm/fixes

SoCFPGA dts updates for v5.16, part 3
- Change the SoCFPGA compatible to "intel,socfpga-qspi"
- Update dt-bindings document to include "intel,socfpga-qspi"

* tag 'socfpga_fix_for_v5.16_part_3' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux: (361 commits)
  ARM: dts: socfpga: change qspi to "intel,socfpga-qspi"
  dt-bindings: spi: cadence-quadspi: document "intel,socfpga-qspi"
  Linux 5.16-rc7
  mm/hwpoison: clear MF_COUNT_INCREASED before retrying get_any_page()
  mm/damon/dbgfs: protect targets destructions with kdamond_lock
  mm/page_alloc: fix __alloc_size attribute for alloc_pages_exact_nid
  mm: delete unsafe BUG from page_cache_add_speculative()
  mm, hwpoison: fix condition in free hugetlb page path
  MAINTAINERS: mark more list instances as moderated
  kernel/crash_core: suppress unknown crashkernel parameter warning
  mm: mempolicy: fix THP allocations escaping mempolicy restrictions
  kfence: fix memory leak when cat kfence objects
  platform/x86: intel_pmc_core: fix memleak on registration failure
  net: stmmac: dwmac-visconti: Fix value of ETHER_CLK_SEL_FREQ_SEL_2P5M
  r8152: sync ocp base
  r8152: fix the force speed doesn't work for RTL8156
  net: bridge: fix ioctl old_deviceless bridge argument
  net: stmmac: ptp: fix potentially overflowing expression
  net: dsa: tag_ocelot: use traffic class to map priority on injected header
  veth: ensure skb entering GRO are not cloned.
  ...

Link: https://lore.kernel.org/r/20211227103644.566694-1-dinguyen@kernel.org
Signed-off-by: Olof Johansson <olof@lixom.net>
2022-01-05 16:18:50 -08:00
Olof Johansson
fde9ec3c1b Reset controller fixes for v5.16, part 2
Fix pm_runtime_resume_and_get() error handling in the
 reset-rzg2l-usbphy-ctrl driver.
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYIADUWIQRRO6F6WdpH1R0vGibVhaclGDdiwAUCYdXQMRcccC56YWJlbEBw
 ZW5ndXRyb25peC5kZQAKCRDVhaclGDdiwCrJAQC/nW5YH9o0PuredqlUtha/Akpc
 jQmrDZOfHmrm8GOJiAD+OtRE1NHjgm6CXan0QYwa2Dbb+yYifvOZL/SLo5raAw8=
 =ScZk
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAmHWNUkPHG9sb2ZAbGl4
 b20ubmV0AAoJEIwa5zzehBx3b64P+wSxOSOinbkaY/QSj57rZAHdt05OPWjQc0+p
 A6cicMiOpcF+dxRp9ZNOmAv/ViZXsHfSnZg15/tzvWT9bsQcekXHTaeCpm8d4Ku9
 LvUS2X1LtnTH/E+I1/No+W9ljJ3sEkmLFigAAh2I7gSAHrba/RQNhv4oMisw4pNP
 SLZRYuju5MLfPg4wQ8nfkq6rwkNuqfHKueyWKu0R9507oNzXzuzTQcpuZvRPVbWL
 Gpi6QRfQeC06tPM/9nbaPCaGeG12opio8rhNnuLrWqRMQHVUUbQ+VmBTM1ZlDhpv
 8UNkW2AqXUrf0ijopzKzzKLWB5iNhPtWav3T+oTppGvB4Z7jYHUwjtotlYqI6uAQ
 VqtGmObxfnY0xx+RpXWEGuCFCd43AhqdNpR0X4miyh06E0fGpgQWCEbQSDRep/iJ
 tbpDR2cfD7JB2m1asYD+72NHqe53UgCXvgt/xf+gpdR2I2BVcQxj0YfieSZvS5Hq
 FY9OTMl1xZE3SUIPi1eZmZAeqmKRaj7ZCKslgqD+pfnWU/+WEGI4f/M30h/7BAK0
 vNvnJ44T6ozPshCfiHNl2L0z6XC0bA4mvi4z7gnuXRAFA71I5Eg9s1mYqwG3UDZU
 /4JKkzr7Al6BdXZZtXjkDsCK+agyhSa95ywNbUi7XqtD/C0qybVQFfwGqJQ+/HH8
 6M2wIqZj
 =/Jsy
 -----END PGP SIGNATURE-----

Merge tag 'reset-fixes-for-v5.16-2' of git://git.pengutronix.de/pza/linux into arm/fixes

Reset controller fixes for v5.16, part 2

Fix pm_runtime_resume_and_get() error handling in the
reset-rzg2l-usbphy-ctrl driver.

* tag 'reset-fixes-for-v5.16-2' of git://git.pengutronix.de/pza/linux:
  reset: renesas: Fix Runtime PM usage
  reset: tegra-bpmp: Revert Handle errors in BPMP response

Link: https://lore.kernel.org/r/20220105172515.273947-1-p.zabel@pengutronix.de
Signed-off-by: Olof Johansson <olof@lixom.net>
2022-01-05 16:18:17 -08:00
Naveen N. Rao
f28439db47 tracing: Tag trace_percpu_buffer as a percpu pointer
Tag trace_percpu_buffer as a percpu pointer to resolve warnings
reported by sparse:
  /linux/kernel/trace/trace.c:3218:46: warning: incorrect type in initializer (different address spaces)
  /linux/kernel/trace/trace.c:3218:46:    expected void const [noderef] __percpu *__vpp_verify
  /linux/kernel/trace/trace.c:3218:46:    got struct trace_buffer_struct *
  /linux/kernel/trace/trace.c:3234:9: warning: incorrect type in initializer (different address spaces)
  /linux/kernel/trace/trace.c:3234:9:    expected void const [noderef] __percpu *__vpp_verify
  /linux/kernel/trace/trace.c:3234:9:    got int *

Link: https://lkml.kernel.org/r/ebabd3f23101d89cb75671b68b6f819f5edc830b.1640255304.git.naveen.n.rao@linux.vnet.ibm.com

Cc: stable@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 07d777fe8c398 ("tracing: Add percpu buffers for trace_printk()")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2022-01-05 18:53:49 -05:00
Naveen N. Rao
823e670f7e tracing: Fix check for trace_percpu_buffer validity in get_trace_buf()
With the new osnoise tracer, we are seeing the below splat:
    Kernel attempted to read user page (c7d880000) - exploit attempt? (uid: 0)
    BUG: Unable to handle kernel data access on read at 0xc7d880000
    Faulting instruction address: 0xc0000000002ffa10
    Oops: Kernel access of bad area, sig: 11 [#1]
    LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
    ...
    NIP [c0000000002ffa10] __trace_array_vprintk.part.0+0x70/0x2f0
    LR [c0000000002ff9fc] __trace_array_vprintk.part.0+0x5c/0x2f0
    Call Trace:
    [c0000008bdd73b80] [c0000000001c49cc] put_prev_task_fair+0x3c/0x60 (unreliable)
    [c0000008bdd73be0] [c000000000301430] trace_array_printk_buf+0x70/0x90
    [c0000008bdd73c00] [c0000000003178b0] trace_sched_switch_callback+0x250/0x290
    [c0000008bdd73c90] [c000000000e70d60] __schedule+0x410/0x710
    [c0000008bdd73d40] [c000000000e710c0] schedule+0x60/0x130
    [c0000008bdd73d70] [c000000000030614] interrupt_exit_user_prepare_main+0x264/0x270
    [c0000008bdd73de0] [c000000000030a70] syscall_exit_prepare+0x150/0x180
    [c0000008bdd73e10] [c00000000000c174] system_call_vectored_common+0xf4/0x278

osnoise tracer on ppc64le is triggering osnoise_taint() for negative
duration in get_int_safe_duration() called from
trace_sched_switch_callback()->thread_exit().

The problem though is that the check for a valid trace_percpu_buffer is
incorrect in get_trace_buf(). The check is being done after calculating
the pointer for the current cpu, rather than on the main percpu pointer.
Fix the check to be against trace_percpu_buffer.

Link: https://lkml.kernel.org/r/a920e4272e0b0635cf20c444707cbce1b2c8973d.1640255304.git.naveen.n.rao@linux.vnet.ibm.com

Cc: stable@vger.kernel.org
Fixes: e2ace001176dc9 ("tracing: Choose static tp_printk buffer by explicit nesting count")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2022-01-05 18:51:25 -05:00
Jiri Olsa
0daf5cb217 ftrace/samples: Add missing prototypes direct functions
There's another compilation fail (first here [1]) reported by kernel
test robot for W=1 clang build:

  >> samples/ftrace/ftrace-direct-multi-modify.c:7:6: warning: no previous
  prototype for function 'my_direct_func1' [-Wmissing-prototypes]
     void my_direct_func1(unsigned long ip)

Direct functions in ftrace direct sample modules need to have prototypes
defined. They are already global in order to be visible for the inline
assembly, so there's no problem.

The kernel test robot reported just error for ftrace-direct-multi-modify,
but I got same errors also for the rest of the modules touched by this patch.

[1] 67d4f6e3bf5d ftrace/samples: Add missing prototype for my_direct_func

Link: https://lkml.kernel.org/r/20211219135317.212430-1-jolsa@kernel.org

Reported-by: kernel test robot <lkp@intel.com>
Fixes: e1067a07cfbc ("ftrace/samples: Add module to test multi direct modify interface")
Fixes: ae0cc3b7e7f5 ("ftrace/samples: Add a sample module that implements modify_ftrace_direct()")
Fixes: 156473a0ff4f ("ftrace: Add another example of register_ftrace_direct() use case")
Fixes: b06457c83af6 ("ftrace: Add sample module that uses register_ftrace_direct()")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2022-01-05 18:34:50 -05:00
Linus Torvalds
75acfdb6fd Networking fixes for 5.16-final, including fixes from bpf, and WiFi.
Current release - regressions:
 
   - Revert "xsk: Do not sleep in poll() when need_wakeup set",
     made the problem worse
 
   - Revert "net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in
     __fixed_phy_register", broke EPROBE_DEFER handling
 
   - Revert "net: usb: r8152: Add MAC pass-through support for more
     Lenovo Docks", broke setups without a Lenovo dock
 
 Current release - new code bugs:
 
   - selftests: set amt.sh executable
 
 Previous releases - regressions:
 
   - batman-adv: mcast: don't send link-local multicast to mcast routers
 
 Previous releases - always broken:
 
   - ipv4/ipv6: check attribute length for RTA_FLOW / RTA_GATEWAY
 
   - sctp: hold endpoint before calling cb in
 	sctp_transport_lookup_process
 
   - mac80211: mesh: embed mesh_paths and mpp_paths into
     ieee80211_if_mesh to avoid complicated handling of sub-object
     allocation failures
 
   - seg6: fix traceroute in the presence of SRv6
 
   - tipc: fix a kernel-infoleak in __tipc_sendmsg()
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmHV/ksACgkQMUZtbf5S
 IrtZHBAAotpSY1buJLCHC+4EdqyMvYdcuTQJsqYBx2oNdMJ2D5bPSX7d2u2xkhgR
 kBL7cAfnH6C7IdgLirh+JbHG2j1e3WMJikhqtWEMcBMt0eYRzEPGOnABYBjd8wdb
 Ie6IiLw/0zXAdE5pfh2yzHTgyzaGPImA04E45nimoxiHOVWJLCFvI5H4BZvK9JLj
 tmRxFG37m5wWRMdfsizXCvFJyMlg52FLIO1Duu82Gc7ZWMiYnxkD1dF8kzFj2jXM
 wmIWRg1wJa+7mHJHPdUR2I1BNWaapamVVa+9NDONWOi3stImUEqNNDHuzlu4hT/p
 khRXZNPHIbB/c7yR7bCJ9YK/raKKYh5GPRanF0YRL2RDqf80V7uLtVoQ8/Sar4pM
 L2jRAC76SGdHVGJMckVV9LE9NPKTNYw0cA97MhwL5Nc/Ks0oB4oBxfG56350S8sb
 5hel3pJ6lFoWIr88qWgJXzgkVLxLvG7EQBFg6URwGJjBgLLJLzMMO88ALrqR+SN+
 tEwTfcjuG+9tEVIb4DQuXQm0LKcfD8Z7FzHEf5ikoyAbOSbGwZzr4vZu8fOw5Z1y
 Z1YihoEoaHv1sZGGQf4MKD71cZmVrTDgYRZ5p/00jXs/NY6EyWCR2+j1tADgjFvY
 UNKa4LlQPx1hfe9QxCpSBRf/eULYZjWT1qzfj4GVX9W9bk+Cz8c=
 =xIOF
 -----END PGP SIGNATURE-----

Merge tag 'net-5.16-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski"
 "Networking fixes, including fixes from bpf, and WiFi. One last pull
  request, turns out some of the recent fixes did more harm than good.

  Current release - regressions:

   - Revert "xsk: Do not sleep in poll() when need_wakeup set", made the
     problem worse

   - Revert "net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in
     __fixed_phy_register", broke EPROBE_DEFER handling

   - Revert "net: usb: r8152: Add MAC pass-through support for more
     Lenovo Docks", broke setups without a Lenovo dock

  Current release - new code bugs:

   - selftests: set amt.sh executable

  Previous releases - regressions:

   - batman-adv: mcast: don't send link-local multicast to mcast routers

  Previous releases - always broken:

   - ipv4/ipv6: check attribute length for RTA_FLOW / RTA_GATEWAY

   - sctp: hold endpoint before calling cb in
     sctp_transport_lookup_process

   - mac80211: mesh: embed mesh_paths and mpp_paths into
     ieee80211_if_mesh to avoid complicated handling of sub-object
     allocation failures

   - seg6: fix traceroute in the presence of SRv6

   - tipc: fix a kernel-infoleak in __tipc_sendmsg()"

* tag 'net-5.16-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits)
  selftests: set amt.sh executable
  Revert "net: usb: r8152: Add MAC passthrough support for more Lenovo Docks"
  sfc: The RX page_ring is optional
  iavf: Fix limit of total number of queues to active queues of VF
  i40e: Fix incorrect netdev's real number of RX/TX queues
  i40e: Fix for displaying message regarding NVM version
  i40e: fix use-after-free in i40e_sync_filters_subtask()
  i40e: Fix to not show opcode msg on unsuccessful VF MAC change
  ieee802154: atusb: fix uninit value in atusb_set_extended_addr
  mac80211: mesh: embedd mesh_paths and mpp_paths into ieee80211_if_mesh
  mac80211: initialize variable have_higher_than_11mbit
  sch_qfq: prevent shift-out-of-bounds in qfq_init_qdisc
  netrom: fix copying in user data in nr_setsockopt
  udp6: Use Segment Routing Header for dest address if present
  icmp: ICMPV6: Examine invoking packet for Segment Route Headers.
  seg6: export get_srh() for ICMP handling
  Revert "net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register"
  ipv6: Do cleanup if attribute validation fails in multipath route
  ipv6: Continue processing multipath route even if gateway attribute is invalid
  net/fsl: Remove leftover definition in xgmac_mdio
  ...
2022-01-05 14:08:56 -08:00
Leon Romanovsky
b35a0f4dd5 RDMA/core: Don't infoleak GRH fields
If dst->is_global field is not set, the GRH fields are not cleared
and the following infoleak is reported.

=====================================================
BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:121 [inline]
BUG: KMSAN: kernel-infoleak in _copy_to_user+0x1c9/0x270 lib/usercopy.c:33
 instrument_copy_to_user include/linux/instrumented.h:121 [inline]
 _copy_to_user+0x1c9/0x270 lib/usercopy.c:33
 copy_to_user include/linux/uaccess.h:209 [inline]
 ucma_init_qp_attr+0x8c7/0xb10 drivers/infiniband/core/ucma.c:1242
 ucma_write+0x637/0x6c0 drivers/infiniband/core/ucma.c:1732
 vfs_write+0x8ce/0x2030 fs/read_write.c:588
 ksys_write+0x28b/0x510 fs/read_write.c:643
 __do_sys_write fs/read_write.c:655 [inline]
 __se_sys_write fs/read_write.c:652 [inline]
 __ia32_sys_write+0xdb/0x120 fs/read_write.c:652
 do_syscall_32_irqs_on arch/x86/entry/common.c:114 [inline]
 __do_fast_syscall_32+0x96/0xf0 arch/x86/entry/common.c:180
 do_fast_syscall_32+0x34/0x70 arch/x86/entry/common.c:205
 do_SYSENTER_32+0x1b/0x20 arch/x86/entry/common.c:248
 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c

Local variable resp created at:
 ucma_init_qp_attr+0xa4/0xb10 drivers/infiniband/core/ucma.c:1214
 ucma_write+0x637/0x6c0 drivers/infiniband/core/ucma.c:1732

Bytes 40-59 of 144 are uninitialized
Memory access of size 144 starts at ffff888167523b00
Data copied to user address 0000000020000100

CPU: 1 PID: 25910 Comm: syz-executor.1 Not tainted 5.16.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
=====================================================

Fixes: 4ba66093bdc6 ("IB/core: Check for global flag when using ah_attr")
Link: https://lore.kernel.org/r/0e9dd51f93410b7b2f4f5562f52befc878b71afa.1641298868.git.leonro@nvidia.com
Reported-by: syzbot+6d532fa8f9463da290bc@syzkaller.appspotmail.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-01-05 16:30:19 -04:00
Taehee Yoo
db54c12a3d selftests: set amt.sh executable
amt.sh test script will not work because it doesn't have execution
permission. So, it adds execution permission.

Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Fixes: c08e8baea78e ("selftests: add amt interface selftest script")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20220105144436.13415-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-05 10:27:19 -08:00