IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Commit e5afdf9dd515 ("powerpc/vfio_spapr_tce: Add reference counting to
iommu_table") missed an iommu_table allocation in the pseries vio code.
The iommu_table is allocated with kzalloc and as a result the associated
kref gets a value of zero. This has the side effect that during a DLPAR
remove of the associated virtual IOA the iommu_tce_table_put() triggers
a use-after-free underflow warning.
Call Trace:
[c0000002879e39f0] [c00000000071ecb4] refcount_warn_saturate+0x184/0x190
(unreliable)
[c0000002879e3a50] [c0000000000500ac] iommu_tce_table_put+0x9c/0xb0
[c0000002879e3a70] [c0000000000f54e4] vio_dev_release+0x34/0x70
[c0000002879e3aa0] [c00000000087cfa4] device_release+0x54/0xf0
[c0000002879e3b10] [c000000000d64c84] kobject_cleanup+0xa4/0x240
[c0000002879e3b90] [c00000000087d358] put_device+0x28/0x40
[c0000002879e3bb0] [c0000000007a328c] dlpar_remove_slot+0x15c/0x250
[c0000002879e3c50] [c0000000007a348c] remove_slot_store+0xac/0xf0
[c0000002879e3cd0] [c000000000d64220] kobj_attr_store+0x30/0x60
[c0000002879e3cf0] [c0000000004ff13c] sysfs_kf_write+0x6c/0xa0
[c0000002879e3d10] [c0000000004fde4c] kernfs_fop_write+0x18c/0x260
[c0000002879e3d60] [c000000000410f3c] __vfs_write+0x3c/0x70
[c0000002879e3d80] [c000000000415408] vfs_write+0xc8/0x250
[c0000002879e3dd0] [c0000000004157dc] ksys_write+0x7c/0x120
[c0000002879e3e20] [c00000000000b278] system_call+0x5c/0x68
Further, since the refcount was always zero the iommu_tce_table_put()
fails to call the iommu_table release function resulting in a leak.
Fix this issue be initilizing the iommu_table kref immediately after
allocation.
Fixes: e5afdf9dd515 ("powerpc/vfio_spapr_tce: Add reference counting to iommu_table")
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1579558202-26052-1-git-send-email-tyreld@linux.ibm.com
Setting ND_REGION_PAGEMAP flag implies namespace mode defaults to fsdax mode.
This also means kernel ends up creating struct page backing for these namspace
ranges. With large namespaces that is not the right thing to do. We
should let the user select the mode he/she wants the namespace to be created
with.
Hence disable ND_REGION_PAGEMAP for papr_scm regions. We still keep the flag for
of_pmem because it supports only small persistent memory regions.
This is similar to what is done for x86 with commit
commit: 004f1afbe199 ("libnvdimm, pmem: direct map legacy pmem by default")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200108064647.169637-1-aneesh.kumar@linux.ibm.com
'read_barrier_depends()' doesn't exist anymore so stop talking about it.
Signed-off-by: Will Deacon <will@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Fix our hash MMU code to avoid having overlapping ids between user and kernel,
which isn't as bad as it sounds but led to crashes on some machines.
A fix for the Power9 XIVE interrupt code, which could return the wrong interrupt
state in obscure error conditions.
A minor Kconfig fix for the recently added CONFIG_PPC_UV code.
Thanks to:
Aneesh Kumar K.V, Bharata B Rao, Cédric Le Goater, Frederic Barrat.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl4q3iMTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgM1fEAC7YefQZsuIstV2bNH9xFJe/oEOSLWt
DqZYfWZRBRZRdRMAP7+hmXpsLfHAB3QSGv4yEc2Pgm5Y5vYuv8MWqEKkpJoivIPW
/Fu7mNGI06IWinMyl59dLA5fXPNeOzSNcdPnBdIUC/62XU414IGsy273X3LN4YiM
hfXSpPA3tpSjVBTeqfKTJowXWMOvbkwiHMs2ziAu4Y9GDauJsU0UiOxMkl2MYT7Z
PULO/SrjYSbP8+MazD/YdJ5k7F4rNV3bVDPUCU305tKIkrqRmBs+8DXjeUAMWVjj
y2huUZ/0XXsr8QZyyGfMhEC1sJKAD3colG4LHUysbWz1oInY93CvXELdPAMXhqrH
xXTzx9ae74Xk7t7WaGgq5xJGHJCDSg1aA2HPkUg6Gbk6iJIztOCkKMdSPvFMo18S
p+Kjis8db83RPXQm3ooW7s/xhp6AaDiXd8khQ7A2efgt4SDTSPFvgv2q4CMDC1lP
njchgzgPJ635MSd3CC/u6i7eViuUylb6SfhFVYY0DqOOvNJVrj1W6M4N7tpWSrRS
PXJI2/NwlL3WIajKO3KfqTRI7QaSLtccTrXGLiGBxx/ooAowP94/WL3CYJAeuO2F
NmA1GXH6/WXvzMMNm5NB0kCOUOOhtJk/MFuO7Agly6YL9p36fqVz4WIJVAqtwJLO
uPu0xASJCgp9Mg==
=r/RM
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.5-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 5.5:
- Fix our hash MMU code to avoid having overlapping ids between user
and kernel, which isn't as bad as it sounds but led to crashes on
some machines.
- A fix for the Power9 XIVE interrupt code, which could return the
wrong interrupt state in obscure error conditions.
- A minor Kconfig fix for the recently added CONFIG_PPC_UV code.
Thanks to Aneesh Kumar K.V, Bharata B Rao, Cédric Le Goater, Frederic
Barrat"
* tag 'powerpc-5.5-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm/hash: Fix sharing context ids between kernel & userspace
powerpc/xive: Discard ESB load value when interrupt is invalid
powerpc: Ultravisor: Fix the dependencies for CONFIG_PPC_UV
Add kvm_vcpu_destroy() and wire up all architectures to call the common
function instead of their arch specific implementation. The common
destruction function will be used by future patches to move allocation
and initialization of vCPUs to common KVM code, i.e. to free resources
that are allocated by arch agnostic code.
No functional change intended.
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a pre-allocation arch hook to handle checks that are currently done
by arch specific code prior to allocating the vCPU object. This paves
the way for moving the allocation to common KVM code.
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove the superfluous kvm_arch_vcpu_free() as it is no longer called
from commmon KVM code. Note, kvm_arch_vcpu_destroy() *is* called from
common code, i.e. choosing which function to whack is not completely
arbitrary.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the kvm_cpu_{un}init() calls to common PPC code as an intermediate
step towards removing kvm_cpu_{un}init() altogether.
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the initialization of oldpir so that the call to kvm_vcpu_init() is
at the top of kvmppc_core_vcpu_create_e500mc(). oldpir is only use
when loading/putting a vCPU, which currently cannot be done until after
kvm_arch_vcpu_create() completes. Reording the call to kvm_vcpu_init()
paves the way for moving the invocation to common PPC code.
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Call kvm_vcpu_init() in kvmppc_core_vcpu_create_pr() prior to allocating
the book3s and shadow_vcpu objects in preparation of moving said call to
common PPC code. Although kvm_vcpu_init() has an arch callback, the
callback is empty for Book3S PR, i.e. barring unseen black magic, moving
the allocation has no real functional impact.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move allocation of all flavors of PPC vCPUs to common PPC code. All
variants either allocate 'struct kvm_vcpu' directly, or require that
the embedded 'struct kvm_vcpu' member be located at offset 0, i.e.
guarantee that the allocation can be directly interpreted as a 'struct
kvm_vcpu' object.
Remove the message from the build-time assertion regarding placement of
the struct, as compatibility with the arch usercopy region is no longer
the sole dependent on 'struct kvm_vcpu' being at offset zero.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In preparation for moving vcpu allocation to common PPC code, add an
explicit, albeit redundant, build-time assert to ensure the vcpu member
is located at offset 0. The assert is redundant in the sense that
kvmppc_core_vcpu_create_e500() contains a functionally identical assert.
The motiviation for adding the extra assert is to provide visual
confirmation of the correctness of moving vcpu allocation to common
code.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Explicitly free the shared page if kvmppc_mmu_init() fails during
kvmppc_core_vcpu_create(), as the page is freed only in
kvmppc_core_vcpu_free(), which is not reached via kvm_vcpu_uninit().
Fixes: 96bc451a15329 ("KVM: PPC: Introduce shared page")
Cc: stable@vger.kernel.org
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Call kvm_vcpu_uninit() if vcore creation fails to avoid leaking any
resources allocated by kvm_vcpu_init(), i.e. the vcpu->run page.
Fixes: 371fefd6f2dc4 ("KVM: PPC: Allow book3s_hv guests to use SMT processor modes")
Cc: stable@vger.kernel.org
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add fsl,erratum-a011043 to internal MDIO buses.
Software may get false read error when reading internal
PCS registers through MDIO. As a workaround, all internal
MDIO accesses should ignore the MDIO_CFG[MDIO_RD_ER] bit.
Signed-off-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Dave Hansen <dave.hansen@linux.intel.com>
MPX is being removed from the kernel due to a lack of support
in the toolchain going forward (gcc).
arch_bprm_mm_init() is used at execve() time. The only non-stub
implementation is on x86 for MPX. Remove the hook entirely from
all architectures and generic code.
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: x86@kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-arch@vger.kernel.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
This is only used in pci-ioda.c so move it there and rename it to match.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200110070207.439-6-oohall@gmail.com
pnv_pci_dma_dev_setup() does nothing but call the phb->dma_dev_setup()
callback, if one exists. That callback is only set for normal PCIe PHBs so
we can remove the layer of indirection and use the ioda version in
the pci_controller_ops.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200110070207.439-5-oohall@gmail.com
An ioda_pe for each VF is allocated in pnv_pci_sriov_enable() before
the pci_dev for the VF is created. We need to set the pe->pdev pointer
at some point after the pci_dev is created. Currently we do that in:
pcibios_bus_add_device()
pnv_pci_dma_dev_setup() (via phb->ops.dma_dev_setup)
/* fixup is done here */
pnv_pci_ioda_dma_dev_setup() (via pnv_phb->dma_dev_setup)
The fixup needs to be done before setting up DMA for for the VF's PE,
but there's no real reason to delay it until this point. Move the
fixup into pnv_pci_ioda_fixup_iov() so the ordering is:
pcibios_add_device()
pnv_pci_ioda_fixup_iov() (via ppc_md.pcibios_fixup_sriov)
pcibios_bus_add_device()
...
This isn't strictly required, but it's a slightly more logical place
to do the fixup and it simplifies pnv_pci_dma_dev_setup().
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200110070207.439-4-oohall@gmail.com
The pnv_pci_dma_dev_setup() only does something when:
1) There PHB contains VFs, or
2) The PHB defines a dma_dev_setup() callback in the pnv_phb structure.
Neither is true for NPU PHBs so there's no reason to set the callback.
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200110070207.439-3-oohall@gmail.com
pcibios_bus_add_device() is the only caller of pcibios_setup_device().
Fold them together since there's no real reason to keep them separate.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200110070207.439-2-oohall@gmail.com
OPAL provides several different kinds of reboot for the kernel to use,
namely forcing a full reboot, platform error reboot and MPIPL. Right now
triggering the alternative resets requires some ad-hoc method such as
triggering a kernel crash and hoping the stars align. It's sometimes handy
to be able to trigger one of these resets directly, so add a way to do
that.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191101085522.3055-2-oohall@gmail.com
On PowerNV a few different kinds of reboot are supported. We'd like to be
able to exercise these from xmon so allow 'zr' to take an argument, and
pass that to the ppc_md.restart() function.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191101085522.3055-1-oohall@gmail.com
Long before we had a generic way for firmware to export memory ranges of
interest we added a special case for the skiboot symbol map. The code is
pretty much identical to the generic export so re-use the code.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191101062611.32610-2-oohall@gmail.com
Originally we only had a handful of exported memory ranges, but we'd to
export the per-core trace buffers. This results in a lot of files in the
exports directory which is a but unfortunate. We can clean things up a bit
by turning subnodes into subdirectories of the exports directory.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191101062611.32610-1-oohall@gmail.com
Many drivers don't check for errors when they get a 0xFFs response from an
MMIO load. As a result after an EEH event occurs a driver can get stuck in
a polling loop unless it some kind of internal timeout logic.
Currently EEH tries to detect and report stuck drivers by dumping a stack
trace after eeh_dev_check_failure() is called EEH_MAX_FAILS times on an
already frozen PE. The value of EEH_MAX_FAILS was chosen so that a dump
would occur every few seconds if the driver was spinning in a loop. This
results in a lot of spurious stack traces in the kernel log.
Fix this by limiting it to printing one stack trace for each PE freeze. If
the driver is truely stuck the kernel's hung task detector is better suited
to reporting the probelm anyway.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com>
Tested-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191016012536.22588-1-oohall@gmail.com
Add a debugfs entry to dump the state of the active IODA PEs. The IODA
PE state reflects how the PHB's internal concept of a PE is
configured. This is separate to the EEH PE state and is managed power
the PowerNV PCI backend rather than the EEH core.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[mpe: Use DEFINE_DEBUGFS_ATTRIBUTE]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190912052945.12589-3-oohall@gmail.com
Make the dump trigger off any input rather than just '1'. This allows you
to write "echo 1> dump_diag_data" and it'll do what you want rather than
erroring out pointlessly.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190912052945.12589-2-oohall@gmail.com
Use the pnv_phb structure as the private data pointer for the debugfs
files. This lets us delete some code and an open-coded use of
hose->private_data.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190912052945.12589-1-oohall@gmail.com
These functions can only be used on a SR-IOV capable physical function and
they're only called in pcibios_sriov_enable / disable. Make them emit a
warning in the future if they're used incorrectly and remove the dead
code that checks if the device is a VF.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190821062655.19735-3-oohall@gmail.com
The powerpc PCI code requires that a pci_dn structure exists for all
devices in the system. This is fine for real devices since at boot a pci_dn
is created for each PCI device in the DT and it's fine for hotplugged devices
since the hotplug slot driver will manage the pci_dn's devices in hotplug
slots. For SR-IOV, we need the platform / pcibios to manage the pci_dn for
virtual functions since firmware is unaware of VFs, and they aren't
"hot plugged" in the traditional sense.
Management of the pci_dn is handled by the, poorly named, functions:
add_pci_dev_data() and remove_pci_dev_data(). The entire body of these
functions is #ifdef`ed around CONFIG_PCI_IOV and they cannot be used
in any other context, so make them only available when CONFIG_PCI_IOV
is selected, and rename them to reflect their actual usage rather than
having them masquerade as generic code.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190821062655.19735-2-oohall@gmail.com
When disabling virtual functions on an SR-IOV adapter we currently do not
correctly remove the EEH state for the now-dead virtual functions. When
removing the pci_dn that was created for the VF when SR-IOV was enabled
we free the corresponding eeh_dev without removing it from the child device
list of the eeh_pe that contained it. This can result in crashes due to the
use-after-free.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com>
Tested-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190821062655.19735-1-oohall@gmail.com
The eeh_sysfs_remove_device() function is supposed to clear the
EEH_DEV_SYSFS flag since it indicates the EEH sysfs entries have been added
for a pci_dev.
When the sysfs files are removed eeh_remove_device() the eeh_dev and the
pci_dev have already been de-associated. This then causes the
pci_dev_to_eeh_dev() call in eeh_sysfs_remove_device() to return NULL so
the flag can't be cleared from the still-live eeh_dev. This problem is
worked around in the caller by clearing the flag manually. However, this
behaviour doesn't make a whole lot of sense, so this patch fixes it by:
a) Re-ordering eeh_remove_device() so that eeh_sysfs_remove_device() is
called before de-associating the pci_dev and eeh_dev.
b) Making eeh_sysfs_remove_device() emit a warning if there's no
corresponding eeh_dev for a pci_dev. The paths where the sysfs
files are only reachable if EEH was setup for the device
for the device in the first place so hitting this warning
indicates a programming error.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com>
Tested-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190715085612.8802-6-oohall@gmail.com
In eeh_notify_resume_show() the pci_dn for the device is looked up once in
the declaration block and then once after checking for a NULL eeh_dev.
Remove the second lookup since it's pointless.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com>
Tested-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190715085612.8802-5-oohall@gmail.com
There are several EEH sysfs properties that only exists when the
"ibm,is-open-sriov-pf" property appears in the device tree node of the PCI
device. This used on pseries to indicate to the guest that the hypervisor
allows the guest to configure the SR-IOV capability. Doing this requires
some handshaking between the guest, hypervisor and userspace when a VF is
EEH frozen which is why these properties exist.
This is all dead code on non-pseries platforms so wrap it in an #ifdef
CONFIG_PPC_PSERIES to make the dependency clearer.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Tested-by: Sam Bobroff <sbobroff@linux.ibm.com>
Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190715085612.8802-4-oohall@gmail.com
The EEH_ATTR_SHOW() helper is used to display fields from struct eeh_dev
not struct pci_dn.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190715085612.8802-3-oohall@gmail.com
At the point where we start inserting ranges into the EEH address cache the
binding between pci_dev and eeh_dev has already been set up. Instead of
consulting the pci_dn tree we can retrieve the eeh_dev directly using
pci_dev_to_eeh_dev().
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com>
Tested-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190715085612.8802-2-oohall@gmail.com
Unlike real PCI slots, opencapi slots are directly associated to
the (virtual) opencapi PHB, there's no intermediate bridge. So when
looking for a slot ID, we must start the search from the device node
itself and not its parent.
Also, the slot ID is not attached to a specific bdfn, so let's build
it from the PHB ID, like skiboot.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191121134918.7155-6-fbarrat@linux.ibm.com
With hotplug, an opencapi device can now go away. It needs to be
released, mostly to clean up its PE state. We were previously not
defining any device callback. We can reuse the standard PCI release
callback, it does a bit too much for an opencapi device, but it's
harmless, and only needs minor tuning.
Also separate the undo of the PELT-V code in a separate function, it
is not needed for NPU devices and it improves a bit the readability of
the code.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191121134918.7155-5-fbarrat@linux.ibm.com
The PE for an opencapi device was set as part of a late PHB fixup
operation, when creating the PHB. To use the PCI hotplug framework,
this is not going to work, as the PHB stays the same, it's only the
devices underneath which are updated. For regular PCI devices, it is
done as part of the reconfiguration of the bridge, but for opencapi
PHBs, we don't have an intermediate bridge. So let's define the PE
when the device is enabled. PEs are meaningless for opencapi, the NPU
doesn't define them and opal is not doing anything with them.
Reviewed-by: Alastair D'Silva <alastair@d-silva.org>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191121134918.7155-4-fbarrat@linux.ibm.com
Protect the PHB's list of PE. Probably not needed as long as it was
populated during PHB creation, but it feels right and will become
required once we can add/remove opencapi devices on hotplug.
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191121134918.7155-3-fbarrat@linux.ibm.com
The pci_dn structure used to store a pointer to the struct pci_dev, so
taking a reference on the device was required. However, the pci_dev
pointer was later removed from the pci_dn structure, but the reference
was kept for the npu device.
See commit 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary
pcidev from pci_dn").
We don't need to take a reference on the device when assigning the PE
as the struct pnv_ioda_pe is cleaned up at the same time as
the (physical) device is released. Doing so prevents the device from
being released, which is a problem for opencapi devices, since we want
to be able to remove them through PCI hotplug.
Now the ugly part: nvlink npu devices are not meant to be
released. Because of the above, we've always leaked a reference and
simply removing it now is dangerous and would likely require more
work. There's currently no release device callback for nvlink devices
for example. So to be safe, this patch leaks a reference on the npu
device, but only for nvlink and not opencapi.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191121134918.7155-2-fbarrat@linux.ibm.com