IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This reverts commit a474b3f042.
Because of recent interactions with developers from @umn.edu, all
commits from them have been recently re-reviewed to ensure if they were
correct or not.
Upon review, this commit was found to be incorrect for the reasons
below, so it must be reverted. It will be fixed up "correctly" in a
later kernel change.
The original change is NOT correct, as it does not correctly unwind from
the resources that was allocated before the call to
platform_driver_register().
Cc: Aditya Pakki <pakki001@umn.edu>
Acked-By: Vinod Koul <vkoul@kernel.org>
Acked-By: Sinan Kaya <okaya@kernel.org>
Link: https://lore.kernel.org/r/20210503115736.2104747-51-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull dmaengine updates from Vinod Koul:
"New drivers/devices:
- Support for QCOM SM8150 GPI DMA
Updates:
- Big pile of idxd updates including support for performance
monitoring
- Support in dw-edma for interleaved dma
- Support for synchronize() in Xilinx driver"
* tag 'dmaengine-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (42 commits)
dmaengine: idxd: Enable IDXD performance monitor support
dmaengine: idxd: Add IDXD performance monitor support
dmaengine: idxd: remove MSIX masking for interrupt handlers
dmaengine: idxd: device cmd should use dedicated lock
dmaengine: idxd: support reporting of halt interrupt
dmaengine: idxd: enable SVA feature for IOMMU
dmaengine: idxd: convert sprintf() to sysfs_emit() for all usages
dmaengine: idxd: add interrupt handle request and release support
dmaengine: idxd: add support for readonly config mode
dmaengine: idxd: add percpu_ref to descriptor submission path
dmaengine: idxd: remove detection of device type
dmaengine: idxd: iax bus removal
dmaengine: idxd: fix cdev setup and free device lifetime issues
dmaengine: idxd: fix group conf_dev lifetime
dmaengine: idxd: fix engine conf_dev lifetime
dmaengine: idxd: fix wq conf_dev 'struct device' lifetime
dmaengine: idxd: fix idxd conf_dev 'struct device' lifetime
dmaengine: idxd: use ida for device instance enumeration
dmaengine: idxd: removal of pcim managed mmio mapping
dmaengine: idxd: cleanup pci interrupt vector allocation management
...
Pull ARM SoC driver updates from Arnd Bergmann:
"Updates for SoC specific drivers include a few subsystems that have
their own maintainers but send them through the soc tree:
TEE/OP-TEE:
- Add tracepoints around calls to secure world
Memory controller drivers:
- Minor fixes for Renesas, Exynos, Mediatek and Tegra platforms
- Add debug statistics to Tegra20 memory controller
- Update Tegra bindings and convert to dtschema
ARM SCMI Firmware:
- Support for modular SCMI protocols and vendor specific extensions
- New SCMI IIO driver
- Per-cpu DVFS
The other driver changes are all from the platform maintainers
directly and reflect the drivers that don't fit into any other
subsystem as well as treewide changes for a particular platform.
SoCFPGA:
- Various cleanups contributed by Krzysztof Kozlowski
Mediatek:
- add MT8183 support to mutex driver
- MMSYS: use per SoC array to describe the possible routing
- add MMSYS support for MT8183 and MT8167
- add support for PMIC wrapper with integrated arbiter
- add support for MT8192/MT6873
Tegra:
- Bug fixes to PMC and clock drivers
NXP/i.MX:
- Update SCU power domain driver to keep console domain power on.
- Add missing ADC1 power domain to SCU power domain driver.
- Update comments for single global power domain in SCU power domain
driver.
- Add i.MX51/i.MX53 unique id support to i.MX SoC driver.
NXP/FSL SoC driver updates for v5.13
- Add ACPI support for RCPM driver
- Use generic io{read,write} for QE drivers after performance
optimized for PowerPC
- Fix QBMAN probe to cleanup HW states correctly for kexec
- Various cleanup and style fix for QBMAN/QE/GUTS drivers
OMAP:
- Preparation to use devicetree for genpd
- ti-sysc needs iorange check improved when the interconnect target
module has no control registers listed
- ti-sysc needs to probe l4_wkup and l4_cfg interconnects first to
avoid issues with missing resources and unnecessary deferred probe
- ti-sysc debug option can now detect more devices
- ti-sysc now warns if an old incomplete devicetree data is found as
we now rely on it being complete for am3 and 4
- soc init code needs to check for prcm and prm nodes for omap4/5 and
dra7
- omap-prm driver needs to enable autoidle retention support for
omap4
- omap5 clocks are missing gpmc and ocmc clock registers
- pci-dra7xx now needs to use builtin_platform_driver instead of
using builtin_platform_driver_probe for deferred probe to work
Raspberry Pi:
- Fix-up all RPi firmware drivers so as for unbind to happen in an
orderly fashion
- Support for RPi's PoE hat PWM bus
Qualcomm
- Improved detection for SCM calling conventions
- Support for OEM specific wifi firmware path
- Added drivers for SC7280/SM8350: RPMH, LLCC< AOSS QMP"
* tag 'arm-drivers-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (165 commits)
soc: aspeed: fix a ternary sign expansion bug
memory: mtk-smi: Add device-link between smi-larb and smi-common
memory: samsung: exynos5422-dmc: handle clk_set_parent() failure
memory: renesas-rpc-if: fix possible NULL pointer dereference of resource
clk: socfpga: fix iomem pointer cast on 64-bit
soc: aspeed: Adapt to new LPC device tree layout
pinctrl: aspeed-g5: Adapt to new LPC device tree layout
ipmi: kcs: aspeed: Adapt to new LPC DTS layout
ARM: dts: Remove LPC BMC and Host partitions
dt-bindings: aspeed-lpc: Remove LPC partitioning
soc: fsl: enable acpi support in RCPM driver
soc: qcom: mdt_loader: Detect truncated read of segments
soc: qcom: mdt_loader: Validate that p_filesz < p_memsz
soc: qcom: pdr: Fix error return code in pdr_register_listener
firmware: qcom_scm: Fix kernel-doc function names to match
firmware: qcom_scm: Suppress sysfs bind attributes
firmware: qcom_scm: Workaround lack of "is available" call on SC7180
firmware: qcom_scm: Reduce locking section for __get_convention()
firmware: qcom_scm: Make __qcom_scm_is_call_available() return bool
Revert "soc: fsl: qe: introduce qe_io{read,write}* wrappers"
...
Implement the IDXD performance monitor capability (named 'perfmon' in
the DSA (Data Streaming Accelerator) spec [1]), which supports the
collection of information about key events occurring during DSA and
IAX (Intel Analytics Accelerator) device execution, to assist in
performance tuning and debugging.
The idxd perfmon support is implemented as part of the IDXD driver and
interfaces with the Linux perf framework. It has several features in
common with the existing uncore pmu support:
- it does not support sampling
- does not support per-thread counting
However it also has some unique features not present in the core and
uncore support:
- all general-purpose counters are identical, thus no event constraints
- operation is always system-wide
While the core perf subsystem assumes that all counters are by default
per-cpu, the uncore pmus are socket-scoped and use a cpu mask to
restrict counting to one cpu from each socket. IDXD counters use a
similar strategy but expand the scope even further; since IDXD
counters are system-wide and can be read from any cpu, the IDXD perf
driver picks a single cpu to do the work (with cpu hotplug notifiers
to choose a different cpu if the chosen one is taken off-line).
More specifically, the perf userspace tool by default opens a counter
for each cpu for an event. However, if it finds a cpumask file
associated with the pmu under sysfs, as is the case with the uncore
pmus, it will open counters only on the cpus specified by the cpumask.
Since perfmon only needs to open a single counter per event for a
given IDXD device, the perfmon driver will create a sysfs cpumask file
for the device and insert the first cpu of the system into it. When a
user uses perf to open an event, perf will open a single counter on
the cpu specified by the cpu mask. This amounts to the default
system-wide rather than per-cpu counting mentioned previously for
perfmon pmu events. In order to keep the cpu mask up-to-date, the
driver implements cpu hotplug support for multiple devices, as IDXD
usually enumerates and registers more than one idxd device.
The perfmon driver implements basic perfmon hardware capability
discovery and configuration, and is initialized by the IDXD driver's
probe function. During initialization, the driver retrieves the total
number of supported performance counters, the pmu ID, and the device
type from idxd device, and registers itself under the Linux perf
framework.
The perf userspace tool can be used to monitor single or multiple
events depending on the given configuration, as well as event groups,
which are also supported by the perfmon driver. The user configures
events using the perf tool command-line interface by specifying the
event and corresponding event category, along with an optional set of
filters that can be used to restrict counting to specific work queues,
traffic classes, page and transfer sizes, and engines (See [1] for
specifics).
With the configuration specified by the user, the perf tool issues a
system call passing that information to the kernel, which uses it to
initialize the specified event(s). The event(s) are opened and
started, and following termination of the perf command, they're
stopped. At that point, the perfmon driver will read the latest count
for the event(s), calculate the difference between the latest counter
values and previously tracked counter values, and display the final
incremental count as the event count for the cycle. An overflow
handler registered on the IDXD irq path is used to account for counter
overflows, which are signaled by an overflow interrupt.
Below are a couple of examples of perf usage for monitoring DSA events.
The following monitors all events in the 'engine' category. Becuuse
no filters are specified, this captures all engine events for the
workload, which in this case is 19 iterations of the work generated by
the kernel dmatest module.
Details describing the events can be found in Appendix D of [1],
Performance Monitoring Events, but briefly they are:
event 0x1: total input data processed, in 32-byte units
event 0x2: total data written, in 32-byte units
event 0x4: number of work descriptors that read the source
event 0x8: number of work descriptors that write the destination
event 0x10: number of work descriptors dispatched from batch descriptors
event 0x20: number of work descriptors dispatched from work queues
# perf stat -e dsa0/event=0x1,event_category=0x1/,
dsa0/event=0x2,event_category=0x1/,
dsa0/event=0x4,event_category=0x1/,
dsa0/event=0x8,event_category=0x1/,
dsa0/event=0x10,event_category=0x1/,
dsa0/event=0x20,event_category=0x1/
modprobe dmatest channel=dma0chan0 timeout=2000
iterations=19 run=1 wait=1
Performance counter stats for 'system wide':
5,332 dsa0/event=0x1,event_category=0x1/
5,327 dsa0/event=0x2,event_category=0x1/
19 dsa0/event=0x4,event_category=0x1/
19 dsa0/event=0x8,event_category=0x1/
0 dsa0/event=0x10,event_category=0x1/
19 dsa0/event=0x20,event_category=0x1/
21.977436186 seconds time elapsed
The command below illustrates filter usage with a simple example. It
specifies that MEM_MOVE operations should be counted for the DSA
device dsa0 (event 0x8 corresponds to the EV_MEM_MOVE event - Number
of Memory Move Descriptors, which is part of event category 0x3 -
Operations. The detailed category and event IDs are available in
Appendix D, Performance Monitoring Events, of [1]). In addition to
the event and event category, a number of filters are also specified
(the detailed filter values are available in Chapter 6.4 (Filter
Support) of [1]), which will restrict counting to only those events
that meet all of the filter criteria. In this case, the filters
specify that only MEM_MOVE operations that are serviced by work queue
wq0 and specifically engine number engine0 and traffic class tc0
having sizes between 0 and 4k and page size of between 0 and 1G result
in a counter hit; anything else will be filtered out and not appear in
the final count. Note that filters are optional - any filter not
specified is assumed to be all ones and will pass anything.
# perf stat -e dsa0/filter_wq=0x1,filter_tc=0x1,filter_sz=0x7,
filter_eng=0x1,event=0x8,event_category=0x3/
modprobe dmatest channel=dma0chan0 timeout=2000
iterations=19 run=1 wait=1
Performance counter stats for 'system wide':
19 dsa0/filter_wq=0x1,filter_tc=0x1,filter_sz=0x7,
filter_eng=0x1,event=0x8,event_category=0x3/
21.865914091 seconds time elapsed
The output above reflects that the unspecified workload resulted in
the counting of 19 MEM_MOVE operation events that met the filter
criteria.
[1]: https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html
[ Based on work originally by Jing Lin. ]
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Link: https://lore.kernel.org/r/0c5080a7d541904c4ad42b848c76a1ce056ddac7.1619276133.git.zanussi@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
DSA spec states that when Request Interrupt Handle and Release Interrupt
Handle command bits are set in the CMDCAP register, these device commands
must be supported by the driver.
The interrupt handle is programmed in a descriptor. When Request Interrupt
Handle is not supported, the interrupt handle is the index of the desired
entry in the MSI-X table. When the command is supported, driver must use
the command to obtain a handle to be programmed in the submitted
descriptor.
A requested handle may be revoked. After the handle is revoked, any use of
the handle will result in Invalid Interrupt Handle error.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161894439422.3202472.17579543737810265471.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The read-only configuration mode is defined by the DSA spec as a mode of
the device WQ configuration. When GENCAP register bit 31 is set to 0,
the device is in RO mode and group configuration and some fields of the
workqueue configuration registers are read-only and reflect the fixed
configuration of the device. Add support for RO mode. The driver will
load the values from the registers directly setup all the internally
cached data structures based on the device configuration.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161894438847.3202472.6317563824045432727.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Current submission path has no way to restrict the submitter from
stop submiting on shutdown path or wq disable path. This provides a way to
quiesce the submission path.
Modeling after 'struct reqeust_queue' usage of percpu_ref. One of the
abilities of per_cpu reference counting is the ability to stop new
references from being taken while awaiting outstanding references to be
dropped. On wq shutdown, we want to block any new submissions to the kernel
workqueue and quiesce before disabling. The percpu_ref allows us to block
any new submissions and wait for any current submission calls to finish
submitting to the workqueue.
A percpu_ref is embedded in each idxd_wq context to allow control for
individual wq. The wq->wq_active counter is elevated before calling
movdir64b() or enqcmds() to submit a descriptor to the wq and dropped once
the submission call completes. The function is gated by
percpu_ref_tryget_live(). On shutdown with percpu_ref_kill() called, any
new submission would be blocked from acquiring a ref and failed. Once all
references are dropped for the wq, shutdown can continue.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161894438293.3202472.14894701611500822232.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
There is no need to have an additional bus for the IAX device. The removal
of IAX will change user ABI as /sys/bus/iax will no longer exist.
The iax device will be moved to the dsa bus. The device id for dsa and
iax will now be combined rather than unique for each device type in order
to accommodate the iax devices. This is in preparation for fixing the
sub-driver code for idxd. There's no hardware deployment for Sapphire
Rapids platform yet, which means that users have no reason to have
developed scripts against this ABI. There is some exposure to
released versions of accel-config, but those are being fixed up and
an accel-config upgrade is reasonable to get IAX support. As far as
accel-config is concerned IAX support starts when these devices appear
under /sys/bus/dsa, and old accel-config just assumes that an empty /
missing /sys/bus/iax just means a lack of platform support.
Fixes: f25b463883 ("dmaengine: idxd: add IAX configuration support in the IDXD driver")
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161852988298.2203940.4529909758034944428.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The char device setup and cleanup has device lifetime issues regarding when
parts are initialized and cleaned up. The initialization of struct device is
done incorrectly. device_initialize() needs to be called on the 'struct
device' and then additional changes can be added. The ->release() function
needs to be setup via device_type before dev_set_name() to allow proper
cleanup. The change re-parents the cdev under the wq->conf_dev to get
natural reference inheritance. No known dependency on the old device path exists.
Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Fixes: 42d279f913 ("dmaengine: idxd: add char driver to expose submission portal to userland")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/161852987721.2203940.1478218825576630810.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The devm managed lifetime is incompatible with 'struct device' objects that
resides in idxd context. This is one of the series that clean up the idxd
driver 'struct device' lifetime. Fix idxd->conf_dev 'struct device'
lifetime. Address issues flagged by CONFIG_DEBUG_KOBJECT_RELEASE.
Add release functions in order to free the allocated memory at the
appropriate time.
Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Fixes: bfe1d56091 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161852985319.2203940.4650791514462735368.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The devm managed lifetime is incompatible with 'struct device' objects that
resides in idxd context. This is one of the series that clean up the idxd
driver 'struct device' lifetime. Remove pcim_* management of the PCI device
and the ioremap of MMIO BAR and replace with unmanaged versions. This is
for consistency of removing all the pcim/devm based calls.
Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Fixes: bfe1d56091 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/161852984150.2203940.8043988289748519056.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The devm managed lifetime is incompatible with 'struct device' objects that
resides in idxd context. This is one of the series that clean up the idxd
driver 'struct device' lifetime. Remove embedding of dma_device and dma_chan
in idxd since it's not the only interface that idxd will use. The freeing of
the dma_device will be managed by the ->release() function.
Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Fixes: bfe1d56091 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/161852983001.2203940.14817017492384561719.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
There are calls to idxd_cmd_exec that pass a null status pointer however
a recent commit has added an assignment to *status that can end up
with a null pointer dereference. The function expects a null status
pointer sometimes as there is a later assignment to *status where
status is first null checked. Fix the issue by null checking status
before making the assignment.
Addresses-Coverity: ("Explicit null dereferenced")
Fixes: 89e3becd8f ("dmaengine: idxd: check device state before issue command")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20210415110654.1941580-1-colin.king@canonical.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In the first list_for_each_entry() macro of dma_async_device_register,
it gets the chan from list and calls __dma_async_device_channel_register
(..,chan). We can see that chan->local is allocated by alloc_percpu() and
it is freed chan->local by free_percpu(chan->local) when
__dma_async_device_channel_register() failed.
But after __dma_async_device_channel_register() failed, the caller will
goto err_out and freed the chan->local in the second time by free_percpu().
The cause of this problem is forget to set chan->local to NULL when
chan->local was freed in __dma_async_device_channel_register(). My
patch sets chan->local to NULL when the callee failed to avoid double free.
Fixes: d2fb0a0438 ("dmaengine: break out channel registration")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20210331014458.3944-1-lyl2019@mail.ustc.edu.cn
Signed-off-by: Vinod Koul <vkoul@kernel.org>