IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Optimezed mem*io operations are defined for LE platforms, use them.
The ARM and !ARCH_EBSA110 dependencies for COMPILE_TEST were added
only for the _memcpy_fromio()/_memcpy_toio() functions. Drop these
dependencies.
Tested unaligned accesses on both sama5d2 and sam9x60 QSPI controllers
using SPI NOR flashes, everything works ok. The following performance
improvement can be seen when running mtd_speedtest:
sama5d2_xplained (mx25l25635e)
- before:
mtd_speedtest: eraseblock write speed is 983 KiB/s
mtd_speedtest: eraseblock read speed is 6150 KiB/s
- after:
mtd_speedtest: eraseblock write speed is 1055 KiB/s
mtd_speedtest: eraseblock read speed is 20144 KiB/s
sam9x60ek (sst26vf064b)
- before:
mtd_speedtest: eraseblock write speed is 4770 KiB/s
mtd_speedtest: eraseblock read speed is 8062 KiB/s
- after:
mtd_speedtest: eraseblock write speed is 4524 KiB/s
mtd_speedtest: eraseblock read speed is 21186 KiB/s
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Link: https://lore.kernel.org/r/20200716043139.565734-1-tudor.ambarus@microchip.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Currently we always defer idling of controllers to the SPI thread, the goal
being to ensure that we're doing teardown that's not suitable for atomic
context in an appropriate context and to try to batch up more expensive
teardown operations when the system is under higher load, allowing more
work to be started before the SPI thread is scheduled. However when the
controller does not require any substantial work to idle there is no need
to do this, we can instead save the context switch and immediately mark
the controller as idle. This is particularly useful for systems where there
is frequent but not constant activity.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20200715163610.9475-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Setting spi_transfer->effective_speed_hz in transfer_one so that
it can get used in cs_change_delay configured with delay as a muliple
of SPI clock cycles.
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/r/20200709074120.110069-3-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Setting spi_transfer->effective_speed_hz in transfer_one so that
it can get used in cs_change_delay configured with delay as a muliple
of SPI clock cycles.
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/r/20200709074120.110069-2-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Hi,
This series adds support for octal DTR flashes in the spi-nor framework,
and then adds hooks for the Cypress Semper and Mircom Xcella flashes to
allow running them in octal DTR mode. This series assumes that the flash
is handed to the kernel in Legacy SPI mode.
Tested on TI J721e EVM with 1-bit ECC on the Cypress flash.
Changes in v10:
- Rebase on latest linux-next/master. Drop a couple patches that made it
in the previous release.
- Move the code that sets 20 dummy cycles for MT35XU512ABA to its octal
enable function. This way, if the controller doesn't support 8D mode
20 dummy cycles won't be used.
Changes in v9:
- Do not use '& 0xff' to get the opcode LSB in spi-mxic and
spi-zynq-qspi. The cast to u8 will do that anyway.
- Do not use if (opcode) as a check for whether the command phase exists
in spi-zynq-qspi because the opcode 0 can be valid. Use the new
cmd.nbytes instead.
Changes in v8:
- Move controller changes in spi-mxic to the commit which introduces
2-byte opcodes to avoid problems when bisecting.
- Replace usage of sizeof(op->cmd.opcode) with op->cmd.nbytes.
- Extract opcode in spi-zynq-qspi instead of using &op->cmd.opcode.
Changes in v7:
- Reject ops with more than 1 command byte in
spi_mem_default_supports_op().
- Reject ops with more than 1 command byte in atmel and mtk controllers.
- Reject ops with 0 command bytes in spi_mem_check_op().
- Set cmd.nbytes to 1 when using SPI_MEM_OP_CMD().
- Avoid endianness problems in spi-mxic.
Changes in v6:
- Instead of hard-coding 8D-8D-8D Fast Read dummy cycles to 20, find
them out from the Profile 1.0 table.
Changes in v5:
- Do not enable stateful X-X-X modes if the reset line is broken.
- Instead of setting SNOR_READ_HWCAPS_8_8_8_DTR from Profile 1.0 table
parsing, do it in spi_nor_info_init_params() instead based on the
SPI_NOR_OCTAL_DTR_READ flag instead.
- Set SNOR_HWCAPS_PP_8_8_8_DTR in s28hs post_sfdp hook since this
capability is no longer set in Profile 1.0 parsing.
- Instead of just checking for spi_nor_get_protocol_width() in
spi_nor_octal_dtr_enable(), make sure the protocol is
SNOR_PROTO_8_8_8_DTR since get_protocol_width() only cares about data
width.
- Drop flag SPI_NOR_SOFT_RESET. Instead, discover soft reset capability
via BFPT.
- Do not make an invalid Quad Enable BFPT field a fatal error. Silently
ignore it by assuming no quad enable bit is present.
- Set dummy cycles for Cypress Semper flash to 24 instead of 20. This
allows for 200MHz operation in 8D mode compared to the 166MHz with 20.
- Rename spi_nor_cypress_octal_enable() to
spi_nor_cypress_octal_dtr_enable().
- Update spi-mtk-nor.c to reject DTR ops since it doesn't call
spi_mem_default_supports_op().
Changes in v4:
- Refactor the series to use the new spi-nor framework with the
manufacturer-specific bits separated from the core.
- Add support for Micron MT35XU512ABA.
- Use cmd.nbytes as the criteria of whether the data phase exists or not
instead of cmd.buf.in || cmd.buf.out in spi_nor_spimem_setup_op().
- Update Read FSR to use the same dummy cycles and address width as Read
SR.
- Fix BFPT parsing stopping too early for JESD216 rev B flashes.
- Use 2 byte reads for Read SR and FSR commands in DTR mode.
Changes in v3:
- Drop the DT properties "spi-rx-dtr" and "spi-tx-dtr". Instead, if
later a need is felt to disable DTR in case someone has a board with
Octal DTR capable flash but does not support DTR transactions for some
reason, a property like "spi-no-dtr" can be added.
- Remove mode bits SPI_RX_DTR and SPI_TX_DTR.
- Remove the Cadence Quadspi controller patch to un-block this series. I
will submit it as a separate patch.
- Rebase on latest 'master' and fix merge conflicts.
- Update read and write dirmap templates to use DTR.
- Rename 'is_dtr' to 'dtr'.
- Make 'dtr' a bitfield.
- Reject DTR ops in spi_mem_default_supports_op().
- Update atmel-quadspi to reject DTR ops. All other controller drivers
call spi_mem_default_supports_op() so they will automatically reject
DTR ops.
- Add support for both enabling and disabling DTR modes.
- Perform a Software Reset on flashes that support it when shutting
down.
- Disable Octal DTR mode on suspend, and re-enable it on resume.
- Drop enum 'spi_mem_cmd_ext' and make command opcode u16 instead.
Update spi-nor to use the 2-byte command instead of the command
extension. Since we still need a "extension type", mode that enum to
spi-nor and name it 'spi_nor_cmd_ext'.
- Default variable address width to 3 to fix SMPT parsing.
- Drop non-volatile change to uniform sector mode and rely on parsing
SMPT.
Changes in v2:
- Add DT properties "spi-rx-dtr" and "spi-tx-dtr" to allow expressing
DTR capabilities.
- Set the mode bits SPI_RX_DTR and SPI_TX_DTR when we discover the DT
properties "spi-rx-dtr" and spi-tx-dtr".
- spi_nor_cypress_octal_enable() was updating nor->params.read[] with
the intention of setting the correct number of dummy cycles. But this
function is called _after_ selecting the read so setting
nor->params.read[] will have no effect. So, update nor->read_dummy
directly.
- Fix spi_nor_spimem_check_readop() and spi_nor_spimem_check_pp()
passing nor->read_proto and nor->write_proto to
spi_nor_spimem_setup_op() instead of read->proto and pp->proto
respectively.
- Move the call to cqspi_setup_opcode_ext() inside cqspi_enable_dtr().
This avoids repeating the 'if (f_pdata->is_dtr)
cqspi_setup_opcode_ext()...` snippet multiple times.
- Call the default 'supports_op()' from cqspi_supports_mem_op(). This
makes sure the buswidth requirements are also enforced along with the
DTR requirements.
- Drop the 'is_dtr' argument from spi_check_dtr_req(). We only call it
when a phase is DTR so it is redundant.
Pratyush Yadav (17):
spi: spi-mem: allow specifying whether an op is DTR or not
spi: spi-mem: allow specifying a command's extension
spi: atmel-quadspi: reject DTR ops
spi: spi-mtk-nor: reject DTR ops
mtd: spi-nor: add support for DTR protocol
mtd: spi-nor: sfdp: get command opcode extension type from BFPT
mtd: spi-nor: sfdp: parse xSPI Profile 1.0 table
mtd: spi-nor: core: use dummy cycle and address width info from SFDP
mtd: spi-nor: core: do 2 byte reads for SR and FSR in DTR mode
mtd: spi-nor: core: enable octal DTR mode when possible
mtd: spi-nor: sfdp: do not make invalid quad enable fatal
mtd: spi-nor: sfdp: detect Soft Reset sequence support from BFPT
mtd: spi-nor: core: perform a Soft Reset on shutdown
mtd: spi-nor: core: disable Octal DTR mode on suspend.
mtd: spi-nor: core: expose spi_nor_default_setup() in core.h
mtd: spi-nor: spansion: add support for Cypress Semper flash
mtd: spi-nor: micron-st: allow using MT35XU512ABA in Octal DTR mode
drivers/mtd/spi-nor/core.c | 446 +++++++++++++++++++++++++++-----
drivers/mtd/spi-nor/core.h | 22 ++
drivers/mtd/spi-nor/micron-st.c | 103 +++++++-
drivers/mtd/spi-nor/sfdp.c | 131 +++++++++-
drivers/mtd/spi-nor/sfdp.h | 8 +
drivers/mtd/spi-nor/spansion.c | 166 ++++++++++++
drivers/spi/atmel-quadspi.c | 6 +
drivers/spi/spi-mem.c | 16 +-
drivers/spi/spi-mtk-nor.c | 10 +-
drivers/spi/spi-mxic.c | 3 +-
drivers/spi/spi-zynq-qspi.c | 11 +-
include/linux/mtd/spi-nor.h | 53 +++-
include/linux/spi/spi-mem.h | 14 +-
13 files changed, 889 insertions(+), 100 deletions(-)
--
2.27.0
base-commit: b3a9e3b9622ae10064826dccb4f7a52bd88c7407
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.orghttp://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Double Transfer Rate (DTR) ops are added in spi-mem. But this controller
doesn't support DTR transactions. Since we don't use the default
supports_op(), which rejects all DTR ops, do that explicitly in our
supports_op().
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Link: https://lore.kernel.org/r/20200623183030.26591-5-p.yadav@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Double Transfer Rate (DTR) ops are added in spi-mem. But this controller
doesn't support DTR transactions. Since we don't use the default
supports_op(), which rejects all DTR ops, do that explicitly in our
supports_op().
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Link: https://lore.kernel.org/r/20200623183030.26591-4-p.yadav@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
In xSPI mode, flashes expect 2-byte opcodes. The second byte is called
the "command extension". There can be 3 types of extensions in xSPI:
repeat, invert, and hex. When the extension type is "repeat", the same
opcode is sent twice. When it is "invert", the second byte is the
inverse of the opcode. When it is "hex" an additional opcode byte based
is sent with the command whose value can be anything.
So, make opcode a 16-bit value and add a 'nbytes', similar to how
multiple address widths are handled.
Some places use sizeof(op->cmd.opcode). Replace them with op->cmd.nbytes
The spi-mxic and spi-zynq-qspi drivers directly use op->cmd.opcode as a
buffer. Now that opcode is a 2-byte field, this can result in different
behaviour depending on if the machine is little endian or big endian.
Extract the opcode in a local 1-byte variable and use that as the buffer
instead. Both these drivers would reject multi-byte opcodes in their
supports_op() hook anyway, so we only need to worry about single-byte
opcodes for now.
The above two changes are put in this commit to keep the series
bisectable.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Link: https://lore.kernel.org/r/20200623183030.26591-3-p.yadav@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Each phase is given a separate 'dtr' field so mixed protocols like
4S-4D-4D can be supported.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Link: https://lore.kernel.org/r/20200623183030.26591-2-p.yadav@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There's a bunch of overhead in spi-geni-qcom's prepare_message. Get
rid of it. Before this change spi_geni_prepare_message() took around
14.5 us. After this change, spi_geni_prepare_message() takes about
1.75 us (as measured by ftrace).
What's here:
* We're always in FIFO mode, so no need to call it for every transfer.
This avoids a whole ton of readl/writel calls.
* We don't need to write a whole pile of config registers if the mode
isn't changing. Cache the last mode and only do the work if needed.
* For several registers we were trying to do read/modify/write, but
there was no reason. The registers only have one thing in them, so
just write them.
Acked-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Akash Asthana <akashast@codeaurora.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200701174506.3.I2b3d7aeb1ea622335482cce60c58d2f8381e61dd@changeid
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
In the patch ("spi: spi-geni-qcom: Avoid clock setting if not needed")
we avoid a whole pile of clock code. As part of that, we should have
restored the clock at runtime resume. Do that.
It turns out that, at least with today's configurations, this doesn't
actually matter. That's because none of the current device trees have
an OPP table for geni SPI yet. That makes dev_pm_opp_set_rate(dev, 0)
a no-op. This is why it wasn't noticed in the testing of the original
patch. It's still a good idea to fix, though.
Reviewed-by: Rajendra Nayak <rnayak@codeaurora.org>
Reviewed-by: Akash Asthana <akashast@codeaurora.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20200709074037.v2.1.I0b701fc23eca911a5bde4ae4fa7f97543d7f960e@changeid
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Every SPI transfer could have a different clock rate. The
spi-geni-qcom controller code to deal with this was never very well
optimized and has always had a lot of code plus some calls into the
clk framework which, at the very least, would grab a mutex. However,
until recently, the overhead wasn't _too_ much. That changed with
commit 0e3b8a81f5df ("spi: spi-geni-qcom: Add interconnect support")
we're now calling geni_icc_set_bw(), which leads to a bunch of math
plus:
geni_icc_set_bw()
icc_set_bw()
apply_constraints()
qcom_icc_set()
qcom_icc_bcm_voter_commit()
rpmh_invalidate()
rpmh_write_batch()
...and those rpmh commands can be a bit beefy if you call them too
often.
We already know what speed we were running at before, so if we see
that nothing has changed let's avoid the whole pile of code.
On my hardware, this made spi_geni_prepare_message() drop down from
~145 us down to ~14 us.
NOTE: Potentially it might also make sense to add some code into the
interconnect framework to avoid executing so much code when bandwidth
isn't changing, but even if we did that we still want to short circuit
here to save the extra math / clock calls.
Acked-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Akash Asthana<akashast@codeaurora.org>
Fixes: 0e3b8a81f5df ("spi: spi-geni-qcom: Add interconnect support")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200701174506.1.Icfdcee14649fc0a6c38e87477b28523d4e60bab3@changeid
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
In commit cff80645d6d3 ("spi: spi-qcom-qspi: Add interconnect support")
the spi_geni_runtime_suspend() and spi_geni_runtime_resume()
became a bit slower. Measuring on my hardware I see numbers in the
hundreds of microseconds now.
Let's use autosuspend to help avoid some of the overhead. Now if
we're doing a bunch of transfers we won't need to be constantly
chruning.
The number 250 ms for the autosuspend delay was picked a bit
arbitrarily, so if someone has measurements showing a better value we
could easily change this.
Fixes: cff80645d6d3 ("spi: spi-qcom-qspi: Add interconnect support")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Rajendra Nayak <rnayak@codeaurora.org>
Tested-by: Rajendra Nayak <rnayak@codeaurora.org>
Reviewed-by: Mukesh Kumar Savaliya <msavaliy@codeaurora.org>
Reviewed-by: Akash Asthana <akashast@codeaurora.org>
Link: https://lore.kernel.org/r/20200709075113.v2.2.I3c56d655737c89bd9b766567a04b0854db1a4152@changeid
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
As per recent changes to the spi-qcom-qspi, now when we set the clock
we'll call into the interconnect framework and also call the OPP API.
Those are expensive operations. Let's avoid calling them if possible.
This has a big impact on getting transfer rates back up to where they
were (or maybe slightly better) before those patches landed.
Fixes: cff80645d6d3 ("spi: spi-qcom-qspi: Add interconnect support")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Rajendra Nayak <rnayak@codeaurora.org>
Tested-by: Rajendra Nayak <rnayak@codeaurora.org>
Reviewed-by: Mukesh Kumar Savaliya <msavaliy@codeaurora.org>
Reviewed-by: Akash Asthana <akashast@codeaurora.org>
Link: https://lore.kernel.org/r/20200709075113.v2.1.Ia7cb4f41ce93d37d0a764b47c8a453ce9e9c70ef@changeid
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
QSPI needs to vote on a performance state of a power domain depending on
the clock rate. Add support for it by specifying the perf state/clock rate
as an OPP table in device tree.
Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Acked-by: Mark Brown <broonie@kernel.org>
Cc: Alok Chauhan <alokc@codeaurora.org>
Cc: Akash Asthana <akashast@codeaurora.org>
Cc: linux-spi@vger.kernel.org
Link: https://lore.kernel.org/r/1593769293-6354-2-git-send-email-rnayak@codeaurora.org
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
This converts the two Freescale i.MX SPI drivers
Freescale i.MX (CONFIG_SPI_IMX) and Freescale i.MX LPSPI
(CONFIG_SPI_FSL_LPSPI) to use GPIO descriptors handled in
the SPI core for GPIO chip selects whether defined in
the device tree or a board file.
The reason why both are converted at the same time is
that they were both using the same platform data and
platform device population helpers when using
board files intertwining the code so this gives a cleaner
cut.
The platform device creation was passing a platform data
container from each boardfile down to the driver using
struct spi_imx_master from <linux/platform_data/spi-imx.h>,
but this was only conveying the number of chipselects and
an int * array of the chipselect GPIO numbers.
The imx27 and imx31 platforms had code passing the
now-unused platform data when creating the platform devices,
this has been repurposed to pass around GPIO descriptor
tables. The platform data struct that was just passing an
array of integers and number of chip selects for the GPIO
lines has been removed.
The number of chipselects used to be passed from the board
file, because this number also limits the number of native
chipselects that the platform can use. To deal with this we
just augment the i.MX (CONFIG_SPI_IMX) driver to support 3
chipselects if the platform does not define "num-cs" as a
device property (such as from the device tree). This covers
all the legacy boards as these use <= 3 native chip selects
(or GPIO lines, and in that case the number of chip selects
is determined by the core from the number of available
GPIO lines). Any new boards should use device tree, so
this is a reasonable simplification to cover all old
boards.
The LPSPI driver never assigned the number of chipselects
and thus always fall back to the core default of 1 chip
select if no GPIOs are defined in the device tree.
The Freescale i.MX driver was already partly utilizing
the SPI core to obtain the GPIO numbers from the device tree,
so this completes the transtion to let the core handle all
of it.
All board files and the core i.MX boardfile registration
code is augmented to account for these changes.
This has been compile-tested with the imx_v4_v5_defconfig
and the imx_v6_v7_defconfig.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Robin Gong <yibin.gong@nxp.com>
Cc: Trent Piepho <tpiepho@impinj.com>
Cc: Clark Wang <xiaoning.wang@nxp.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Link: https://lore.kernel.org/r/20200625200252.207614-1-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Link: https://lore.kernel.org/r/20200708194400.22213-1-grandmaster@al2klimov.de
Signed-off-by: Mark Brown <broonie@kernel.org>
The error exit label out_free is no longer being used, it is redundant
and can be removed.
Cleans up warning:
drivers/spi/spi-atmel.c:1680:1: warning: label ‘out_free’ defined but not used [-Wunused-label]
Fixes: 2d9a744685bc ("spi: atmel: No need to call spi_master_put() if spi_alloc_master() failed")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20200709101203.1374117-1-colin.king@canonical.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Use kthread_create_worker() helper to simplify the code. It uses
the kthread worker API the right way. It will eventually allow
to remove the FIXME in kthread_worker_fn() and add more consistency
checks in the future.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200709065007.26896-1-m.szyprowski@samsung.com
Signed-off-by: Mark Brown <broonie@kernel.org>
This series tries to reduce a whole bunch of overhead in each SPI
transfer. Much of this overhead is new with the recent interconnect
changes, but even without those changes we still had some overhead
that we could avoid. Let's avoid all of it.
These changes are atop the Qualcomm tree to avoid merge conflicts. If
they look good, the most expedient way to land them is probably to get
Ack's from Mark and land then via the Qualcomm tree.
Most testing was done on the Chrome OS 5.4 tree, but sanity check was
done on mainline.
Douglas Anderson (3):
spi: spi-geni-qcom: Avoid clock setting if not needed
spi: spi-geni-qcom: Set an autosuspend delay of 250 ms
spi: spi-geni-qcom: Get rid of most overhead in prepare_message()
drivers/spi/spi-geni-qcom.c | 67 ++++++++++++++++++-------------------
1 file changed, 32 insertions(+), 35 deletions(-)
--
2.27.0.383.g050319c2ae-goog
Hello,
this series first fixes the calculation of the clock rate. The driver will
round up to the nearest clock rate instead of rounding down. Resulting in SPI
devices accessed with a too high SPI clock.
The remaining patches improve the performance of the driver. The changes range
from micro-optimizations like reducing MMIO writes to the controller to
reducing the number of needed interrupts in some use cases.
regards,
Marc
changes since v1:
- added Maxime Ripard's to the existing patches
- 06/10: (was 05/10 in v1)
"spi: spi-sun6i: sun6i_spi_drain_fifo(): introduce sun6i_spi_get_rx_fifo_count() and make use of it"
use FIELD_GET instead of open coding it
(tnx: Maxime Ripard)
- 05/10: "spi: spi-sun6i: sun6i_spi_get_tx_fifo_count: Convert manual shift+mask to FIELD_GET()"
new patch
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.orghttp://lists.infradead.org/mailman/listinfo/linux-arm-kernel
In commit 0e3b8a81f5df ("spi: spi-geni-qcom: Add interconnect
support") the spi_geni_runtime_suspend() and spi_geni_runtime_resume()
became a bit slower. Measuring on my hardware I see numbers in the
hundreds of microseconds now.
Let's use autosuspend to help avoid some of the overhead. Now if
we're doing a bunch of transfers we won't need to be constantly
chruning.
The number 250 ms for the autosuspend delay was picked a bit
arbitrarily, so if someone has measurements showing a better value we
could easily change this.
Fixes: 0e3b8a81f5df ("spi: spi-geni-qcom: Add interconnect support")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Akash Asthana<akashast@codeaurora.org>
Link: https://lore.kernel.org/r/20200701174506.2.I9b8f6bb1e7e6d8847e2ed2cf854ec55678db427f@changeid
Signed-off-by: Mark Brown <broonie@kernel.org>
In sun6i_spi_transfer_one() the RX FIFO Ready (SUN6I_INT_CTL_RF_RDY) is
unconditionally enabled.
A RX interrupt is only needed, if more data than fits into the FIFO is going to
be received during this transfer. As the RX-FIFO is drained during transfer
complete interrupt, enable the RX FIFO Ready interrupt only if the data doesn't
fit into the FIFO.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20200706143443.9855-11-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
In sun6i_spi_transfer_one() the Interrupt Control Register is written three
times. This patch collates the three writes into one.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20200706143443.9855-10-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
The function sun6i_spi_fill_fifo() is called with a length argument of
"sspi->fifo_depth" and "SUN6I_FIFO_DEPTH".
The driver reads the number of free bytes in the FIFO from the hardware and
uses the length argument to limit this value. This is not needed as the number
of free bytes in the FIFO is always less or equal the depth of the FIFO.
This patch removes the length argument and check.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20200706143443.9855-9-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
The function sun6i_spi_drain_fifo() is called with a length argument of
"sspi->fifo_depth" and "SUN6I_FIFO_DEPTH".
The driver reads the number of available bytes to read from the FIFO from the
hardware and uses the length argument to limit this value. This is not needed
as the FIFO can contain only the fifo depth number of bytes.
This patch removes the length argument and check.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20200706143443.9855-8-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
This patch introduces the function sun6i_spi_get_rx_fifo_count(), similar to
the existing sun6i_spi_get_tx_fifo_count(), to make the sun6i_spi_drain_fifo()
function a bit easier to read.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20200706143443.9855-7-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
This patch converts the manual shift+mask in sun6i_spi_get_tx_fifo_count() to
make use of FIELD_GET()
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/r/20200706143443.9855-6-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
In sun6i_spi_transfer_one() the driver ensures that the length of the transfer
is smaller or equal to SUN6I_MAX_XFER_SIZE. This means the masking of the
length to SUN6I_MAX_XFER_SIZE can be skipped when writing the transfer length
into the registers.
This patch removes the useless masking of the transfer length to
SUN6I_MAX_XFER_SIZE.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20200706143443.9855-5-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
This patch removes an useless goto at the end of
sun6i_spi_transfer_one().
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20200706143443.9855-4-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
This patch implementes the reporting of the effectivly used speed_hz for the
transfer by setting tfr->effective_speed_hz.
See the following patch, which adds this feature to the SPI core for more
information:
5d7e2b5ed585 spi: core: allow reporting the effectivly used speed_hz for a transfer
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20200706143443.9855-3-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
A SPI transfer defines the _maximum_ speed of the SPI transfer. However the
driver doesn't take into account that the clock divider is always rounded down
(due to integer arithmetics). This results in a too high clock rate for the SPI
transfer.
E.g.: with a mclk_rate of 24 MHz and a SPI transfer speed of 10 MHz, the
original code calculates a reg of "0", which results in a effective divider of
"2" and a 12 MHz clock for the SPI transfer.
This patch fixes the issue by using DIV_ROUND_UP() instead of a plain
integer division.
While there simplify the divider calculation for the CDR1 case, use
order_base_2() instead of two ilog2() calculations.
Fixes: 3558fe900e8a ("spi: sunxi: Add Allwinner A31 SPI controller driver")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20200706143443.9855-2-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Hi all,
Although Florian was concerned about a trivial inline check to deal with
shared IRQs adding overhead, the reality is that it would be so small as
to not be worth even thinking about unless the driver was already tuned
to squeeze out every last cycle. And a brief look over the code shows
that that clearly isn't the case.
This is an example of some of the easy low-hanging fruit that jumps out
just from code inspection. Based on disassembly and ARM1176 cycle
timings, patch #2 should save the equivalent of 2-3 shared interrupt
checks off the critical path in all cases, and patch #3 possibly up to
about 100x more. I don't have any means to test these patches, let alone
measure performance, so they're only backed by the principle that less
code - and in particular fewer memory accesses - is almost always
better.
There is almost certainly a *lot* more to be had from careful use of
relaxed I/O accessors, not doing a read-modify-write of CS at every
reset, tweaking the loops further to avoid unnecessary writebacks to
variables, and so on. However since I'm not invested in this personally
I'm not going to pursue it any further; I'm throwing these patches out
as more of a demonstration to back up my original drive-by review
comments, so if anyone want to pick them up and run with them then
please do so.
Robin.
Robin Murphy (3):
spi: bcm3835: Tidy up bcm2835_spi_reset_hw()
spi: bcm2835: Micro-optimise IRQ handler
spi: bcm2835: Micro-optimise FIFO loops
drivers/spi/spi-bcm2835.c | 45 +++++++++++++++++++--------------------
1 file changed, 22 insertions(+), 23 deletions(-)
--
2.23.0.dirty
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.orghttp://lists.infradead.org/mailman/listinfo/linux-arm-kernel
This switches the Lantiq SSC driver over to use GPIO descriptor
handling in the core.
The driver was already utilizing the core to look up and request
GPIOs from the device tree so this is a pretty small change
just switching it over to use descriptors directly instead.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Link: https://lore.kernel.org/r/20200625202149.209276-1-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
This converts the IMG SPFI SPI driver to use GPIO descriptors
as obtained from the core instead of GPIO numbers.
The driver was already relying on the core code to look up
the GPIO numbers from the device tree and allocate memory for
storing state etc. By moving to use descriptors handled by
the core we can delete the setup/cleanup functions and
the device state handler that were only dealing with this.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Ionela Voinescu <ionela.voinescu@imgtec.com>
Cc: Sifan Naeem <sifan.naeem@imgtec.com>
Link: https://lore.kernel.org/r/20200625201422.208640-1-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
The Nuvoton PSPI driver already uses the core to handle GPIO
chip selects but is using the old GPIO number method and
retrieveing the GPIOs in the probe() call.
Switch it over to using GPIO descriptors saving a bunch of
code and modernizing it.
Compile tested med ARMv7 multiplatform config augmented
with the Nuvoton arch and this driver.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Tomer Maimon <tmaimon77@gmail.com>
Link: https://lore.kernel.org/r/20200625225759.273911-1-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
On some SPI controllers (like spi-geni-qcom) setting the chip select
is a heavy operation. For instance on spi-geni-qcom, with the current
code, is was measured as taking upwards of 20 us. Even on SPI
controllers that aren't as heavy, setting the chip select is at least
something like a MMIO operation over some peripheral bus which isn't
as fast as a RAM access.
While it would be good to find ways to mitigate problems like this in
the drivers for those SPI controllers, it can also be noted that the
SPI framework could also help out. Specifically, in some situations,
we can see the SPI framework calling the driver's set_cs() with the
same parameter several times in a row. This is specifically observed
when looking at the way the Chrome OS EC SPI driver (cros_ec_spi)
works but other drivers likely trip it to some extent.
Let's solve this by caching the chip select state in the core and only
calling into the controller if there was a change. We check not only
the "enable" state but also the chip select mode (active high or
active low) since controllers may care about both the mode and the
enable flag in their callback.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200629164103.1.Ied8e8ad8bbb2df7f947e3bc5ea1c315e041785a2@changeid
Signed-off-by: Mark Brown <broonie@kernel.org>
The field mspi->reg_base is annotated as an __iomem pointer. Good.
However, this field is often assigned to a temporary variable:
before being used. For example:
struct fsl_spi_reg *reg_base = mspi->reg_base;
But this variable is missing the __iomem annotation.
So, add the missing __iomem and make sparse & the bot happier.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Link: https://lore.kernel.org/r/20200622162611.83694-1-luc.vanoostenryck@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The blind and counted loops are always called with nonzero count, so
convert them to do-while loops that lead to slightly more efficient
code generation. With GCC 8.3 this shaves off 1-2 instructions per
iteration in each case.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/9242863077acf9a64e4b3720e479855b88d19e82.1592261248.git.robin.murphy@arm.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The IRQ handler only needs the struct spi_controller for the sake of
the completion at the end of a transfer. Passing the struct bcm2835_spi
directly as the IRQ data allows that level of indirection to be pushed
into the completion path for the reverse lookup, and avoided entirely
in all other cases.
This saves one explicit load in the critical path, plus (for a GCC 8.3
build) two registers worth of stack frame overhead.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/6b401cb521539caffab21f05b4c8cba6c9d27c6e.1592261248.git.robin.murphy@arm.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The OMAP2 MCSPI has some kind of half-baked GPIO CS support:
it includes code like this:
if (gpio_is_valid(spi->cs_gpio)) {
ret = gpio_request(spi->cs_gpio, dev_name(&spi->dev));
(...)
But it doesn't parse the "cs-gpios" attribute in the device
tree to count the number of GPIOs or pick out the GPIO numbers
and put these in the SPI master's .cs_gpios property.
We complete the implementation of supporting CS GPIOs
from the device tree and switch it over to use the SPI core
for this.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Tony Lindgren <tony@atomide.com>
Link: https://lore.kernel.org/r/20200625231257.280615-1-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Setting the chip select on the Qualcomm geni SPI controller isn't
exactly cheap. Let's cache the current setting and avoid setting the
chip select if it's already right.
Using "flashrom" to read or write the EC firmware on a Chromebook
shows roughly a 25% reduction in interrupts and a 15% speedup.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200626151946.1.I06134fd669bf91fd387dc6ecfe21d44c202bd412@changeid
Signed-off-by: Mark Brown <broonie@kernel.org>
A batch of fixes for the Freescale DSPI driver fixing some serious
issues with removal of active devices and one resume case, plus a few
new PCI IDs for Intel platforms.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAl76FzkTHGJyb29uaWVA
a2VybmVsLm9yZwAKCRAk1otyXVSH0C5iB/0ThW3V0f9NKiIPkJjgp4DYfhYL8cKO
z4fOeOdJHOm73Atd5efhfU58REpEjmAD76O++/z4qD1l5wRsfILCObEsMwPYRTzr
gYlFVaNuZ3wAz8rDYU7VfmdzuZKRfhAUxJKovcLJleBmWAsGk3fHd5oBPVNJQPl3
jx1Wnj37T2dykI+jhRjQuRbN4e+1FxOKNHcVmwkoUx9OKX1cbRczE5mYhPf8zD2K
ZMMlfuyPg1zqfsVRzBSjaqsbMLZY4tgYemwQmLthhMvMZJ1cw4bt1RMgVBKlzPZN
X+vr5vh9HA8y6spJoAfffBCtIr7BVsNpXXi11QMozZ6YgnBUSJDZeN8Q
=OauP
-----END PGP SIGNATURE-----
Merge tag 'spi-fix-v5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A batch of fixes for the Freescale DSPI driver fixing some serious
issues with removal of active devices and one resume case, plus a few
new PCI IDs for Intel platforms"
* tag 'spi-fix-v5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: pxa2xx: Add support for Intel Tiger Lake PCH-H
spi: spi-fsl-dspi: Initialize completion before possible interrupt
spi: spi-fsl-dspi: Fix external abort on interrupt in resume or exit paths
spi: spi-fsl-dspi: Fix lockup if device is shutdown during SPI transfer
spi: spi-fsl-dspi: Fix lockup if device is removed during SPI transfer