IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
ofdm_index[] is used to indicate how many power compensation is needed to
current thermal value. For internal PA module or 2.4G band, the min_index
is different from other cases.
This issue originally is reported by Dan. He found the size of ofdm_index[]
is 2, but access index 'i' may be equal to 2 if 'rf' is 2 in case of
'is2t'.
In fact, the chunk of code is added to wrong place, so move it back to
proper place, and then power compensation and buffer overflow are fixed.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201207031903.7599-1-pkshih@realtek.com
In preparation to enable -Wimplicit-fallthrough for Clang, fix a
warning by replacing a /* Fall through */ comment with the new
pseudo-keyword macro fallthrough; instead of letting the code fall
through to the next case.
Notice that Clang doesn't recognize /* Fall through */ comments as
implicit fall-through markings.
Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/52931e343af25f6dab4bd1d604889d2594a861dd.1605896060.git.gustavoars@kernel.org
In preparation to enable -Wimplicit-fallthrough for Clang, fix a
warning by replacing a /* fall through */ comment with the new
pseudo-keyword macro fallthrough; instead of letting the code fall
through to the next case.
Notice that Clang doesn't recognize /* fall through */ comments as
implicit fall-through markings.
Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/967a171da3db43e4cdf38104876b4ec1cde46359.1605896060.git.gustavoars@kernel.org
When 8822CE is associating with AP, driver will poll status bit of
IQ calibration to confirm the IQ calibration is done, and then move on
the association process. Current polling time for IQ calibration is 6
seconds.
But occasionally driver fails in polling the status bit because the status
bit is not set after IQ calibration is done. When it happends, association
process will be serieously delayed up to 6 seconds. To avoid it, we reduce
polling time to 300ms, in which the IQ calibration can be done.
Signed-off-by: Chin-Yen Lee <timlee@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201208014503.12118-1-pkshih@realtek.com
'const struct dev_pm_ops rtw_pm_ops' is declared by pci.c, and it should be
declare as 'extern' in pci.h. Without 'extern' causes every file including
pci.h has an individual instance of rtw_pm_ops but not reference to the one
declared in pci.c
If kernel config, like test robot, doesn't build driver as module, it leads
multiple definition.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 2e86ef413ab3 ("rtw88: pci: Add prototypes for .probe, .remove and .shutdown")
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201208013746.11065-1-pkshih@realtek.com
mwifiex_cmd_802_11_ad_hoc_start() calls memcpy() without checking
the destination size may trigger a buffer overflower,
which a local user could use to cause denial of service
or the execution of arbitrary code.
Fix it by putting the length check before calling memcpy().
Signed-off-by: Zhang Xiaohui <ruc_zhangxiaohui@163.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201206084801.26479-1-ruc_zhangxiaohui@163.com
The pointer 'entry' is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201204180459.1148257-1-colin.king@canonical.com
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: cc0b88cf5ecf ("[PATCH] Add adm8211 802.11b wireless driver")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1607071638-33619-1-git-send-email-zhangchangzhong@huawei.com
When driver was developed, FCC regulation didn't enable channel 144
and there was no demand for channel 144 at that time. Although HW
actually supports channel 144, driver didn't announce channel 144.
Therefore, channel 144 (20 MHz), channel 142 (40 MHz) and channel
138 (80 MHz) couldn't be used.
Today, channel 144 has been enabled by regulations and
is gradually being supported. With test requirements,
we declare hw supports channel 144.
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201204013823.3729-1-pkshih@realtek.com
Currently the variable 'interval' is not initialized and is only set
to 1 when oex_stat->bt_418_hid_existi is true. Fix this by inintializing
variable interval to 0 (which I'm assuming is the intended default).
Addresses-Coverity: ("Uninitalized scalar variable")
Fixes: 5b2e9a35e456 ("rtw88: coex: add feature to enhance HID coexistence performance")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201203175142.1071738-1-colin.king@canonical.com
The assignment to pointer vif is redundant as the assigned value
is never read, hence it can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Ajay Singh <ajay.kathat@microchip.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201203174316.1071446-1-colin.king@canonical.com
As of 6-DEC-2019, NXP has acquired Marvell’s Wireless business
unit. This change is to update the license text accordingly.
commit 932183aa35c6 ("mwifiex: change license text from MARVELL
to NXP") does this, but it left out two files.
Signed-off-by: James Cao <zheng.cao@nxp.com>
Signed-off-by: Cathy Luo <xiaohua.luo@nxp.com>
Signed-off-by: Ganapathi Bhat <ganapathi.bhat@nxp.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1606814307-32715-1-git-send-email-ganapathi.bhat@nxp.com
Also strip out other duplicates from driver specific headers.
Ensure 'main.h' is explicitly included in 'pci.h' since the latter
uses some defines from the former. It avoids issues like:
from drivers/net/wireless/realtek/rtw88/rtw8822be.c:5:
drivers/net/wireless/realtek/rtw88/pci.h:209:28: error: ‘RTK_MAX_TX_QUEUE_NUM’ undeclared here (not in a function); did you mean ‘RTK_MAX_RX_DESC_NUM’?
209 | DECLARE_BITMAP(tx_queued, RTK_MAX_TX_QUEUE_NUM);
| ^~~~~~~~~~~~~~~~~~~~
Fixes the following W=1 kernel build warning(s):
drivers/net/wireless/realtek/rtw88/pci.c:1488:5: warning: no previous prototype for ‘rtw_pci_probe’ [-Wmissing-prototypes]
1488 | int rtw_pci_probe(struct pci_dev *pdev,
| ^~~~~~~~~~~~~
drivers/net/wireless/realtek/rtw88/pci.c:1568:6: warning: no previous prototype for ‘rtw_pci_remove’ [-Wmissing-prototypes]
1568 | void rtw_pci_remove(struct pci_dev *pdev)
| ^~~~~~~~~~~~~~
drivers/net/wireless/realtek/rtw88/pci.c:1590:6: warning: no previous prototype for ‘rtw_pci_shutdown’ [-Wmissing-prototypes]
1590 | void rtw_pci_shutdown(struct pci_dev *pdev)
| ^~~~~~~~~~~~~~~~
Cc: Yan-Hsuan Chuang <yhchuang@realtek.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201126133152.3211309-18-lee.jones@linaro.org
Ido Schimmel says:
====================
mlxsw: Misc updates
This patchset contains miscellaneous patches we gathered in our queue.
Some of them are dependencies of larger patchsets that I will submit
later this cycle.
Patches #1-#3 perform small non-functional changes in mlxsw.
Patch #4 adds more extended ack messages in mlxsw.
Patch #5 adds devlink parameters documentation for mlxsw. To be extended
with more parameters this cycle.
Patches #6-#7 perform small changes in forwarding selftests
infrastructure.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Turned out that mlxsw_sp_ipip_fib_entry_op_gre4() does not need to
figure out the IP address and virtual router id. Those are exactly
the same as in the fib_entry it is called for. So just use that and
reduce mlxsw_sp_ipip_fib_entry_op_gre4() function to only call
mlxsw_sp_ipip_fib_entry_op_gre4_rtdp() make the ipip decap op
code similar to nve.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The indicated version fixes an issue whereby the MOMTE register would by
default enable mirroring of ECN-marked traffic from all traffic classes,
once the ECN mirroring was configured. This fix is necessary for offload
of RED "ecn_mark" qevent.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suppresses the following coccinelle warning:
drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.c:139:3-7:
WARNING use flexible-array member instead
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suppresses the following coccinelle warning:
drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c:18:15-19: WARNING use flexible-array member instead
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, mlxsw triggers the 'devlink:devlink_hwmsg' tracepoint
whenever a request is sent to the device and whenever a response is
received from it. However, the tracepoint is not triggered when an event
(e.g., port up / down) is received from the device.
Also trace EMAD events in order to log a more complete picture of all
the exchanged hardware messages.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Test that the reference count of a router interface (RIF) configured for
a LAG is incremented / decremented when ports join / leave the LAG. Use
the offload indication on routes configured on the RIF to understand if
it was created / destroyed.
The test fails without the previous patch.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case a router interface (RIF) is configured for a LAG, make sure its
configuration is applied on the new LAG member.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit says:
====================
r8169: improve rtl_rx and NUM_RX_DESC handling
This series improves rtl_rx() and the handling of NUM_RX_DESC.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
After recent changes there's no need any longer to define NUM_RX_DESC
as an unsigned value.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's no need to check min(budget, NUM_RX_DESC). At first budget
(NAPI_POLL_WEIGHT = 64) is less then NUM_RX_DESC (256).
And more important: Even in case of budget > NUM_RX_DESC we could
safely continue processing descriptors as long as they are owned by
the CPU. In addition replace rx_left with a normal counter variable,
this allows to simplify the code. Last but not least there's no need
any longer to pass the budget as an u32.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- bump version strings, by Simon Wunderlich
- update include for min/max helpers, by Sven Eckelmann
- add infrastructure and netlink functions for routing algo selection,
by Sven Eckelmann (2 patches)
- drop deprecated debugfs and sysfs support and obsoleted
functionality, by Sven Eckelmann (3 patches)
- drop unused include in fragmentation.c, by Simon Wunderlich
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAl/KWIkWHHN3QHNpbW9u
d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoTrTEACUImdOCWv4+NnEQfChQv6Y3i18
gJABXoOkWLfFpGBUlw/uYzFKpMEWZ0orHig9gucC+rmjNc8veWwAOugJoTPTKQJZ
/4yndhM0x39vWex03rdDmyqzCEh1V1Q9VcdEuD6XbJDaK5F4jDu3NQVneOijIkN+
5PzhlvtUlfe8csykOCOoC9Y5wy82fEhcEvuSq+Z6dU3Cb3EGHtEUtZ4orDkpnnml
7XEcn5C5+OFGlz/ikiszKumTtNK+dmGluOxoyfAzEjQHK7PoTorcXFS2YUoSWeqQ
gmYZ56RBqEHjo4eqcaEgcqq5v8cTPCEMCB8UQjAffxrhloRKHRhQOysG1+OnzGA8
IQ2ARHLQCVPVraXF2ixE0D3BvjKmtMmcvZOCXwhCHDajn9jFKAh0+hnInDyv6Fp1
7eUfpHACL9EQDxKWXeQg37X2mk3hHJ+4zgZOYidahVeKbiiexe2heaHTYAbr9rIf
8hvtlgMg4AnwL3IxadrKwsbJ5t7TEPLTInf47hPvpRg7SmthDTcgso4VDwmWgB8W
Tlug8/NoXXDCmDhXUpvyi9+idHe0J8xvHl/2xGC7aSsPAbuhqOuefMKr36YhXJTY
vBA5Ih5ppylJ8Dzwa0TbonvQbOAinA8YTa6izKxY4e+xTPB2jz/WT2vciEZv2+ig
vNIPFrLZ7OFMohCDfQ==
=tNKF
-----END PGP SIGNATURE-----
Merge tag 'batadv-next-pullrequest-20201204' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
- update include for min/max helpers, by Sven Eckelmann
- add infrastructure and netlink functions for routing algo selection,
by Sven Eckelmann (2 patches)
- drop deprecated debugfs and sysfs support and obsoleted
functionality, by Sven Eckelmann (3 patches)
- drop unused include in fragmentation.c, by Simon Wunderlich
* tag 'batadv-next-pullrequest-20201204' of git://git.open-mesh.org/linux-merge:
batman-adv: Drop unused soft-interface.h include in fragmentation.c
batman-adv: Drop legacy code for auto deleting mesh interfaces
batman-adv: Drop deprecated debugfs support
batman-adv: Drop deprecated sysfs support
batman-adv: Allow selection of routing algorithm over rtnetlink
batman-adv: Prepare infrastructure for newlink settings
batman-adv: Add new include for min/max helpers
batman-adv: Start new development cycle
====================
Link: https://lore.kernel.org/r/20201204154631.21063-1-sw@simonwunderlich.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When CONFIG_OF is disabled, there is a harmless warning about
an unused variable:
enetc_pf.c: In function 'enetc_phylink_create':
enetc_pf.c:981:17: error: unused variable 'dev' [-Werror=unused-variable]
Slightly rearrange the code to pass around the of_node as a
function argument, which avoids the problem without hurting
readability.
Fixes: 71b77a7a27a3 ("enetc: Migrate to PHYLINK and PCS_LYNX")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Link: https://lore.kernel.org/r/20201204120800.17193-1-claudiu.manoil@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The OpenCompute time card is an atomic clock along with
a GPS receiver that provides a Grandmaster clock source
for a PTP enabled network.
More information is available at http://www.timingcard.com/
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://lore.kernel.org/r/20201204035128.2219252-2-jonathan.lemon@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
implement the NCI 2.x initial sequence to support NCI 2.x NFCC.
Since NCI 2.0, CORE_RESET and CORE_INIT sequence have been changed.
If NFCEE supports NCI 2.x, then NCI 2.x initial sequence will work.
In NCI 1.0, Initial sequence and payloads are as below:
(DH) (NFCC)
| -- CORE_RESET_CMD --> |
| <-- CORE_RESET_RSP -- |
| -- CORE_INIT_CMD --> |
| <-- CORE_INIT_RSP -- |
CORE_RESET_RSP payloads are Status, NCI version, Configuration Status.
CORE_INIT_CMD payloads are empty.
CORE_INIT_RSP payloads are Status, NFCC Features,
Number of Supported RF Interfaces, Supported RF Interface,
Max Logical Connections, Max Routing table Size,
Max Control Packet Payload Size, Max Size for Large Parameters,
Manufacturer ID, Manufacturer Specific Information.
In NCI 2.0, Initial Sequence and Parameters are as below:
(DH) (NFCC)
| -- CORE_RESET_CMD --> |
| <-- CORE_RESET_RSP -- |
| <-- CORE_RESET_NTF -- |
| -- CORE_INIT_CMD --> |
| <-- CORE_INIT_RSP -- |
CORE_RESET_RSP payloads are Status.
CORE_RESET_NTF payloads are Reset Trigger,
Configuration Status, NCI Version, Manufacturer ID,
Manufacturer Specific Information Length,
Manufacturer Specific Information.
CORE_INIT_CMD payloads are Feature1, Feature2.
CORE_INIT_RSP payloads are Status, NFCC Features,
Max Logical Connections, Max Routing Table Size,
Max Control Packet Payload Size,
Max Data Packet Payload Size of the Static HCI Connection,
Number of Credits of the Static HCI Connection,
Max NFC-V RF Frame Size, Number of Supported RF Interfaces,
Supported RF Interfaces.
Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com>
Link: https://lore.kernel.org/r/20201202223147.3472-1-bongsu.jeon@samsung.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Connect hosts H1 and H2 using two intermediate encapsulation routers
(LER1 and LER2). These routers encapsulate traffic from the hosts,
including the original Ethernet header, into MPLS.
Use ping to test reachability between H1 and H2.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Link: https://lore.kernel.org/r/625f5c1aafa3a8085f8d3e082d680a82e16ffbaa.1606918980.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We add the support to remove a specific node down with 128bit
node identifier, as an alternative to legacy 32-bit node address.
example:
$tipc peer remove identiy <1001002|16777777>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Link: https://lore.kernel.org/r/20201203035045.4564-1-hoang.h.le@dektech.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If there isn't a proper NFC firmware image, Bootloader mode will be
skipped.
Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20201203225257.2446-1-bongsu.jeon@samsung.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Arjun Roy says:
====================
Perf. optimizations for TCP Recv. Zerocopy
This patchset contains several optimizations for TCP Recv. Zerocopy.
Summarized:
1. It is possible that a read payload is not exactly page aligned -
that there may exist "straggler" bytes that we cannot map into the
caller's address space cleanly. For this, we allow the caller to
provide as argument a "hybrid copy buffer", turning
getsockopt(TCP_ZEROCOPY_RECEIVE) into a "hybrid" operation that allows
the caller to avoid a subsequent recvmsg() call to read the
stragglers.
2. Similarly, for "small" read payloads that are either below the size
of a page, or small enough that remapping pages is not a performance
win - we allow the user to short-circuit the remapping operations
entirely and simply copy into the buffer provided.
Some of the patches in the middle of this set are refactors to support
this "short-circuiting" optimization.
3. We allow the user to provide a hint that performing a page zap
operation (and the accompanying TLB shootdown) may not be necessary,
for the provided region that the kernel will attempt to map pages
into. This allows us to avoid this expensive operation while holding
the socket lock, which provides a significant performance advantage.
With all of these changes combined, "medium" sized receive traffic
(multiple tens to few hundreds of KB) see significant efficiency gains
when using TCP receive zerocopy instead of regular recvmsg(). For
example, with RPC-style traffic with 32KB messages, there is a roughly
15% efficiency improvement when using zerocopy. Without these changes,
there is a roughly 60-70% efficiency reduction with such messages when
employing zerocopy.
====================
Link: https://lore.kernel.org/r/20201202225349.935284-1-arjunroy.kdev@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Zapping pages is required only if we are calling vm_insert_page into a
region where pages had previously been mapped. Receive zerocopy allows
reusing such regions, and hitherto called zap_page_range() before
calling vm_insert_page() in that range.
zap_page_range() can also be triggered from userspace with
madvise(MADV_DONTNEED). If userspace is configured to call this before
reusing a segment, or if there was nothing mapped at this virtual
address to begin with, we can avoid calling zap_page_range() under the
socket lock. That said, if userspace does not do that, then we are
still responsible for calling zap_page_range().
This patch adds a flag that the user can use to hint to the kernel
that a zap is not required. If the flag is not set, or if an older
user application does not have a flags field at all, then the kernel
calls zap_page_range as before. Also, if the flag is set but a zap is
still required, the kernel performs that zap as necessary. Thus
incorrectly indicating that a zap can be avoided does not change the
correctness of operation. It also increases the batchsize for
vm_insert_pages and prefetches the page struct for the batch since
we're about to bump the refcount.
An alternative mechanism could be to not have a flag, assume by
default a zap is not needed, and fall back to zapping if needed.
However, this would harm performance for older applications for which
a zap is necessary, and thus we implement it with an explicit flag
so newer applications can opt in.
When using RPC-style traffic with medium sized (tens of KB) RPCs, this
change yields an efficency improvement of about 30% for QPS/CPU usage.
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Set zerocopy hint, event when falling back to copy, so that the
pending data can be efficiently received using zerocopy when
possible.
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sometimes, we may call tcp receive zerocopy when inq is 0,
or inq < PAGE_SIZE, or inq is generally small enough that
it is cheaper to copy rather than remap pages.
In these cases, we may want to either return early (inq=0) or
attempt to use the provided copy buffer to simply copy
the received data.
This allows us to save both system call overhead and
the latency of acquiring mmap_sem in read mode for cases where
it would be useless to do so.
This patchset enables this behaviour by:
1. Returning quickly if inq is 0.
2. Attempting to perform a regular copy if a hybrid copybuffer is
provided and it is large enough to absorb all available bytes.
3. Return quickly if no such buffer was provided and there are less
than PAGE_SIZE bytes available.
For small RPC ping-pong workloads, normally we would have
1 getsockopt(), 1 recvmsg() and 1 sendmsg() call per RPC. With this
change, we remove the recvmsg() call entirely, reducing the syscall
overhead by about 33%. In testing with small (hundreds of bytes)
RPC traffic, this yields a syscall reduction of about 33% and
an efficiency gain of about 3-5% when defined as QPS/CPU Util.
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sometimes, we may call tcp receive zerocopy when inq is 0,
or inq < PAGE_SIZE, in which case we cannot remap pages. In this case,
simply return the appropriate hint for regular copying without taking
mmap_sem.
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Refactor frag-is-remappable test for tcp receive zerocopy. This is
part of a patch set that introduces short-circuited hybrid copies
for small receive operations, which results in roughly 33% fewer
syscalls for small RPC scenarios.
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Refactor skb frag fast-forwarding for tcp receive zerocopy. This is
part of a patch set that introduces short-circuited hybrid copies
for small receive operations, which results in roughly 33% fewer
syscalls for small RPC scenarios.
skb_advance_to_frag(), given a skb and an offset into the skb,
iterates from the first frag for the skb until we're at the frag
specified by the offset. Assuming the offset provided refers to how
many bytes in the skb are already read, the returned frag points to
the next frag we may read from, while offset_frag is set to the number
of bytes from this frag that we have already read.
If frag is not null and offset_frag is equal to 0, then we may be able
to map this frag's page into the process address space with
vm_insert_page(). However, if offset_frag is not equal to 0, then we
cannot do so.
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>