IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
ieee80211_report_used_skb of mac80211 use the frame_control of
ieee80211_hdr in sk_buff and indicate it to another function
ieee80211_mgd_conn_tx_status, then it queue work ieee80211_sta_work,
but ieee80211_is_auth(fc) in ieee80211_sta_work check fail when the
authentication has transmitted by ath10k.
When the ath10k report it with HTT_TX_COMPL_STATE_DISCARD, it will be
set without flag IEEE80211_TX_STAT_ACK, then mac80211 should try the
next authentication immeditely, but in fact mac80211 wait 1 second for
it, the reason is ieee80211_is_auth(fc) in ieee80211_sta_work check
fail for the sk_buff which is not restored, the data of sk_buff is not
the begin of ieee80211_hdr, in fact it is the begin of htt_cmd_hdr.
dmesg without this patch, it wait 1 second for the next retry when
ath10k report without IEEE80211_TX_STAT_ACK for authentication:
[ 6973.883116] wlan0: send auth to 5e:6f:2b:0d:fb:d7 (try 1/3)
[ 6974.705471] wlan0: send auth to 5e:6f:2b:0d:fb:d7 (try 2/3)
[ 6975.712962] wlan0: send auth to 5e:6f:2b:0d:fb:d7 (try 3/3)
Restore the sk_buff make mac8011 retry the next authentication
immeditely which meet logic of mac80211.
dmesg with this patch, it retry the next immeditely when ath10k
report without IEEE80211_TX_STAT_ACK for authentication:
[ 216.734813] wlan0: send auth to 5e:6f:2b:0d:fb:d7 (try 1/3)
[ 216.739914] wlan0: send auth to 5e:6f:2b:0d:fb:d7 (try 2/3)
[ 216.745874] wlan0: send auth to 5e:6f:2b:0d:fb:d7 (try 3/3)
Tested-on: QCA6174 hw3.2 SDIO WLAN.RMH.4.4.1-00049
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1612839530-2263-1-git-send-email-wgong@codeaurora.org
This change fixes the checkpatch warning described in this commit
commit cbacb5ab0aa0 ("docs: printk-formats: Stop encouraging use of
unnecessary %h[xudi] and %hh[xudi]")
Standard integer promotion is already done and %hx and %hhx is useless
so do not encourage the use of %hh[xudi] or %h[xudi].
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210127222344.2445641-1-trix@redhat.com
ath10k_htt_tx_free_msdu_id() has a lockdep assertion that htt->tx_lock
is held. Acquire the lock in a couple of error paths when calling that
function to ensure this condition is met.
Fixes: 6421969f248fd ("ath10k: refactor tx pending management")
Fixes: e62ee5c381c59 ("ath10k: Add support for htt_data_tx_desc_64 descriptor")
Signed-off-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200604105901.1.I5b8b0c7ee0d3e51a73248975a9da61401b8f3900@changeid
The transmission utilization ratio for sdio bus for small packet is
slow, because the space and time cost for sdio bus is same for large
length packet and small length packet. So the speed of data for large
length packet is higher than small length.
Test result of different length of data:
data packet(byte) cost time(us) calculated rate(Mbps)
256 28 73
512 33 124
1024 35 234
1792 45 318
14336 168 682
28672 333 688
57344 660 695
This patch change the TX packet from single packet to a large length
bundle packet, max size is 32, it results in significant performance
improvement on TX path.
Also there's a fourth thread "ath10k_tx_complete_wq" added to ath10k as it
improves TCP RX throughput (values in Mbps):
TCP-RX TCP-TX UDP-RX UDP-TX
use workqueue_tx_complete 423 357 448 412
change it to ar->workqueue 410 360 449 414
change it to ar->workqueue_aux 405 339 446 401
This patch only effect sdio chip, it will not effect PCI, SNOC etc.
It only enable bundle for sdio chip.
Tested with QCA6174 SDIO with firmware
WLAN.RMH.4.4.1-00017-QCARMSWP-1.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200410061400.14231-2-wgong@codeaurora.org
For sdio chip, it is high latency bus, all the TX packet's content will
be tranferred from HOST memory to firmware memory via sdio bus, then it
need much more memory in firmware than low latency bus chip, for low
latency chip, such as PCI-E, it only need to transfer the TX descriptor
via PCI-E bus to firmware memory. For sdio chip, reduce the complexity of
TX logic will help TX efficiency since its memory is limited, and it will
reduce the TX circle's time of each packet and then firmware will have more
memory for TX since TX complete also need memeory.
This patch disable TX complete indication from firmware for htt data
packet, it will not have TX complete indication from firmware to ath10k.
It will cut the cost of bus bandwidth of TX complete and make the TX
logic of firmware simpler, it results in significant performance
improvement on TX path.
Udp TX throughout is 130Mbps without this patch, and it arrives
400Mbps with this patch.
The downside of this patch is the command "iw wlan0 station dump" will
show 0 for "tx retries" and "tx failed" since all tx packet's status
is success.
This patch only effect sdio chip, it will not effect PCI, SNOC etc.
Tested with QCA6174 SDIO with firmware
WLAN.RMH.4.4.1-00017-QCARMSWPZ-1
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200212080415.31265-2-wgong@codeaurora.org
GCMP MIC length is not filled for GCMP/GCMP-256 cipher suites in
PMF enabled case. Due to mismatch in MIC length, deauth/disassoc frames
are unencrypted.
This patch fills proper MIC length for GCMP/GCMP-256 cipher suites.
Tested HW: QCA9984, QCA9888
Tested FW: 10.4-3.6-00104
Signed-off-by: Yingying Tang <yintang@codeaurora.org>
Co-developed-by: Sowmiya Sree Elavalagan <ssreeela@codeaurora.org>
Signed-off-by: Sowmiya Sree Elavalagan <ssreeela@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
For PMF case, the action,deauth,disassoc management need to encrypt
by hardware, it need to reserve 8 bytes for encryption, otherwise
the packet will be sent out with error format, then PMF case will
fail.
After add the 8 bytes, it will pass the PMF case.
Tested with QCA6174 SDIO with firmware
WLAN.RMH.4.4.1-00005-QCARMSWP-1.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Without this op, it will not be possible to configure aggregation for
high latency devices.
Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
This is done in order to make the *htt_h2t_aggr_cfg_msg* op align better
with the rest of the htt ops (whom all have inline wrappers).
It also adds support for the case when the op is missing (function
pointer is NULL).
As a result of this, the name of the 32 bit implementation in htt_tx.c
was changed and the function was made static.
Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Reset HTT stats helps to get the aggregated HTT stats via
tracing and also we can clear the accumulated HTT stats with
this debugfs file.
Signed-off-by: Maharaja Kennadyrajan <mkenna@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Extended the bit mask value of the HTT stats to get the Mu-MIMO
related stats via tracing.
Signed-off-by: Maharaja Kennadyrajan <mkenna@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
With SDIO there's a use after free after a data frame is transfered, call stack
below. This happens because ath10k_htt_tx_hl() directly transmits the skb
provided by mac80211 using ath10k_htc_send(), all other HTT functions use
separate skb created with ath10k_htc_alloc_skb() to transmit the HTC packet.
After the packet is transmitted mac80211 frees the skb in ieee80211_tx_status()
but HTT layer expects that it still owns the skb, and frees it in
ath10k_htt_htc_tx_complete().
To fix this take a reference of skb before sending it to HTC layer to make sure
we still own the skb.
Tested on QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00007-QCARMSWP-1.
ath10k_htt_tx_hl() is only used by SDIO and USB so other busses (PCI, AHB and
SNOC) should be unaffected.
call stack of use-after-free:
dump_backtrace+0x0/0x2d8
show_stack+0x20/0x2c
__dump_stack+0x20/0x28
dump_stack+0xc8/0xec
print_address_description+0x74/0x240
kasan_report+0x258/0x274
__asan_report_load4_noabort+0x20/0x28
skb_pull+0xbc/0x114
ath10k_htc_notify_tx_completion+0x190/0x2a4 [ath10k_core]
ath10k_sdio_write_async_work+0x1e4/0x2c4 [ath10k_sdio]
process_one_work+0x3d8/0x8b0
worker_thread+0x518/0x7e0
kthread+0x260/0x278
ret_from_fork+0x10/0x18
Allocated by one task:
kasan_kmalloc+0xa0/0x13c
kasan_slab_alloc+0x14/0x1c
kmem_cache_alloc+0x144/0x208
__alloc_skb+0xec/0x394
alloc_skb_with_frags+0x8c/0x374
sock_alloc_send_pskb+0x520/0x5d4
sock_alloc_send_skb+0x40/0x50
__ip_append_data+0xf5c/0x1858
ip_make_skb+0x194/0x1d4
udp_sendmsg+0xf24/0x1ab8
inet_sendmsg+0x1b0/0x2e0
sock_sendmsg+0x88/0xa0
__sys_sendto+0x220/0x3a8
__arm64_sys_sendto+0x78/0x80
el0_svc_common+0x120/0x1e0
el0_svc_compat_handler+0x64/0x80
el0_svc_compat+0x8/0x18
Freed by another task:
__kasan_slab_free+0x120/0x1d4
kasan_slab_free+0x10/0x1c
kmem_cache_free+0x74/0x504
kfree_skbmem+0x88/0xc8
__kfree_skb+0x24/0x2c
consume_skb+0x114/0x18c
__ieee80211_tx_status+0xb7c/0xf60 [mac80211]
ieee80211_tx_status+0x224/0x270 [mac80211]
ath10k_txrx_tx_unref+0x564/0x950 [ath10k_core]
ath10k_htt_t2h_msg_handler+0x178c/0x2a38 [ath10k_core]
ath10k_htt_htc_t2h_msg_handler+0x20/0x30 [ath10k_core]
ath10k_sdio_irq_handler+0xcc0/0x1654 [ath10k_sdio]
process_sdio_pending_irqs+0xec/0x358
sdio_run_irqs+0x68/0xe4
sdio_irq_work+0x1c/0x28
process_one_work+0x3d8/0x8b0
worker_thread+0x518/0x7e0
kthread+0x260/0x278
ret_from_fork+0x10/0x18
Reported-by: Wen Gong <wgong@codeaurora.org>
Tested-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Transmit completion for SDIO is similar to PCIe, modify the high
latency path to allow SDIO modules to use the msdu id.
kvalo: the original patch from Alagu enabled this only for SDIO but I'm not
sure should we also enable this with USB. I'll use bus params to enable this
for so that it's easy to enable also for USB later.
Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00007-QCARMSWP-1.
Co-developed-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Alagu Sankar <alagusankar@silex-india.com>
Signed-off-by: Wen Gong <wgong@codeaurora.org>.
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Use SPDX identifiers everywhere in ath10k.
Makefile was incorrectly marked in commit b24413180f56 ("License cleanup: add
SPDX GPL-2.0 license identifier to files with no license"), fix that as well.
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
wow pause iface config controls the PCI D0/D3-WOW cases for pcie
bus state. Firmware does not expects WOW_IFACE_PAUSE_ENABLED config
for bus/link that cannot be suspended ex:snoc and does not trigger
common subsystem shutdown.
Disable interface pause wow config for integrated chipset(WCN3990)
for correct WOW configuration in the firmware.
Testing:
Tested on WCN3990 HW.
Tested FW: WLAN.HL.2.0-01192-QCAHLSWMTPLZ-1.
Signed-off-by: Govind Singh <govinds@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
HTT aggr message parameter in HL2.0 fw are different in comparison
to legacy fw version. Fill correct HTT aggr msg parameter for
targets using HL2.0 firmware.
Signed-off-by: Govind Singh <govinds@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
WCN3990 is a 37-bit target but can address memory range
only upto 35 bits. The 36th bit is used to control the
smmu/iommu translation and the 37th bit is used by the
internal bus masters to access the wifi subsystem internal
SRAM. With the DMA mask set to 37i-bit, the host driver
can get 37-bit dma address, which leads to incorrect
address access in the target.
Hence the host driver can used addresses upto 35-bit
for WCN3990. Fix the dma mask for wcn3990 to 35-bit,
instead of 37-bit.
Tested HW: WCN3990
Tested FW: WLAN.HL.2.0-01188-QCAHLSWMTPLZ-1
Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Rakesh Pillai <pillair@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Add HTT TX function for HL interfaces.
Intended for SDIO and USB.
Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Special HTT RX ring config message used by high latency
devices.
The main difference between HL and LL is that HL devices
do not use shared memory between device and host and thus,
no host paddr's are added to the RX config message.
Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Several DMA related functions (such as the dma_map_xxx functions)
are not used with high latency devices and don't need to be invoked
in this case.
Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
This is only code refactoring as all call sites of
ath10k_htt_tx_alloc_msdu_id() take the same lock it can be moved into the
id_get function and the assertion dropped.
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Although the TID mask is 0xf, the modulus operation does still not
produce identical results as the bitwise and operator. If the TID is 15, the
modulus operation will "convert" it to 0, whereas the bitwise and will keep it
as 15.
This was found during code review.
Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Notice that in this particular case, I replaced "pass through" with
a proper "fall through" comment, which is what GCC is expecting
to find.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
By default ath10k driver enables the support for HW_CHECKSUM
(NETIF_F_HW_CSUM). Since the TCP/UDP checksum calculation is not enabled
in the wcn3990 firmware the checksum is incorrect in the TCP/UDP packets
and all patckets are dropped. But due note that wcn3990 support in
ath10k is still incomplete so this isn't a critical fix (yet).
Enable hw checksum calculations in wcn3990 hardware by
setting the proper flags in msdu descriptor tso flags.
Signed-off-by: Rakesh Pillai <pillair@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
These wrappers makes the HTT ops align better with the HIF ops
(where similar wrappers are used).
It also makes it easier for a target to have unsupported ops
(by letting the corresponding function pointer be NULL).
Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
WCN3990 target uses 64 bit frags_paddr in htt tx descriptor,
which holds the physical address of SKB fragments in tx data path.
In order to support 64 bit bit frags_paddr in htt tx descriptor, define
htt_data_tx_desc_64 descriptor and ath10k_htt_tx_64 method for handling
tx data path with new descriptor fields.
Signed-off-by: Govind Singh <govinds@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
WCN3990 target uses 64 bit frag descriptor and more
fields in TSO flag.
Add support for 64 bit HTT frag descriptor.
Signed-off-by: Govind Singh <govinds@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
WCN3900 target uses 64bit rx_ring_base_paddr and
fw_idx_shadow_reg_paddr fields in HTT rx ring cfg message.
These address points to the memory region where remote
ring empty buffers are allocated.
In order to add 64 bit htt rx ring cfg, define separate
64 bit htt rx ring cfg message and attach it in runtime
based on target_64bit hw param flag.
Signed-off-by: Govind Singh <govinds@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Fix output from checkpatch.pl like:
Block comments use a trailing */ on a separate lin
Signed-off-by: Marcin Rokicki <marcin.rokicki@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
'ath10k_htt_tx_free_cont_txbuf' and 'ath10k_htt_tx_free_cont_frag_desc'
have NULL pointer checks to avoid crash if they are called twice
but this is as of now not sufficient as these pointers are not assigned
to NULL once the contiguous DMA memory allocation is freed, fix this.
Though this may not be hit with the explicity check of state variable
'tx_mem_allocated' check, good to have this addressed as well.
Below BUG_ON is hit when the above scenario is simulated
with kernel debugging enabled
page:f6d09a00 count:0 mapcount:-127 mapping: (null)
index:0x0
flags: 0x40000000()
page dumped because: VM_BUG_ON_PAGE(page_ref_count(page)
== 0)
------------[ cut here ]------------
kernel BUG at ./include/linux/mm.h:445!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
EIP is at put_page_testzero.part.88+0xd/0xf
Call Trace:
[<c118a2cc>] __free_pages+0x3c/0x40
[<c118a30e>] free_pages+0x3e/0x50
[<c10222b4>] dma_generic_free_coherent+0x24/0x30
[<f8c1d9a8>] ath10k_htt_tx_free_cont_txbuf+0xf8/0x140
[<f8c1e2a9>] ath10k_htt_tx_destroy+0x29/0xa0
[<f8c143e0>] ath10k_core_destroy+0x60/0x80 [ath10k_core]
[<f8acd7e9>] ath10k_pci_remove+0x79/0xa0 [ath10k_pci]
[<c13ed7a8>] pci_device_remove+0x38/0xb0
[<c14d3492>] __device_release_driver+0x72/0x100
[<c14d36b7>] driver_detach+0x97/0xa0
[<c14d29c0>] bus_remove_driver+0x40/0x80
[<c14d427a>] driver_unregister+0x2a/0x60
[<c13ec768>] pci_unregister_driver+0x18/0x70
[<f8aced4f>] ath10k_pci_exit+0xd/0x2be [ath10k_pci]
[<c1101e78>] SyS_delete_module+0x158/0x210
[<c11b34f1>] ? __might_fault+0x41/0xa0
[<c11b353b>] ? __might_fault+0x8b/0xa0
[<c1001a4b>] do_fast_syscall_32+0x9b/0x1c0
[<c178da34>] sysenter_past_esp+0x45/0x74
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
With maximum number of vap's configured in a two radio supported
systems of ~256 Mb RAM, doing a continuous wifi down/up and
intermittent traffic streaming from the connected stations results
in failure to allocate contiguous memory for tx buffers. This results
in the disappearance of all VAP's and a manual reboot is needed as
this is not a crash (or) OOM(for OOM killer to be invoked). To address
this allocate contiguous memory for tx buffers one time and re-use them
until the modules are unloaded but this results in a slight increase in
memory footprint of ath10k when the wifi is down, but the modules are
still loaded. Also as of now we use a separate bool 'tx_mem_allocated'
to keep track of the one time memory allocation, as we cannot come up
with something like 'ath10k_tx_{register,unregister}' before
'ath10k_probe_fw' is called as 'ath10k_htt_tx_alloc_cont_frag_desc'
memory allocation is dependent on the hw_param 'continuous_frag_desc'
a) memory footprint of ath10k without the change
lsmod | grep ath10k
ath10k_core 414498 1 ath10k_pci
ath10k_pci 38236 0
b) memory footprint of ath10k with the change
ath10k_core 414980 1 ath10k_pci
ath10k_pci 38236 0
Memory Failure Call trace:
hostapd: page allocation failure: order:6, mode:0xd0
[<c021f150>] (__dma_alloc_buffer.isra.23) from
[<c021f23c>] (__alloc_remap_buffer.isra.26+0x14/0xb8)
[<c021f23c>] (__alloc_remap_buffer.isra.26) from
[<c021f664>] (__dma_alloc+0x224/0x2b8)
[<c021f664>] (__dma_alloc) from [<c021f810>]
(arm_dma_alloc+0x84/0x90)
[<c021f810>] (arm_dma_alloc) from [<bf954764>]
(ath10k_htt_tx_alloc+0xe0/0x2e4 [ath10k_core])
[<bf954764>] (ath10k_htt_tx_alloc [ath10k_core]) from
[<bf94e6ac>] (ath10k_core_start+0x538/0xcf8 [ath10k_core])
[<bf94e6ac>] (ath10k_core_start [ath10k_core]) from
[<bf947eec>] (ath10k_start+0xbc/0x56c [ath10k_core])
[<bf947eec>] (ath10k_start [ath10k_core]) from
[<bf8a7a04>] (drv_start+0x40/0x5c [mac80211])
[<bf8a7a04>] (drv_start [mac80211]) from [<bf8b7cf8>]
(ieee80211_do_open+0x170/0x82c [mac80211])
[<bf8b7cf8>] (ieee80211_do_open [mac80211]) from
[<c056afc8>] (__dev_open+0xa0/0xf4)
[21053.491752] Normal: 641*4kB (UEMR) 505*8kB (UEMR) 330*16kB (UEMR)
126*32kB (UEMR) 762*64kB (UEMR) 237*128kB (UEMR) 1*256kB (M) 0*512kB
0*1024kB 0*2048kB 0*4096kB = 95276kB
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Remove extraneous error message in 'ath10k_htt_tx_alloc_cont_frag_desc'
as the caller 'ath10k_htt_tx_alloc' already dumps a proper error
message
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
cleanup 'ath10k_htt_tx_alloc' by introducing the API's
'ath10k_htt_tx_alloc/free_{cont_txbuf, txdone_fifo} and
re-use them whereever needed
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Add NAPI support for rx and tx completion. NAPI poll is scheduled
from interrupt handler. The design is as below
- on interrupt
- schedule napi and mask interrupts
- on poll
- process all pipes (no actual Tx/Rx)
- process Rx within budget
- if quota exceeds budget reschedule napi poll by returning budget
- process Tx completions and update budget if necessary
- process Tx fetch indications (pull-push)
- push any other pending Tx (if possible)
- before resched or napi completion replenish htt rx ring buffer
- if work done < budget, complete napi poll and unmask interrupts
This change also get rid of two tasklets (intr_tq and txrx_compl_task).
Measured peak throughput with NAPI on IPQ4019 platform in controlled
environment. No noticeable reduction in throughput is seen and also
observed improvements in CPU usage. Approx. 15% CPU usage got reduced
in UDP uplink case.
DL: AP DUT Tx
UL: AP DUT Rx
IPQ4019 (avg. cpu usage %)
========
TOT +NAPI
=========== =============
TCP DL 644 Mbps (42%) 645 Mbps (36%)
TCP UL 673 Mbps (30%) 675 Mbps (26%)
UDP DL 682 Mbps (49%) 680 Mbps (49%)
UDP UL 720 Mbps (28%) 717 Mbps (11%)
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Otherwise, the txrx-compl-task may access some bad memory?
Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Smatch warns about a number of cases in ath10k where a pointer is
null-checked after it has already been dereferenced, in code involving
ath10k private virtual interface pointers.
Fix these by making the dereference happen later.
Addresses the following smatch warnings:
drivers/net/wireless/ath/ath10k/mac.c:3651 ath10k_mac_txq_init() warn: variable dereferenced before check 'txq' (see line 3649)
drivers/net/wireless/ath/ath10k/mac.c:3664 ath10k_mac_txq_unref() warn: variable dereferenced before check 'txq' (see line 3659)
drivers/net/wireless/ath/ath10k/htt_tx.c:70 __ath10k_htt_tx_txq_recalc() warn: variable dereferenced before check 'txq->sta' (see line 52)
drivers/net/wireless/ath/ath10k/htt_tx.c:740 ath10k_htt_tx_get_vdev_id() warn: variable dereferenced before check 'cb->vif' (see line 736)
drivers/net/wireless/ath/ath10k/txrx.c:86 ath10k_txrx_tx_unref() warn: variable dereferenced before check 'txq' (see line 84)
drivers/net/wireless/ath/ath10k/wmi.c:1837 ath10k_wmi_op_gen_mgmt_tx() warn: variable dereferenced before check 'cb->vif' (see line 1825)
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
To optimize CPU usage htt rx descriptors will be reused instead of
refilling it for htt rx copy engine (CE5). To support that all htt rx
indications should be processed at same context. FIFO queue is used
to maintain tx completion status for each msdu. This helps to retain
the order of tx completion.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Frames that are transmitted via MGMT_TX are using reserved descriptor
slots in firmware. This limitation is for the htt_mgmt_tx path itself,
not for mgmt frames per se. In 16 MBSSID scenario, these reserved slots
will be easy exhausted due to frequent probe responses. So for 10.4
based solutions, probe responses are limited by a threshold (24).
management tx path is separate for all except tlv based solutions. Since
tlv solutions (qca6174 & qca9377) do not support 16 AP interfaces, it is
safe to move management descriptor limitation check under mgmt_tx
function. Though CPU improvement is negligible, unlikely conditions or
never hit conditions in hot path can be avoided on data transmission.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The current/old tx path design was that host, at
its own leisure, pushed tx frames to the device.
For HTT there was ~1000-1400 msdu queue depth.
After reaching that limit the driver would request
mac80211 to stop queues. There was little control
over what packets got in there as far as
DA/RA was considered so it was rather easy to
starve per-station traffic flows.
With MU-MIMO this became a significant problem
because the queue depth was insufficient to buffer
frames from multiple clients (which could have
different signal quality and capabilities) in an
efficient fashion.
Hence the new tx path in 10.4 was introduced: a
pull-push mode.
Firmware and host can share tx queue state via
DMA. The state is logically a 2 dimensional array
addressed via peer_id+tid pair. Each entry is a
counter (either number of bytes or packets. Host
keeps it updated and firmware uses it for
scheduling Tx pull requests to host.
This allows MU-MIMO to become a lot more effective
with 10+ clients.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Firmware 10.4.3 onwards can support a pull-push Tx
model where it shares a Tx queue state with the
host.
The host updates the DMA region it pointed to
during HTT setup whenever number of software
queued from (on host) changes. Based on this
information firmware issues fetch requests to the
host telling the host how many frames from a list
of given stations/tids should be submitted to the
firmware.
The code won't be called because not all
appropriate HTT events are processed yet.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This merely adds some parsing, generation and
sanity checks with placeholders for real
code/functionality to be added later.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Tx pending counter logic assumed that the sk_buff
is already known and hence was performed in HTT
functions themselves.
However, for the sake of future wake_tx_queue()
usage the driver must be able to tell whether it
can submit more frames to firmware before it
dequeues frame from ieee80211_txq (and thus long
before HTT Tx functions are called) because once a
frame is dequeued it cannot be requeud back to
mac80211.
This prepares the driver for future changes.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This allows to use the new firmware which
implements the new tx data path. Without this
patch firmware supporting new tx path stops
responding shortly after booting.
This patch doesn't implement the entire pull-push
logic available in the new firmware. This will be
done in subsequent patches.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This makes the code easier to extend and re-use.
While at it fix _warn to _err. Other than that
there are no functional changes.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Allocations from the DMA zone were originally added for legacy ISA
stuff, or PCI devices that have specific limitations in their DMA
addressing capabilities. It has no place in ath10k, which can do
full 32-bit DMA.
Fixes memory allocation errors on some platforms.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>