96 Commits

Author SHA1 Message Date
Arnd Bergmann
9f12bebd51 ath10k: fix gcc-10 zero-length-bounds warnings
gcc-10 started warning about out-of-bounds access for zero-length
arrays:

In file included from drivers/net/wireless/ath/ath10k/core.h:18,
                 from drivers/net/wireless/ath/ath10k/htt_rx.c:8:
drivers/net/wireless/ath/ath10k/htt_rx.c: In function 'ath10k_htt_rx_tx_fetch_ind':
drivers/net/wireless/ath/ath10k/htt.h:1683:17: warning: array subscript 65535 is outside the bounds of an interior zero-length array 'struct htt_tx_fetch_record[0]' [-Wzero-length-bounds]
 1683 |  return (void *)&ind->records[le16_to_cpu(ind->num_records)];
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/wireless/ath/ath10k/htt.h:1676:29: note: while referencing 'records'
 1676 |  struct htt_tx_fetch_record records[0];
      |                             ^~~~~~~

Make records[] a flexible array member to allow this, moving it behind
the other zero-length member that is not accessed in a way that gcc
warns about.

Fixes: 22e6b3bc5d96 ("ath10k: add new htt definitions")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200509120707.188595-1-arnd@arndb.de
2020-05-12 10:33:04 +03:00
Gustavo A. R. Silva
d3ed0cf047 ath10k: Replace zero-length array with flexible-array
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200507041127.GA31587@embeddedor
2020-05-11 15:36:31 +03:00
Wen Gong
dd7fc5545b ath10k: add flush tx packets for SDIO chip
When station connected to AP, and run TX traffic such as TCP/UDP, and
system enter suspend state, then mac80211 call ath10k_flush with set
drop flag, recently it only send wmi peer flush to firmware and
firmware will flush all pending TX packets, for PCIe, firmware will
indicate the TX packets status to ath10k, and then ath10k indicate to
mac80211 TX complete with the status, then all the packets has been
flushed at this moment. For SDIO chip, it is different, its TX
complete indication is disabled by default, and it has a tx queue in
ath10k, and its tx credit control is enabled, total tx credit is 96,
when its credit is not sufficient, then the packets will buffered in
the tx queue of ath10k, max packets is TARGET_TLV_NUM_MSDU_DESC_HL
which is 1024, for SDIO, when mac80211 call ath10k_flush with set drop
flag, maybe it have pending packets in tx queue of ath10k, and if it
does not have sufficient tx credit, the packets will stay in queue
untill tx credit report from firmware, if it is a noisy environment,
tx speed is low and the tx credit report from firmware will delay more
time, then the num_pending_tx will remain > 0 untill all packets send
to firmware. After the 1st ath10k_flush, mac80211 will call the 2nd
ath10k_flush without set drop flag immediately, then it will call to
ath10k_mac_wait_tx_complete, and it wait untill num_pending_tx become
to 0, in noisy environment, it is esay to wait about near 5 seconds,
then it cause the suspend take long time.

1st and 2nd callstack of ath10k_flush
[  303.740427] ath10k_sdio mmc1:0001:1: ath10k_flush drop:1, pending:0-0
[  303.740495] ------------[ cut here ]------------
[  303.740739] WARNING: CPU: 1 PID: 3921 at /mnt/host/source/src/third_party/kernel/v4.19/drivers/net/wireless/ath/ath10k/mac.c:7025 ath10k_flush+0x54/0x104 [ath10k_core]
[  303.740757] Modules linked in: bridge stp llc ath10k_sdio ath10k_core rfcomm uinput cros_ec_rpmsg mtk_seninf mtk_cam_isp mtk_vcodec_enc mtk_fd mtk_vcodec_dec mtk_vcodec_common mtk_dip mtk_mdp3 videobuf2_dma_contig videobuf2_memops v4l2_mem2mem videobuf2_v4l2 videobuf2_common hid_google_hammer hci_uart btqca bluetooth dw9768 ov8856 ecdh_generic ov02a10 v4l2_fwnode mtk_scp mtk_rpmsg rpmsg_core mtk_scp_ipi ipt_MASQUERADE fuse iio_trig_sysfs cros_ec_sensors_ring cros_ec_sensors_sync cros_ec_light_prox cros_ec_sensors industrialio_triggered_buffer
[  303.740914]  kfifo_buf cros_ec_activity cros_ec_sensors_core lzo_rle lzo_compress ath mac80211 zram cfg80211 joydev [last unloaded: ath10k_core]
[  303.741009] CPU: 1 PID: 3921 Comm: kworker/u16:10 Tainted: G        W         4.19.95 #2
[  303.741027] Hardware name: MediaTek krane sku176 board (DT)
[  303.741061] Workqueue: events_unbound async_run_entry_fn
[  303.741086] pstate: 60000005 (nZCv daif -PAN -UAO)
[  303.741166] pc : ath10k_flush+0x54/0x104 [ath10k_core]
[  303.741244] lr : ath10k_flush+0x54/0x104 [ath10k_core]
[  303.741260] sp : ffffffdf080e77a0
[  303.741276] x29: ffffffdf080e77a0 x28: ffffffdef3730040
[  303.741300] x27: ffffff907c2240a0 x26: ffffffde6ff39afc
[  303.741321] x25: ffffffdef3730040 x24: ffffff907bf61018
[  303.741343] x23: ffffff907c2240a0 x22: ffffffde6ff39a50
[  303.741364] x21: 0000000000000001 x20: ffffffde6ff39a50
[  303.741385] x19: ffffffde6bac2420 x18: 0000000000017200
[  303.741407] x17: ffffff907c24a000 x16: 0000000000000037
[  303.741428] x15: ffffff907b49a568 x14: ffffff907cf332c1
[  303.741476] x13: 00000000000922e4 x12: 0000000000000000
[  303.741497] x11: 0000000000000001 x10: 0000000000000007
[  303.741518] x9 : f2256b8c1de4bc00 x8 : f2256b8c1de4bc00
[  303.741539] x7 : ffffff907ab5e764 x6 : 0000000000000000
[  303.741560] x5 : 0000000000000080 x4 : 0000000000000001
[  303.741582] x3 : ffffffdf080e74a8 x2 : ffffff907aa91244
[  303.741603] x1 : ffffffdf080e74a8 x0 : 0000000000000024
[  303.741624] Call trace:
[  303.741701]  ath10k_flush+0x54/0x104 [ath10k_core]
[  303.741941]  __ieee80211_flush_queues+0x1dc/0x358 [mac80211]
[  303.742098]  ieee80211_flush_queues+0x34/0x44 [mac80211]
[  303.742253]  ieee80211_set_disassoc+0xc0/0x5ec [mac80211]
[  303.742399]  ieee80211_mgd_deauth+0x720/0x7d4 [mac80211]
[  303.742535]  ieee80211_deauth+0x24/0x30 [mac80211]
[  303.742720]  cfg80211_mlme_deauth+0x250/0x3bc [cfg80211]
[  303.742849]  cfg80211_mlme_down+0x90/0xd0 [cfg80211]
[  303.742971]  cfg80211_disconnect+0x340/0x3a0 [cfg80211]
[  303.743087]  __cfg80211_leave+0xe4/0x17c [cfg80211]
[  303.743203]  cfg80211_leave+0x38/0x50 [cfg80211]
[  303.743319]  wiphy_suspend+0x84/0x5bc [cfg80211]
[  303.743335]  dpm_run_callback+0x170/0x304
[  303.743346]  __device_suspend+0x2dc/0x3e8
[  303.743356]  async_suspend+0x2c/0xb0
[  303.743370]  async_run_entry_fn+0x48/0xf8
[  303.743383]  process_one_work+0x304/0x604
[  303.743394]  worker_thread+0x248/0x3f4
[  303.743403]  kthread+0x120/0x130
[  303.743416]  ret_from_fork+0x10/0x18

[  303.743812] ath10k_sdio mmc1:0001:1: ath10k_flush drop:0, pending:0-0
[  303.743858] ------------[ cut here ]------------
[  303.744057] WARNING: CPU: 1 PID: 3921 at /mnt/host/source/src/third_party/kernel/v4.19/drivers/net/wireless/ath/ath10k/mac.c:7025 ath10k_flush+0x54/0x104 [ath10k_core]
[  303.744075] Modules linked in: bridge stp llc ath10k_sdio ath10k_core rfcomm uinput cros_ec_rpmsg mtk_seninf mtk_cam_isp mtk_vcodec_enc mtk_fd mtk_vcodec_dec mtk_vcodec_common mtk_dip mtk_mdp3 videobuf2_dma_contig videobuf2_memops v4l2_mem2mem videobuf2_v4l2 videobuf2_common hid_google_hammer hci_uart btqca bluetooth dw9768 ov8856 ecdh_generic ov02a10 v4l2_fwnode mtk_scp mtk_rpmsg rpmsg_core mtk_scp_ipi ipt_MASQUERADE fuse iio_trig_sysfs cros_ec_sensors_ring cros_ec_sensors_sync cros_ec_light_prox cros_ec_sensors industrialio_triggered_buffer kfifo_buf cros_ec_activity cros_ec_sensors_core lzo_rle lzo_compress ath mac80211 zram cfg80211 joydev [last unloaded: ath10k_core]
[  303.744256] CPU: 1 PID: 3921 Comm: kworker/u16:10 Tainted: G        W         4.19.95 #2
[  303.744273] Hardware name: MediaTek krane sku176 board (DT)
[  303.744301] Workqueue: events_unbound async_run_entry_fn
[  303.744325] pstate: 60000005 (nZCv daif -PAN -UAO)
[  303.744403] pc : ath10k_flush+0x54/0x104 [ath10k_core]
[  303.744480] lr : ath10k_flush+0x54/0x104 [ath10k_core]
[  303.744496] sp : ffffffdf080e77a0
[  303.744512] x29: ffffffdf080e77a0 x28: ffffffdef3730040
[  303.744534] x27: ffffff907c2240a0 x26: ffffffde6ff39afc
[  303.744556] x25: ffffffdef3730040 x24: ffffff907bf61018
[  303.744577] x23: ffffff907c2240a0 x22: ffffffde6ff39a50
[  303.744598] x21: 0000000000000000 x20: ffffffde6ff39a50
[  303.744620] x19: ffffffde6bac2420 x18: 000000000001831c
[  303.744641] x17: ffffff907c24a000 x16: 0000000000000037
[  303.744662] x15: ffffff907b49a568 x14: ffffff907cf332c1
[  303.744683] x13: 00000000000922ea x12: 0000000000000000
[  303.744704] x11: 0000000000000001 x10: 0000000000000007
[  303.744747] x9 : f2256b8c1de4bc00 x8 : f2256b8c1de4bc00
[  303.744768] x7 : ffffff907ab5e764 x6 : 0000000000000000
[  303.744789] x5 : 0000000000000080 x4 : 0000000000000001
[  303.744810] x3 : ffffffdf080e74a8 x2 : ffffff907aa91244
[  303.744831] x1 : ffffffdf080e74a8 x0 : 0000000000000024
[  303.744853] Call trace:
[  303.744929]  ath10k_flush+0x54/0x104 [ath10k_core]
[  303.745098]  __ieee80211_flush_queues+0x1dc/0x358 [mac80211]
[  303.745277]  ieee80211_flush_queues+0x34/0x44 [mac80211]
[  303.745424]  ieee80211_set_disassoc+0x108/0x5ec [mac80211]
[  303.745569]  ieee80211_mgd_deauth+0x720/0x7d4 [mac80211]
[  303.745706]  ieee80211_deauth+0x24/0x30 [mac80211]
[  303.745853]  cfg80211_mlme_deauth+0x250/0x3bc [cfg80211]
[  303.745979]  cfg80211_mlme_down+0x90/0xd0 [cfg80211]
[  303.746103]  cfg80211_disconnect+0x340/0x3a0 [cfg80211]
[  303.746219]  __cfg80211_leave+0xe4/0x17c [cfg80211]
[  303.746335]  cfg80211_leave+0x38/0x50 [cfg80211]
[  303.746452]  wiphy_suspend+0x84/0x5bc [cfg80211]
[  303.746467]  dpm_run_callback+0x170/0x304
[  303.746477]  __device_suspend+0x2dc/0x3e8
[  303.746487]  async_suspend+0x2c/0xb0
[  303.746498]  async_run_entry_fn+0x48/0xf8
[  303.746510]  process_one_work+0x304/0x604
[  303.746521]  worker_thread+0x248/0x3f4
[  303.746530]  kthread+0x120/0x130
[  303.746542]  ret_from_fork+0x10/0x18

one sample's debugging log: it wait 3190 ms(5000 - 1810).

1st ath10k_flush, it has 120 packets in tx queue of ath10k:
<...>-1513  [000] .... 25374.786005: ath10k_log_err: ath10k_sdio mmc1:0001:1 ath10k_flush drop:1, pending:120-0
<...>-1513  [000] ...1 25374.788375: ath10k_log_warn: ath10k_sdio mmc1:0001:1 ath10k_htt_tx_mgmt_inc_pending htt->num_pending_mgmt_tx:0
<...>-1500  [001] .... 25374.790143: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx work, eid:1, count:121

2st ath10k_flush, it has 121 packets in tx queue of ath10k:
<...>-1513  [000] .... 25374.790571: ath10k_log_err: ath10k_sdio mmc1:0001:1 ath10k_flush drop:0, pending:121-0
<...>-1513  [000] .... 25374.791990: ath10k_log_err: ath10k_sdio mmc1:0001:1 ath10k_mac_wait_tx_complete state:1 pending:121-0
<...>-1508  [001] .... 25374.792696: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit update: delta:46
<...>-1508  [001] .... 25374.792700: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit total:46
<...>-1508  [001] .... 25374.792729: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx work, eid:1, count:121
<...>-1508  [001] .... 25374.792937: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx status:0, eid:1, req count:88, count:32, len:49792
<...>-1508  [001] .... 25374.793031: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx status:0, eid:1, req count:75, count:14, len:21784
kworker/u16:0-25773 [003] .... 25374.793701: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx complete, eid:1, pending complete count:46
<...>-1881  [000] .... 25375.073178: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit update: delta:24
<...>-1881  [000] .... 25375.073182: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit total:24
<...>-1881  [000] .... 25375.073429: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx work, eid:1, count:75
<...>-1879  [001] .... 25375.074090: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx complete, eid:1, pending complete count:24
<...>-1881  [000] .... 25375.074123: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx status:0, eid:1, req count:51, count:24, len:37344
<...>-1879  [001] .... 25375.270126: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit update: delta:26
<...>-1879  [001] .... 25375.270130: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit total:26
<...>-1488  [000] .... 25375.270174: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx work, eid:1, count:51
<...>-1488  [000] .... 25375.270529: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx status:0, eid:1, req count:25, count:26, len:40456
<...>-1879  [001] .... 25375.270693: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx complete, eid:1, pending complete count:26
<...>-1488  [001] .... 25377.775885: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit update: delta:12
<...>-1488  [001] .... 25377.775890: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit total:12
<...>-1488  [001] .... 25377.775933: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx work, eid:1, count:25
<...>-1488  [001] .... 25377.776059: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx status:0, eid:1, req count:13, count:12, len:18672
<...>-1879  [001] .... 25377.776100: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx complete, eid:1, pending complete count:12
<...>-1488  [001] .... 25377.878079: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit update: delta:15
<...>-1488  [001] .... 25377.878087: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit total:15
<...>-1879  [000] .... 25377.878323: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx work, eid:1, count:13
<...>-1879  [000] .... 25377.878487: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx status:0, eid:1, req count:0, count:13, len:20228
<...>-1879  [000] .... 25377.878497: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx complete, eid:1, pending complete count:13
<...>-1488  [001] .... 25377.919927: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit update: delta:11
<...>-1488  [001] .... 25377.919932: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 credit total:13
<...>-1488  [001] .... 25377.919976: ath10k_log_dbg: ath10k_sdio mmc1:0001:1 bundle tx work, eid:1, count:0
<...>-1881  [000] .... 25377.982645: ath10k_log_warn: ath10k_sdio mmc1:0001:1 HTT_T2H_MSG_TYPE_MGMT_TX_COMPLETION status:0
<...>-1513  [001] .... 25377.982973: ath10k_log_err: ath10k_sdio mmc1:0001:1 ath10k_mac_wait_tx_complete time_left:1810, pending:0-0

Flush all pending TX packets for the 1st ath10k_flush reduced the wait
time of the 2nd ath10k_flush and then suspend take short time.

This Patch only effect SDIO chips.

Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00042.

Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200415233730.10581-1-wgong@codeaurora.org
2020-04-22 09:45:03 +03:00
Wen Gong
c8334512f3 ath10k: add htt TX bundle for sdio
The transmission utilization ratio for sdio bus for small packet is
slow, because the space and time cost for sdio bus is same for large
length packet and small length packet. So the speed of data for large
length packet is higher than small length.

Test result of different length of data:

data packet(byte)   cost time(us)   calculated rate(Mbps)
      256               28                73
      512               33               124
     1024               35               234
     1792               45               318
    14336              168               682
    28672              333               688
    57344              660               695

This patch change the TX packet from single packet to a large length
bundle packet, max size is 32, it results in significant performance
improvement on TX path.

Also there's a fourth thread "ath10k_tx_complete_wq" added to ath10k as it
improves TCP RX throughput (values in Mbps):

                                       TCP-RX    TCP-TX    UDP-RX      UDP-TX
use workqueue_tx_complete              423       357       448         412
change it to ar->workqueue             410       360       449         414
change it to ar->workqueue_aux         405       339       446         401

This patch only effect sdio chip, it will not effect PCI, SNOC etc.
It only enable bundle for sdio chip.

Tested with QCA6174 SDIO with firmware
WLAN.RMH.4.4.1-00017-QCARMSWP-1.

Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200410061400.14231-2-wgong@codeaurora.org
2020-04-22 09:43:29 +03:00
Wen Gong
d81686d333 ath10k: disable TX complete indication of htt for sdio
For sdio chip, it is high latency bus, all the TX packet's content will
be tranferred from HOST memory to firmware memory via sdio bus, then it
need much more memory in firmware than low latency bus chip, for low
latency chip, such as PCI-E, it only need to transfer the TX descriptor
via PCI-E bus to firmware memory. For sdio chip, reduce the complexity of
TX logic will help TX efficiency since its memory is limited, and it will
reduce the TX circle's time of each packet and then firmware will have more
memory for TX since TX complete also need memeory.

This patch disable TX complete indication from firmware for htt data
packet, it will not have TX complete indication from firmware to ath10k.
It will cut the cost of bus bandwidth of TX complete and make the TX
logic of firmware simpler, it results in significant performance
improvement on TX path.

Udp TX throughout is 130Mbps without this patch, and it arrives
400Mbps with this patch.

The downside of this patch is the command "iw wlan0 station dump" will
show 0 for "tx retries" and "tx failed" since all tx packet's status
is success.

This patch only effect sdio chip, it will not effect PCI, SNOC etc.

Tested with QCA6174 SDIO with firmware
WLAN.RMH.4.4.1-00017-QCARMSWPZ-1

Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200212080415.31265-2-wgong@codeaurora.org
2020-04-09 17:48:50 +03:00
Wen Gong
cfee8793a7 ath10k: enable napi on RX path for sdio
For tcp RX, the quantity of tcp acks to remote is 1/2 of the quantity
of tcp data from remote, then it will have many small length packets
on TX path of sdio bus, then it reduce the RX packets's bandwidth of
tcp.

This patch enable napi on RX path, then the RX packet of tcp will not
feed to tcp stack immeditely from mac80211 since GRO is enabled by
default, it will feed to tcp stack after napi complete, if rx bundle
is enabled, then it will feed to tcp stack one time for each bundle
of RX. For example, RX bundle size is 32, then tcp stack will receive
one large length packet, its length is neary 1500*32, then tcp stack
will send a tcp ack for this large packet, this will reduce the tcp
acks ratio from 1/2 to 1/32. This results in significant performance
improvement for tcp RX.

Tcp rx throughout is 240Mbps without this patch, and it arrive 390Mbps
with this patch. The cpu usage has no obvious difference with and
without NAPI.

call stack for each RX packet on GRO path:
(skb length is about 1500 bytes)
  skb_gro_receive ([kernel.kallsyms])
  tcp4_gro_receive ([kernel.kallsyms])
  inet_gro_receive ([kernel.kallsyms])
  dev_gro_receive ([kernel.kallsyms])
  napi_gro_receive ([kernel.kallsyms])
  ieee80211_deliver_skb ([mac80211])
  ieee80211_rx_handlers ([mac80211])
  ieee80211_prepare_and_rx_handle ([mac80211])
  ieee80211_rx_napi ([mac80211])
  ath10k_htt_rx_proc_rx_ind_hl ([ath10k_core])
  ath10k_htt_rx_pktlog_completion_handler ([ath10k_core])
  ath10k_sdio_napi_poll ([ath10k_sdio])
  net_rx_action ([kernel.kallsyms])
  softirqentry_text_start ([kernel.kallsyms])
  do_softirq ([kernel.kallsyms])

call stack for napi complete and send tcp ack from tcp stack:
(skb length is about 1500*32 bytes)
 _tcp_ack_snd_check ([kernel.kallsyms])
 tcp_v4_do_rcv ([kernel.kallsyms])
 tcp_v4_rcv ([kernel.kallsyms])
 local_deliver_finish ([kernel.kallsyms])
 ip_local_deliver ([kernel.kallsyms])
 ip_rcv_finish ([kernel.kallsyms])
 ip_rcv ([kernel.kallsyms])
 netif_receive_skb_core ([kernel.kallsyms])
 netif_receive_skb_one_core([kernel.kallsyms])
 netif_receive_skb ([kernel.kallsyms])
 netif_receive_skb_internal ([kernel.kallsyms])
 napi_gro_complete ([kernel.kallsyms])
 napi_gro_flush ([kernel.kallsyms])
 napi_complete_done ([kernel.kallsyms])
 ath10k_sdio_napi_poll ([ath10k_sdio])
 net_rx_action ([kernel.kallsyms])
 __softirqentry_text_start ([kernel.kallsyms])
 do_softirq ([kernel.kallsyms])

Tested with QCA6174 SDIO with firmware
WLAN.RMH.4.4.1-00017-QCARMSWP-1.

Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-12-02 11:59:41 +02:00
Erik Stromdahl
74ee571599 ath10k: add inline wrapper for htt_h2t_aggr_cfg_msg
This is done in order to make the *htt_h2t_aggr_cfg_msg* op align better
with the rest of the htt ops (whom all have inline wrappers).

It also adds support for the case when the op is missing (function
pointer is NULL).

As a result of this, the name of the 32 bit implementation in htt_tx.c
was changed and the function was made static.

Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-06-25 15:47:15 +03:00
Maharaja Kennadyrajan
473a4084e1 ath10k: Added support to reset HTT stats in debugfs
Reset HTT stats helps to get the aggregated HTT stats via
tracing and also we can clear the accumulated HTT stats with
this debugfs file.

Signed-off-by: Maharaja Kennadyrajan <mkenna@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-05-07 16:37:26 +03:00
Maharaja Kennadyrajan
14bf9217d6 ath10k: Extended the HTT stats support to retrieve Mu-MIMO related stats
Extended the bit mask value of the HTT stats to get the Mu-MIMO
related stats via tracing.

Signed-off-by: Maharaja Kennadyrajan <mkenna@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-05-07 16:37:15 +03:00
Wen Gong
33f9747291 ath10k: add fragmentation handler for high latency devices
On high latency devices (SDIO, USB) ath10k did not handle fragmented frames and
all fragmented frames on receive path were lost in ath10k. Even a simple ping
test failed with fragmentation.

The fragmented packets are decapsulated based on the security mode, then the PN
is checked and the fragmented frame is passed to mac80211.  mac80211 in
ieee80211_rx_h_defragment() will then combine the fragment frames and forward
to upper layers.

Tested on QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00007-QCARMSWP-1.

Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-04-29 17:37:44 +03:00
Wen Gong
e1bddde973 ath10k: add struct for high latency PN replay protection
Add the struct for PN replay protection and fragment packet
handler.

Also fix the bitmask of HTT_RX_DESC_HL_INFO_MCAST_BCAST to match what's currently
used by SDIO firmware. The defines are not used yet so it's safe to modify
them. Remove the conflicting HTT_RX_DESC_HL_INFO_FRAGMENT as
it's not either used in ath10k.

Tested on QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00007-QCARMSWP-1.

Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-04-29 17:37:13 +03:00
Abhishek Ambure
6ddc3860a5 ath10k: add support for ack rssi value of data tx packets
In WCN3990, WMI_TLV_SERVICE_TX_DATA_MGMT_ACK_RSSI service Indicates that
the firmware has the capability to send the RSSI value of the ACK for all
data and management packets transmitted.

If WMI_RSRC_CFG_FLAG_TX_ACK_RSSI is set in host capability then firmware
sends RSSI value in "data" tx completion event. Host extracts ack rssi
values of data packets from their tx completion event.

Tested HW: WCN3990
Tested FW: WLAN.HL.2.0-01617-QCAHLSWMTPLZ-1

Signed-off-by: Abhishek Ambure <aambure@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-02-26 14:58:06 +02:00
Kalle Valo
f0553ca9ce ath10k: switch to use SPDX license identifiers
Use SPDX identifiers everywhere in ath10k.

Makefile was incorrectly marked in commit b24413180f56 ("License cleanup: add
SPDX GPL-2.0 license identifier to files with no license"), fix that as well.

Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-02-20 10:33:00 +02:00
Kalle Valo
385bd8816c ath10k: align ath10k_htt_txbuf structures
With W=1 GCC warns:

drivers/net/wireless/ath/ath10k/htt.h:1746:1: warning: alignment 1 of 'struct ath10k_htt_txbuf_32' is less than 4 [-Wpacked-not-aligned]
drivers/net/wireless/ath/ath10k/htt.h:1753:1: warning: alignment 1 of 'struct ath10k_htt_txbuf_64' is less than 4 [-Wpacked-not-aligned]

Fix that by using __align(4). Compile tested only.

Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-02-12 20:49:35 +02:00
Manikanta Pubbisetty
bb31b7cb10 ath10k: report tx airtime provided by fw
If supported, update transmit airtime in mac80211 with the airtime
values reported by the firmware. TX airtime of the PPDU is reported
via HTT data TX completion indication message.

A new service flag 'WMI_SERVICE_REPORT_AIRTIME' is added to advertise
the firmware support. For firmwares which do not support this feature,
TX airtime is calculated in the driver using TX bitrate.

Hardwares tested : QCA9984
Firmwares tested : 10.4-3.6.1-00841

Signed-off-by: Manikanta Pubbisetty <mpubbise@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-02-12 20:48:12 +02:00
Govind Singh
5cbb117477 ath10k: Add support for extended HTT aggr msg support
HTT aggr message parameter in HL2.0 fw are different in comparison
to legacy fw version. Fill correct HTT aggr msg parameter for
targets using HL2.0 firmware.

Signed-off-by: Govind Singh <govinds@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-02-04 17:51:39 +02:00
YueHaibing
4be3b05e7a ath10k: remove duplicated includes
remove duplicated include from ath10k driver.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-10-01 16:57:18 +03:00
Erik Stromdahl
f88d493450 ath10k: htt: High latency RX support
Special HTT RX handling for high latency interfaces.

Since no DMA physical addresses are used in the RX ring
config message (this is not supported by the high latency
devices), no RX ring is allocated.
All RX skb's are allocated by the driver and passed directly
to mac80211 in the HTT RX indication handler.

A nice side effect of this is that no huge buffer will be
allocated with dma_alloc_coherent. On embedded systems with
limited memory resources, the allocation of the RX ring is
prone to fail.

Some tweaks made to "make it work":

Removal of protected bit in 802.11 header frame control field.
The chipset seems to do hw decryption but the frame_control
protected bit is still set.

This is necessary for mac80211 not to drop the frame.

Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-09-06 19:15:26 +03:00
Balaji Pothunoori
c7fd8d237e ath10k: average ack rssi support for data frames
Average ack rssi value is weighted average of ack rssi for
no of msdu's has been sent.
This feature is enabled by the host driver if firmware is capable.
After receiving event from host, firmware allocates the necessary
memory to store the ack_rssi for data packets during the init time.

After each successful transmission, If tx completion status is OK
and 24th bit is set in HTT message header then host will fetch the
ack_rssi else host can ignore the ack_rssi field.

Signed-off-by: Balaji Pothunoori <bpothuno@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-09-06 19:05:13 +03:00
Rakesh Pillai
20529b33aa ath10k: enable hw checksum for wcn3990
By default ath10k driver enables the support for HW_CHECKSUM
(NETIF_F_HW_CSUM). Since the TCP/UDP checksum calculation is not enabled
in the wcn3990 firmware the checksum is incorrect in the TCP/UDP packets
and all patckets are dropped. But due note that wcn3990 support in
ath10k is still incomplete so this isn't a critical fix (yet).

Enable hw checksum calculations in wcn3990 hardware by
setting the proper flags in msdu descriptor tso flags.

Signed-off-by: Rakesh Pillai <pillair@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-04-24 08:57:48 +03:00
Erik Stromdahl
9a5511d5f2 ath10k: add inlined wrappers for htt rx ops
Added for the same reason as the TX wrappers.

Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-04-19 19:19:34 +03:00
Erik Stromdahl
5df6e13130 ath10k: add inlined wrappers for htt tx ops
These wrappers makes the HTT ops align better with the HIF ops
(where similar wrappers are used).

It also makes it easier for a target to have unsupported ops
(by letting the corresponding function pointer be NULL).

Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-04-19 19:19:28 +03:00
Venkateswara Naralasetty
235b9c4276 ath10k: Add tx ack signal support for management frames
This patch add support to get RSSI from acknowledgment
frames for transmitted management frames.

hardware_used: QCA4019, QCA9984.
firmware version: 10.4-3.5.3-00052.

Signed-off-by: Venkateswara Naralasetty <vnaralas@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-04-19 18:47:00 +03:00
Kalle Valo
a3e712b74b ath10k: fix recently introduced checkpatch warnings
Checkpatch found these issues:

drivers/net/wireless/ath/ath10k/ce.h:324: Please use a blank line after function/struct/union/enum declarations
drivers/net/wireless/ath/ath10k/core.c:1321: Please don't use multiple blank lines
drivers/net/wireless/ath/ath10k/htt.h:1859: Please use a blank line after function/struct/union/enum declarations

Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-16 16:34:42 +02:00
Erik Stromdahl
fbea11c8d5 ath10k: remove unused prototype
The function does not exist and thus, the prototype can be removed.

Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-09 10:18:27 +02:00
Kalle Valo
8b1083d618 ath10k: update copyright year
Update year for Qualcomm Atheros, Inc. copyrights.

Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-12-27 12:22:01 +02:00
Govind Singh
a91a626baa ath10k: Add paddrs_ring_64 support for 64bit target
paddrs_ring_64 holds the physical device address of the
rx buffers that host SW provides for the MAC HW to fill.
Since this field is used in rx ring setup and rx ring
replenish in rx data path. Define separate methods
for handling 64 bit ring paddr and attach them dynamically
based on target_64bit hw param flag. Use u64 type
while popping paddr from the rx hash table for 64bit target.

Signed-off-by: Govind Singh <govinds@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-12-27 12:06:31 +02:00
Govind Singh
bb8d0d15fc ath10k: Add hw param for rx ring size support
WCN3990 uses larger ring size in comparison to existing
ring size value.
Add rx ring size hw param for supporting different rx ring
size across multiple target.

Signed-off-by: Govind Singh <govinds@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-12-27 12:06:23 +02:00
Govind Singh
e62ee5c381 ath10k: Add support for htt_data_tx_desc_64 descriptor
WCN3990 target uses 64 bit frags_paddr in htt tx descriptor,
which holds the physical address of SKB fragments in tx data path.

In order to support 64 bit bit frags_paddr in htt tx descriptor, define
htt_data_tx_desc_64 descriptor and ath10k_htt_tx_64 method for handling
tx data path with new descriptor fields.

Signed-off-by: Govind Singh <govinds@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-12-27 12:06:11 +02:00
Govind Singh
71ad709610 ath10k: Add support for 64 bit HTT frag descriptor
WCN3990 target uses 64 bit frag descriptor and more
fields in TSO flag.
Add support for 64 bit HTT frag descriptor.

Signed-off-by: Govind Singh <govinds@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-12-27 12:06:02 +02:00
Govind Singh
9abe68535a ath10k: Add support for 64 bit htt rx ring cfg
WCN3900 target uses 64bit rx_ring_base_paddr and
fw_idx_shadow_reg_paddr fields in HTT rx ring cfg message.
These address points to the memory region where remote
ring empty buffers are allocated.
In order to add 64 bit htt rx ring cfg, define separate
64 bit htt rx ring cfg message and attach it in runtime
based on target_64bit hw param flag.

Signed-off-by: Govind Singh <govinds@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-12-27 12:05:53 +02:00
Govind Singh
3b0b55b19d ath10k: Add support for 64 bit HTT in-order indication msg
WCN3990 target use 64bit msdu address in htt in-order
indication message. Add support for 64 bit msdu address in
HTT_T2H_MSG_TYPE_RX_IN_ORD_PADDR_IND message.

Signed-off-by: Govind Singh <govinds@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-12-27 12:05:44 +02:00
Anilkumar Kolli
e8123bb74c ath10k: add per peer tx stats support for 10.2.4
10.2.4 firmware branch (used in QCA988X) does not support
HTT_10_4_T2H_MSG_TYPE_PEER_STATS and that's why ath10k does not provide
tranmission rate statistics to user space, instead it just shows
hardcoded 6 Mbit/s. But pktlog firmware facility provides per peer tx
statistics. The firmware sends one pktlog event for every four
PPDUs per peer, which include:

* successful number of packets and bytes transmitted
* number of packets and bytes dropped
* retried number of packets and bytes
* rate info per ppdu

Firmware supports WMI_SERVICE_PEER_STATS, pktlog is enabled through
ATH10K_FLAG_PEER_STATS, which is nowadays enabled by default in ath10k.

This patch does not impact throughput.

Tested on QCA9880 with firmware version 10.2.4.70.48. This should also
work with firmware branch 10.2.4-1.0-00029

Parse peer stats from pktlog packets and update the tx rate information
per STA. This way user space can query about transmit rate with iw:

$iw wlan0 station dump
Station 3c:a9:f4:72:bb:a4 (on wlan1)
        inactive time:  8210 ms
        rx bytes:       9166
        rx packets:     44
        tx bytes:       1105
        tx packets:     9
        tx retries:     0
        tx failed:      1
        rx drop misc:   3
        signal:         -75 [-75, -87, -88] dBm
        signal avg:     -75 [-75, -85, -88] dBm
        tx bitrate:     39.0 MBit/s MCS 10
        rx bitrate:     26.0 MBit/s MCS 3
        rx duration:    23250 us
        authorized:     yes
        authenticated:  yes
        associated:     yes
        preamble:       short
        WMM/WME:        yes
        MFP:            no
        TDLS peer:      no
        DTIM period:    2
        beacon interval:100
        short preamble: yes
        short slot time:yes
        connected time: 22 seconds

Signed-off-by: Anilkumar Kolli <akolli@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-12-14 17:26:41 +02:00
Rajkumar Manoharan
deba1b9ea6 ath10k: unify rx processing in napi_poll
With current NAPI implementation, NAPI poll can deliver more frames
to net core than allotted budget. This may cause warning in napi_poll.
Remaining quota is not accounted, while processing amsdus in
rx_in_ord_ind and rx_ind queue. Adding num_msdus at last can not
prevent delivering more frames to net core. With this change,
all amdus from both in_ord_ind and rx_ind queues are processed and
enqueued into common skb list instead of delivering into mac80211.
Later msdus from common queue are dequeued and delivered depends on
quota availability. This change also simplifies the rx processing in
napi poll routine.

Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-12-14 17:20:28 +02:00
Marcin Rokicki
37ff1b0df3 ath10k: clean header files from bad block comments
Fix output from checkpatch.pl like:

 Block comments use a trailing */ on a separate line

Signed-off-by: Marcin Rokicki <marcin.rokicki@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-04-05 10:45:21 +03:00
Kalle Valo
019e4280fa ath10k: use names in function definition arguments
Fixes new checkpatch warnings:

drivers/net/wireless/ath/ath10k/htt.h:1823: function definition argument 'struct sk_buff *' should also have an identifier name
drivers/net/wireless/ath/ath10k/wmi.h:6607: function definition argument 'struct wmi_start_scan_arg *' should also have an identifier name

Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-02-07 10:43:25 +02:00
Kalle Valo
b2d6041568 ath10k: prefer unsigned int over just unsigned
Fixes new checkpatch warnings:

drivers/net/wireless/ath/ath10k/htt.h:1639: Prefer 'unsigned int' to bare use of 'unsigned'
drivers/net/wireless/ath/ath10k/htt.h:1660: Prefer 'unsigned int' to bare use of 'unsigned'

Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2017-02-07 10:43:21 +02:00
Mohammed Shafi Shajakhan
9ec34a8619 ath10k: fix Tx DMA alloc failure during continuous wifi down/up
With maximum number of vap's configured in a two radio supported
systems of ~256 Mb RAM, doing a continuous wifi down/up and
intermittent traffic streaming from the connected stations results
in failure to allocate contiguous memory for tx buffers. This results
in the disappearance of all VAP's and a manual reboot is needed as
this is not a crash (or) OOM(for OOM killer to be invoked). To address
this allocate contiguous memory for tx buffers one time and re-use them
until the modules are unloaded but this results in a slight increase in
memory footprint of ath10k when the wifi is down, but the modules are
still loaded. Also as of now we use a separate bool 'tx_mem_allocated'
to keep track of the one time memory allocation, as we cannot come up
with something like 'ath10k_tx_{register,unregister}' before
'ath10k_probe_fw' is called as 'ath10k_htt_tx_alloc_cont_frag_desc'
memory allocation is dependent on the hw_param 'continuous_frag_desc'

a) memory footprint of ath10k without the change

lsmod | grep ath10k
ath10k_core           414498  1 ath10k_pci
ath10k_pci             38236  0

b) memory footprint of ath10k with the change

ath10k_core           414980  1 ath10k_pci
ath10k_pci             38236  0

Memory Failure Call trace:

hostapd: page allocation failure: order:6, mode:0xd0
 [<c021f150>] (__dma_alloc_buffer.isra.23) from
[<c021f23c>] (__alloc_remap_buffer.isra.26+0x14/0xb8)
[<c021f23c>] (__alloc_remap_buffer.isra.26) from
[<c021f664>] (__dma_alloc+0x224/0x2b8)
[<c021f664>] (__dma_alloc) from [<c021f810>]
(arm_dma_alloc+0x84/0x90)
[<c021f810>] (arm_dma_alloc) from [<bf954764>]
(ath10k_htt_tx_alloc+0xe0/0x2e4 [ath10k_core])
[<bf954764>] (ath10k_htt_tx_alloc [ath10k_core]) from
[<bf94e6ac>] (ath10k_core_start+0x538/0xcf8 [ath10k_core])
[<bf94e6ac>] (ath10k_core_start [ath10k_core]) from
[<bf947eec>] (ath10k_start+0xbc/0x56c [ath10k_core])
[<bf947eec>] (ath10k_start [ath10k_core]) from
[<bf8a7a04>] (drv_start+0x40/0x5c [mac80211])
[<bf8a7a04>] (drv_start [mac80211]) from [<bf8b7cf8>]
(ieee80211_do_open+0x170/0x82c [mac80211])
[<bf8b7cf8>] (ieee80211_do_open [mac80211]) from
[<c056afc8>] (__dev_open+0xa0/0xf4)
[21053.491752] Normal: 641*4kB (UEMR) 505*8kB (UEMR) 330*16kB (UEMR)
126*32kB (UEMR) 762*64kB (UEMR) 237*128kB (UEMR) 1*256kB (M) 0*512kB
0*1024kB 0*2048kB 0*4096kB = 95276kB

Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-12-01 13:13:55 +02:00
Anilkumar Kolli
cec17c3821 ath10k: add per peer htt tx stats support for 10.4
Per peer tx stats are part of 'HTT_10_4_T2H_MSG_TYPE_PEER_STATS'
event, Firmware sends one HTT event for every four PPDUs.
HTT payload has success pkts/bytes, failed pkts/bytes, retry
pkts/bytes and rate info per ppdu.
Peer stats are enabled through 'WMI_SERVICE_PEER_STATS',
which are nowadays enabled by default.

Parse peer stats and update the tx rate information per STA.

tx rate, Peer stats are tested on QCA4019 with Firmware version
10.4-3.2.1-00028.

Signed-off-by: Anilkumar Kolli <akolli@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-11-23 21:40:02 +02:00
Joe Perches
e13dbead97 ath10k: spelling and miscellaneous neatening
Correct some trivial comment typos.
Remove unnecessary parentheses in a long line.

Signed-off-by: Joe Perches <joe@perches.com>
[kvalo@qca.qualcomm.com: drop the change for return]
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-09-27 15:00:48 +03:00
Rajkumar Manoharan
3c97f5de1f ath10k: implement NAPI support
Add NAPI support for rx and tx completion. NAPI poll is scheduled
from interrupt handler. The design is as below

 - on interrupt
     - schedule napi and mask interrupts
 - on poll
   - process all pipes (no actual Tx/Rx)
   - process Rx within budget
   - if quota exceeds budget reschedule napi poll by returning budget
   - process Tx completions and update budget if necessary
   - process Tx fetch indications (pull-push)
   - push any other pending Tx (if possible)
   - before resched or napi completion replenish htt rx ring buffer
   - if work done < budget, complete napi poll and unmask interrupts

This change also get rid of two tasklets (intr_tq and txrx_compl_task).

Measured peak throughput with NAPI on IPQ4019 platform in controlled
environment. No noticeable reduction in throughput is seen and also
observed improvements in CPU usage. Approx. 15% CPU usage got reduced
in UDP uplink case.

DL: AP DUT Tx
UL: AP DUT Rx

IPQ4019 (avg. cpu usage %)

========
                TOT              +NAPI
              ===========      =============
TCP DL       644 Mbps (42%)    645 Mbps (36%)
TCP UL       673 Mbps (30%)    675 Mbps (26%)
UDP DL       682 Mbps (49%)    680 Mbps (49%)
UDP UL       720 Mbps (28%)    717 Mbps (11%)

Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-09-09 14:49:47 +03:00
Vasanthakumar Thiagarajan
dc73787b8b ath10k: fix some of the macro definitions of HTT_RX_IND message
Only five bits are defined to pass tid information in HTT_RX_IND
message, so the mask which can be used to extract tid should be 0x1f
instead of the current 0x3f. Also, macros which can be used to extract
flush_valid and release_valid bits have to be left shifted one bit less
because these information follow the tid right after. This patch does
not really fix anything functionally because these macros are not used
currently.

Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-06-14 14:51:43 +03:00
Kalle Valo
77561f9394 ath10k: move htt_op_version to struct ath10k_fw_file
Preparation for testmode.c to use ath10k_core_fetch_board_data_api_n().

Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-04-20 20:00:27 +03:00
Kalle Valo
beeb1a302f ath10k: prefer kernel type 'u64' over 'u_int64_t'
Fixes checkpatch warnings:

drivers/net/wireless/ath/ath10k/htt.h:1477: Prefer kernel type 'u64' over 'u_int64_t'
drivers/net/wireless/ath/ath10k/htt.h:1480: Prefer kernel type 'u64' over 'u_int64_t'

Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-04-14 17:30:52 +03:00
Rajkumar Manoharan
5c86d97bcc ath10k: combine txrx and replenish task
Since tx completion and rx indication processing are moved out
of txrx tasklet and rx ring lock contention also removed from txrx
for rx_ind messages, it would be efficient to combine both replenish
and txrx tasks. Refill threshold is adjusted for both AP135 and AP148
(low and high end systems). With this adjustment in AP135, TCP DL is
improved from 603 Mbps to 620 Mbps and UDP DL is improved from 758 Mbps
to 803 Mbps. Also no watchdog are observed on UDP BiDi.

Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-04-04 17:03:21 +03:00
Rajkumar Manoharan
e3a91f877c ath10k: register ath10k_htt_htc_t2h_msg_handler
Except qca61x4 family chips (qca6164, qca6174), copy engine 5 is used
for receiving target to host htt messages. In follow up patch, CE5
descriptors will be reused. In such case, same API can not be used as
htc layer callback where the response messages will be freed at the end.
Hence register new API for HTC layer that free up received message and
keep the message handler common for both HTC and HIF layers.

Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-04-04 17:03:20 +03:00
Rajkumar Manoharan
3128b3d8a2 ath10k: speedup htt rx descriptor processing for rx_ind
In follow up patch, htt rx descriptors will be reused instead of
dealloc and refill. To achieve that htt rx indication messages
should not be deferred and should be processed in pci tasklet itself.
Also from rx indication message, mpdu_count alone is used. So it is
maintained as atomic variable and all rx amsdu handlers are done
processed from txrx tasklet. This change get rid of rx_compl_q usage.

Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-04-04 17:03:20 +03:00
Rajkumar Manoharan
59465fe46e ath10k: speedup htt rx descriptor processing for tx completion
To optimize CPU usage htt rx descriptors will be reused instead of
refilling it for htt rx copy engine (CE5). To support that all htt rx
indications should be processed at same context. FIFO queue is used
to maintain tx completion status for each msdu. This helps to retain
the order of tx completion.

Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-04-04 17:03:07 +03:00
Rajkumar Manoharan
cac085524c ath10k: move mgmt descriptor limit handle under mgmt_tx
Frames that are transmitted via MGMT_TX are using reserved descriptor
slots in firmware. This limitation is for the htt_mgmt_tx path itself,
not for mgmt frames per se. In 16 MBSSID scenario, these reserved slots
will be easy exhausted due to frequent probe responses. So for 10.4
based solutions, probe responses are limited by a threshold (24).

management tx path is separate for all except tlv based solutions. Since
tlv solutions (qca6174 & qca9377) do not support 16 AP interfaces, it is
safe to move management descriptor limitation check under mgmt_tx
function. Though CPU improvement is negligible, unlikely conditions or
never hit conditions in hot path can be avoided on data transmission.

Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-03-18 09:52:27 +02:00
Rajkumar Manoharan
2ce9b25cef ath10k: handle channel change htt event
Whenever firmware is configuring operating channel during scan or
home channel, channel change event will be indicated to host. In some
cases (device probe/ last vdev down), target will be configured to
default channel whereas host is unaware of target's operating channel.
This leads to packet drop due to unknown channel and kernel log will be
filled up with "no channel configured; ignoring frame(s)!". Fix that
by handling HTT_T2H_MSG_TYPE_CHAN_CHANGE event.

Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-03-18 09:49:39 +02:00