linux

iv/linux

Author	SHA1	Message	Date
Suman Ghosh	4aa1d8f89b	octeontx2-pf: Fix ntuple rule creation to direct packet to VF with higher Rx queue than its PF It is possible to add a ntuple rule which would like to direct packet to a VF whose number of queues are greater/less than its PF's queue numbers. For example a PF can have 2 Rx queues but a VF created on that PF can have 8 Rx queues. As of today, ntuple rule will reject rule because it is checking the requested queue number against PF's number of Rx queues. As a part of this fix if the action of a ntuple rule is to move a packet to a VF's queue then the check is removed. Also, a debug information is printed to aware user that it is user's responsibility to cross check if the requested queue number on that VF is a valid one. Fixes: `f0a1913f8a` ("octeontx2-pf: Add support for ethtool ntuple filters") Signed-off-by: Suman Ghosh <sumang@marvell.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231121165624.3664182-1-sumang@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-11-23 12:55:32 +01:00
Paolo Abeni	7490a42020	Merge branch 'net-ethernet-renesas-rcar_gen4_ptp-add-v4h-support' Niklas Söderlund says: ==================== net: ethernet: renesas: rcar_gen4_ptp: Add V4H support This small series prepares the rcar_gen4_ptp to be useable both on both R-Car S4 and V4H. The only in-tree driver that make use of this is rswtich on S4. A new Ethernet (R-Car Ethernet TSN) driver for V4H is on it's way that also will make use of rcar_gen4_ptp functionality. Patch 1-2 are small improvements to the existing driver. While patch 3-4 adds V4H support. Finally patch 5 turns rcar_gen4_ptp into a separate module to allow the gPTP functionality to be shared between the two users without having to duplicate the code in each. See each patch for changelog. ==================== Link: https://lore.kernel.org/r/20231121155306.515446-1-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-11-23 12:05:44 +01:00
Niklas Söderlund	8c1c66235e	net: ethernet: renesas: rcar_gen4_ptp: Break out to module The Gen4 gPTP support will be shared between the existing Renesas Ethernet Switch driver and the upcoming Renesas Ethernet-TSN driver. In preparation for this break out the gPTP support to its own module. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-11-23 12:02:49 +01:00
Niklas Söderlund	be5f81d37f	net: ethernet: renesas: rcar_gen4_ptp: Get clock increment from clock rate Instead of using hard coded clock increment values for each SoC derive the clock increment from the module clock. This is done in preparation to support a second platform, R-Car V4H that uses a 200Mhz clock compared with the 320Mhz clock used on R-Car S4. Tested on both SoCs, S4 reports a clock of 320000000Hz which gives a value of 0x19000000. Documentation says a 320Mhz clock is used and the correct increment for that clock is 0x19000000. V4H reports a clock of 199999992Hz which gives a value of 0x2800001a. Documentation says a 200Mhz clock is used and the correct increment for that clock is 0x28000000. Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-11-23 12:02:49 +01:00
Niklas Söderlund	46c361a046	net: ethernet: renesas: rcar_gen4_ptp: Prepare for shared register layout All known R-Car Gen4 SoC share the same register layout, rename the R-Car S4 specific identifiers so they can be shared with the upcoming R-Car V4H support. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-11-23 12:02:49 +01:00
Niklas Söderlund	9f3995707e	net: ethernet: renesas: rcar_gen4_ptp: Fail on unknown register layout Instead of printing a warning and proceeding with an unknown register layout return an error. The only call site is already prepared to propagate the error. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-11-23 12:02:49 +01:00
Niklas Söderlund	d73dcff9eb	net: ethernet: renesas: rcar_gen4_ptp: Remove incorrect comment The comments intent was to indicates which function uses the enum. While upstreaming rcar_gen4_ptp the function was renamed but this comment was left with the old function name. Instead of correcting the comment remove it, it adds little value. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-11-23 12:02:49 +01:00
Lech Perczak	99360d9620	net: usb: qmi_wwan: claim interface 4 for ZTE MF290 Interface 4 is used by for QMI interface in stock firmware of MF28D, the router which uses MF290 modem. Rebind it to qmi_wwan after freeing it up from option driver. The proper configuration is: Interface mapping is: 0: QCDM, 1: (unknown), 2: AT (PCUI), 2: AT (Modem), 4: QMI T: Bus=01 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 4 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=19d2 ProdID=0189 Rev= 0.00 S: Manufacturer=ZTE, Incorporated S: Product=ZTE LTE Technologies MSM C:* #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=84(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=86(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms Cc: Bjørn Mork <bjorn@mork.no> Signed-off-by: Lech Perczak <lech.perczak@gmail.com> Link: https://lore.kernel.org/r/20231117231918.100278-3-lech.perczak@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-11-23 09:30:27 +01:00
Linus Torvalds	9b6de136b5	Merge tag 'loongarch-fixes-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Fix several build errors, a potential kernel panic, a cpu hotplug issue and update links in documentations" * tag 'loongarch-fixes-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: Docs/zh_CN/LoongArch: Update links in LoongArch introduction.rst Docs/LoongArch: Update links in LoongArch introduction.rst LoongArch: Implement constant timer shutdown interface LoongArch: Mark {dmw,tlb}_virt_to_page() exports as non-GPL LoongArch: Silence the boot warning about 'nokaslr' LoongArch: Add __percpu annotation for __percpu_read()/__percpu_write() LoongArch: Record pc instead of offset in la_abs relocation LoongArch: Explicitly set -fdirect-access-external-data for vmlinux LoongArch: Add dependency between vmlinuz.efi and vmlinux.efi	2023-11-22 10:20:17 -08:00
Linus Torvalds	05c8c94ed4	Merge tag 'hyperv-fixes-signed-20231121' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - One fix for the KVP daemon (Ani Sinha) - Fix for the detection of E820_TYPE_PRAM in a Gen2 VM (Saurabh Sengar) - Micro-optimization for hv_nmi_unknown() (Uros Bizjak) * tag 'hyperv-fixes-signed-20231121' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Use atomic_try_cmpxchg() to micro-optimize hv_nmi_unknown() x86/hyperv: Fix the detection of E820_TYPE_PRAM in a Gen2 VM hv/hv_kvp_daemon: Some small fixes for handling NM keyfiles	2023-11-22 09:56:26 -08:00
Linus Torvalds	125b0bb95d	asm-generic: qspinlock: fix queued_spin_value_unlocked() implementation We really don't want to do atomic_read() or anything like that, since we already have the value, not the lock. The whole point of this is that we've loaded the lock from memory, and we want to check whether the value we loaded was a locked one or not. The main use of this is the lockref code, which loads both the lock and the reference count in one atomic operation, and then works on that combined value. With the atomic_read(), the compiler would pointlessly spill the value to the stack, in order to then be able to read it back "atomically". This is the qspinlock version of commit `c6f4a90022` ("asm-generic: ticket-lock: Optimize arch_spin_value_unlocked()") which fixed this same bug for ticket locks. Cc: Guo Ren <guoren@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/all/CAHk-=whNRv0v6kQiV5QO6DJhjH4KEL36vWQ6Re8Csrnh4zbRkQ@mail.gmail.com/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2023-11-22 09:32:49 -08:00
Ping-Ke Shih	52471877a2	wifi: rtw89: 8922a: read efuse content from physical map The calibration values of thermal and bias are programmed in invariable physical map. Read them into driver and will set them to registers later. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231117024029.113845-7-pkshih@realtek.com	2023-11-22 17:51:17 +02:00
Ping-Ke Shih	c7ccb2402e	wifi: rtw89: 8922a: read efuse content via efuse map struct from logic map Define efuse map struct of RTW89_EFUSE_BLOCK_RF block, and read needed data from efuse logic map into driver. Also, with efuse power-on state, read MAC address via register interface according to HCI interface. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231117024029.113845-6-pkshih@realtek.com	2023-11-22 17:51:16 +02:00
Ping-Ke Shih	e102ff4b35	wifi: rtw89: 8852c: read RX gain offset from efuse for 6GHz channels Read calibration values of RX gain offset from efuse, and set them to registers to normalize RX gain for all hardware modules. Then, PHY dynamic mechanism can get expected values to adjust hardware parameters to yield expected performance. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231117024029.113845-5-pkshih@realtek.com	2023-11-22 17:51:16 +02:00
Ping-Ke Shih	f28eab6ae4	wifi: rtw89: mac: add to access efuse for WiFi 7 chips MAC address, hardware type, calibration values and etc are stored in efuse, so we read them at probe stage and use them as capabilities to register hardware. There are two physical efuse -- one is the main efuse for digital hardware part, and the other is for analog part. Because they are very similar, we only describe the main efuse below. The main efuse is split into two regions -- one is for logic map, and the other is for physical map. For both regions, we use the same method to read data, but need additional parser to get logic map. To allow reading operation, we need to convert power state to active, and turn to idle state after reading. For WiFi 7 chips, we introduce efuse blocks to define feature group easier, and these blocks are discontinue. For example, RF block is from 0x1_0000 ~ 0x1_0240, and the next block PCIE_SDIO is starting from 0x2_0000. Comparing to old one used by WiFi 6 chips, there is only single one logic map, it would be a little hard to add an new field to a group if we don't reserve a room in advance. The relationship between efuse, region and block is shown as below: (logical map) +------------+ +---------------+ +-----------------+ \| main efuse \| \| region 1 \| \| block 0x1_0000~ \| \| (digital) \| \|(to logcal map)\| +-----------------+ \| \| \| \| => +-----------------+ \| \| => \| \| \| block 0x2_0000~ \| \| \| \| \| +-----------------+ \| \| \|---------------\| : \| \| \| region 2 \| +------------+ +---------------+ +------------+ +-----------------+ \| 2nd efuse \| ======================> \| block 0x7_0000~ \| \| (analog) \| +-----------------+ +------------+ The parser converting from raw data to logic map is to decode block page, block page offset, and word_en bits. Each word_en bit indicates two following bytes as data of logic map, so total four word_en bits can represent eight bytes. Thus, block page offset is 8-byte alignment. The layout of a tuple is shown as below +--------+--------+--------+--------+--------+--------+ \| fixed 3 byte header \| \| \| \| \| \| \| \| \| \| [19:17] block_page \| \| \| ... \| \| [16:4] block_page_offset\| \| \| \| \| [3:0] word_en \| ^ \| ^ \| \| +----\|---+--------+--------+---\|----+----\|---+--------+ \| \| \| +-------------------------+---------+ a word_en bit indicates two bytes as data For example, block_page = 0x3 block_page_offset = 0x80 (must 8-byte alignment) word_en = 0x6 (b'0110; 0 means data is presented) following 4 bytes = 34 56 78 90 Then, 0x3_0080 = 34 56 0x3_0086 = 78 90 A special block page is RTW89_EFUSE_BLOCK_ADIE (7) that uses different but similar format, because its real efuse size is smaller than main efuse. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231117024029.113845-4-pkshih@realtek.com	2023-11-22 17:51:16 +02:00
Ping-Ke Shih	88e6a923bb	wifi: rtw89: mac: use mac_gen pointer to access about efuse Use function pointers to abstract efuse access, and introduce an new function to convert efuse power state that is needed by WiFi 7 chips. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231117024029.113845-3-pkshih@realtek.com	2023-11-22 17:51:16 +02:00
Ping-Ke Shih	c0a04552e3	wifi: rtw89: 8922a: add 8922A basic chip info 8922A is a 802.11be chip that can support 2/5/6 GHz bands 160MHz bandwidth. Introduce the basic info such as firmware file name, some hardware address and size, supported spatial stream, TX descriptor and so on, and then we can add more attributes by later patches. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231117024029.113845-2-pkshih@realtek.com	2023-11-22 17:51:16 +02:00
Bjorn Helgaas	f60df12aaa	wifi: rtlwifi: drop unused const_amdpci_aspm Remove the unused "const_amdpci_aspm" member of struct rtl_pci and struct rtl_ps_ctl. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Ping-Ke Shih <pkshih@realtek.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231116180529.52752-1-helgaas@kernel.org	2023-11-22 17:50:36 +02:00
Su Hui	a85198c9f0	wifi: mwifiex: mwifiex_process_sleep_confirm_resp(): remove unused priv variable Clang static analyzer complains that value stored to 'priv' is never read. 'priv' is useless, so remove it to save space. Signed-off-by: Su Hui <suhui@nfschina.com> Acked-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231115092328.1048103-1-suhui@nfschina.com	2023-11-22 17:50:09 +02:00
Zong-Zhe Yang	c212abfbd1	wifi: rtw89: regd: update regulatory map to R65-R44 Sync Realtek Regulatory R44 and Realtek Channel Plan R65. Configure 6 GHz field of Realtek regd on SG, TW, GD. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231114091359.50664-4-pkshih@realtek.com	2023-11-22 17:48:00 +02:00
Zong-Zhe Yang	b2774a916a	wifi: rtw89: regd: handle policy of 6 GHz according to BIOS According to BIOS configuration of Realtek ACPI DSM function 4, RTW89_ACPI_DSM_FUNC_6G_BP, we handle the regd policy of 6 GHz. Policy defines two modes as below. 1. `BLOCK` mode: The countries in configured list are blocked. 2. `ALLOW` mode: _Only_ the countries in configured list are allowed. (i.e. others are all blocked.) Then, when receiving regulatory notification at runtime, if 6 GHz is blocked on the country, 6 GHz channels will be disabled. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231114091359.50664-3-pkshih@realtek.com	2023-11-22 17:48:00 +02:00
Zong-Zhe Yang	665ecff7dd	wifi: rtw89: acpi: process 6 GHz band policy from DSM Realtek ACPI DSM func 4, RTW89_ACPI_DSM_FUNC_6G_BP, accepts a configuration via ACPI buffer as below. \| index \| description \| ------------------------- \| [0-2] \| signature \| \| [3] \| reserved \| \| [4] \| policy mode \| \| [5] \| country count \| \| [6-] \| country list \| Through this function, BIOS can indicate to allow/block 6 GHz on some specific countries. Still, driver should follow regd first before taking this configuration into account. Besides, add a bit in debug mask for ACPI. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231114091359.50664-2-pkshih@realtek.com	2023-11-22 17:48:00 +02:00
Dmitry Antipov	2c4e9acbe3	wifi: rtlwifi: simplify rtl_action_proc() and rtl_tx_agg_start() Since 'drv_priv' is an in-place member allocated at the end of 'struct ieee80211_sta', it can't be NULL and so relevant checks in 'rtl_action_proc()' and 'rtl_tx_agg_start()' may be dropped. Compile tested only. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Acked-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231113144734.197359-2-dmantipov@yandex.ru	2023-11-22 16:57:39 +02:00
Heiner Kallweit	6a26310273	Revert "net: r8169: Disable multicast filter for RTL8168H and RTL8107E" This reverts commit `efa5f1311c`. I couldn't reproduce the reported issue. What I did, based on a pcap packet log provided by the reporter: - Used same chip version (RTL8168h) - Set MAC address to the one used on the reporters system - Replayed the EAPOL unicast packet that, according to the reporter, was filtered out by the mc filter. The packet was properly received. Therefore the root cause of the reported issue seems to be somewhere else. Disabling mc filtering completely for the most common chip version is a quite big hammer. Therefore revert the change and wait for further analysis results from the reporter. Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-22 12:12:02 +00:00
D. Wythe	e6d71b437a	net/smc: avoid data corruption caused by decline We found a data corruption issue during testing of SMC-R on Redis applications. The benchmark has a low probability of reporting a strange error as shown below. "Error: Protocol error, got "\xe2" as reply type byte" Finally, we found that the retrieved error data was as follows: 0xE2 0xD4 0xC3 0xD9 0x04 0x00 0x2C 0x20 0xA6 0x56 0x00 0x16 0x3E 0x0C 0xCB 0x04 0x02 0x01 0x00 0x00 0x20 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0xE2 It is quite obvious that this is a SMC DECLINE message, which means that the applications received SMC protocol message. We found that this was caused by the following situations: client server ¦ clc proposal -------------> ¦ clc accept <------------- ¦ clc confirm -------------> wait llc confirm send llc confirm ¦failed llc confirm ¦ x------ (after 2s)timeout wait llc confirm rsp wait decline (after 1s) timeout (after 2s) timeout ¦ decline --------------> ¦ decline <-------------- As a result, a decline message was sent in the implementation, and this message was read from TCP by the already-fallback connection. This patch double the client timeout as 2x of the server value, With this simple change, the Decline messages should never cross or collide (during Confirm link timeout). This issue requires an immediate solution, since the protocol updates involve a more long-term solution. Fixes: `0fb0b02bd6` ("net/smc: adapt SMC client code to use the LLC flow") Signed-off-by: D. Wythe <alibuda@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-22 12:10:19 +00:00
Nguyen Dinh Phi	84d2db91f1	nfc: virtual_ncidev: Add variable to check if ndev is running syzbot reported an memory leak that happens when an skb is add to send_buff after virtual nci closed. This patch adds a variable to track if the ndev is running before handling new skb in send function. Signed-off-by: Nguyen Dinh Phi <phind.uet@gmail.com> Reported-by: syzbot+6eb09d75211863f15e3e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/00000000000075472b06007df4fb@google.com Reviewed-by: Bongsu Jeon Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-22 10:55:48 +00:00
Gan, Yi Fang	750011e239	net: stmmac: Add support for HW-accelerated VLAN stripping Current implementation supports driver level VLAN tag stripping only. The features is always on if CONFIG_VLAN_8021Q is enabled in kernel config and is not user configurable. This patch add support to MAC level VLAN tag stripping and can be configured through ethtool. If the rx-vlan-offload is off, the VLAN tag will be stripped by driver. If the rx-vlan-offload is on, the VLAN tag will be stripped by MAC. Command: ethtool -K <interface> rx-vlan-offload off \| on Signed-off-by: Lai Peter Jun Ann <jun.ann.lai@intel.com> Signed-off-by: Gan, Yi Fang <yi.fang.gan@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-22 10:54:14 +00:00
Murali Karicheri	36b20fcdd9	net: hsr: Add support for MC filtering at the slave device When MC (multicast) list is updated by the networking layer due to a user command and as well as when allmulti flag is set, it needs to be passed to the enslaved Ethernet devices. This patch allows this to happen by implementing ndo_change_rx_flags() and ndo_set_rx_mode() API calls that in turns pass it to the slave devices using existing API calls. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: Ravi Gunasekaran <r-gunasekaran@ti.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-22 10:51:32 +00:00
Uros Bizjak	18286883e7	x86/hyperv: Use atomic_try_cmpxchg() to micro-optimize hv_nmi_unknown() Use atomic_try_cmpxchg() instead of atomic_cmpxchg(*ptr, old, new) == old in hv_nmi_unknown(). On x86 the CMPXCHG instruction returns success in the ZF flag, so this change saves a compare after CMPXCHG. The generated asm code improves from: 3e: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx 45: b8 ff ff ff ff mov $0xffffffff,%eax 4a: f0 0f b1 15 00 00 00 lock cmpxchg %edx,0x0(%rip) 51: 00 52: 83 f8 ff cmp $0xffffffff,%eax 55: 0f 95 c0 setne %al to: 3e: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx 45: b8 ff ff ff ff mov $0xffffffff,%eax 4a: f0 0f b1 15 00 00 00 lock cmpxchg %edx,0x0(%rip) 51: 00 52: 0f 95 c0 setne %al No functional change intended. Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20231114170038.381634-1-ubizjak@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20231114170038.381634-1-ubizjak@gmail.com>	2023-11-22 03:47:44 +00:00
Jakub Kicinski	53475287da	Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-11-21 We've added 85 non-merge commits during the last 12 day(s) which contain a total of 63 files changed, 4464 insertions(+), 1484 deletions(-). The main changes are: 1) Huge batch of verifier changes to improve BPF register bounds logic and range support along with a large test suite, and verifier log improvements, all from Andrii Nakryiko. 2) Add a new kfunc which acquires the associated cgroup of a task within a specific cgroup v1 hierarchy where the latter is identified by its id, from Yafang Shao. 3) Extend verifier to allow bpf_refcount_acquire() of a map value field obtained via direct load which is a use-case needed in sched_ext, from Dave Marchevsky. 4) Fix bpf_get_task_stack() helper to add the correct crosstask check for the get_perf_callchain(), from Jordan Rome. 5) Fix BPF task_iter internals where lockless usage of next_thread() was wrong. The rework also simplifies the code, from Oleg Nesterov. 6) Fix uninitialized tail padding via LIBBPF_OPTS_RESET, and another fix for certain BPF UAPI structs to fix verifier failures seen in bpf_dynptr usage, from Yonghong Song. 7) Add BPF selftest fixes for map_percpu_stats flakes due to per-CPU BPF memory allocator not being able to allocate per-CPU pointer successfully, from Hou Tao. 8) Add prep work around dynptr and string handling for kfuncs which is later going to be used by file verification via BPF LSM and fsverity, from Song Liu. 9) Improve BPF selftests to update multiple prog_tests to use ASSERT_* macros, from Yuran Pereira. 10) Optimize LPM trie lookup to check prefixlen before walking the trie, from Florian Lehner. 11) Consolidate virtio/9p configs from BPF selftests in config.vm file given they are needed consistently across archs, from Manu Bretelle. 12) Small BPF verifier refactor to remove register_is_const(), from Shung-Hsi Yu. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits) selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in vmlinux selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bpf_obj_id selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bind_perm selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bpf_tcp_ca selftests/bpf: reduce verboseness of reg_bounds selftest logs bpf: bpf_iter_task_next: use next_task(kit->task) rather than next_task(kit->pos) bpf: bpf_iter_task_next: use __next_thread() rather than next_thread() bpf: task_group_seq_get_next: use __next_thread() rather than next_thread() bpf: emit frameno for PTR_TO_STACK regs if it differs from current one bpf: smarter verifier log number printing logic bpf: omit default off=0 and imm=0 in register state log bpf: emit map name in register state if applicable and available bpf: print spilled register state in stack slot bpf: extract register state printing bpf: move verifier state printing code to kernel/bpf/log.c bpf: move verbose_linfo() into kernel/bpf/log.c bpf: rename BPF_F_TEST_SANITY_STRICT to BPF_F_TEST_REG_INVARIANTS bpf: Remove test for MOVSX32 with offset=32 selftests/bpf: add iter test requiring range x range logic veristat: add ability to set BPF_F_TEST_SANITY_STRICT flag with -r flag ... ==================== Link: https://lore.kernel.org/r/20231122000500.28126-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:53:20 -08:00
Hao Ge	b6fe6f0371	dpll: Fix potential msg memleak when genlmsg_put_reply failed We should clean the skb resource if genlmsg_put_reply failed. Fixes: `9d71b54b65` ("dpll: netlink: Add DPLL framework base functions") Signed-off-by: Hao Ge <gehao@kylinos.cn> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://lore.kernel.org/r/20231121013709.73323-1-gehao@kylinos.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:41:20 -08:00
Jakub Kicinski	340bf2dbb1	Merge branch 'bnxt_en-prepare-to-support-new-p7-chips' Michael Chan says: ==================== bnxt_en: Prepare to support new P7 chips This patchset is to prepare the driver to support the new P7 chips by refactoring and modifying the code. The P7 chip is built on the P5 chip and many code paths can be modified to support both chips. The whole patchset to have basic support for P7 chips is about 20 patches so a follow-on patchset will complete the support and add the new PCI IDs. The first 8 patches are changes to the backing store logic to support both chips with mostly common code paths and datastructures. Both chips require host backing store memory but the relevant firmware APIs have been modified to make it easier to support new backing store memory types. The next 4 patches are changes to TX and RX ring indexing logic and NAPI logic. The main changes are to increment the TX and RX producers unbounded and to do any masking only when needed. These changes are needed to support the P7 chips which require additional higher bits in these producer indices. The NAPI logic is also slightly modifed. The last patch is a rename of BNXT_FLAG_CHIP_P5 to BNXT_FLAG_P5_PLUS and other related macro changes to make it clear that the P5_PLUS macro applies to P5 and newer chips. ==================== Link: https://lore.kernel.org/r/20231120234405.194542-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:51 -08:00
Randy Schacher	1c7fd6ee2f	bnxt_en: Rename some macros for the P5 chips In preparation to support a new P7 chip which has a lot of similarities with the P5 chip, rename the BNXT_FLAG_CHIP_P5 flag to BNXT_FLAG_CHIP_P5_PLUS. This will make it clear that the flag is for P5 and newer chips. Also, since there are no additional P5 variants in production, rename BNXT_FLAG_CHIP_P5_THOR() to BNXT_FLAG_CHIP_P5() to keep the naming more simple. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Randy Schacher <stuart.schacher@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-14-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:49 -08:00
Michael Chan	f94471f3ce	bnxt_en: Modify the NAPI logic for the new P7 chips Modify the NAPI logic for the new doorbell mechanism on P7 chips. These changes are compatible with the current P5 chips. In the current logic, bnxt_poll_p5() services 1 or more CQs for each MSIX. Each MSIX has an associated NQ and each NQ has 1 or more associated CQs. If any CQ reaches NAPI budget, we'll stay in polling mode and will unconditionally check and service all CQs until we exit polling. We always re-arm all CQs when we exit polling. To be compatible with the new Toggle bit mechanism in P7 chips, we need to modify the logic so that we service and re-arm the CQ only if we receive an NQE notification for work for that CQ. We add a new had_nqe_notify bit to the cp_ring_info structure and it gets set when we see the NQE notification for that CQ anytime during polling. We'll service and re-arm only the CQs with the had_nqe_notify bits set. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-13-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:49 -08:00
Michael Chan	c09d22674b	bnxt_en: Modify RX ring indexing logic. Modify the RX indexing logic for both RX ring and RX aggregation ring just like the TX logic. Change it so that the index increments unbounded and mask it only when needed. Modify the existing RX macros so that the index is not masked. Add new macros RING_RX()/RING_RX_AGG() to mask it only when needed to get the index of rxr->rx_buf_ring[] and rxr->rx_agg_ring[]. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-12-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:49 -08:00
Michael Chan	6d1add9553	bnxt_en: Modify TX ring indexing logic. Change the TX ring logic so that the index increments unbounded and mask it only when needed. Modify the existing macros so that the index is not masked. Add a new macro RING_TX() to mask it only when needed to get the index of txr->tx_buf_ring[]. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-11-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:49 -08:00
Michael Chan	b9e0c47ee2	bnxt_en: Add db_ring_mask and related macro to bnxt_db_info struct. This allows the doorbell related logic to mask the doorbell index to the proper range before writing the doorbell. The current code masks the doorbell index immediately to keep it in the legal ranges for the most part. Subsequent patches will change the logic so that the index increments unbounded and it only gets masked before use. This is preparation work for the new chip that requires an additional Epoch bit in the doorbell that needs to toggle when the index has wrapped around. This patch just adds the basic infrastructure and the logic is largely unchanged. We now replace RING_CMP() with the new DB_RING_IDX() at appropriate places where we mask the completion ring index before writing the doorbell. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-10-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:48 -08:00
Michael Chan	236e237f8f	bnxt_en: Add support for HWRM_FUNC_BACKING_STORE_CFG_V2 firmware calls Newer chips starting with 57600 will use this new firmware HWRM call to configure backing store memory. Add this new call if it is supported by the firmware. Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-9-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:48 -08:00
Michael Chan	6a4d0774f0	bnxt_en: Add support for new backing store query firmware API Use the new v2 firmware API if supported by the firmware. We now have the infrastructure to support the v2 API. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-8-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:48 -08:00
Michael Chan	b098dc5a33	bnxt_en: Add bnxt_setup_ctxm_pg_tbls() helper function In bnxt_alloc_ctx_mem(), the logic to set up the context memory entries and to allocate the context memory tables is done repetitively. Add a helper function to simplify the code. The setup of the Fast Path TQM entries relies on some information from the Slow Path TQM entries. Copy the SP_TQM entries to the FP_TQM entries to simplify the logic. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-7-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:48 -08:00
Michael Chan	2ad67aea11	bnxt_en: Use the pg_info field in bnxt_ctx_mem_type struct Use the newly added pg_info field in bnxt_ctx_mem_type struct and remove the standalone page info structures in bnxt_ctx_mem_info. This now completes the reorganization of the context memory structures to work better with the new and more flexible firmware interface for newer chips. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-6-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:48 -08:00
Michael Chan	035c576159	bnxt_en: Add page info to struct bnxt_ctx_mem_type This will further improve the organization of the bnxt_ctx_mem_info structure by moving the standalone page info structures into the bnxt_ctx_mem_type array. Add the allocation and free logic first and the next patch will migrate to use the new infrastructure. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:48 -08:00
Michael Chan	76087d997a	bnxt_en: Restructure context memory data structures The current code uses a flat bnxt_ctx_mem_info structure to store 8 types of context memory for the NIC. All the context memory types are very similar and have similar parameters. They can all share a common structure to improve the organization. Also, new firmware interface will provide a new API to retrieve each type of context memory by calling the API repeatedly. This patch reorganizes the bnxt_ctx_mem_info structure to fit better with the new firmware interface. It will also work with the legacy firmware interface. The flat fields in bnxt_ctx_mem_info are replaced by the bnxt_ctx_mem_type array. The bnxt_mem_init array info will no longer be needed. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:48 -08:00
Michael Chan	e50dc4c220	bnxt_en: Free bp->ctx inside bnxt_free_ctx_mem() We always free bp->ctx right after calling bnxt_free_ctx_mem(), so just free it at the end of that function to make things simpler. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:47 -08:00
Michael Chan	aa8460bacf	bnxt_en: The caller of bnxt_alloc_ctx_mem() should always free bp->ctx bnxt_alloc_ctx_mem() calls bnxt_hwrm_func_backing_store_qcaps() to allocate the memory for bp->ctx. Initialize bp->ctx with the allocated memory and let the caller free it during unwind. The unwind logic is already there, we just need to always set bp->ctx to the allocated memory so the caller will always free it. This simplifies the logic and makes it easier to expand on the backing store logic. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:47 -08:00
Jakub Kicinski	46e208e70a	Merge branch 'net-page_pool-add-netlink-based-introspection-part1' Jakub Kicinski says: ==================== net: page_pool: plit the page_pool_params into fast and slow Small refactoring in prep for adding more page pool params which won't be needed on the fast path. v1: https://lore.kernel.org/all/20231024160220.3973311-1-kuba@kernel.org/ RFC: https://lore.kernel.org/all/20230816234303.3786178-1-kuba@kernel.org/ ==================== Link: https://lore.kernel.org/r/20231121000048.789613-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:22:38 -08:00
Jakub Kicinski	2da0cac1e9	net: page_pool: avoid touching slow on the fastpath To fully benefit from previous commit add one byte of state in the first cache line recording if we need to look at the slow part. The packing isn't all that impressive right now, we create a 7B hole. I'm expecting Olek's rework will reshuffle this, anyway. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://lore.kernel.org/r/20231121000048.789613-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:22:30 -08:00
Jakub Kicinski	5027ec19f1	net: page_pool: split the page_pool_params into fast and slow struct page_pool is rather performance critical and we use 16B of the first cache line to store 2 pointers used only by test code. Future patches will add more informational (non-fast path) attributes. It's convenient for the user of the API to not have to worry which fields are fast and which are slow path. Use struct groups to split the params into the two categories internally. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Link: https://lore.kernel.org/r/20231121000048.789613-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:22:29 -08:00
Jakub Kicinski	b2d66643dc	Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2023-11-21 We've added 19 non-merge commits during the last 4 day(s) which contain a total of 18 files changed, 1043 insertions(+), 416 deletions(-). The main changes are: 1) Fix BPF verifier to validate callbacks as if they are called an unknown number of times in order to fix not detecting some unsafe programs, from Eduard Zingerman. 2) Fix bpf_redirect_peer() handling which missed proper stats accounting for veth and netkit and also generally fix missing stats for the latter, from Peilin Ye, Daniel Borkmann et al. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: check if max number of bpf_loop iterations is tracked bpf: keep track of max number of bpf_loop callback iterations selftests/bpf: test widening for iterating callbacks bpf: widening for callback iterators selftests/bpf: tests for iterating callbacks bpf: verify callbacks as if they are called unknown number of times bpf: extract setup_func_entry() utility function bpf: extract __check_reg_arg() utility function selftests/bpf: fix bpf_loop_bench for new callback verification scheme selftests/bpf: track string payload offset as scalar in strobemeta selftests/bpf: track tcp payload offset as scalar in xdp_synproxy selftests/bpf: Add netkit to tc_redirect selftest selftests/bpf: De-veth-ize the tc_redirect test case bpf, netkit: Add indirect call wrapper for fetching peer dev bpf: Fix dev's rx stats for bpf_redirect_peer traffic veth: Use tstats per-CPU traffic counters netkit: Add tstats per-CPU traffic counters net: Move {l,t,d}stats allocation to core and convert veth & vrf net, vrf: Move dstats structure to core ==================== Link: https://lore.kernel.org/r/20231121193113.11796-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 15:49:31 -08:00
Jakub Kicinski	3a17ea77da	Merge branch 'mlxsw-preparations-for-support-of-cff-flood-mode' Petr Machata says: ==================== mlxsw: Preparations for support of CFF flood mode PGT is an in-HW table that maps addresses to sets of ports. Then when some HW process needs a set of ports as an argument, instead of embedding the actual set in the dynamic configuration, what gets configured is the address referencing the set. The HW then works with the appropriate PGT entry. Among other allocations, the PGT currently contains two large blocks for bridge flooding: one for 802.1q and one for 802.1d. Within each of these blocks are three tables, for unknown-unicast, multicast and broadcast flooding: . . . \| 802.1q \| 802.1d \| . . . \| UC \| MC \| BC \| UC \| MC \| BC \| \______ _____/ \_____ ______/ v v FID flood vectors Thus each FID (which corresponds to an 802.1d bridge or one VLAN in an 802.1q bridge) uses three flood vectors spread across a fairly large region of PGT. This way of organizing the flood table (called "controlled") is not very flexible. E.g. to decrease a bridge scale and store more IP MC vectors, one would need to completely rewrite the bridge PGT blocks, or resort to hacks such as storing individual MC flood vectors into unused part of the bridge table. In order to address these shortcomings, Spectrum-2 and above support what is called CFF flood mode, for Compressed FID Flooding. In CFF flood mode, each FID has a little table of its own, with three entries adjacent to each other, one for unknown-UC, one for MC, one for BC. This allows for a much more fine-grained approach to PGT management, where bits of it are allocated on demand. . . . \| FID \| FID \| FID \| FID \| FID \| . . . \|U\|M\|B\|U\|M\|B\|U\|M\|B\|U\|M\|B\|U\|M\|B\| \_____________ _____________/ v FID flood vectors Besides the FID table organization, the CFF flood mode also impacts Router Subport (RSP) table. This table contains flood vectors for rFIDs, which are FIDs that reference front panel ports or LAGs. The RSP table contains two entries per front panel port and LAG, one for unknown-UC traffic, and one for everything else. Currently, the FW allocates and manages the table in its own part of PGT. rFIDs are marked with flood_rsp bit and managed specially. In CFF mode, rFIDs are managed as all other FIDs. The driver therefore has to allocate and maintain the flood vectors. Like with bridge FIDs, this is more work, but increases flexibility of the system. The FW currently supports both the controlled and CFF flood modes. To shed complexity, in the future it should only support CFF flood mode. Hence this patchset, which is the first in series of two to add CFF flood mode support to mlxsw. There are FW versions out there that do not support CFF flood mode, and on Spectrum-1 in particular, there is no plan to support it at all. mlxsw will therefore have to support both controlled flood mode as well as CFF. Another aspect is that at least on Spectrum-1, there are FW versions out there that claim to support CFF flood mode, but then reject or ignore configurations enabling the same. The driver thus has to have a say in whether an attempt to configure CFF flood mode should even be made. Much like with the LAG mode, the feature is therefore expressed in terms of "does the driver prefer CFF flood mode?", and "what flood mode the PCI module managed to configure the FW with". This gives to the driver a chance to determine whether CFF flood mode configuration should be attempted. In this patchset, we lay the ground with new definitions, registers and their fields, and some minor code shaping. The next patchset will be more focused on introducing necessary abstractions and implementation. - Patches #1 and #2 add CFF-related items to the command interface. - Patch #3 adds a new resource, for maximum number of flood profiles supported. (A flood profile is a mapping between traffic type and offset in the per-FID flood vector table.) - Patches #4 to #8 adjust reg.h. The SFFP register is added, which is used for configuring the abovementioned traffic-type-to-offset mapping. The SFMR, register, which serves for FID configuration, is extended with fields specific to CFF mode. And other minor adjustments. - Patches #9 and #10 add the plumbing for CFF mode: a way to request that CFF flood mode be configured, and a way to query the flood mode that was actually configured. - Patch #11 removes dead code. - Patches #12 and #13 add helpers that the next patchset will make use of. Patch #14 moves RIF setup ahead so that FID code can make use of it. ==================== Link: https://lore.kernel.org/r/cover.1700503643.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:53:12 -08:00

... 3 4 5 6 7 ...

1234464 Commits