linux

iv/linux

Author	SHA1	Message	Date
Vlad Buslov	396d7f23ad	net: sched: fix police ext initialization When police action is created by cls API tcf_exts_validate() first conditional that calls tcf_action_init_1() directly, the action idr is not updated according to latest changes in action API that require caller to commit newly created action to idr with tcf_idr_insert_many(). This results such action not being accessible through act API and causes crash reported by syzbot: ================================================================== BUG: KASAN: null-ptr-deref in instrument_atomic_read include/linux/instrumented.h:71 [inline] BUG: KASAN: null-ptr-deref in atomic_read include/asm-generic/atomic-instrumented.h:27 [inline] BUG: KASAN: null-ptr-deref in __tcf_idr_release net/sched/act_api.c:178 [inline] BUG: KASAN: null-ptr-deref in tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598 Read of size 4 at addr 0000000000000010 by task kworker/u4:5/204 CPU: 0 PID: 204 Comm: kworker/u4:5 Not tainted 5.11.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 __kasan_report mm/kasan/report.c:400 [inline] kasan_report.cold+0x5f/0xd5 mm/kasan/report.c:413 check_memory_region_inline mm/kasan/generic.c:179 [inline] check_memory_region+0x13d/0x180 mm/kasan/generic.c:185 instrument_atomic_read include/linux/instrumented.h:71 [inline] atomic_read include/asm-generic/atomic-instrumented.h:27 [inline] __tcf_idr_release net/sched/act_api.c:178 [inline] tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598 tc_action_net_exit include/net/act_api.h:151 [inline] police_exit_net+0x168/0x360 net/sched/act_police.c:390 ops_exit_list+0x10d/0x160 net/core/net_namespace.c:190 cleanup_net+0x4ea/0xb10 net/core/net_namespace.c:604 process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421 kthread+0x3b1/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296 ================================================================== Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 204 Comm: kworker/u4:5 Tainted: G B 5.11.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 panic+0x306/0x73d kernel/panic.c:231 end_report+0x58/0x5e mm/kasan/report.c:100 __kasan_report mm/kasan/report.c:403 [inline] kasan_report.cold+0x67/0xd5 mm/kasan/report.c:413 check_memory_region_inline mm/kasan/generic.c:179 [inline] check_memory_region+0x13d/0x180 mm/kasan/generic.c:185 instrument_atomic_read include/linux/instrumented.h:71 [inline] atomic_read include/asm-generic/atomic-instrumented.h:27 [inline] __tcf_idr_release net/sched/act_api.c:178 [inline] tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598 tc_action_net_exit include/net/act_api.h:151 [inline] police_exit_net+0x168/0x360 net/sched/act_police.c:390 ops_exit_list+0x10d/0x160 net/core/net_namespace.c:190 cleanup_net+0x4ea/0xb10 net/core/net_namespace.c:604 process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421 kthread+0x3b1/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296 Kernel Offset: disabled Fix the issue by calling tcf_idr_insert_many() after successful action initialization. Fixes: 0fedc63fadf0 ("net_sched: commit action insertions together") Reported-by: syzbot+151e3e714d34ae4ce7e8@syzkaller.appspotmail.com Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-16 14:59:19 -08:00
David S. Miller	17aff5389d	Merge branch 'amd-xgbe-fixes' Shyam Sundar S K says: ==================== Bug fixes to amd-xgbe driver General fixes on amd-xgbe driver are addressed in this series, mostly on the mailbox communication failures and improving the link stability of the amd-xgbe device. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-16 14:09:46 -08:00
Shyam Sundar S K	9eab3fdb41	net: amd-xgbe: Fix network fluctuations when using 1G BELFUSE SFP Frequent link up/down events can happen when a Bel Fuse SFP part is connected to the amd-xgbe device. Try to avoid the frequent link issues by resetting the PHY as documented in Bel Fuse SFP datasheets. Fixes: e722ec82374b ("amd-xgbe: Update the BelFuse quirk to support SGMII") Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-16 14:09:46 -08:00
Shyam Sundar S K	84fe68eb67	net: amd-xgbe: Reset link when the link never comes back Normally, auto negotiation and reconnect should be automatically done by the hardware. But there seems to be an issue where auto negotiation has to be restarted manually. This happens because of link training and so even though still connected to the partner the link never "comes back". This needs an auto-negotiation restart. Also, a change in xgbe-mdio is needed to get ethtool to recognize the link down and get the link change message. This change is only required in a backplane connection mode. Fixes: abf0a1c2b26a ("amd-xgbe: Add support for SFP+ modules") Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-16 14:09:45 -08:00
Shyam Sundar S K	186edbb510	net: amd-xgbe: Fix NETDEV WATCHDOG transmit queue timeout warning The current driver calls netif_carrier_off() late in the link tear down which can result in a netdev watchdog timeout. Calling netif_carrier_off() immediately after netif_tx_stop_all_queues() avoids the warning. ------------[ cut here ]------------ NETDEV WATCHDOG: enp3s0f2 (amd-xgbe): transmit queue 0 timed out WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:461 dev_watchdog+0x20d/0x220 Modules linked in: amd_xgbe(E) amd-xgbe 0000:03:00.2 enp3s0f2: Link is Down CPU: 3 PID: 0 Comm: swapper/3 Tainted: G E Hardware name: AMD Bilby-RV2/Bilby-RV2, BIOS RBB1202A 10/18/2019 RIP: 0010:dev_watchdog+0x20d/0x220 Code: 00 49 63 4e e0 eb 92 4c 89 e7 c6 05 c6 e2 c1 00 01 e8 e7 ce fc ff 89 d9 48 RSP: 0018:ffff90cfc28c3e88 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006 RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff90cfc28d63c0 RBP: ffff90cfb977845c R08: 0000000000000050 R09: 0000000000196018 R10: ffff90cfc28c3ef8 R11: 0000000000000000 R12: ffff90cfb9778000 R13: 0000000000000003 R14: ffff90cfb9778480 R15: 0000000000000010 FS: 0000000000000000(0000) GS:ffff90cfc28c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f240ff2d9d0 CR3: 00000001e3e0a000 CR4: 00000000003406e0 Call Trace: <IRQ> ? pfifo_fast_reset+0x100/0x100 call_timer_fn+0x2b/0x130 run_timer_softirq+0x3e8/0x440 ? enqueue_hrtimer+0x39/0x90 Fixes: e722ec82374b ("amd-xgbe: Update the BelFuse quirk to support SGMII") Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-16 14:09:45 -08:00
Shyam Sundar S K	30b7edc82e	net: amd-xgbe: Reset the PHY rx data path when mailbox command timeout Sometimes mailbox commands timeout when the RX data path becomes unresponsive. This prevents the submission of new mailbox commands to DXIO. This patch identifies the timeout and resets the RX data path so that the next message can be submitted properly. Fixes: 549b32af9f7c ("amd-xgbe: Simplify mailbox interface rate change code") Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-16 14:09:45 -08:00
Alex Elder	25c5a7e89b	net: ipa: initialize all resources We configure the minimum and maximum number of various types of IPA resources in ipa_resource_config(). It iterates over resource types in the configuration data and assigns resource limits to each resource group for each type. Unfortunately, we are repeatedly initializing the resource data for the first type, rather than initializing each of the types whose limits are specified. Fix this bug. Fixes: 4a0d7579d466e ("net: ipa: avoid going past end of resource group array") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-15 15:26:29 -08:00
Sukadev Bhattiprolu	4a41c421f3	ibmvnic: serialize access to work queue on remove The work queue is used to queue reset requests like CHANGE-PARAM or FAILOVER resets for the worker thread. When the adapter is being removed the adapter state is set to VNIC_REMOVING and the work queue is flushed so no new work is added. However the check for adapter being removed is racy in that the adapter can go into REMOVING state just after we check and we might end up adding work just as it is being flushed (or after). The ->rwi_lock is already being used to serialize queue/dequeue work. Extend its usage ensure there is no race when scheduling/flushing work. Fixes: 6954a9e4192b ("ibmvnic: Flush existing work items before device removal") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Cc:Uwe Kleine-König <uwe@kleine-koenig.org> Cc:Saeed Mahameed <saeed@kernel.org> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-15 15:17:30 -08:00
Lijun Pan	7d3a7b9ea5	ibmvnic: skip send_request_unmap for timeout reset Timeout reset will trigger the VIOS to unmap it automatically, similarly as FAILVOER and MOBILITY events. If we unmap it in the linux side, we will see errors like "30000003: Error 4 in REQUEST_UNMAP_RSP". So, don't call send_request_unmap for timeout reset. Fixes: ed651a10875f ("ibmvnic: Updated reset handling") Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-15 15:12:26 -08:00
Lijun Pan	42557dab78	ibmvnic: add memory barrier to protect long term buffer dma_rmb() barrier is added to load the long term buffer before copying it to socket buffer; and dma_wmb() barrier is added to update the long term buffer before it being accessed by VIOS (virtual i/o server). Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Acked-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-15 15:12:01 -08:00
Heiner Kallweit	7ce189faa7	r8169: fix resuming from suspend on RTL8105e if machine runs on battery Armin reported that after referenced commit his RTL8105e is dead when resuming from suspend and machine runs on battery. This patch has been confirmed to fix the issue. Fixes: e80bd76fbf56 ("r8169: work around power-saving bug on some chip versions") Reported-by: Armin Wolf <W_Armin@gmx.de> Tested-by: Armin Wolf <W_Armin@gmx.de> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-15 14:56:07 -08:00
Sebastian Andrzej Siewior	d6d8a24023	net: caif: Use netif_rx_any_context(). The usage of in_interrupt() in non-core code is phased out. Ideally the information of the calling context should be passed by the callers or the functions be split as appropriate. The attempt to consolidate the code by passing an arguemnt or by distangling it failed due lack of knowledge about this driver and because the call chains are hard to follow. As a stop gap use netif_rx_any_context() which invokes the correct code path depending on context and confines the in_interrupt() usage to core code. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-15 13:21:48 -08:00
Tong Zhang	a67f061615	net: wan/lmc: dont print format string when not available dev->name is determined only after calling register_hdlc_device(), however ,it is used by printk before the name is fully determined. [ 4.565137] hdlc%d: detected at e8000000, irq 11 Instead of printing out a %d, print hdlc directly Signed-off-by: Tong Zhang <ztong0001@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-15 13:03:22 -08:00
Tong Zhang	62e69bc419	net: wan/lmc: unregister device when no matching device is found lmc set sc->lmc_media pointer when there is a matching device. However, when no matching device is found, this pointer is NULL and the following dereference will result in a null-ptr-deref. To fix this issue, unregister the hdlc device and return an error. [ 4.569359] BUG: KASAN: null-ptr-deref in lmc_init_one.cold+0x2b6/0x55d [lmc] [ 4.569748] Read of size 8 at addr 0000000000000008 by task modprobe/95 [ 4.570102] [ 4.570187] CPU: 0 PID: 95 Comm: modprobe Not tainted 5.11.0-rc7 #94 [ 4.570527] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-preb4 [ 4.571125] Call Trace: [ 4.571261] dump_stack+0x7d/0xa3 [ 4.571445] kasan_report.cold+0x10c/0x10e [ 4.571667] ? lmc_init_one.cold+0x2b6/0x55d [lmc] [ 4.571932] lmc_init_one.cold+0x2b6/0x55d [lmc] [ 4.572186] ? lmc_mii_readreg+0xa0/0xa0 [lmc] [ 4.572432] local_pci_probe+0x6f/0xb0 [ 4.572639] pci_device_probe+0x171/0x240 [ 4.572857] ? pci_device_remove+0xe0/0xe0 [ 4.573080] ? kernfs_create_link+0xb6/0x110 [ 4.573315] ? sysfs_do_create_link_sd.isra.0+0x76/0xe0 [ 4.573598] really_probe+0x161/0x420 [ 4.573799] driver_probe_device+0x6d/0xd0 [ 4.574022] device_driver_attach+0x82/0x90 [ 4.574249] ? device_driver_attach+0x90/0x90 [ 4.574485] __driver_attach+0x60/0x100 [ 4.574694] ? device_driver_attach+0x90/0x90 [ 4.574931] bus_for_each_dev+0xe1/0x140 [ 4.575146] ? subsys_dev_iter_exit+0x10/0x10 [ 4.575387] ? klist_node_init+0x61/0x80 [ 4.575602] bus_add_driver+0x254/0x2a0 [ 4.575812] driver_register+0xd3/0x150 [ 4.576021] ? 0xffffffffc0018000 [ 4.576202] do_one_initcall+0x84/0x250 [ 4.576411] ? trace_event_raw_event_initcall_finish+0x150/0x150 [ 4.576733] ? unpoison_range+0xf/0x30 [ 4.576938] ? ____kasan_kmalloc.constprop.0+0x84/0xa0 [ 4.577219] ? unpoison_range+0xf/0x30 [ 4.577423] ? unpoison_range+0xf/0x30 [ 4.577628] do_init_module+0xf8/0x350 [ 4.577833] load_module+0x3fe6/0x4340 [ 4.578038] ? vm_unmap_ram+0x1d0/0x1d0 [ 4.578247] ? ____kasan_kmalloc.constprop.0+0x84/0xa0 [ 4.578526] ? module_frob_arch_sections+0x20/0x20 [ 4.578787] ? __do_sys_finit_module+0x108/0x170 [ 4.579037] __do_sys_finit_module+0x108/0x170 [ 4.579278] ? __ia32_sys_init_module+0x40/0x40 [ 4.579523] ? file_open_root+0x200/0x200 [ 4.579742] ? do_sys_open+0x85/0xe0 [ 4.579938] ? filp_open+0x50/0x50 [ 4.580125] ? exit_to_user_mode_prepare+0xfc/0x130 [ 4.580390] do_syscall_64+0x33/0x40 [ 4.580586] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 4.580859] RIP: 0033:0x7f1a724c3cf7 [ 4.581054] Code: 48 89 57 30 48 8b 04 24 48 89 47 38 e9 1d a0 02 00 48 89 f8 48 89 f7 48 89 d6 48 891 [ 4.582043] RSP: 002b:00007fff44941c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 4.582447] RAX: ffffffffffffffda RBX: 00000000012ada70 RCX: 00007f1a724c3cf7 [ 4.582827] RDX: 0000000000000000 RSI: 00000000012ac9e0 RDI: 0000000000000003 [ 4.583207] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000001 [ 4.583587] R10: 00007f1a72527300 R11: 0000000000000246 R12: 00000000012ac9e0 [ 4.583968] R13: 0000000000000000 R14: 00000000012acc90 R15: 0000000000000001 [ 4.584349] ================================================================== Signed-off-by: Tong Zhang <ztong0001@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-15 13:02:02 -08:00
Colin Ian King	4773acf3d4	b43: N-PHY: Fix the update of coef for the PHY revision >= 3case The documentation for the PHY update [1] states: Loop 4 times with index i If PHY Revision >= 3 Copy table[i] to coef[i] Otherwise Set coef[i] to 0 the copy of the table to coef is currently implemented the wrong way around, table is being updated from uninitialized values in coeff. Fix this by swapping the assignment around. [1] https://bcm-v4.sipsolutions.net/802.11/PHY/N/RestoreCal/ Fixes: 2f258b74d13c ("b43: N-PHY: implement restoring general configuration") Addresses-Coverity: ("Uninitialized scalar variable") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-15 12:40:41 -08:00
Ayush Sawal	2355a6773a	cxgb4/chtls/cxgbit: Keeping the max ofld immediate data size same in cxgb4 and ulds The Max imm data size in cxgb4 is not similar to the max imm data size in the chtls. This caused an mismatch in output of is_ofld_imm() of cxgb4 and chtls. So fixed this by keeping the max wreq size of imm data same in both chtls and cxgb4 as MAX_IMM_OFLD_TX_DATA_WR_LEN. As cxgb4's max imm. data value for ofld packets is changed to MAX_IMM_OFLD_TX_DATA_WR_LEN. Using the same in cxgbit also. Fixes: 36bedb3f2e5b8 ("crypto: chtls - Inline TLS record Tx") Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-15 12:39:33 -08:00
Tong Zhang	d0a0bbe7b0	atm: idt77252: fix build broken on amd64 idt77252 is broken and wont load on amd64 systems modprobe idt77252 shows the following idt77252_init: skb->cb is too small (48 < 56) Add packed attribute to struct idt77252_skb_prv and struct atm_skb_data so that the total size can be <= sizeof(skb->cb) Also convert runtime size check to buildtime size check in idt77252_init() Signed-off-by: Tong Zhang <ztong0001@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-15 12:36:27 -08:00
Robert Hancock	57baf8cc70	net: axienet: Handle deferred probe on clock properly This driver is set up to use a clock mapping in the device tree if it is present, but still work without one for backward compatibility. However, if getting the clock returns -EPROBE_DEFER, then we need to abort and return that error from our driver initialization so that the probe can be retried later after the clock is set up. Move clock initialization to earlier in the process so we do not waste as much effort if the clock is not yet available. Switch to use devm_clk_get_optional and abort initialization on any error reported. Also enable the clock regardless of whether the controller is using an MDIO bus, as the clock is required in any case. Fixes: 09a0354cadec267be7f ("net: axienet: Use clock framework to get device clock rate") Signed-off-by: Robert Hancock <robert.hancock@calian.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 17:36:31 -08:00
Dany Madden	a6f2fe5f10	ibmvnic: change IBMVNIC_MAX_IND_DESCS to 16 The supported indirect subcrq entries on Power8 is 16. Power9 supports 128. Redefined this value to 16 to minimize the driver from having to reset when migrating between Power9 and Power8. In our rx/tx performance testing, we found no performance difference between 16 and 128 at this time. Fixes: f019fb6392e5 ("ibmvnic: Introduce indirect subordinate Command Response Queue buffer") Signed-off-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 17:17:05 -08:00
Davide Caratti	d212683805	flow_dissector: fix TTL and TOS dissection on IPv4 fragments the following command: # tc filter add dev $h2 ingress protocol ip pref 1 handle 101 flower \ $tcflags dst_ip 192.0.2.2 ip_ttl 63 action drop doesn't drop all IPv4 packets that match the configured TTL / destination address. In particular, if "fragment offset" or "more fragments" have non zero value in the IPv4 header, setting of FLOW_DISSECTOR_KEY_IP is simply ignored. Fix this dissecting IPv4 TTL and TOS before fragment info; while at it, add a selftest for tc flower's match on 'ip_ttl' that verifies the correct behavior. Fixes: 518d8a2e9bad ("net/flow_dissector: add support for dissection of misc ip header fields") Reported-by: Shuang Li <shuali@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 17:03:51 -08:00
Doug Brown	39935dccb2	appletalk: Fix skb allocation size in loopback case If a DDP broadcast packet is sent out to a non-gateway target, it is also looped back. There is a potential for the loopback device to have a longer hardware header length than the original target route's device, which can result in the skb not being created with enough room for the loopback device's hardware header. This patch fixes the issue by determining that a loopback will be necessary prior to allocating the skb, and if so, ensuring the skb has enough room. This was discovered while testing a new driver that creates a LocalTalk network interface (LTALK_HLEN = 1). It caused an skb_under_panic. Signed-off-by: Doug Brown <doug@schmorgal.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:40:28 -08:00
David S. Miller	0c9fc2ede9	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2021-02-13 The following pull-request contains BPF updates for your net tree. We've added 2 non-merge commits during the last 3 day(s) which contain a total of 2 files changed, 9 insertions(+), 11 deletions(-). The main changes are: 1) Fix mod32 truncation handling in verifier, from Daniel Borkmann. 2) Fix XDP redirect tests to explicitly use bash, from Björn Töpel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:15:23 -08:00
Daniel Borkmann	9b00f1b788	bpf: Fix truncation handling for mod32 dst reg wrt zero Recently noticed that when mod32 with a known src reg of 0 is performed, then the dst register is 32-bit truncated in verifier: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (b7) r0 = 0 1: R0_w=inv0 R1=ctx(id=0,off=0,imm=0) R10=fp0 1: (b7) r1 = -1 2: R0_w=inv0 R1_w=inv-1 R10=fp0 2: (b4) w2 = -1 3: R0_w=inv0 R1_w=inv-1 R2_w=inv4294967295 R10=fp0 3: (9c) w1 %= w0 4: R0_w=inv0 R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2_w=inv4294967295 R10=fp0 4: (b7) r0 = 1 5: R0_w=inv1 R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2_w=inv4294967295 R10=fp0 5: (1d) if r1 == r2 goto pc+1 R0_w=inv1 R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2_w=inv4294967295 R10=fp0 6: R0_w=inv1 R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2_w=inv4294967295 R10=fp0 6: (b7) r0 = 2 7: R0_w=inv2 R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2_w=inv4294967295 R10=fp0 7: (95) exit 7: R0=inv1 R1=inv(id=0,umin_value=4294967295,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2=inv4294967295 R10=fp0 7: (95) exit However, as a runtime result, we get 2 instead of 1, meaning the dst register does not contain (u32)-1 in this case. The reason is fairly straight forward given the 0 test leaves the dst register as-is: # ./bpftool p d x i 23 0: (b7) r0 = 0 1: (b7) r1 = -1 2: (b4) w2 = -1 3: (16) if w0 == 0x0 goto pc+1 4: (9c) w1 %= w0 5: (b7) r0 = 1 6: (1d) if r1 == r2 goto pc+1 7: (b7) r0 = 2 8: (95) exit This was originally not an issue given the dst register was marked as completely unknown (aka 64 bit unknown). However, after 468f6eafa6c4 ("bpf: fix 32-bit ALU op verification") the verifier casts the register output to 32 bit, and hence it becomes 32 bit unknown. Note that for the case where the src register is unknown, the dst register is marked 64 bit unknown. After the fix, the register is truncated by the runtime and the test passes: # ./bpftool p d x i 23 0: (b7) r0 = 0 1: (b7) r1 = -1 2: (b4) w2 = -1 3: (16) if w0 == 0x0 goto pc+2 4: (9c) w1 %= w0 5: (05) goto pc+1 6: (bc) w1 = w1 7: (b7) r0 = 1 8: (1d) if r1 == r2 goto pc+1 9: (b7) r0 = 2 10: (95) exit Semantics also match with {R,W}x mod{64,32} 0 -> {R,W}x. Invalid div has always been {R,W}x div{64,32} 0 -> 0. Rewrites are as follows: mod32: mod64: (16) if w0 == 0x0 goto pc+2 (15) if r0 == 0x0 goto pc+1 (9c) w1 %= w0 (9f) r1 %= r0 (05) goto pc+1 (bc) w1 = w1 Fixes: 468f6eafa6c4 ("bpf: fix 32-bit ALU op verification") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org>	2021-02-13 00:53:12 +01:00
David S. Miller	308daa19e2	mlx5-fixes-2021-02-11 -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmAl7OgACgkQSD+KveBX +j72XQgAkwXPdkVm36mUP4vnjZhxCIOG/8KHp50iVJAFKRBukEGgSV3yGt8Srjaj fCyqevg/ncOq61mmo6KE2v16/oI5Mh7fzJ+Q0+MXdYp5VVgtyUIil2dnfgPWnYfm Ahsc4TaRfU3YUBnMz8MhgAhmRih24+dyW+YtOj3pzwxNRjTMxseLI9S1tXu9/a7P BLWBka7EZroI+mArNtGXqlN8bPt+IigrysDixvEEQvNoUxSbEwjoPjwHLrukWjwB FE530QXjvzsJH8gFeH0QnG432w/ZYcDlBqmlr4qRwR7k89/ClH0GjA2fqDXejX5u xI/bx8dH5YgPKjpNPdsfhupe902dag== =Z+Z+ -----END PGP SIGNATURE----- Merge tag 'mlx5-fixes-2021-02-11' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-fixes-2021-02-11 Saeed Mahameed says: ==================== mlx5 fixes 2021-02-11 This series introduces some fixes to mlx5 driver. Please pull and let me know if there is any problem. For -stable v5.4 ('net/mlx5e: E-switch, Fix rate calculation for overflow')i For -stable v5.10 ('net/mlx5: Disallow RoCE on multi port slave device') ('net/mlx5: Disable devlink reload for multi port slave device') ('net/mlx5e: Don't change interrupt moderation params when DIM is enabled') ('net/mlx5e: Replace synchronize_rcu with synchronize_net') ('net/mlx5e: Enable XDP for Connect-X IPsec capable devices') ('net/mlx5e: kTLS, Use refcounts to free kTLS RX priv context') ('net/mlx5e: Check tunnel offload is required before setting SWP') ('net/mlx5: Fix health error state handling') ('net/mlx5: Disable devlink reload for lag devices') ('net/mlx5e: CT: manage the lifetime of the ct entry object') ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 19:25:12 -08:00
Moshe Shemesh	e1c3940c60	net/mlx5e: Check tunnel offload is required before setting SWP Check that tunnel offload is required before setting Software Parser offsets to get Geneve HW offload. In case of Geneve packet we check HW offload support of SWP in mlx5e_tunnel_features_check() and set features accordingly, this should be reflected in skb offload requested by the kernel and we should add the Software Parser offsets only if requested. Otherwise, in case HW doesn't support SWP for Geneve, data path will mistakenly try to offload Geneve SKBs with skb->encapsulation set, regardless of whether offload was requested or not on this specific SKB. Fixes: e3cfc7e6b7bd ("net/mlx5e: TX, Add geneve tunnel stateless offload support") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:16 -08:00
Oz Shlomo	a217313152	net/mlx5e: CT: manage the lifetime of the ct entry object The ct entry object is accessed by the ct add, del, stats and restore methods. In addition, it is referenced from several hash tables. The lifetime of the ct entry object was not managed which triggered race conditions as in the following kasan dump: [ 3374.973945] ================================================================== [ 3374.988552] BUG: KASAN: use-after-free in memcmp+0x4c/0x98 [ 3374.999590] Read of size 1 at addr ffff00036129ea55 by task ksoftirqd/1/15 [ 3375.016415] CPU: 1 PID: 15 Comm: ksoftirqd/1 Tainted: G O 5.4.31+ #1 [ 3375.055301] Call trace: [ 3375.060214] dump_backtrace+0x0/0x238 [ 3375.067580] show_stack+0x24/0x30 [ 3375.074244] dump_stack+0xe0/0x118 [ 3375.081085] print_address_description.isra.9+0x74/0x3d0 [ 3375.091771] __kasan_report+0x198/0x1e8 [ 3375.099486] kasan_report+0xc/0x18 [ 3375.106324] __asan_load1+0x60/0x68 [ 3375.113338] memcmp+0x4c/0x98 [ 3375.119409] mlx5e_tc_ct_restore_flow+0x3a4/0x6f8 [mlx5_core] [ 3375.131073] mlx5e_rep_tc_update_skb+0x1d4/0x2f0 [mlx5_core] [ 3375.142553] mlx5e_handle_rx_cqe_rep+0x198/0x308 [mlx5_core] [ 3375.154034] mlx5e_poll_rx_cq+0x2a0/0x1060 [mlx5_core] [ 3375.164459] mlx5e_napi_poll+0x1d4/0xa78 [mlx5_core] [ 3375.174453] net_rx_action+0x28c/0x7a8 [ 3375.182004] __do_softirq+0x1b4/0x5d0 Manage the lifetime of the ct entry object by using synchornization mechanisms for concurrent access. Fixes: ac991b48d43c ("net/mlx5e: CT: Offload established flows") Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:15 -08:00
Shay Drory	edac23c2b3	net/mlx5: Disable devlink reload for lag devices Devlink reload can't be allowed on lag devices since reloading one lag device will cause traffic on the bond to get stucked. Users who wish to reload a lag device, need to remove the device from the bond, and only then reload it. Fixes: 4383cfcc65e7 ("net/mlx5: Add devlink reload") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:15 -08:00
Shay Drory	7ab91f2b03	net/mlx5: Disallow RoCE on lag device In lag mode, setting roce enabled/disable of lag device have no effect. e.g.: bond device (roce/vf_lag) roce status remain unchanged. Therefore disable it and add an error message. Fixes: cc9defcbb8fa ("net/mlx5: Handle "enable_roce" devlink param") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:15 -08:00
Shay Drory	c70f8597fc	net/mlx5: Disallow RoCE on multi port slave device In dual port mode, setting roce enabled/disable for the slave device have no effect. e.g.: the slave device roce status remain unchanged. Therefore disable it and add an error message. Enable or disable roce of the master device affect both master and slave devices. Fixes: cc9defcbb8fa ("net/mlx5: Handle "enable_roce" devlink param") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:14 -08:00
Shay Drory	d89ddaae17	net/mlx5: Disable devlink reload for multi port slave device Devlink reload can't be allowed on a multi port slave device, because reload of slave device doesn't take effect. The right flow is to disable devlink reload for multi port slave device. Hence, disabling it in mlx5_core probing. Fixes: 4383cfcc65e7 ("net/mlx5: Add devlink reload") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:14 -08:00
Maxim Mikityanskiy	b850bbff96	net/mlx5e: kTLS, Use refcounts to free kTLS RX priv context wait_for_resync is unreliable - if it timeouts, priv_rx will be freed anyway. However, mlx5e_ktls_handle_get_psv_completion will be called sooner or later, leading to use-after-free. For example, it can happen if a CQ error happened, and ICOSQ stopped, but later on the queues are destroyed, and ICOSQ is flushed with mlx5e_free_icosq_descs. This patch converts the lifecycle of priv_rx to fully refcount-based, so that the struct won't be freed before the refcount goes to zero. Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:13 -08:00
Maxim Mikityanskiy	ebf79b6be6	net/mlx5e: Fix CQ params of ICOSQ and async ICOSQ The commit mentioned below has split the parameters of ICOSQ and async ICOSQ, but it contained a typo: the CQ parameters were swapped for ICOSQ and async ICOSQ. Async ICOSQ is longer than the normal ICOSQ, and the CQ size must be the same as the size of the corresponding SQ, but due to this bug, the CQ of async ICOSQ was much shorter than async ICOSQ itself. It led to overflows of the CQ with such messages in dmesg, in particular, when running multiple kTLS-offloaded streams: mlx5_core 0000:08:00.0: cq_err_event_notifier:529:(pid 9422): CQ error on CQN 0x406, syndrome 0x1 mlx5_core 0000:08:00.0 eth2: mlx5e_cq_error_event: cqn=0x000406 event=0x04 This commit fixes the issue by using the corresponding parameters for ICOSQ and async ICOSQ. Fixes: c293ac927fbb ("net/mlx5e: Refactor build channel params") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:12 -08:00
Maxim Mikityanskiy	4d6e6b0c6d	net/mlx5e: Replace synchronize_rcu with synchronize_net The commit cited below switched from using napi_synchronize to synchronize_rcu to have a guarantee that it will finish in finite time. However, on average, synchronize_rcu takes more time than napi_synchronize. Given that it's called multiple times per channel on deactivation, it accumulates to a significant amount, which causes timeouts in some applications (for example, when using bonding with NetworkManager). This commit replaces synchronize_rcu with synchronize_net, which is faster when called under rtnl_lock, allowing to speed up the described flow. Fixes: 9c25a22dfb00 ("net/mlx5e: Use synchronize_rcu to sync with NAPI") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:12 -08:00
Shay Drory	51d138c261	net/mlx5: Fix health error state handling Currently, when we discover a fatal error, we are queueing a work that will wait for a lock in order to enter the device to error state. Meanwhile, FW commands are still being processed, and gets timeouts. This can block the driver for few minutes before the work will manage to get the lock and enter to error state. Setting the device to error state before queueing health work, in order to avoid FW commands being processed while the work is waiting for the lock. Fixes: c1d4d2e92ad6 ("net/mlx5: Avoid calling sleeping function by the health poll thread") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:11 -08:00
Maxim Mikityanskiy	65ba8594a2	net/mlx5e: Change interrupt moderation channel params also when channels are closed struct mlx5e_params contains fields ({rx,tx}_cq_moderation) that depend on two things: whether DIM is enabled and the state of a private flag (MLX5E_PFLAG_{RX,TX}_CQE_BASED_MODER). Whenever the DIM state changes, mlx5e_reset_{rx,tx}_moderation is called to update the fields, however, only if the channels are open. The flow where the channels are closed misses the required update of the fields. This commit moves the calls of mlx5e_reset_{rx,tx}_moderation, so that they run in both flows. Fixes: ebeaf084ad5c ("net/mlx5e: Properly set default values when disabling adaptive moderation") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:11 -08:00
Maxim Mikityanskiy	019f93bc4b	net/mlx5e: Don't change interrupt moderation params when DIM is enabled When mlx5e_ethtool_set_coalesce doesn't change DIM state (enabled/disabled), it calls mlx5e_set_priv_channels_coalesce unconditionally, which in turn invokes a firmware command to set interrupt moderation parameters. It shouldn't happen while DIM manages those parameters dynamically (it might even be happening at the same time). This patch fixes it by splitting mlx5e_set_priv_channels_coalesce into two functions (for RX and TX) and calling them only when DIM is disabled (for RX and TX respectively). Fixes: cb3c7fd4f839 ("net/mlx5e: Support adaptive RX coalescing") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:10 -08:00
Raed Salem	e33f9f5f2d	net/mlx5e: Enable XDP for Connect-X IPsec capable devices This limitation was inherited by previous Innova (FPGA) IPsec implementation, it uses its private set of RQ handlers which does not support XDP, for Connect-X this is no longer true. Fix by keeping this limitation only for Innova IPsec supporting devices, as otherwise this limitation effectively wrongly blocks XDP for all future Connect-X devices for all flows even if IPsec offload is not used. Fixes: 2d64663cd559 ("net/mlx5: IPsec: Add HW crypto offload support") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Alaa Hleihel <alaa@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:10 -08:00
Raed Salem	e4484d9df5	net/mlx5e: Enable striding RQ for Connect-X IPsec capable devices This limitation was inherited by previous Innova (FPGA) IPsec implementation, it uses its private set of RQ handlers which does not support striding rq, for Connect-X this is no longer true. Fix by keeping this limitation only for Innova IPsec supporting devices, as otherwise this limitation effectively wrongly blocks striding RQs for all future Connect-X devices for all flows even if IPsec offload is not used. Fixes: 2d64663cd559 ("net/mlx5: IPsec: Add HW crypto offload support") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:09 -08:00
Parav Pandit	0e22bfb7c0	net/mlx5e: E-switch, Fix rate calculation for overflow rate_bytes_ps is a 64-bit field. It passed as 32-bit field to apply_police_params(). Due to this when police rate is higher than 4Gbps, 32-bit calculation ignores the carry. This results in incorrect rate configurationn the device. Fix it by performing 64-bit calculation. Fixes: fcb64c0f5640 ("net/mlx5: E-Switch, add ingress rate support") Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:09 -08:00
David S. Miller	9c899aa6ac	Merge branch 'mptcp-Miscellaneous-fixes' Mat Martineau says: ==================== mptcp: Miscellaneous fixes Here are some MPTCP fixes for the -net tree, addressing various issues we have seen thanks to syzkaller and other testing: Patch 1 correctly propagates errors at connection time and for TCP fallback connections. Patch 2 sets the expected poll() events on SEND_SHUTDOWN. Patch 3 fixes a retranmit crash and unneeded retransmissions. Patch 4 fixes possible uninitialized data on the error path during socket creation. Patch 5 addresses a problem with MPTCP window updates. Patch 6 fixes a case where MPTCP retransmission can get stuck. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 18:30:55 -08:00
Paolo Abeni	d09d818ec2	mptcp: add a missing retransmission timer scheduling Currently we do not schedule the MPTCP retransmission timer after pushing the data when such action happens in the subflow context. This may cause hang-up on active-backup scenarios, or even when only single subflow msks are involved, if we lost some peer's ack. Fixes: 6e628cd3a8f7 ("mptcp: use mptcp release_cb for delayed tasks") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 18:30:55 -08:00
Paolo Abeni	e3859603ba	mptcp: better msk receive window updates Move mptcp_cleanup_rbuf() related checks inside the mentioned helper and extend them to mirror TCP checks more closely. Additionally drop the 'rmem_pending' hack, since commit 879526030c8b ("mptcp: protect the rx path with the msk socket spinlock") we can use instead 'rmem_released'. Fixes: ea4ca586b16f ("mptcp: refine MPTCP-level ack scheduling") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 18:30:54 -08:00
Paolo Abeni	d8b59efa64	mptcp: init mptcp request socket earlier The mptcp subflow route_req() callback performs the subflow req initialization after the route_req() check. If the latter fails, mptcp-specific bits of the current request sockets are left uninitialized. The above causes bad things at req socket disposal time, when the mptcp resources are cleared. This change addresses the issue by splitting subflow_init_req() into the actual initialization and the mptcp-specific checks. The initialization is moved before any possibly failing check. Reported-by: Christoph Paasch <cpaasch@apple.com> Fixes: 7ea851d19b23 ("tcp: merge 'init_req' and 'route_req' functions") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 18:30:54 -08:00
Paolo Abeni	64b9cea7a0	mptcp: fix spurious retransmissions Syzkaller was able to trigger the following splat again: WARNING: CPU: 1 PID: 12512 at net/mptcp/protocol.c:761 mptcp_reset_timer+0x12a/0x160 net/mptcp/protocol.c:761 Modules linked in: CPU: 1 PID: 12512 Comm: kworker/1:6 Not tainted 5.10.0-rc6 #52 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: events mptcp_worker RIP: 0010:mptcp_reset_timer+0x12a/0x160 net/mptcp/protocol.c:761 Code: e8 4b 0c ad ff e8 56 21 88 fe 48 b8 00 00 00 00 00 fc ff df 48 c7 04 03 00 00 00 00 48 83 c4 40 5b 5d 41 5c c3 e8 36 21 88 fe <0f> 0b 41 bc c8 00 00 00 eb 98 e8 e7 b1 af fe e9 30 ff ff ff 48 c7 RSP: 0018:ffffc900018c7c68 EFLAGS: 00010293 RAX: ffff888108cb1c80 RBX: 1ffff92000318f8d RCX: ffffffff82ad0307 RDX: 0000000000000000 RSI: ffffffff82ad036a RDI: 0000000000000007 RBP: ffff888113e2d000 R08: ffff888108cb1c80 R09: ffffed10227c5ab7 R10: ffff888113e2d5b7 R11: ffffed10227c5ab6 R12: 0000000000000000 R13: ffff88801f100000 R14: ffff888113e2d5b0 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff88811b500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fd76a874ef8 CR3: 000000001689c005 CR4: 0000000000170ee0 Call Trace: mptcp_worker+0xaa4/0x1560 net/mptcp/protocol.c:2334 process_one_work+0x8d3/0x1200 kernel/workqueue.c:2272 worker_thread+0x9c/0x1090 kernel/workqueue.c:2418 kthread+0x303/0x410 kernel/kthread.c:292 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:296 The mptcp_worker tries to update the MPTCP retransmission timer even if such timer is not currently scheduled. The mptcp_rtx_head() return value is bogus: we can have enqueued data not yet transmitted. The above may additionally cause spurious, unneeded MPTCP-level retransmissions. Fix the issue adding an explicit clearing of the rtx queue before trying to retransmit and checking for unacked data. Additionally drop an unneeded timer stop call and the unused mptcp_rtx_tail() helper. Reported-by: Christoph Paasch <cpaasch@apple.com> Fixes: 6e628cd3a8f7 ("mptcp: use mptcp release_cb for delayed tasks") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 18:30:54 -08:00
Paolo Abeni	dd913410b0	mptcp: fix poll after shutdown The current mptcp_poll() implementation gives unexpected results after shutdown(SEND_SHUTDOWN) and when the msk status is TCP_CLOSE. Set the correct mask. Fixes: 8edf08649eed ("mptcp: rework poll+nospace handling") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 18:30:54 -08:00
Paolo Abeni	15cc104533	mptcp: deliver ssk errors to msk Currently all errors received on msk subflows are ignored. We need to catch at least the errors on connect() and on fallback sockets. Use a custom sk_error_report callback at subflow level, and do the real action under the msk socket lock - via the usual sock_owned_by_user()/release_callback() schema. Fixes: 6e628cd3a8f7 ("mptcp: use mptcp release_cb for delayed tasks") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 18:30:54 -08:00
Heiner Kallweit	4c0d2e96ba	net: phy: consider that suspend2ram may cut off PHY power Claudiu reported that on his system S2R cuts off power to the PHY and after resuming certain PHY settings are lost. The PM folks confirmed that cutting off power to selected components in S2R is a valid case. Therefore resuming from S2R, same as from hibernation, has to assume that the PHY has power-on defaults. As a consequence use the restore callback also as resume callback. In addition make sure that the interrupt configuration is restored. Let's do this in phy_init_hw() and ensure that after this call actual interrupt configuration is in sync with phydev->interrupts. Currently, if interrupt was enabled before hibernation, we would resume with interrupt disabled because that's the power-on default. This fix applies cleanly only after the commit marked as fixed. I don't have an affected system, therefore change is compile-tested only. [0] https://lore.kernel.org/netdev/1610120754-14331-1-git-send-email-claudiu.beznea@microchip.com/ Fixes: 611d779af7ca ("net: phy: fix MDIO bus PM PHY resuming") Reported-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 18:27:18 -08:00
Ioana Ciornei	e12be9139c	dpaa2-eth: fix memory leak in XDP_REDIRECT If xdp_do_redirect() fails, the calling driver should handle recycling or freeing of the page associated with the frame. The dpaa2-eth driver didn't do either of them and just incremented a counter. Fix this by trying to DMA map back the page and recycle it or, if the mapping fails, just free it. Fixes: d678be1dc1ec ("dpaa2-eth: add XDP_REDIRECT support") Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 18:15:15 -08:00
Tong Zhang	e185ea30df	enetc: auto select PHYLIB and MDIO_DEVRES FSL_ENETC_MDIO use symbols from PHYLIB (MDIO_BUS) and MDIO_DEVRES, however there are no dependency specified in Kconfig ERROR: modpost: "__mdiobus_register" [drivers/net/ethernet/freescale/enetc/fsl-enetc-mdio.ko] undefined! ERROR: modpost: "mdiobus_unregister" [drivers/net/ethernet/freescale/enetc/fsl-enetc-mdio.ko] undefined! ERROR: modpost: "devm_mdiobus_alloc_size" [drivers/net/ethernet/freescale/enetc/fsl-enetc-mdio.ko] undefined! add depends on MDIO_DEVRES && MDIO_BUS Signed-off-by: Tong Zhang <ztong0001@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 18:12:47 -08:00
Nathan Rossi	8a28af7a3e	net: ethernet: aquantia: Handle error cleanup of start on open The aq_nic_start function can fail in a variety of cases which leaves the device in broken state. An example case where the start function fails is the request_threaded_irq which can be interrupted, resulting in a EINTR result. This can be manually triggered by bringing the link up (e.g. ip link set up) and triggering a SIGINT on the initiating process (e.g. Ctrl+C). This would put the device into a half configured state. Subsequently bringing the link up again would cause the napi_enable to BUG. In order to correctly clean up the failed attempt to start a device call aq_nic_stop. Signed-off-by: Nathan Rossi <nathan.rossi@digi.com> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 14:38:06 -08:00

1 2 3 4 5 ...

984206 Commits