Including fixes from netfilter.

Current release - new code bugs:
 
  - clear msg_get_inq in __sys_recvfrom() and __copy_msghdr_from_user()
 
  - mptcp:
    - invoke MP_FAIL response only when needed
    - fix shutdown vs fallback race
    - consistent map handling on failure
 
  - octeon_ep: use bitwise AND
 
 Previous releases - regressions:
 
  - tipc: move bc link creation back to tipc_node_create, fix NPD
 
 Previous releases - always broken:
 
  - tcp: add a missing nf_reset_ct() in 3WHS handling, prevent socket
    buffered skbs from keeping refcount on the conntrack module
 
  - ipv6: take care of disable_policy when restoring routes
 
  - tun: make sure to always disable and unlink NAPI instances
 
  - phy: don't trigger state machine while in suspend
 
  - netfilter: nf_tables: avoid skb access on nf_stolen
 
  - asix: fix "can't send until first packet is send" issue
 
  - usb: asix: do not force pause frames support
 
  - nxp-nci: don't issue a zero length i2c_master_read()
 
 Misc:
 
  - ncsi: allow use of proper "mellanox" DT vendor prefix
 
  - act_api: add a message for user space if any actions were already
    flushed before the error was hit
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmK99PwACgkQMUZtbf5S
 IruyHQ/+NtACBpCvB9jDIzoX9aOr1KEC48GAmHVU3RjZ3JEIg9hd0o1SVGL6LR3Z
 cvBj7z1gpgoVcs/vtIliaZhaqu9OoE/wejzDn6c5ClGfnBfKtsRaA89GsJPRdbrY
 oGA+LM1ufJ0wgasOpxSN1i8Z3iG9q4fRUTr0BX1Mndm0cn/P3dSKvxnmu+dV3Xh2
 yPTRejwOtxOSEHqdz4SeJnbLGVOZoxn6RohSWMrLebwSoS12KQTUGGcPqXxRi0s8
 wLCtGx6XHdYN777RA3RwsydO+DJ1FJsH011QzDuWNKJQuOMCktwqun92H8a9I2BL
 er0GA+6awjzwKfTgQtDINoudJjchlyipcYGlonlckMCYgD/YhwozKue+mW36lFKX
 Gb0McONP0n58gU2k1kw0AuvEE8h+x10hdDyF04A1eoQWKdPUarWtweE2lEUNLAsl
 nU0d1nnVkDUr1j6FZoJrJ6svey1zQxdmWMF/YBr83M/EEnCwK+vKKXFRsq0nJF7U
 4lBiDngGwywl4siP7LvVCNIBNL5wzY+HcBmlHVtoQuh3DE2Yc7lAh/lSOTK/+1KY
 5sRZAkQemkI6CTkiB+oCXH2KeWJDXBGJ30Vbzb3E5R+n3Y50MV6V/BN6k6NapxSU
 cNIiSq/k7fyqS1Gw1WHt/0kBc9/MhkaCyOCiHYmB74MUtXA2m5s=
 =6dPN
 -----END PGP SIGNATURE-----

Merge tag 'net-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter.

  Current release - new code bugs:

   - clear msg_get_inq in __sys_recvfrom() and __copy_msghdr_from_user()

   - mptcp:
      - invoke MP_FAIL response only when needed
      - fix shutdown vs fallback race
      - consistent map handling on failure

   - octeon_ep: use bitwise AND

  Previous releases - regressions:

   - tipc: move bc link creation back to tipc_node_create, fix NPD

  Previous releases - always broken:

   - tcp: add a missing nf_reset_ct() in 3WHS handling to prevent socket
     buffered skbs from keeping refcount on the conntrack module

   - ipv6: take care of disable_policy when restoring routes

   - tun: make sure to always disable and unlink NAPI instances

   - phy: don't trigger state machine while in suspend

   - netfilter: nf_tables: avoid skb access on nf_stolen

   - asix: fix "can't send until first packet is send" issue

   - usb: asix: do not force pause frames support

   - nxp-nci: don't issue a zero length i2c_master_read()

  Misc:

   - ncsi: allow use of proper "mellanox" DT vendor prefix

   - act_api: add a message for user space if any actions were already
     flushed before the error was hit"

* tag 'net-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (55 commits)
  net: dsa: felix: fix race between reading PSFP stats and port stats
  selftest: tun: add test for NAPI dismantle
  net: tun: avoid disabling NAPI twice
  net: sparx5: mdb add/del handle non-sparx5 devices
  net: sfp: fix memory leak in sfp_probe()
  mlxsw: spectrum_router: Fix rollback in tunnel next hop init
  net: rose: fix UAF bugs caused by timer handler
  net: usb: ax88179_178a: Fix packet receiving
  net: bonding: fix use-after-free after 802.3ad slave unbind
  ipv6: fix lockdep splat in in6_dump_addrs()
  net: phy: ax88772a: fix lost pause advertisement configuration
  net: phy: Don't trigger state machine while in suspend
  usbnet: fix memory allocation in helpers
  selftests net: fix kselftest net fatal error
  NFC: nxp-nci: don't print header length mismatch on i2c error
  NFC: nxp-nci: Don't issue a zero length i2c_master_read()
  net: tipc: fix possible refcount leak in tipc_sk_create()
  nfc: nfcmrvl: Fix irq_of_parse_and_map() return value
  net: ipv6: unexport __init-annotated seg6_hmac_net_init()
  ipv6/sit: fix ipip6_tunnel_get_prl return value
  ...
This commit is contained in:
Linus Torvalds 2022-06-30 15:26:55 -07:00
commit 5e8379351d
58 changed files with 907 additions and 256 deletions

View File

@ -14397,9 +14397,8 @@ F: Documentation/devicetree/bindings/sound/nxp,tfa989x.yaml
F: sound/soc/codecs/tfa989x.c F: sound/soc/codecs/tfa989x.c
NXP-NCI NFC DRIVER NXP-NCI NFC DRIVER
R: Charles Gorand <charles.gorand@effinnov.com>
L: linux-nfc@lists.01.org (subscribers-only) L: linux-nfc@lists.01.org (subscribers-only)
S: Supported S: Orphan
F: Documentation/devicetree/bindings/net/nfc/nxp,nci.yaml F: Documentation/devicetree/bindings/net/nfc/nxp,nci.yaml
F: drivers/nfc/nxp-nci F: drivers/nfc/nxp-nci

View File

@ -2228,7 +2228,8 @@ void bond_3ad_unbind_slave(struct slave *slave)
temp_aggregator->num_of_ports--; temp_aggregator->num_of_ports--;
if (__agg_active_ports(temp_aggregator) == 0) { if (__agg_active_ports(temp_aggregator) == 0) {
select_new_active_agg = temp_aggregator->is_active; select_new_active_agg = temp_aggregator->is_active;
ad_clear_agg(temp_aggregator); if (temp_aggregator->num_of_ports == 0)
ad_clear_agg(temp_aggregator);
if (select_new_active_agg) { if (select_new_active_agg) {
slave_info(bond->dev, slave->dev, "Removing an active aggregator\n"); slave_info(bond->dev, slave->dev, "Removing an active aggregator\n");
/* select new active aggregator */ /* select new active aggregator */

View File

@ -1302,12 +1302,12 @@ int bond_alb_initialize(struct bonding *bond, int rlb_enabled)
return res; return res;
if (rlb_enabled) { if (rlb_enabled) {
bond->alb_info.rlb_enabled = 1;
res = rlb_initialize(bond); res = rlb_initialize(bond);
if (res) { if (res) {
tlb_deinitialize(bond); tlb_deinitialize(bond);
return res; return res;
} }
bond->alb_info.rlb_enabled = 1;
} else { } else {
bond->alb_info.rlb_enabled = 0; bond->alb_info.rlb_enabled = 0;
} }

View File

@ -878,6 +878,11 @@ static void bcm_sf2_sw_mac_link_up(struct dsa_switch *ds, int port,
if (duplex == DUPLEX_FULL) if (duplex == DUPLEX_FULL)
reg |= DUPLX_MODE; reg |= DUPLX_MODE;
if (tx_pause)
reg |= TXFLOW_CNTL;
if (rx_pause)
reg |= RXFLOW_CNTL;
core_writel(priv, reg, offset); core_writel(priv, reg, offset);
} }

View File

@ -300,6 +300,7 @@ static int hellcreek_led_setup(struct hellcreek *hellcreek)
const char *label, *state; const char *label, *state;
int ret = -EINVAL; int ret = -EINVAL;
of_node_get(hellcreek->dev->of_node);
leds = of_find_node_by_name(hellcreek->dev->of_node, "leds"); leds = of_find_node_by_name(hellcreek->dev->of_node, "leds");
if (!leds) { if (!leds) {
dev_err(hellcreek->dev, "No LEDs specified in device tree!\n"); dev_err(hellcreek->dev, "No LEDs specified in device tree!\n");

View File

@ -1886,6 +1886,8 @@ static void vsc9959_psfp_sgi_table_del(struct ocelot *ocelot,
static void vsc9959_psfp_counters_get(struct ocelot *ocelot, u32 index, static void vsc9959_psfp_counters_get(struct ocelot *ocelot, u32 index,
struct felix_stream_filter_counters *counters) struct felix_stream_filter_counters *counters)
{ {
mutex_lock(&ocelot->stats_lock);
ocelot_rmw(ocelot, SYS_STAT_CFG_STAT_VIEW(index), ocelot_rmw(ocelot, SYS_STAT_CFG_STAT_VIEW(index),
SYS_STAT_CFG_STAT_VIEW_M, SYS_STAT_CFG_STAT_VIEW_M,
SYS_STAT_CFG); SYS_STAT_CFG);
@ -1900,6 +1902,8 @@ static void vsc9959_psfp_counters_get(struct ocelot *ocelot, u32 index,
SYS_STAT_CFG_STAT_VIEW(index) | SYS_STAT_CFG_STAT_VIEW(index) |
SYS_STAT_CFG_STAT_CLEAR_SHOT(0x10), SYS_STAT_CFG_STAT_CLEAR_SHOT(0x10),
SYS_STAT_CFG); SYS_STAT_CFG);
mutex_unlock(&ocelot->stats_lock);
} }
static int vsc9959_psfp_filter_add(struct ocelot *ocelot, int port, static int vsc9959_psfp_filter_add(struct ocelot *ocelot, int port,

View File

@ -52,7 +52,7 @@
#define CN93_SDP_EPF_RINFO_SRN(val) ((val) & 0xFF) #define CN93_SDP_EPF_RINFO_SRN(val) ((val) & 0xFF)
#define CN93_SDP_EPF_RINFO_RPVF(val) (((val) >> 32) & 0xF) #define CN93_SDP_EPF_RINFO_RPVF(val) (((val) >> 32) & 0xF)
#define CN93_SDP_EPF_RINFO_NVFS(val) (((val) >> 48) && 0xFF) #define CN93_SDP_EPF_RINFO_NVFS(val) (((val) >> 48) & 0xFF)
/* SDP Function select */ /* SDP Function select */
#define CN93_SDP_FUNC_SEL_EPF_BIT_POS 8 #define CN93_SDP_FUNC_SEL_EPF_BIT_POS 8

View File

@ -4415,6 +4415,8 @@ static int mlxsw_sp_nexthop4_init(struct mlxsw_sp *mlxsw_sp,
return 0; return 0;
err_nexthop_neigh_init: err_nexthop_neigh_init:
list_del(&nh->router_list_node);
mlxsw_sp_nexthop_counter_free(mlxsw_sp, nh);
mlxsw_sp_nexthop_remove(mlxsw_sp, nh); mlxsw_sp_nexthop_remove(mlxsw_sp, nh);
return err; return err;
} }
@ -6740,6 +6742,7 @@ static int mlxsw_sp_nexthop6_init(struct mlxsw_sp *mlxsw_sp,
const struct fib6_info *rt) const struct fib6_info *rt)
{ {
struct net_device *dev = rt->fib6_nh->fib_nh_dev; struct net_device *dev = rt->fib6_nh->fib_nh_dev;
int err;
nh->nhgi = nh_grp->nhgi; nh->nhgi = nh_grp->nhgi;
nh->nh_weight = rt->fib6_nh->fib_nh_weight; nh->nh_weight = rt->fib6_nh->fib_nh_weight;
@ -6755,7 +6758,16 @@ static int mlxsw_sp_nexthop6_init(struct mlxsw_sp *mlxsw_sp,
return 0; return 0;
nh->ifindex = dev->ifindex; nh->ifindex = dev->ifindex;
return mlxsw_sp_nexthop_type_init(mlxsw_sp, nh, dev); err = mlxsw_sp_nexthop_type_init(mlxsw_sp, nh, dev);
if (err)
goto err_nexthop_type_init;
return 0;
err_nexthop_type_init:
list_del(&nh->router_list_node);
mlxsw_sp_nexthop_counter_free(mlxsw_sp, nh);
return err;
} }
static void mlxsw_sp_nexthop6_fini(struct mlxsw_sp *mlxsw_sp, static void mlxsw_sp_nexthop6_fini(struct mlxsw_sp *mlxsw_sp,

View File

@ -396,6 +396,9 @@ static int sparx5_handle_port_mdb_add(struct net_device *dev,
u32 mact_entry; u32 mact_entry;
int res, err; int res, err;
if (!sparx5_netdevice_check(dev))
return -EOPNOTSUPP;
if (netif_is_bridge_master(v->obj.orig_dev)) { if (netif_is_bridge_master(v->obj.orig_dev)) {
sparx5_mact_learn(spx5, PGID_CPU, v->addr, v->vid); sparx5_mact_learn(spx5, PGID_CPU, v->addr, v->vid);
return 0; return 0;
@ -466,6 +469,9 @@ static int sparx5_handle_port_mdb_del(struct net_device *dev,
u32 mact_entry, res, pgid_entry[3]; u32 mact_entry, res, pgid_entry[3];
int err; int err;
if (!sparx5_netdevice_check(dev))
return -EOPNOTSUPP;
if (netif_is_bridge_master(v->obj.orig_dev)) { if (netif_is_bridge_master(v->obj.orig_dev)) {
sparx5_mact_forget(spx5, v->addr, v->vid); sparx5_mact_forget(spx5, v->addr, v->vid);
return 0; return 0;

View File

@ -1515,14 +1515,14 @@ static void epic_remove_one(struct pci_dev *pdev)
struct net_device *dev = pci_get_drvdata(pdev); struct net_device *dev = pci_get_drvdata(pdev);
struct epic_private *ep = netdev_priv(dev); struct epic_private *ep = netdev_priv(dev);
unregister_netdev(dev);
dma_free_coherent(&pdev->dev, TX_TOTAL_SIZE, ep->tx_ring, dma_free_coherent(&pdev->dev, TX_TOTAL_SIZE, ep->tx_ring,
ep->tx_ring_dma); ep->tx_ring_dma);
dma_free_coherent(&pdev->dev, RX_TOTAL_SIZE, ep->rx_ring, dma_free_coherent(&pdev->dev, RX_TOTAL_SIZE, ep->rx_ring,
ep->rx_ring_dma); ep->rx_ring_dma);
unregister_netdev(dev);
pci_iounmap(pdev, ep->ioaddr); pci_iounmap(pdev, ep->ioaddr);
pci_release_regions(pdev);
free_netdev(dev); free_netdev(dev);
pci_release_regions(pdev);
pci_disable_device(pdev); pci_disable_device(pdev);
/* pci_power_off(pdev, -1); */ /* pci_power_off(pdev, -1); */
} }

View File

@ -88,8 +88,10 @@ static void asix_ax88772a_link_change_notify(struct phy_device *phydev)
/* Reset PHY, otherwise MII_LPA will provide outdated information. /* Reset PHY, otherwise MII_LPA will provide outdated information.
* This issue is reproducible only with some link partner PHYs * This issue is reproducible only with some link partner PHYs
*/ */
if (phydev->state == PHY_NOLINK && phydev->drv->soft_reset) if (phydev->state == PHY_NOLINK) {
phydev->drv->soft_reset(phydev); phy_init_hw(phydev);
phy_start_aneg(phydev);
}
} }
static struct phy_driver asix_driver[] = { static struct phy_driver asix_driver[] = {

View File

@ -229,9 +229,7 @@ static int dp83822_config_intr(struct phy_device *phydev)
if (misr_status < 0) if (misr_status < 0)
return misr_status; return misr_status;
misr_status |= (DP83822_RX_ERR_HF_INT_EN | misr_status |= (DP83822_LINK_STAT_INT_EN |
DP83822_FALSE_CARRIER_HF_INT_EN |
DP83822_LINK_STAT_INT_EN |
DP83822_ENERGY_DET_INT_EN | DP83822_ENERGY_DET_INT_EN |
DP83822_LINK_QUAL_INT_EN); DP83822_LINK_QUAL_INT_EN);

View File

@ -31,6 +31,7 @@
#include <linux/io.h> #include <linux/io.h>
#include <linux/uaccess.h> #include <linux/uaccess.h>
#include <linux/atomic.h> #include <linux/atomic.h>
#include <linux/suspend.h>
#include <net/netlink.h> #include <net/netlink.h>
#include <net/genetlink.h> #include <net/genetlink.h>
#include <net/sock.h> #include <net/sock.h>
@ -976,6 +977,28 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
struct phy_driver *drv = phydev->drv; struct phy_driver *drv = phydev->drv;
irqreturn_t ret; irqreturn_t ret;
/* Wakeup interrupts may occur during a system sleep transition.
* Postpone handling until the PHY has resumed.
*/
if (IS_ENABLED(CONFIG_PM_SLEEP) && phydev->irq_suspended) {
struct net_device *netdev = phydev->attached_dev;
if (netdev) {
struct device *parent = netdev->dev.parent;
if (netdev->wol_enabled)
pm_system_wakeup();
else if (device_may_wakeup(&netdev->dev))
pm_wakeup_dev_event(&netdev->dev, 0, true);
else if (parent && device_may_wakeup(parent))
pm_wakeup_dev_event(parent, 0, true);
}
phydev->irq_rerun = 1;
disable_irq_nosync(irq);
return IRQ_HANDLED;
}
mutex_lock(&phydev->lock); mutex_lock(&phydev->lock);
ret = drv->handle_interrupt(phydev); ret = drv->handle_interrupt(phydev);
mutex_unlock(&phydev->lock); mutex_unlock(&phydev->lock);

View File

@ -278,6 +278,15 @@ static __maybe_unused int mdio_bus_phy_suspend(struct device *dev)
if (phydev->mac_managed_pm) if (phydev->mac_managed_pm)
return 0; return 0;
/* Wakeup interrupts may occur during the system sleep transition when
* the PHY is inaccessible. Set flag to postpone handling until the PHY
* has resumed. Wait for concurrent interrupt handler to complete.
*/
if (phy_interrupt_is_valid(phydev)) {
phydev->irq_suspended = 1;
synchronize_irq(phydev->irq);
}
/* We must stop the state machine manually, otherwise it stops out of /* We must stop the state machine manually, otherwise it stops out of
* control, possibly with the phydev->lock held. Upon resume, netdev * control, possibly with the phydev->lock held. Upon resume, netdev
* may call phy routines that try to grab the same lock, and that may * may call phy routines that try to grab the same lock, and that may
@ -315,6 +324,20 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
if (ret < 0) if (ret < 0)
return ret; return ret;
no_resume: no_resume:
if (phy_interrupt_is_valid(phydev)) {
phydev->irq_suspended = 0;
synchronize_irq(phydev->irq);
/* Rerun interrupts which were postponed by phy_interrupt()
* because they occurred during the system sleep transition.
*/
if (phydev->irq_rerun) {
phydev->irq_rerun = 0;
enable_irq(phydev->irq);
irq_wake_thread(phydev->irq, phydev);
}
}
if (phydev->attached_dev && phydev->adjust_link) if (phydev->attached_dev && phydev->adjust_link)
phy_start_machine(phydev); phy_start_machine(phydev);

View File

@ -2516,7 +2516,7 @@ static int sfp_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, sfp); platform_set_drvdata(pdev, sfp);
err = devm_add_action(sfp->dev, sfp_cleanup, sfp); err = devm_add_action_or_reset(sfp->dev, sfp_cleanup, sfp);
if (err < 0) if (err < 0)
return err; return err;

View File

@ -273,6 +273,12 @@ static void tun_napi_init(struct tun_struct *tun, struct tun_file *tfile,
} }
} }
static void tun_napi_enable(struct tun_file *tfile)
{
if (tfile->napi_enabled)
napi_enable(&tfile->napi);
}
static void tun_napi_disable(struct tun_file *tfile) static void tun_napi_disable(struct tun_file *tfile)
{ {
if (tfile->napi_enabled) if (tfile->napi_enabled)
@ -634,7 +640,8 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
tun = rtnl_dereference(tfile->tun); tun = rtnl_dereference(tfile->tun);
if (tun && clean) { if (tun && clean) {
tun_napi_disable(tfile); if (!tfile->detached)
tun_napi_disable(tfile);
tun_napi_del(tfile); tun_napi_del(tfile);
} }
@ -653,8 +660,10 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
if (clean) { if (clean) {
RCU_INIT_POINTER(tfile->tun, NULL); RCU_INIT_POINTER(tfile->tun, NULL);
sock_put(&tfile->sk); sock_put(&tfile->sk);
} else } else {
tun_disable_queue(tun, tfile); tun_disable_queue(tun, tfile);
tun_napi_disable(tfile);
}
synchronize_net(); synchronize_net();
tun_flow_delete_by_queue(tun, tun->numqueues + 1); tun_flow_delete_by_queue(tun, tun->numqueues + 1);
@ -727,6 +736,7 @@ static void tun_detach_all(struct net_device *dev)
sock_put(&tfile->sk); sock_put(&tfile->sk);
} }
list_for_each_entry_safe(tfile, tmp, &tun->disabled, next) { list_for_each_entry_safe(tfile, tmp, &tun->disabled, next) {
tun_napi_del(tfile);
tun_enable_queue(tfile); tun_enable_queue(tfile);
tun_queue_purge(tfile); tun_queue_purge(tfile);
xdp_rxq_info_unreg(&tfile->xdp_rxq); xdp_rxq_info_unreg(&tfile->xdp_rxq);
@ -807,6 +817,7 @@ static int tun_attach(struct tun_struct *tun, struct file *file,
if (tfile->detached) { if (tfile->detached) {
tun_enable_queue(tfile); tun_enable_queue(tfile);
tun_napi_enable(tfile);
} else { } else {
sock_hold(&tfile->sk); sock_hold(&tfile->sk);
tun_napi_init(tun, tfile, napi, napi_frags); tun_napi_init(tun, tfile, napi, napi_frags);

View File

@ -126,8 +126,7 @@
AX_MEDIUM_RE) AX_MEDIUM_RE)
#define AX88772_MEDIUM_DEFAULT \ #define AX88772_MEDIUM_DEFAULT \
(AX_MEDIUM_FD | AX_MEDIUM_RFC | \ (AX_MEDIUM_FD | AX_MEDIUM_PS | \
AX_MEDIUM_TFC | AX_MEDIUM_PS | \
AX_MEDIUM_AC | AX_MEDIUM_RE) AX_MEDIUM_AC | AX_MEDIUM_RE)
/* AX88772 & AX88178 RX_CTL values */ /* AX88772 & AX88178 RX_CTL values */

View File

@ -431,6 +431,7 @@ void asix_adjust_link(struct net_device *netdev)
asix_write_medium_mode(dev, mode, 0); asix_write_medium_mode(dev, mode, 0);
phy_print_status(phydev); phy_print_status(phydev);
usbnet_link_change(dev, phydev->link, 0);
} }
int asix_write_gpio(struct usbnet *dev, u16 value, int sleep, int in_pm) int asix_write_gpio(struct usbnet *dev, u16 value, int sleep, int in_pm)

View File

@ -1472,6 +1472,42 @@ static int ax88179_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
* are bundled into this buffer and where we can find an array of * are bundled into this buffer and where we can find an array of
* per-packet metadata (which contains elements encoded into u16). * per-packet metadata (which contains elements encoded into u16).
*/ */
/* SKB contents for current firmware:
* <packet 1> <padding>
* ...
* <packet N> <padding>
* <per-packet metadata entry 1> <dummy header>
* ...
* <per-packet metadata entry N> <dummy header>
* <padding2> <rx_hdr>
*
* where:
* <packet N> contains pkt_len bytes:
* 2 bytes of IP alignment pseudo header
* packet received
* <per-packet metadata entry N> contains 4 bytes:
* pkt_len and fields AX_RXHDR_*
* <padding> 0-7 bytes to terminate at
* 8 bytes boundary (64-bit).
* <padding2> 4 bytes to make rx_hdr terminate at
* 8 bytes boundary (64-bit)
* <dummy-header> contains 4 bytes:
* pkt_len=0 and AX_RXHDR_DROP_ERR
* <rx-hdr> contains 4 bytes:
* pkt_cnt and hdr_off (offset of
* <per-packet metadata entry 1>)
*
* pkt_cnt is number of entrys in the per-packet metadata.
* In current firmware there is 2 entrys per packet.
* The first points to the packet and the
* second is a dummy header.
* This was done probably to align fields in 64-bit and
* maintain compatibility with old firmware.
* This code assumes that <dummy header> and <padding2> are
* optional.
*/
if (skb->len < 4) if (skb->len < 4)
return 0; return 0;
skb_trim(skb, skb->len - 4); skb_trim(skb, skb->len - 4);
@ -1485,51 +1521,66 @@ static int ax88179_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
/* Make sure that the bounds of the metadata array are inside the SKB /* Make sure that the bounds of the metadata array are inside the SKB
* (and in front of the counter at the end). * (and in front of the counter at the end).
*/ */
if (pkt_cnt * 2 + hdr_off > skb->len) if (pkt_cnt * 4 + hdr_off > skb->len)
return 0; return 0;
pkt_hdr = (u32 *)(skb->data + hdr_off); pkt_hdr = (u32 *)(skb->data + hdr_off);
/* Packets must not overlap the metadata array */ /* Packets must not overlap the metadata array */
skb_trim(skb, hdr_off); skb_trim(skb, hdr_off);
for (; ; pkt_cnt--, pkt_hdr++) { for (; pkt_cnt > 0; pkt_cnt--, pkt_hdr++) {
u16 pkt_len_plus_padd;
u16 pkt_len; u16 pkt_len;
le32_to_cpus(pkt_hdr); le32_to_cpus(pkt_hdr);
pkt_len = (*pkt_hdr >> 16) & 0x1fff; pkt_len = (*pkt_hdr >> 16) & 0x1fff;
pkt_len_plus_padd = (pkt_len + 7) & 0xfff8;
if (pkt_len > skb->len) /* Skip dummy header used for alignment
*/
if (pkt_len == 0)
continue;
if (pkt_len_plus_padd > skb->len)
return 0; return 0;
/* Check CRC or runt packet */ /* Check CRC or runt packet */
if (((*pkt_hdr & (AX_RXHDR_CRC_ERR | AX_RXHDR_DROP_ERR)) == 0) && if ((*pkt_hdr & (AX_RXHDR_CRC_ERR | AX_RXHDR_DROP_ERR)) ||
pkt_len >= 2 + ETH_HLEN) { pkt_len < 2 + ETH_HLEN) {
bool last = (pkt_cnt == 0); dev->net->stats.rx_errors++;
skb_pull(skb, pkt_len_plus_padd);
if (last) { continue;
ax_skb = skb;
} else {
ax_skb = skb_clone(skb, GFP_ATOMIC);
if (!ax_skb)
return 0;
}
ax_skb->len = pkt_len;
/* Skip IP alignment pseudo header */
skb_pull(ax_skb, 2);
skb_set_tail_pointer(ax_skb, ax_skb->len);
ax_skb->truesize = pkt_len + sizeof(struct sk_buff);
ax88179_rx_checksum(ax_skb, pkt_hdr);
if (last)
return 1;
usbnet_skb_return(dev, ax_skb);
} }
/* Trim this packet away from the SKB */ /* last packet */
if (!skb_pull(skb, (pkt_len + 7) & 0xFFF8)) if (pkt_len_plus_padd == skb->len) {
skb_trim(skb, pkt_len);
/* Skip IP alignment pseudo header */
skb_pull(skb, 2);
skb->truesize = SKB_TRUESIZE(pkt_len_plus_padd);
ax88179_rx_checksum(skb, pkt_hdr);
return 1;
}
ax_skb = skb_clone(skb, GFP_ATOMIC);
if (!ax_skb)
return 0; return 0;
skb_trim(ax_skb, pkt_len);
/* Skip IP alignment pseudo header */
skb_pull(ax_skb, 2);
skb->truesize = pkt_len_plus_padd +
SKB_DATA_ALIGN(sizeof(struct sk_buff));
ax88179_rx_checksum(ax_skb, pkt_hdr);
usbnet_skb_return(dev, ax_skb);
skb_pull(skb, pkt_len_plus_padd);
} }
return 0;
} }
static struct sk_buff * static struct sk_buff *

View File

@ -2004,7 +2004,7 @@ static int __usbnet_read_cmd(struct usbnet *dev, u8 cmd, u8 reqtype,
cmd, reqtype, value, index, size); cmd, reqtype, value, index, size);
if (size) { if (size) {
buf = kmalloc(size, GFP_KERNEL); buf = kmalloc(size, GFP_NOIO);
if (!buf) if (!buf)
goto out; goto out;
} }
@ -2036,7 +2036,7 @@ static int __usbnet_write_cmd(struct usbnet *dev, u8 cmd, u8 reqtype,
cmd, reqtype, value, index, size); cmd, reqtype, value, index, size);
if (data) { if (data) {
buf = kmemdup(data, size, GFP_KERNEL); buf = kmemdup(data, size, GFP_NOIO);
if (!buf) if (!buf)
goto out; goto out;
} else { } else {

View File

@ -167,9 +167,9 @@ static int nfcmrvl_i2c_parse_dt(struct device_node *node,
pdata->irq_polarity = IRQF_TRIGGER_RISING; pdata->irq_polarity = IRQF_TRIGGER_RISING;
ret = irq_of_parse_and_map(node, 0); ret = irq_of_parse_and_map(node, 0);
if (ret < 0) { if (!ret) {
pr_err("Unable to get irq, error: %d\n", ret); pr_err("Unable to get irq\n");
return ret; return -EINVAL;
} }
pdata->irq = ret; pdata->irq = ret;

View File

@ -115,9 +115,9 @@ static int nfcmrvl_spi_parse_dt(struct device_node *node,
} }
ret = irq_of_parse_and_map(node, 0); ret = irq_of_parse_and_map(node, 0);
if (ret < 0) { if (!ret) {
pr_err("Unable to get irq, error: %d\n", ret); pr_err("Unable to get irq\n");
return ret; return -EINVAL;
} }
pdata->irq = ret; pdata->irq = ret;

View File

@ -122,7 +122,9 @@ static int nxp_nci_i2c_fw_read(struct nxp_nci_i2c_phy *phy,
skb_put_data(*skb, &header, NXP_NCI_FW_HDR_LEN); skb_put_data(*skb, &header, NXP_NCI_FW_HDR_LEN);
r = i2c_master_recv(client, skb_put(*skb, frame_len), frame_len); r = i2c_master_recv(client, skb_put(*skb, frame_len), frame_len);
if (r != frame_len) { if (r < 0) {
goto fw_read_exit_free_skb;
} else if (r != frame_len) {
nfc_err(&client->dev, nfc_err(&client->dev,
"Invalid frame length: %u (expected %zu)\n", "Invalid frame length: %u (expected %zu)\n",
r, frame_len); r, frame_len);
@ -162,8 +164,13 @@ static int nxp_nci_i2c_nci_read(struct nxp_nci_i2c_phy *phy,
skb_put_data(*skb, (void *)&header, NCI_CTRL_HDR_SIZE); skb_put_data(*skb, (void *)&header, NCI_CTRL_HDR_SIZE);
if (!header.plen)
return 0;
r = i2c_master_recv(client, skb_put(*skb, header.plen), header.plen); r = i2c_master_recv(client, skb_put(*skb, header.plen), header.plen);
if (r != header.plen) { if (r < 0) {
goto nci_read_exit_free_skb;
} else if (r != header.plen) {
nfc_err(&client->dev, nfc_err(&client->dev,
"Invalid frame payload length: %u (expected %u)\n", "Invalid frame payload length: %u (expected %u)\n",
r, header.plen); r, header.plen);

View File

@ -1671,7 +1671,7 @@ enum netdev_priv_flags {
IFF_FAILOVER_SLAVE = 1<<28, IFF_FAILOVER_SLAVE = 1<<28,
IFF_L3MDEV_RX_HANDLER = 1<<29, IFF_L3MDEV_RX_HANDLER = 1<<29,
IFF_LIVE_RENAME_OK = 1<<30, IFF_LIVE_RENAME_OK = 1<<30,
IFF_TX_SKB_NO_LINEAR = 1<<31, IFF_TX_SKB_NO_LINEAR = BIT_ULL(31),
IFF_CHANGE_PROTO_DOWN = BIT_ULL(32), IFF_CHANGE_PROTO_DOWN = BIT_ULL(32),
}; };

View File

@ -572,6 +572,10 @@ struct macsec_ops;
* @mdix_ctrl: User setting of crossover * @mdix_ctrl: User setting of crossover
* @pma_extable: Cached value of PMA/PMD Extended Abilities Register * @pma_extable: Cached value of PMA/PMD Extended Abilities Register
* @interrupts: Flag interrupts have been enabled * @interrupts: Flag interrupts have been enabled
* @irq_suspended: Flag indicating PHY is suspended and therefore interrupt
* handling shall be postponed until PHY has resumed
* @irq_rerun: Flag indicating interrupts occurred while PHY was suspended,
* requiring a rerun of the interrupt handler after resume
* @interface: enum phy_interface_t value * @interface: enum phy_interface_t value
* @skb: Netlink message for cable diagnostics * @skb: Netlink message for cable diagnostics
* @nest: Netlink nest used for cable diagnostics * @nest: Netlink nest used for cable diagnostics
@ -626,6 +630,8 @@ struct phy_device {
/* Interrupts are enabled */ /* Interrupts are enabled */
unsigned interrupts:1; unsigned interrupts:1;
unsigned irq_suspended:1;
unsigned irq_rerun:1;
enum phy_state state; enum phy_state state;

View File

@ -1338,24 +1338,28 @@ void nft_unregister_flowtable_type(struct nf_flowtable_type *type);
/** /**
* struct nft_traceinfo - nft tracing information and state * struct nft_traceinfo - nft tracing information and state
* *
* @trace: other struct members are initialised
* @nf_trace: copy of skb->nf_trace before rule evaluation
* @type: event type (enum nft_trace_types)
* @skbid: hash of skb to be used as trace id
* @packet_dumped: packet headers sent in a previous traceinfo message
* @pkt: pktinfo currently processed * @pkt: pktinfo currently processed
* @basechain: base chain currently processed * @basechain: base chain currently processed
* @chain: chain currently processed * @chain: chain currently processed
* @rule: rule that was evaluated * @rule: rule that was evaluated
* @verdict: verdict given by rule * @verdict: verdict given by rule
* @type: event type (enum nft_trace_types)
* @packet_dumped: packet headers sent in a previous traceinfo message
* @trace: other struct members are initialised
*/ */
struct nft_traceinfo { struct nft_traceinfo {
bool trace;
bool nf_trace;
bool packet_dumped;
enum nft_trace_types type:8;
u32 skbid;
const struct nft_pktinfo *pkt; const struct nft_pktinfo *pkt;
const struct nft_base_chain *basechain; const struct nft_base_chain *basechain;
const struct nft_chain *chain; const struct nft_chain *chain;
const struct nft_rule_dp *rule; const struct nft_rule_dp *rule;
const struct nft_verdict *verdict; const struct nft_verdict *verdict;
enum nft_trace_types type;
bool packet_dumped;
bool trace;
}; };
void nft_trace_init(struct nft_traceinfo *info, const struct nft_pktinfo *pkt, void nft_trace_init(struct nft_traceinfo *info, const struct nft_pktinfo *pkt,

View File

@ -2,16 +2,17 @@
#ifndef _UAPI_MPTCP_H #ifndef _UAPI_MPTCP_H
#define _UAPI_MPTCP_H #define _UAPI_MPTCP_H
#ifndef __KERNEL__
#include <netinet/in.h> /* for sockaddr_in and sockaddr_in6 */
#include <sys/socket.h> /* for struct sockaddr */
#endif
#include <linux/const.h> #include <linux/const.h>
#include <linux/types.h> #include <linux/types.h>
#include <linux/in.h> /* for sockaddr_in */ #include <linux/in.h> /* for sockaddr_in */
#include <linux/in6.h> /* for sockaddr_in6 */ #include <linux/in6.h> /* for sockaddr_in6 */
#include <linux/socket.h> /* for sockaddr_storage and sa_family */ #include <linux/socket.h> /* for sockaddr_storage and sa_family */
#ifndef __KERNEL__
#include <sys/socket.h> /* for struct sockaddr */
#endif
#define MPTCP_SUBFLOW_FLAG_MCAP_REM _BITUL(0) #define MPTCP_SUBFLOW_FLAG_MCAP_REM _BITUL(0)
#define MPTCP_SUBFLOW_FLAG_MCAP_LOC _BITUL(1) #define MPTCP_SUBFLOW_FLAG_MCAP_LOC _BITUL(1)
#define MPTCP_SUBFLOW_FLAG_JOIN_REM _BITUL(2) #define MPTCP_SUBFLOW_FLAG_JOIN_REM _BITUL(2)

View File

@ -1012,9 +1012,24 @@ int br_nf_hook_thresh(unsigned int hook, struct net *net,
return okfn(net, sk, skb); return okfn(net, sk, skb);
ops = nf_hook_entries_get_hook_ops(e); ops = nf_hook_entries_get_hook_ops(e);
for (i = 0; i < e->num_hook_entries && for (i = 0; i < e->num_hook_entries; i++) {
ops[i]->priority <= NF_BR_PRI_BRNF; i++) /* These hooks have already been called */
; if (ops[i]->priority < NF_BR_PRI_BRNF)
continue;
/* These hooks have not been called yet, run them. */
if (ops[i]->priority > NF_BR_PRI_BRNF)
break;
/* take a closer look at NF_BR_PRI_BRNF. */
if (ops[i]->hook == br_nf_pre_routing) {
/* This hook diverted the skb to this function,
* hooks after this have not been run yet.
*/
i++;
break;
}
}
nf_hook_state_init(&state, hook, NFPROTO_BRIDGE, indev, outdev, nf_hook_state_init(&state, hook, NFPROTO_BRIDGE, indev, outdev,
sk, net, okfn); sk, net, okfn);

View File

@ -410,7 +410,7 @@ int skb_tunnel_check_pmtu(struct sk_buff *skb, struct dst_entry *encap_dst,
u32 mtu = dst_mtu(encap_dst) - headroom; u32 mtu = dst_mtu(encap_dst) - headroom;
if ((skb_is_gso(skb) && skb_gso_validate_network_len(skb, mtu)) || if ((skb_is_gso(skb) && skb_gso_validate_network_len(skb, mtu)) ||
(!skb_is_gso(skb) && (skb->len - skb_mac_header_len(skb)) <= mtu)) (!skb_is_gso(skb) && (skb->len - skb_network_offset(skb)) <= mtu))
return 0; return 0;
skb_dst_update_pmtu_no_confirm(skb, mtu); skb_dst_update_pmtu_no_confirm(skb, mtu);

View File

@ -1964,7 +1964,10 @@ process:
struct sock *nsk; struct sock *nsk;
sk = req->rsk_listener; sk = req->rsk_listener;
drop_reason = tcp_inbound_md5_hash(sk, skb, if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb))
drop_reason = SKB_DROP_REASON_XFRM_POLICY;
else
drop_reason = tcp_inbound_md5_hash(sk, skb,
&iph->saddr, &iph->daddr, &iph->saddr, &iph->daddr,
AF_INET, dif, sdif); AF_INET, dif, sdif);
if (unlikely(drop_reason)) { if (unlikely(drop_reason)) {
@ -2016,6 +2019,7 @@ process:
} }
goto discard_and_relse; goto discard_and_relse;
} }
nf_reset_ct(skb);
if (nsk == sk) { if (nsk == sk) {
reqsk_put(req); reqsk_put(req);
tcp_v4_restore_cb(skb); tcp_v4_restore_cb(skb);

View File

@ -1109,10 +1109,6 @@ ipv6_add_addr(struct inet6_dev *idev, struct ifa6_config *cfg,
goto out; goto out;
} }
if (net->ipv6.devconf_all->disable_policy ||
idev->cnf.disable_policy)
f6i->dst_nopolicy = true;
neigh_parms_data_state_setall(idev->nd_parms); neigh_parms_data_state_setall(idev->nd_parms);
ifa->addr = *cfg->pfx; ifa->addr = *cfg->pfx;
@ -5172,9 +5168,9 @@ next:
fillargs->event = RTM_GETMULTICAST; fillargs->event = RTM_GETMULTICAST;
/* multicast address */ /* multicast address */
for (ifmca = rcu_dereference(idev->mc_list); for (ifmca = rtnl_dereference(idev->mc_list);
ifmca; ifmca;
ifmca = rcu_dereference(ifmca->next), ip_idx++) { ifmca = rtnl_dereference(ifmca->next), ip_idx++) {
if (ip_idx < s_ip_idx) if (ip_idx < s_ip_idx)
continue; continue;
err = inet6_fill_ifmcaddr(skb, ifmca, fillargs); err = inet6_fill_ifmcaddr(skb, ifmca, fillargs);

View File

@ -4569,8 +4569,15 @@ struct fib6_info *addrconf_f6i_alloc(struct net *net,
} }
f6i = ip6_route_info_create(&cfg, gfp_flags, NULL); f6i = ip6_route_info_create(&cfg, gfp_flags, NULL);
if (!IS_ERR(f6i)) if (!IS_ERR(f6i)) {
f6i->dst_nocount = true; f6i->dst_nocount = true;
if (!anycast &&
(net->ipv6.devconf_all->disable_policy ||
idev->cnf.disable_policy))
f6i->dst_nopolicy = true;
}
return f6i; return f6i;
} }

View File

@ -406,7 +406,6 @@ int __net_init seg6_hmac_net_init(struct net *net)
return rhashtable_init(&sdata->hmac_infos, &rht_params); return rhashtable_init(&sdata->hmac_infos, &rht_params);
} }
EXPORT_SYMBOL(seg6_hmac_net_init);
void seg6_hmac_exit(void) void seg6_hmac_exit(void)
{ {

View File

@ -323,8 +323,6 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
kcalloc(cmax, sizeof(*kp), GFP_KERNEL_ACCOUNT | __GFP_NOWARN) : kcalloc(cmax, sizeof(*kp), GFP_KERNEL_ACCOUNT | __GFP_NOWARN) :
NULL; NULL;
rcu_read_lock();
ca = min(t->prl_count, cmax); ca = min(t->prl_count, cmax);
if (!kp) { if (!kp) {
@ -341,7 +339,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
} }
} }
c = 0; rcu_read_lock();
for_each_prl_rcu(t->prl) { for_each_prl_rcu(t->prl) {
if (c >= cmax) if (c >= cmax)
break; break;
@ -353,7 +351,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
if (kprl.addr != htonl(INADDR_ANY)) if (kprl.addr != htonl(INADDR_ANY))
break; break;
} }
out:
rcu_read_unlock(); rcu_read_unlock();
len = sizeof(*kp) * c; len = sizeof(*kp) * c;
@ -362,7 +360,7 @@ out:
ret = -EFAULT; ret = -EFAULT;
kfree(kp); kfree(kp);
out:
return ret; return ret;
} }

View File

@ -765,6 +765,7 @@ static noinline bool mptcp_established_options_rst(struct sock *sk, struct sk_bu
opts->suboptions |= OPTION_MPTCP_RST; opts->suboptions |= OPTION_MPTCP_RST;
opts->reset_transient = subflow->reset_transient; opts->reset_transient = subflow->reset_transient;
opts->reset_reason = subflow->reset_reason; opts->reset_reason = subflow->reset_reason;
MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPRSTTX);
return true; return true;
} }
@ -788,6 +789,7 @@ static bool mptcp_established_options_fastclose(struct sock *sk,
opts->rcvr_key = msk->remote_key; opts->rcvr_key = msk->remote_key;
pr_debug("FASTCLOSE key=%llu", opts->rcvr_key); pr_debug("FASTCLOSE key=%llu", opts->rcvr_key);
MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPFASTCLOSETX);
return true; return true;
} }
@ -809,6 +811,7 @@ static bool mptcp_established_options_mp_fail(struct sock *sk,
opts->fail_seq = subflow->map_seq; opts->fail_seq = subflow->map_seq;
pr_debug("MP_FAIL fail_seq=%llu", opts->fail_seq); pr_debug("MP_FAIL fail_seq=%llu", opts->fail_seq);
MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPFAILTX);
return true; return true;
} }
@ -833,13 +836,11 @@ bool mptcp_established_options(struct sock *sk, struct sk_buff *skb,
mptcp_established_options_mp_fail(sk, &opt_size, remaining, opts)) { mptcp_established_options_mp_fail(sk, &opt_size, remaining, opts)) {
*size += opt_size; *size += opt_size;
remaining -= opt_size; remaining -= opt_size;
MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPFASTCLOSETX);
} }
/* MP_RST can be used with MP_FASTCLOSE and MP_FAIL if there is room */ /* MP_RST can be used with MP_FASTCLOSE and MP_FAIL if there is room */
if (mptcp_established_options_rst(sk, skb, &opt_size, remaining, opts)) { if (mptcp_established_options_rst(sk, skb, &opt_size, remaining, opts)) {
*size += opt_size; *size += opt_size;
remaining -= opt_size; remaining -= opt_size;
MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPRSTTX);
} }
return true; return true;
} }
@ -966,7 +967,7 @@ static bool check_fully_established(struct mptcp_sock *msk, struct sock *ssk,
goto reset; goto reset;
subflow->mp_capable = 0; subflow->mp_capable = 0;
pr_fallback(msk); pr_fallback(msk);
__mptcp_do_fallback(msk); mptcp_do_fallback(ssk);
return false; return false;
} }

View File

@ -299,23 +299,21 @@ void mptcp_pm_mp_fail_received(struct sock *sk, u64 fail_seq)
{ {
struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
struct mptcp_sock *msk = mptcp_sk(subflow->conn); struct mptcp_sock *msk = mptcp_sk(subflow->conn);
struct sock *s = (struct sock *)msk;
pr_debug("fail_seq=%llu", fail_seq); pr_debug("fail_seq=%llu", fail_seq);
if (!READ_ONCE(msk->allow_infinite_fallback)) if (!READ_ONCE(msk->allow_infinite_fallback))
return; return;
if (!READ_ONCE(subflow->mp_fail_response_expect)) { if (!subflow->fail_tout) {
pr_debug("send MP_FAIL response and infinite map"); pr_debug("send MP_FAIL response and infinite map");
subflow->send_mp_fail = 1; subflow->send_mp_fail = 1;
MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPFAILTX);
subflow->send_infinite_map = 1; subflow->send_infinite_map = 1;
} else if (!sock_flag(sk, SOCK_DEAD)) { tcp_send_ack(sk);
} else {
pr_debug("MP_FAIL response received"); pr_debug("MP_FAIL response received");
WRITE_ONCE(subflow->fail_tout, 0);
sk_stop_timer(s, &s->sk_timer);
} }
} }

View File

@ -500,7 +500,7 @@ static void mptcp_set_timeout(struct sock *sk)
__mptcp_set_timeout(sk, tout); __mptcp_set_timeout(sk, tout);
} }
static bool tcp_can_send_ack(const struct sock *ssk) static inline bool tcp_can_send_ack(const struct sock *ssk)
{ {
return !((1 << inet_sk_state_load(ssk)) & return !((1 << inet_sk_state_load(ssk)) &
(TCPF_SYN_SENT | TCPF_SYN_RECV | TCPF_TIME_WAIT | TCPF_CLOSE | TCPF_LISTEN)); (TCPF_SYN_SENT | TCPF_SYN_RECV | TCPF_TIME_WAIT | TCPF_CLOSE | TCPF_LISTEN));
@ -1245,7 +1245,7 @@ static void mptcp_update_infinite_map(struct mptcp_sock *msk,
MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_INFINITEMAPTX); MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_INFINITEMAPTX);
mptcp_subflow_ctx(ssk)->send_infinite_map = 0; mptcp_subflow_ctx(ssk)->send_infinite_map = 0;
pr_fallback(msk); pr_fallback(msk);
__mptcp_do_fallback(msk); mptcp_do_fallback(ssk);
} }
static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk, static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
@ -2175,21 +2175,6 @@ static void mptcp_retransmit_timer(struct timer_list *t)
sock_put(sk); sock_put(sk);
} }
static struct mptcp_subflow_context *
mp_fail_response_expect_subflow(struct mptcp_sock *msk)
{
struct mptcp_subflow_context *subflow, *ret = NULL;
mptcp_for_each_subflow(msk, subflow) {
if (READ_ONCE(subflow->mp_fail_response_expect)) {
ret = subflow;
break;
}
}
return ret;
}
static void mptcp_timeout_timer(struct timer_list *t) static void mptcp_timeout_timer(struct timer_list *t)
{ {
struct sock *sk = from_timer(sk, t, sk_timer); struct sock *sk = from_timer(sk, t, sk_timer);
@ -2346,6 +2331,11 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
kfree_rcu(subflow, rcu); kfree_rcu(subflow, rcu);
} else { } else {
/* otherwise tcp will dispose of the ssk and subflow ctx */ /* otherwise tcp will dispose of the ssk and subflow ctx */
if (ssk->sk_state == TCP_LISTEN) {
tcp_set_state(ssk, TCP_CLOSE);
mptcp_subflow_queue_clean(ssk);
inet_csk_listen_stop(ssk);
}
__tcp_close(ssk, 0); __tcp_close(ssk, 0);
/* close acquired an extra ref */ /* close acquired an extra ref */
@ -2518,27 +2508,50 @@ reset_timer:
mptcp_reset_timer(sk); mptcp_reset_timer(sk);
} }
/* schedule the timeout timer for the relevant event: either close timeout
* or mp_fail timeout. The close timeout takes precedence on the mp_fail one
*/
void mptcp_reset_timeout(struct mptcp_sock *msk, unsigned long fail_tout)
{
struct sock *sk = (struct sock *)msk;
unsigned long timeout, close_timeout;
if (!fail_tout && !sock_flag(sk, SOCK_DEAD))
return;
close_timeout = inet_csk(sk)->icsk_mtup.probe_timestamp - tcp_jiffies32 + jiffies + TCP_TIMEWAIT_LEN;
/* the close timeout takes precedence on the fail one, and here at least one of
* them is active
*/
timeout = sock_flag(sk, SOCK_DEAD) ? close_timeout : fail_tout;
sk_reset_timer(sk, &sk->sk_timer, timeout);
}
static void mptcp_mp_fail_no_response(struct mptcp_sock *msk) static void mptcp_mp_fail_no_response(struct mptcp_sock *msk)
{ {
struct mptcp_subflow_context *subflow; struct sock *ssk = msk->first;
struct sock *ssk;
bool slow; bool slow;
subflow = mp_fail_response_expect_subflow(msk); if (!ssk)
if (subflow) { return;
pr_debug("MP_FAIL doesn't respond, reset the subflow");
ssk = mptcp_subflow_tcp_sock(subflow); pr_debug("MP_FAIL doesn't respond, reset the subflow");
slow = lock_sock_fast(ssk);
mptcp_subflow_reset(ssk); slow = lock_sock_fast(ssk);
unlock_sock_fast(ssk, slow); mptcp_subflow_reset(ssk);
} WRITE_ONCE(mptcp_subflow_ctx(ssk)->fail_tout, 0);
unlock_sock_fast(ssk, slow);
mptcp_reset_timeout(msk, 0);
} }
static void mptcp_worker(struct work_struct *work) static void mptcp_worker(struct work_struct *work)
{ {
struct mptcp_sock *msk = container_of(work, struct mptcp_sock, work); struct mptcp_sock *msk = container_of(work, struct mptcp_sock, work);
struct sock *sk = &msk->sk.icsk_inet.sk; struct sock *sk = &msk->sk.icsk_inet.sk;
unsigned long fail_tout;
int state; int state;
lock_sock(sk); lock_sock(sk);
@ -2575,7 +2588,9 @@ static void mptcp_worker(struct work_struct *work)
if (test_and_clear_bit(MPTCP_WORK_RTX, &msk->flags)) if (test_and_clear_bit(MPTCP_WORK_RTX, &msk->flags))
__mptcp_retrans(sk); __mptcp_retrans(sk);
mptcp_mp_fail_no_response(msk); fail_tout = msk->first ? READ_ONCE(mptcp_subflow_ctx(msk->first)->fail_tout) : 0;
if (fail_tout && time_after(jiffies, fail_tout))
mptcp_mp_fail_no_response(msk);
unlock: unlock:
release_sock(sk); release_sock(sk);
@ -2822,6 +2837,7 @@ static void __mptcp_destroy_sock(struct sock *sk)
static void mptcp_close(struct sock *sk, long timeout) static void mptcp_close(struct sock *sk, long timeout)
{ {
struct mptcp_subflow_context *subflow; struct mptcp_subflow_context *subflow;
struct mptcp_sock *msk = mptcp_sk(sk);
bool do_cancel_work = false; bool do_cancel_work = false;
lock_sock(sk); lock_sock(sk);
@ -2840,10 +2856,16 @@ static void mptcp_close(struct sock *sk, long timeout)
cleanup: cleanup:
/* orphan all the subflows */ /* orphan all the subflows */
inet_csk(sk)->icsk_mtup.probe_timestamp = tcp_jiffies32; inet_csk(sk)->icsk_mtup.probe_timestamp = tcp_jiffies32;
mptcp_for_each_subflow(mptcp_sk(sk), subflow) { mptcp_for_each_subflow(msk, subflow) {
struct sock *ssk = mptcp_subflow_tcp_sock(subflow); struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
bool slow = lock_sock_fast_nested(ssk); bool slow = lock_sock_fast_nested(ssk);
/* since the close timeout takes precedence on the fail one,
* cancel the latter
*/
if (ssk == msk->first)
subflow->fail_tout = 0;
sock_orphan(ssk); sock_orphan(ssk);
unlock_sock_fast(ssk, slow); unlock_sock_fast(ssk, slow);
} }
@ -2852,13 +2874,13 @@ cleanup:
sock_hold(sk); sock_hold(sk);
pr_debug("msk=%p state=%d", sk, sk->sk_state); pr_debug("msk=%p state=%d", sk, sk->sk_state);
if (mptcp_sk(sk)->token) if (mptcp_sk(sk)->token)
mptcp_event(MPTCP_EVENT_CLOSED, mptcp_sk(sk), NULL, GFP_KERNEL); mptcp_event(MPTCP_EVENT_CLOSED, msk, NULL, GFP_KERNEL);
if (sk->sk_state == TCP_CLOSE) { if (sk->sk_state == TCP_CLOSE) {
__mptcp_destroy_sock(sk); __mptcp_destroy_sock(sk);
do_cancel_work = true; do_cancel_work = true;
} else { } else {
sk_reset_timer(sk, &sk->sk_timer, jiffies + TCP_TIMEWAIT_LEN); mptcp_reset_timeout(msk, 0);
} }
release_sock(sk); release_sock(sk);
if (do_cancel_work) if (do_cancel_work)

View File

@ -306,6 +306,7 @@ struct mptcp_sock {
u32 setsockopt_seq; u32 setsockopt_seq;
char ca_name[TCP_CA_NAME_MAX]; char ca_name[TCP_CA_NAME_MAX];
struct mptcp_sock *dl_next;
}; };
#define mptcp_data_lock(sk) spin_lock_bh(&(sk)->sk_lock.slock) #define mptcp_data_lock(sk) spin_lock_bh(&(sk)->sk_lock.slock)
@ -468,7 +469,6 @@ struct mptcp_subflow_context {
local_id_valid : 1, /* local_id is correctly initialized */ local_id_valid : 1, /* local_id is correctly initialized */
valid_csum_seen : 1; /* at least one csum validated */ valid_csum_seen : 1; /* at least one csum validated */
enum mptcp_data_avail data_avail; enum mptcp_data_avail data_avail;
bool mp_fail_response_expect;
u32 remote_nonce; u32 remote_nonce;
u64 thmac; u64 thmac;
u32 local_nonce; u32 local_nonce;
@ -482,6 +482,7 @@ struct mptcp_subflow_context {
u8 stale_count; u8 stale_count;
long delegated_status; long delegated_status;
unsigned long fail_tout;
); );
@ -608,6 +609,7 @@ void mptcp_close_ssk(struct sock *sk, struct sock *ssk,
struct mptcp_subflow_context *subflow); struct mptcp_subflow_context *subflow);
void mptcp_subflow_send_ack(struct sock *ssk); void mptcp_subflow_send_ack(struct sock *ssk);
void mptcp_subflow_reset(struct sock *ssk); void mptcp_subflow_reset(struct sock *ssk);
void mptcp_subflow_queue_clean(struct sock *ssk);
void mptcp_sock_graft(struct sock *sk, struct socket *parent); void mptcp_sock_graft(struct sock *sk, struct socket *parent);
struct socket *__mptcp_nmpc_socket(const struct mptcp_sock *msk); struct socket *__mptcp_nmpc_socket(const struct mptcp_sock *msk);
@ -662,6 +664,7 @@ void mptcp_get_options(const struct sk_buff *skb,
void mptcp_finish_connect(struct sock *sk); void mptcp_finish_connect(struct sock *sk);
void __mptcp_set_connected(struct sock *sk); void __mptcp_set_connected(struct sock *sk);
void mptcp_reset_timeout(struct mptcp_sock *msk, unsigned long fail_tout);
static inline bool mptcp_is_fully_established(struct sock *sk) static inline bool mptcp_is_fully_established(struct sock *sk)
{ {
return inet_sk_state_load(sk) == TCP_ESTABLISHED && return inet_sk_state_load(sk) == TCP_ESTABLISHED &&
@ -926,12 +929,25 @@ static inline void __mptcp_do_fallback(struct mptcp_sock *msk)
set_bit(MPTCP_FALLBACK_DONE, &msk->flags); set_bit(MPTCP_FALLBACK_DONE, &msk->flags);
} }
static inline void mptcp_do_fallback(struct sock *sk) static inline void mptcp_do_fallback(struct sock *ssk)
{ {
struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
struct mptcp_sock *msk = mptcp_sk(subflow->conn); struct sock *sk = subflow->conn;
struct mptcp_sock *msk;
msk = mptcp_sk(sk);
__mptcp_do_fallback(msk); __mptcp_do_fallback(msk);
if (READ_ONCE(msk->snd_data_fin_enable) && !(ssk->sk_shutdown & SEND_SHUTDOWN)) {
gfp_t saved_allocation = ssk->sk_allocation;
/* we are in a atomic (BH) scope, override ssk default for data
* fin allocation
*/
ssk->sk_allocation = GFP_ATOMIC;
ssk->sk_shutdown |= SEND_SHUTDOWN;
tcp_shutdown(ssk, SEND_SHUTDOWN);
ssk->sk_allocation = saved_allocation;
}
} }
#define pr_fallback(a) pr_debug("%s:fallback to TCP (msk=%p)", __func__, a) #define pr_fallback(a) pr_debug("%s:fallback to TCP (msk=%p)", __func__, a)

View File

@ -843,7 +843,8 @@ enum mapping_status {
MAPPING_INVALID, MAPPING_INVALID,
MAPPING_EMPTY, MAPPING_EMPTY,
MAPPING_DATA_FIN, MAPPING_DATA_FIN,
MAPPING_DUMMY MAPPING_DUMMY,
MAPPING_BAD_CSUM
}; };
static void dbg_bad_map(struct mptcp_subflow_context *subflow, u32 ssn) static void dbg_bad_map(struct mptcp_subflow_context *subflow, u32 ssn)
@ -958,11 +959,7 @@ static enum mapping_status validate_data_csum(struct sock *ssk, struct sk_buff *
subflow->map_data_csum); subflow->map_data_csum);
if (unlikely(csum)) { if (unlikely(csum)) {
MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_DATACSUMERR); MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_DATACSUMERR);
if (subflow->mp_join || subflow->valid_csum_seen) { return MAPPING_BAD_CSUM;
subflow->send_mp_fail = 1;
MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_MPFAILTX);
}
return subflow->mp_join ? MAPPING_INVALID : MAPPING_DUMMY;
} }
subflow->valid_csum_seen = 1; subflow->valid_csum_seen = 1;
@ -974,7 +971,6 @@ static enum mapping_status get_mapping_status(struct sock *ssk,
{ {
struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
bool csum_reqd = READ_ONCE(msk->csum_enabled); bool csum_reqd = READ_ONCE(msk->csum_enabled);
struct sock *sk = (struct sock *)msk;
struct mptcp_ext *mpext; struct mptcp_ext *mpext;
struct sk_buff *skb; struct sk_buff *skb;
u16 data_len; u16 data_len;
@ -1016,9 +1012,6 @@ static enum mapping_status get_mapping_status(struct sock *ssk,
pr_debug("infinite mapping received"); pr_debug("infinite mapping received");
MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_INFINITEMAPRX); MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_INFINITEMAPRX);
subflow->map_data_len = 0; subflow->map_data_len = 0;
if (!sock_flag(ssk, SOCK_DEAD))
sk_stop_timer(sk, &sk->sk_timer);
return MAPPING_INVALID; return MAPPING_INVALID;
} }
@ -1165,6 +1158,33 @@ static bool subflow_can_fallback(struct mptcp_subflow_context *subflow)
return !subflow->fully_established; return !subflow->fully_established;
} }
static void mptcp_subflow_fail(struct mptcp_sock *msk, struct sock *ssk)
{
struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
unsigned long fail_tout;
/* greceful failure can happen only on the MPC subflow */
if (WARN_ON_ONCE(ssk != READ_ONCE(msk->first)))
return;
/* since the close timeout take precedence on the fail one,
* no need to start the latter when the first is already set
*/
if (sock_flag((struct sock *)msk, SOCK_DEAD))
return;
/* we don't need extreme accuracy here, use a zero fail_tout as special
* value meaning no fail timeout at all;
*/
fail_tout = jiffies + TCP_RTO_MAX;
if (!fail_tout)
fail_tout = 1;
WRITE_ONCE(subflow->fail_tout, fail_tout);
tcp_send_ack(ssk);
mptcp_reset_timeout(msk, subflow->fail_tout);
}
static bool subflow_check_data_avail(struct sock *ssk) static bool subflow_check_data_avail(struct sock *ssk)
{ {
struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
@ -1184,10 +1204,8 @@ static bool subflow_check_data_avail(struct sock *ssk)
status = get_mapping_status(ssk, msk); status = get_mapping_status(ssk, msk);
trace_subflow_check_data_avail(status, skb_peek(&ssk->sk_receive_queue)); trace_subflow_check_data_avail(status, skb_peek(&ssk->sk_receive_queue));
if (unlikely(status == MAPPING_INVALID)) if (unlikely(status == MAPPING_INVALID || status == MAPPING_DUMMY ||
goto fallback; status == MAPPING_BAD_CSUM))
if (unlikely(status == MAPPING_DUMMY))
goto fallback; goto fallback;
if (status != MAPPING_OK) if (status != MAPPING_OK)
@ -1229,22 +1247,17 @@ no_data:
fallback: fallback:
if (!__mptcp_check_fallback(msk)) { if (!__mptcp_check_fallback(msk)) {
/* RFC 8684 section 3.7. */ /* RFC 8684 section 3.7. */
if (subflow->send_mp_fail) { if (status == MAPPING_BAD_CSUM &&
(subflow->mp_join || subflow->valid_csum_seen)) {
subflow->send_mp_fail = 1;
if (!READ_ONCE(msk->allow_infinite_fallback)) { if (!READ_ONCE(msk->allow_infinite_fallback)) {
ssk->sk_err = EBADMSG;
tcp_set_state(ssk, TCP_CLOSE);
subflow->reset_transient = 0; subflow->reset_transient = 0;
subflow->reset_reason = MPTCP_RST_EMIDDLEBOX; subflow->reset_reason = MPTCP_RST_EMIDDLEBOX;
tcp_send_active_reset(ssk, GFP_ATOMIC); goto reset;
while ((skb = skb_peek(&ssk->sk_receive_queue)))
sk_eat_skb(ssk, skb);
} else if (!sock_flag(ssk, SOCK_DEAD)) {
WRITE_ONCE(subflow->mp_fail_response_expect, true);
sk_reset_timer((struct sock *)msk,
&((struct sock *)msk)->sk_timer,
jiffies + TCP_RTO_MAX);
} }
WRITE_ONCE(subflow->data_avail, MPTCP_SUBFLOW_NODATA); mptcp_subflow_fail(msk, ssk);
WRITE_ONCE(subflow->data_avail, MPTCP_SUBFLOW_DATA_AVAIL);
return true; return true;
} }
@ -1252,16 +1265,20 @@ fallback:
/* fatal protocol error, close the socket. /* fatal protocol error, close the socket.
* subflow_error_report() will introduce the appropriate barriers * subflow_error_report() will introduce the appropriate barriers
*/ */
ssk->sk_err = EBADMSG;
tcp_set_state(ssk, TCP_CLOSE);
subflow->reset_transient = 0; subflow->reset_transient = 0;
subflow->reset_reason = MPTCP_RST_EMPTCP; subflow->reset_reason = MPTCP_RST_EMPTCP;
reset:
ssk->sk_err = EBADMSG;
tcp_set_state(ssk, TCP_CLOSE);
while ((skb = skb_peek(&ssk->sk_receive_queue)))
sk_eat_skb(ssk, skb);
tcp_send_active_reset(ssk, GFP_ATOMIC); tcp_send_active_reset(ssk, GFP_ATOMIC);
WRITE_ONCE(subflow->data_avail, MPTCP_SUBFLOW_NODATA); WRITE_ONCE(subflow->data_avail, MPTCP_SUBFLOW_NODATA);
return false; return false;
} }
__mptcp_do_fallback(msk); mptcp_do_fallback(ssk);
} }
skb = skb_peek(&ssk->sk_receive_queue); skb = skb_peek(&ssk->sk_receive_queue);
@ -1706,6 +1723,58 @@ static void subflow_state_change(struct sock *sk)
} }
} }
void mptcp_subflow_queue_clean(struct sock *listener_ssk)
{
struct request_sock_queue *queue = &inet_csk(listener_ssk)->icsk_accept_queue;
struct mptcp_sock *msk, *next, *head = NULL;
struct request_sock *req;
/* build a list of all unaccepted mptcp sockets */
spin_lock_bh(&queue->rskq_lock);
for (req = queue->rskq_accept_head; req; req = req->dl_next) {
struct mptcp_subflow_context *subflow;
struct sock *ssk = req->sk;
struct mptcp_sock *msk;
if (!sk_is_mptcp(ssk))
continue;
subflow = mptcp_subflow_ctx(ssk);
if (!subflow || !subflow->conn)
continue;
/* skip if already in list */
msk = mptcp_sk(subflow->conn);
if (msk->dl_next || msk == head)
continue;
msk->dl_next = head;
head = msk;
}
spin_unlock_bh(&queue->rskq_lock);
if (!head)
return;
/* can't acquire the msk socket lock under the subflow one,
* or will cause ABBA deadlock
*/
release_sock(listener_ssk);
for (msk = head; msk; msk = next) {
struct sock *sk = (struct sock *)msk;
bool slow;
slow = lock_sock_fast_nested(sk);
next = msk->dl_next;
msk->first = NULL;
msk->dl_next = NULL;
unlock_sock_fast(sk, slow);
}
/* we are still under the listener msk socket lock */
lock_sock_nested(listener_ssk, SINGLE_DEPTH_NESTING);
}
static int subflow_ulp_init(struct sock *sk) static int subflow_ulp_init(struct sock *sk)
{ {
struct inet_connection_sock *icsk = inet_csk(sk); struct inet_connection_sock *icsk = inet_csk(sk);

View File

@ -1803,7 +1803,8 @@ struct ncsi_dev *ncsi_register_dev(struct net_device *dev,
pdev = to_platform_device(dev->dev.parent); pdev = to_platform_device(dev->dev.parent);
if (pdev) { if (pdev) {
np = pdev->dev.of_node; np = pdev->dev.of_node;
if (np && of_get_property(np, "mlx,multi-host", NULL)) if (np && (of_get_property(np, "mellanox,multi-host", NULL) ||
of_get_property(np, "mlx,multi-host", NULL)))
ndp->mlx_multi_host = true; ndp->mlx_multi_host = true;
} }

View File

@ -25,9 +25,7 @@ static noinline void __nft_trace_packet(struct nft_traceinfo *info,
const struct nft_chain *chain, const struct nft_chain *chain,
enum nft_trace_types type) enum nft_trace_types type)
{ {
const struct nft_pktinfo *pkt = info->pkt; if (!info->trace || !info->nf_trace)
if (!info->trace || !pkt->skb->nf_trace)
return; return;
info->chain = chain; info->chain = chain;
@ -42,11 +40,24 @@ static inline void nft_trace_packet(struct nft_traceinfo *info,
enum nft_trace_types type) enum nft_trace_types type)
{ {
if (static_branch_unlikely(&nft_trace_enabled)) { if (static_branch_unlikely(&nft_trace_enabled)) {
const struct nft_pktinfo *pkt = info->pkt;
info->nf_trace = pkt->skb->nf_trace;
info->rule = rule; info->rule = rule;
__nft_trace_packet(info, chain, type); __nft_trace_packet(info, chain, type);
} }
} }
static inline void nft_trace_copy_nftrace(struct nft_traceinfo *info)
{
if (static_branch_unlikely(&nft_trace_enabled)) {
const struct nft_pktinfo *pkt = info->pkt;
if (info->trace)
info->nf_trace = pkt->skb->nf_trace;
}
}
static void nft_bitwise_fast_eval(const struct nft_expr *expr, static void nft_bitwise_fast_eval(const struct nft_expr *expr,
struct nft_regs *regs) struct nft_regs *regs)
{ {
@ -85,6 +96,7 @@ static noinline void __nft_trace_verdict(struct nft_traceinfo *info,
const struct nft_chain *chain, const struct nft_chain *chain,
const struct nft_regs *regs) const struct nft_regs *regs)
{ {
const struct nft_pktinfo *pkt = info->pkt;
enum nft_trace_types type; enum nft_trace_types type;
switch (regs->verdict.code) { switch (regs->verdict.code) {
@ -92,8 +104,13 @@ static noinline void __nft_trace_verdict(struct nft_traceinfo *info,
case NFT_RETURN: case NFT_RETURN:
type = NFT_TRACETYPE_RETURN; type = NFT_TRACETYPE_RETURN;
break; break;
case NF_STOLEN:
type = NFT_TRACETYPE_RULE;
/* can't access skb->nf_trace; use copy */
break;
default: default:
type = NFT_TRACETYPE_RULE; type = NFT_TRACETYPE_RULE;
info->nf_trace = pkt->skb->nf_trace;
break; break;
} }
@ -254,6 +271,7 @@ next_rule:
switch (regs.verdict.code) { switch (regs.verdict.code) {
case NFT_BREAK: case NFT_BREAK:
regs.verdict.code = NFT_CONTINUE; regs.verdict.code = NFT_CONTINUE;
nft_trace_copy_nftrace(&info);
continue; continue;
case NFT_CONTINUE: case NFT_CONTINUE:
nft_trace_packet(&info, chain, rule, nft_trace_packet(&info, chain, rule,

View File

@ -7,7 +7,7 @@
#include <linux/module.h> #include <linux/module.h>
#include <linux/static_key.h> #include <linux/static_key.h>
#include <linux/hash.h> #include <linux/hash.h>
#include <linux/jhash.h> #include <linux/siphash.h>
#include <linux/if_vlan.h> #include <linux/if_vlan.h>
#include <linux/init.h> #include <linux/init.h>
#include <linux/skbuff.h> #include <linux/skbuff.h>
@ -25,22 +25,6 @@
DEFINE_STATIC_KEY_FALSE(nft_trace_enabled); DEFINE_STATIC_KEY_FALSE(nft_trace_enabled);
EXPORT_SYMBOL_GPL(nft_trace_enabled); EXPORT_SYMBOL_GPL(nft_trace_enabled);
static int trace_fill_id(struct sk_buff *nlskb, struct sk_buff *skb)
{
__be32 id;
/* using skb address as ID results in a limited number of
* values (and quick reuse).
*
* So we attempt to use as many skb members that will not
* change while skb is with netfilter.
*/
id = (__be32)jhash_2words(hash32_ptr(skb), skb_get_hash(skb),
skb->skb_iif);
return nla_put_be32(nlskb, NFTA_TRACE_ID, id);
}
static int trace_fill_header(struct sk_buff *nlskb, u16 type, static int trace_fill_header(struct sk_buff *nlskb, u16 type,
const struct sk_buff *skb, const struct sk_buff *skb,
int off, unsigned int len) int off, unsigned int len)
@ -186,6 +170,7 @@ void nft_trace_notify(struct nft_traceinfo *info)
struct nlmsghdr *nlh; struct nlmsghdr *nlh;
struct sk_buff *skb; struct sk_buff *skb;
unsigned int size; unsigned int size;
u32 mark = 0;
u16 event; u16 event;
if (!nfnetlink_has_listeners(nft_net(pkt), NFNLGRP_NFTRACE)) if (!nfnetlink_has_listeners(nft_net(pkt), NFNLGRP_NFTRACE))
@ -229,7 +214,7 @@ void nft_trace_notify(struct nft_traceinfo *info)
if (nla_put_be32(skb, NFTA_TRACE_TYPE, htonl(info->type))) if (nla_put_be32(skb, NFTA_TRACE_TYPE, htonl(info->type)))
goto nla_put_failure; goto nla_put_failure;
if (trace_fill_id(skb, pkt->skb)) if (nla_put_u32(skb, NFTA_TRACE_ID, info->skbid))
goto nla_put_failure; goto nla_put_failure;
if (nla_put_string(skb, NFTA_TRACE_CHAIN, info->chain->name)) if (nla_put_string(skb, NFTA_TRACE_CHAIN, info->chain->name))
@ -249,16 +234,24 @@ void nft_trace_notify(struct nft_traceinfo *info)
case NFT_TRACETYPE_RULE: case NFT_TRACETYPE_RULE:
if (nft_verdict_dump(skb, NFTA_TRACE_VERDICT, info->verdict)) if (nft_verdict_dump(skb, NFTA_TRACE_VERDICT, info->verdict))
goto nla_put_failure; goto nla_put_failure;
/* pkt->skb undefined iff NF_STOLEN, disable dump */
if (info->verdict->code == NF_STOLEN)
info->packet_dumped = true;
else
mark = pkt->skb->mark;
break; break;
case NFT_TRACETYPE_POLICY: case NFT_TRACETYPE_POLICY:
mark = pkt->skb->mark;
if (nla_put_be32(skb, NFTA_TRACE_POLICY, if (nla_put_be32(skb, NFTA_TRACE_POLICY,
htonl(info->basechain->policy))) htonl(info->basechain->policy)))
goto nla_put_failure; goto nla_put_failure;
break; break;
} }
if (pkt->skb->mark && if (mark && nla_put_be32(skb, NFTA_TRACE_MARK, htonl(mark)))
nla_put_be32(skb, NFTA_TRACE_MARK, htonl(pkt->skb->mark)))
goto nla_put_failure; goto nla_put_failure;
if (!info->packet_dumped) { if (!info->packet_dumped) {
@ -283,9 +276,20 @@ void nft_trace_init(struct nft_traceinfo *info, const struct nft_pktinfo *pkt,
const struct nft_verdict *verdict, const struct nft_verdict *verdict,
const struct nft_chain *chain) const struct nft_chain *chain)
{ {
static siphash_key_t trace_key __read_mostly;
struct sk_buff *skb = pkt->skb;
info->basechain = nft_base_chain(chain); info->basechain = nft_base_chain(chain);
info->trace = true; info->trace = true;
info->nf_trace = pkt->skb->nf_trace;
info->packet_dumped = false; info->packet_dumped = false;
info->pkt = pkt; info->pkt = pkt;
info->verdict = verdict; info->verdict = verdict;
net_get_random_once(&trace_key, sizeof(trace_key));
info->skbid = (u32)siphash_3u32(hash32_ptr(skb),
skb_get_hash(skb),
skb->skb_iif,
&trace_key);
} }

View File

@ -143,6 +143,7 @@ static bool nft_rhash_update(struct nft_set *set, const u32 *key,
/* Another cpu may race to insert the element with the same key */ /* Another cpu may race to insert the element with the same key */
if (prev) { if (prev) {
nft_set_elem_destroy(set, he, true); nft_set_elem_destroy(set, he, true);
atomic_dec(&set->nelems);
he = prev; he = prev;
} }
@ -152,6 +153,7 @@ out:
err2: err2:
nft_set_elem_destroy(set, he, true); nft_set_elem_destroy(set, he, true);
atomic_dec(&set->nelems);
err1: err1:
return false; return false;
} }

View File

@ -31,89 +31,89 @@ static void rose_idletimer_expiry(struct timer_list *);
void rose_start_heartbeat(struct sock *sk) void rose_start_heartbeat(struct sock *sk)
{ {
del_timer(&sk->sk_timer); sk_stop_timer(sk, &sk->sk_timer);
sk->sk_timer.function = rose_heartbeat_expiry; sk->sk_timer.function = rose_heartbeat_expiry;
sk->sk_timer.expires = jiffies + 5 * HZ; sk->sk_timer.expires = jiffies + 5 * HZ;
add_timer(&sk->sk_timer); sk_reset_timer(sk, &sk->sk_timer, sk->sk_timer.expires);
} }
void rose_start_t1timer(struct sock *sk) void rose_start_t1timer(struct sock *sk)
{ {
struct rose_sock *rose = rose_sk(sk); struct rose_sock *rose = rose_sk(sk);
del_timer(&rose->timer); sk_stop_timer(sk, &rose->timer);
rose->timer.function = rose_timer_expiry; rose->timer.function = rose_timer_expiry;
rose->timer.expires = jiffies + rose->t1; rose->timer.expires = jiffies + rose->t1;
add_timer(&rose->timer); sk_reset_timer(sk, &rose->timer, rose->timer.expires);
} }
void rose_start_t2timer(struct sock *sk) void rose_start_t2timer(struct sock *sk)
{ {
struct rose_sock *rose = rose_sk(sk); struct rose_sock *rose = rose_sk(sk);
del_timer(&rose->timer); sk_stop_timer(sk, &rose->timer);
rose->timer.function = rose_timer_expiry; rose->timer.function = rose_timer_expiry;
rose->timer.expires = jiffies + rose->t2; rose->timer.expires = jiffies + rose->t2;
add_timer(&rose->timer); sk_reset_timer(sk, &rose->timer, rose->timer.expires);
} }
void rose_start_t3timer(struct sock *sk) void rose_start_t3timer(struct sock *sk)
{ {
struct rose_sock *rose = rose_sk(sk); struct rose_sock *rose = rose_sk(sk);
del_timer(&rose->timer); sk_stop_timer(sk, &rose->timer);
rose->timer.function = rose_timer_expiry; rose->timer.function = rose_timer_expiry;
rose->timer.expires = jiffies + rose->t3; rose->timer.expires = jiffies + rose->t3;
add_timer(&rose->timer); sk_reset_timer(sk, &rose->timer, rose->timer.expires);
} }
void rose_start_hbtimer(struct sock *sk) void rose_start_hbtimer(struct sock *sk)
{ {
struct rose_sock *rose = rose_sk(sk); struct rose_sock *rose = rose_sk(sk);
del_timer(&rose->timer); sk_stop_timer(sk, &rose->timer);
rose->timer.function = rose_timer_expiry; rose->timer.function = rose_timer_expiry;
rose->timer.expires = jiffies + rose->hb; rose->timer.expires = jiffies + rose->hb;
add_timer(&rose->timer); sk_reset_timer(sk, &rose->timer, rose->timer.expires);
} }
void rose_start_idletimer(struct sock *sk) void rose_start_idletimer(struct sock *sk)
{ {
struct rose_sock *rose = rose_sk(sk); struct rose_sock *rose = rose_sk(sk);
del_timer(&rose->idletimer); sk_stop_timer(sk, &rose->idletimer);
if (rose->idle > 0) { if (rose->idle > 0) {
rose->idletimer.function = rose_idletimer_expiry; rose->idletimer.function = rose_idletimer_expiry;
rose->idletimer.expires = jiffies + rose->idle; rose->idletimer.expires = jiffies + rose->idle;
add_timer(&rose->idletimer); sk_reset_timer(sk, &rose->idletimer, rose->idletimer.expires);
} }
} }
void rose_stop_heartbeat(struct sock *sk) void rose_stop_heartbeat(struct sock *sk)
{ {
del_timer(&sk->sk_timer); sk_stop_timer(sk, &sk->sk_timer);
} }
void rose_stop_timer(struct sock *sk) void rose_stop_timer(struct sock *sk)
{ {
del_timer(&rose_sk(sk)->timer); sk_stop_timer(sk, &rose_sk(sk)->timer);
} }
void rose_stop_idletimer(struct sock *sk) void rose_stop_idletimer(struct sock *sk)
{ {
del_timer(&rose_sk(sk)->idletimer); sk_stop_timer(sk, &rose_sk(sk)->idletimer);
} }
static void rose_heartbeat_expiry(struct timer_list *t) static void rose_heartbeat_expiry(struct timer_list *t)
@ -130,6 +130,7 @@ static void rose_heartbeat_expiry(struct timer_list *t)
(sk->sk_state == TCP_LISTEN && sock_flag(sk, SOCK_DEAD))) { (sk->sk_state == TCP_LISTEN && sock_flag(sk, SOCK_DEAD))) {
bh_unlock_sock(sk); bh_unlock_sock(sk);
rose_destroy_socket(sk); rose_destroy_socket(sk);
sock_put(sk);
return; return;
} }
break; break;
@ -152,6 +153,7 @@ static void rose_heartbeat_expiry(struct timer_list *t)
rose_start_heartbeat(sk); rose_start_heartbeat(sk);
bh_unlock_sock(sk); bh_unlock_sock(sk);
sock_put(sk);
} }
static void rose_timer_expiry(struct timer_list *t) static void rose_timer_expiry(struct timer_list *t)
@ -181,6 +183,7 @@ static void rose_timer_expiry(struct timer_list *t)
break; break;
} }
bh_unlock_sock(sk); bh_unlock_sock(sk);
sock_put(sk);
} }
static void rose_idletimer_expiry(struct timer_list *t) static void rose_idletimer_expiry(struct timer_list *t)
@ -205,4 +208,5 @@ static void rose_idletimer_expiry(struct timer_list *t)
sock_set_flag(sk, SOCK_DEAD); sock_set_flag(sk, SOCK_DEAD);
} }
bh_unlock_sock(sk); bh_unlock_sock(sk);
sock_put(sk);
} }

View File

@ -588,7 +588,8 @@ static int tcf_idr_release_unsafe(struct tc_action *p)
} }
static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb, static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
const struct tc_action_ops *ops) const struct tc_action_ops *ops,
struct netlink_ext_ack *extack)
{ {
struct nlattr *nest; struct nlattr *nest;
int n_i = 0; int n_i = 0;
@ -604,20 +605,25 @@ static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
if (nla_put_string(skb, TCA_KIND, ops->kind)) if (nla_put_string(skb, TCA_KIND, ops->kind))
goto nla_put_failure; goto nla_put_failure;
ret = 0;
mutex_lock(&idrinfo->lock); mutex_lock(&idrinfo->lock);
idr_for_each_entry_ul(idr, p, tmp, id) { idr_for_each_entry_ul(idr, p, tmp, id) {
if (IS_ERR(p)) if (IS_ERR(p))
continue; continue;
ret = tcf_idr_release_unsafe(p); ret = tcf_idr_release_unsafe(p);
if (ret == ACT_P_DELETED) { if (ret == ACT_P_DELETED)
module_put(ops->owner); module_put(ops->owner);
n_i++; else if (ret < 0)
} else if (ret < 0) { break;
mutex_unlock(&idrinfo->lock); n_i++;
goto nla_put_failure;
}
} }
mutex_unlock(&idrinfo->lock); mutex_unlock(&idrinfo->lock);
if (ret < 0) {
if (n_i)
NL_SET_ERR_MSG(extack, "Unable to flush all TC actions");
else
goto nla_put_failure;
}
ret = nla_put_u32(skb, TCA_FCNT, n_i); ret = nla_put_u32(skb, TCA_FCNT, n_i);
if (ret) if (ret)
@ -638,7 +644,7 @@ int tcf_generic_walker(struct tc_action_net *tn, struct sk_buff *skb,
struct tcf_idrinfo *idrinfo = tn->idrinfo; struct tcf_idrinfo *idrinfo = tn->idrinfo;
if (type == RTM_DELACTION) { if (type == RTM_DELACTION) {
return tcf_del_walker(idrinfo, skb, ops); return tcf_del_walker(idrinfo, skb, ops, extack);
} else if (type == RTM_GETACTION) { } else if (type == RTM_GETACTION) {
return tcf_dump_walker(idrinfo, skb, cb); return tcf_dump_walker(idrinfo, skb, cb);
} else { } else {

View File

@ -2149,10 +2149,13 @@ SYSCALL_DEFINE4(send, int, fd, void __user *, buff, size_t, len,
int __sys_recvfrom(int fd, void __user *ubuf, size_t size, unsigned int flags, int __sys_recvfrom(int fd, void __user *ubuf, size_t size, unsigned int flags,
struct sockaddr __user *addr, int __user *addr_len) struct sockaddr __user *addr, int __user *addr_len)
{ {
struct sockaddr_storage address;
struct msghdr msg = {
/* Save some cycles and don't copy the address if not needed */
.msg_name = addr ? (struct sockaddr *)&address : NULL,
};
struct socket *sock; struct socket *sock;
struct iovec iov; struct iovec iov;
struct msghdr msg;
struct sockaddr_storage address;
int err, err2; int err, err2;
int fput_needed; int fput_needed;
@ -2163,14 +2166,6 @@ int __sys_recvfrom(int fd, void __user *ubuf, size_t size, unsigned int flags,
if (!sock) if (!sock)
goto out; goto out;
msg.msg_control = NULL;
msg.msg_controllen = 0;
/* Save some cycles and don't copy the address if not needed */
msg.msg_name = addr ? (struct sockaddr *)&address : NULL;
/* We assume all kernel code knows the size of sockaddr_storage */
msg.msg_namelen = 0;
msg.msg_iocb = NULL;
msg.msg_flags = 0;
if (sock->file->f_flags & O_NONBLOCK) if (sock->file->f_flags & O_NONBLOCK)
flags |= MSG_DONTWAIT; flags |= MSG_DONTWAIT;
err = sock_recvmsg(sock, &msg, flags); err = sock_recvmsg(sock, &msg, flags);
@ -2375,6 +2370,7 @@ int __copy_msghdr_from_user(struct msghdr *kmsg,
return -EFAULT; return -EFAULT;
kmsg->msg_control_is_user = true; kmsg->msg_control_is_user = true;
kmsg->msg_get_inq = 0;
kmsg->msg_control_user = msg.msg_control; kmsg->msg_control_user = msg.msg_control;
kmsg->msg_controllen = msg.msg_controllen; kmsg->msg_controllen = msg.msg_controllen;
kmsg->msg_flags = msg.msg_flags; kmsg->msg_flags = msg.msg_flags;

View File

@ -472,8 +472,8 @@ struct tipc_node *tipc_node_create(struct net *net, u32 addr, u8 *peer_id,
bool preliminary) bool preliminary)
{ {
struct tipc_net *tn = net_generic(net, tipc_net_id); struct tipc_net *tn = net_generic(net, tipc_net_id);
struct tipc_link *l, *snd_l = tipc_bc_sndlink(net);
struct tipc_node *n, *temp_node; struct tipc_node *n, *temp_node;
struct tipc_link *l;
unsigned long intv; unsigned long intv;
int bearer_id; int bearer_id;
int i; int i;
@ -488,6 +488,16 @@ struct tipc_node *tipc_node_create(struct net *net, u32 addr, u8 *peer_id,
goto exit; goto exit;
/* A preliminary node becomes "real" now, refresh its data */ /* A preliminary node becomes "real" now, refresh its data */
tipc_node_write_lock(n); tipc_node_write_lock(n);
if (!tipc_link_bc_create(net, tipc_own_addr(net), addr, peer_id, U16_MAX,
tipc_link_min_win(snd_l), tipc_link_max_win(snd_l),
n->capabilities, &n->bc_entry.inputq1,
&n->bc_entry.namedq, snd_l, &n->bc_entry.link)) {
pr_warn("Broadcast rcv link refresh failed, no memory\n");
tipc_node_write_unlock_fast(n);
tipc_node_put(n);
n = NULL;
goto exit;
}
n->preliminary = false; n->preliminary = false;
n->addr = addr; n->addr = addr;
hlist_del_rcu(&n->hash); hlist_del_rcu(&n->hash);
@ -567,7 +577,16 @@ update:
n->signature = INVALID_NODE_SIG; n->signature = INVALID_NODE_SIG;
n->active_links[0] = INVALID_BEARER_ID; n->active_links[0] = INVALID_BEARER_ID;
n->active_links[1] = INVALID_BEARER_ID; n->active_links[1] = INVALID_BEARER_ID;
n->bc_entry.link = NULL; if (!preliminary &&
!tipc_link_bc_create(net, tipc_own_addr(net), addr, peer_id, U16_MAX,
tipc_link_min_win(snd_l), tipc_link_max_win(snd_l),
n->capabilities, &n->bc_entry.inputq1,
&n->bc_entry.namedq, snd_l, &n->bc_entry.link)) {
pr_warn("Broadcast rcv link creation failed, no memory\n");
kfree(n);
n = NULL;
goto exit;
}
tipc_node_get(n); tipc_node_get(n);
timer_setup(&n->timer, tipc_node_timeout, 0); timer_setup(&n->timer, tipc_node_timeout, 0);
/* Start a slow timer anyway, crypto needs it */ /* Start a slow timer anyway, crypto needs it */
@ -1155,7 +1174,7 @@ void tipc_node_check_dest(struct net *net, u32 addr,
bool *respond, bool *dupl_addr) bool *respond, bool *dupl_addr)
{ {
struct tipc_node *n; struct tipc_node *n;
struct tipc_link *l, *snd_l; struct tipc_link *l;
struct tipc_link_entry *le; struct tipc_link_entry *le;
bool addr_match = false; bool addr_match = false;
bool sign_match = false; bool sign_match = false;
@ -1175,22 +1194,6 @@ void tipc_node_check_dest(struct net *net, u32 addr,
return; return;
tipc_node_write_lock(n); tipc_node_write_lock(n);
if (unlikely(!n->bc_entry.link)) {
snd_l = tipc_bc_sndlink(net);
if (!tipc_link_bc_create(net, tipc_own_addr(net),
addr, peer_id, U16_MAX,
tipc_link_min_win(snd_l),
tipc_link_max_win(snd_l),
n->capabilities,
&n->bc_entry.inputq1,
&n->bc_entry.namedq, snd_l,
&n->bc_entry.link)) {
pr_warn("Broadcast rcv link creation failed, no mem\n");
tipc_node_write_unlock_fast(n);
tipc_node_put(n);
return;
}
}
le = &n->links[b->identity]; le = &n->links[b->identity];

View File

@ -502,6 +502,7 @@ static int tipc_sk_create(struct net *net, struct socket *sock,
sock_init_data(sock, sk); sock_init_data(sock, sk);
tipc_set_sk_state(sk, TIPC_OPEN); tipc_set_sk_state(sk, TIPC_OPEN);
if (tipc_sk_insert(tsk)) { if (tipc_sk_insert(tsk)) {
sk_free(sk);
pr_warn("Socket create failed; port number exhausted\n"); pr_warn("Socket create failed; port number exhausted\n");
return -EINVAL; return -EINVAL;
} }

View File

@ -4,6 +4,7 @@
* Tests for sockmap/sockhash holding kTLS sockets. * Tests for sockmap/sockhash holding kTLS sockets.
*/ */
#include <netinet/tcp.h>
#include "test_progs.h" #include "test_progs.h"
#define MAX_TEST_NAME 80 #define MAX_TEST_NAME 80
@ -92,9 +93,78 @@ close_srv:
close(srv); close(srv);
} }
static void test_sockmap_ktls_update_fails_when_sock_has_ulp(int family, int map)
{
struct sockaddr_storage addr = {};
socklen_t len = sizeof(addr);
struct sockaddr_in6 *v6;
struct sockaddr_in *v4;
int err, s, zero = 0;
switch (family) {
case AF_INET:
v4 = (struct sockaddr_in *)&addr;
v4->sin_family = AF_INET;
break;
case AF_INET6:
v6 = (struct sockaddr_in6 *)&addr;
v6->sin6_family = AF_INET6;
break;
default:
PRINT_FAIL("unsupported socket family %d", family);
return;
}
s = socket(family, SOCK_STREAM, 0);
if (!ASSERT_GE(s, 0, "socket"))
return;
err = bind(s, (struct sockaddr *)&addr, len);
if (!ASSERT_OK(err, "bind"))
goto close;
err = getsockname(s, (struct sockaddr *)&addr, &len);
if (!ASSERT_OK(err, "getsockname"))
goto close;
err = connect(s, (struct sockaddr *)&addr, len);
if (!ASSERT_OK(err, "connect"))
goto close;
/* save sk->sk_prot and set it to tls_prots */
err = setsockopt(s, IPPROTO_TCP, TCP_ULP, "tls", strlen("tls"));
if (!ASSERT_OK(err, "setsockopt(TCP_ULP)"))
goto close;
/* sockmap update should not affect saved sk_prot */
err = bpf_map_update_elem(map, &zero, &s, BPF_ANY);
if (!ASSERT_ERR(err, "sockmap update elem"))
goto close;
/* call sk->sk_prot->setsockopt to dispatch to saved sk_prot */
err = setsockopt(s, IPPROTO_TCP, TCP_NODELAY, &zero, sizeof(zero));
ASSERT_OK(err, "setsockopt(TCP_NODELAY)");
close:
close(s);
}
static const char *fmt_test_name(const char *subtest_name, int family,
enum bpf_map_type map_type)
{
const char *map_type_str = BPF_MAP_TYPE_SOCKMAP ? "SOCKMAP" : "SOCKHASH";
const char *family_str = AF_INET ? "IPv4" : "IPv6";
static char test_name[MAX_TEST_NAME];
snprintf(test_name, MAX_TEST_NAME,
"sockmap_ktls %s %s %s",
subtest_name, family_str, map_type_str);
return test_name;
}
static void run_tests(int family, enum bpf_map_type map_type) static void run_tests(int family, enum bpf_map_type map_type)
{ {
char test_name[MAX_TEST_NAME];
int map; int map;
map = bpf_map_create(map_type, NULL, sizeof(int), sizeof(int), 1, NULL); map = bpf_map_create(map_type, NULL, sizeof(int), sizeof(int), 1, NULL);
@ -103,14 +173,10 @@ static void run_tests(int family, enum bpf_map_type map_type)
return; return;
} }
snprintf(test_name, MAX_TEST_NAME, if (test__start_subtest(fmt_test_name("disconnect_after_delete", family, map_type)))
"sockmap_ktls disconnect_after_delete %s %s", test_sockmap_ktls_disconnect_after_delete(family, map);
family == AF_INET ? "IPv4" : "IPv6", if (test__start_subtest(fmt_test_name("update_fails_when_sock_has_ulp", family, map_type)))
map_type == BPF_MAP_TYPE_SOCKMAP ? "SOCKMAP" : "SOCKHASH"); test_sockmap_ktls_update_fails_when_sock_has_ulp(family, map);
if (!test__start_subtest(test_name))
return;
test_sockmap_ktls_disconnect_after_delete(family, map);
close(map); close(map);
} }

View File

@ -54,7 +54,7 @@ TEST_GEN_FILES += ipsec
TEST_GEN_FILES += ioam6_parser TEST_GEN_FILES += ioam6_parser
TEST_GEN_FILES += gro TEST_GEN_FILES += gro
TEST_GEN_PROGS = reuseport_bpf reuseport_bpf_cpu reuseport_bpf_numa TEST_GEN_PROGS = reuseport_bpf reuseport_bpf_cpu reuseport_bpf_numa
TEST_GEN_PROGS += reuseport_dualstack reuseaddr_conflict tls TEST_GEN_PROGS += reuseport_dualstack reuseaddr_conflict tls tun
TEST_GEN_FILES += toeplitz TEST_GEN_FILES += toeplitz
TEST_GEN_FILES += cmsg_sender TEST_GEN_FILES += cmsg_sender
TEST_GEN_FILES += stress_reuseport_listen TEST_GEN_FILES += stress_reuseport_listen

View File

@ -2,7 +2,7 @@
CLANG ?= clang CLANG ?= clang
CCINCLUDE += -I../../bpf CCINCLUDE += -I../../bpf
CCINCLUDE += -I../../../lib CCINCLUDE += -I../../../../lib
CCINCLUDE += -I../../../../../usr/include/ CCINCLUDE += -I../../../../../usr/include/
TEST_CUSTOM_PROGS = $(OUTPUT)/bpf/nat6to4.o TEST_CUSTOM_PROGS = $(OUTPUT)/bpf/nat6to4.o

View File

@ -61,6 +61,39 @@ chk_msk_nr()
__chk_nr "grep -c token:" $* __chk_nr "grep -c token:" $*
} }
wait_msk_nr()
{
local condition="grep -c token:"
local expected=$1
local timeout=20
local msg nr
local max=0
local i=0
shift 1
msg=$*
while [ $i -lt $timeout ]; do
nr=$(ss -inmHMN $ns | $condition)
[ $nr == $expected ] && break;
[ $nr -gt $max ] && max=$nr
i=$((i + 1))
sleep 1
done
printf "%-50s" "$msg"
if [ $i -ge $timeout ]; then
echo "[ fail ] timeout while expecting $expected max $max last $nr"
ret=$test_cnt
elif [ $nr != $expected ]; then
echo "[ fail ] expected $expected found $nr"
ret=$test_cnt
else
echo "[ ok ]"
fi
test_cnt=$((test_cnt+1))
}
chk_msk_fallback_nr() chk_msk_fallback_nr()
{ {
__chk_nr "grep -c fallback" $* __chk_nr "grep -c fallback" $*
@ -146,7 +179,7 @@ ip -n $ns link set dev lo up
echo "a" | \ echo "a" | \
timeout ${timeout_test} \ timeout ${timeout_test} \
ip netns exec $ns \ ip netns exec $ns \
./mptcp_connect -p 10000 -l -t ${timeout_poll} \ ./mptcp_connect -p 10000 -l -t ${timeout_poll} -w 20 \
0.0.0.0 >/dev/null & 0.0.0.0 >/dev/null &
wait_local_port_listen $ns 10000 wait_local_port_listen $ns 10000
chk_msk_nr 0 "no msk on netns creation" chk_msk_nr 0 "no msk on netns creation"
@ -155,7 +188,7 @@ chk_msk_listen 10000
echo "b" | \ echo "b" | \
timeout ${timeout_test} \ timeout ${timeout_test} \
ip netns exec $ns \ ip netns exec $ns \
./mptcp_connect -p 10000 -r 0 -t ${timeout_poll} \ ./mptcp_connect -p 10000 -r 0 -t ${timeout_poll} -w 20 \
127.0.0.1 >/dev/null & 127.0.0.1 >/dev/null &
wait_connected $ns 10000 wait_connected $ns 10000
chk_msk_nr 2 "after MPC handshake " chk_msk_nr 2 "after MPC handshake "
@ -167,13 +200,13 @@ flush_pids
echo "a" | \ echo "a" | \
timeout ${timeout_test} \ timeout ${timeout_test} \
ip netns exec $ns \ ip netns exec $ns \
./mptcp_connect -p 10001 -l -s TCP -t ${timeout_poll} \ ./mptcp_connect -p 10001 -l -s TCP -t ${timeout_poll} -w 20 \
0.0.0.0 >/dev/null & 0.0.0.0 >/dev/null &
wait_local_port_listen $ns 10001 wait_local_port_listen $ns 10001
echo "b" | \ echo "b" | \
timeout ${timeout_test} \ timeout ${timeout_test} \
ip netns exec $ns \ ip netns exec $ns \
./mptcp_connect -p 10001 -r 0 -t ${timeout_poll} \ ./mptcp_connect -p 10001 -r 0 -t ${timeout_poll} -w 20 \
127.0.0.1 >/dev/null & 127.0.0.1 >/dev/null &
wait_connected $ns 10001 wait_connected $ns 10001
chk_msk_fallback_nr 1 "check fallback" chk_msk_fallback_nr 1 "check fallback"
@ -184,7 +217,7 @@ for I in `seq 1 $NR_CLIENTS`; do
echo "a" | \ echo "a" | \
timeout ${timeout_test} \ timeout ${timeout_test} \
ip netns exec $ns \ ip netns exec $ns \
./mptcp_connect -p $((I+10001)) -l -w 10 \ ./mptcp_connect -p $((I+10001)) -l -w 20 \
-t ${timeout_poll} 0.0.0.0 >/dev/null & -t ${timeout_poll} 0.0.0.0 >/dev/null &
done done
wait_local_port_listen $ns $((NR_CLIENTS + 10001)) wait_local_port_listen $ns $((NR_CLIENTS + 10001))
@ -193,12 +226,11 @@ for I in `seq 1 $NR_CLIENTS`; do
echo "b" | \ echo "b" | \
timeout ${timeout_test} \ timeout ${timeout_test} \
ip netns exec $ns \ ip netns exec $ns \
./mptcp_connect -p $((I+10001)) -w 10 \ ./mptcp_connect -p $((I+10001)) -w 20 \
-t ${timeout_poll} 127.0.0.1 >/dev/null & -t ${timeout_poll} 127.0.0.1 >/dev/null &
done done
sleep 1.5
chk_msk_nr $((NR_CLIENTS*2)) "many msk socket present" wait_msk_nr $((NR_CLIENTS*2)) "many msk socket present"
flush_pids flush_pids
exit $ret exit $ret

View File

@ -265,7 +265,7 @@ static void sock_test_tcpulp(int sock, int proto, unsigned int line)
static int sock_listen_mptcp(const char * const listenaddr, static int sock_listen_mptcp(const char * const listenaddr,
const char * const port) const char * const port)
{ {
int sock; int sock = -1;
struct addrinfo hints = { struct addrinfo hints = {
.ai_protocol = IPPROTO_TCP, .ai_protocol = IPPROTO_TCP,
.ai_socktype = SOCK_STREAM, .ai_socktype = SOCK_STREAM,

View File

@ -88,7 +88,7 @@ static void xgetaddrinfo(const char *node, const char *service,
static int sock_listen_mptcp(const char * const listenaddr, static int sock_listen_mptcp(const char * const listenaddr,
const char * const port) const char * const port)
{ {
int sock; int sock = -1;
struct addrinfo hints = { struct addrinfo hints = {
.ai_protocol = IPPROTO_TCP, .ai_protocol = IPPROTO_TCP,
.ai_socktype = SOCK_STREAM, .ai_socktype = SOCK_STREAM,

View File

@ -136,7 +136,7 @@ static void xgetaddrinfo(const char *node, const char *service,
static int sock_listen_mptcp(const char * const listenaddr, static int sock_listen_mptcp(const char * const listenaddr,
const char * const port) const char * const port)
{ {
int sock; int sock = -1;
struct addrinfo hints = { struct addrinfo hints = {
.ai_protocol = IPPROTO_TCP, .ai_protocol = IPPROTO_TCP,
.ai_socktype = SOCK_STREAM, .ai_socktype = SOCK_STREAM,

View File

@ -0,0 +1,162 @@
// SPDX-License-Identifier: GPL-2.0
#define _GNU_SOURCE
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <linux/if.h>
#include <linux/if_tun.h>
#include <linux/netlink.h>
#include <linux/rtnetlink.h>
#include <sys/ioctl.h>
#include <sys/socket.h>
#include "../kselftest_harness.h"
static int tun_attach(int fd, char *dev)
{
struct ifreq ifr;
memset(&ifr, 0, sizeof(ifr));
strcpy(ifr.ifr_name, dev);
ifr.ifr_flags = IFF_ATTACH_QUEUE;
return ioctl(fd, TUNSETQUEUE, (void *) &ifr);
}
static int tun_detach(int fd, char *dev)
{
struct ifreq ifr;
memset(&ifr, 0, sizeof(ifr));
strcpy(ifr.ifr_name, dev);
ifr.ifr_flags = IFF_DETACH_QUEUE;
return ioctl(fd, TUNSETQUEUE, (void *) &ifr);
}
static int tun_alloc(char *dev)
{
struct ifreq ifr;
int fd, err;
fd = open("/dev/net/tun", O_RDWR);
if (fd < 0) {
fprintf(stderr, "can't open tun: %s\n", strerror(errno));
return fd;
}
memset(&ifr, 0, sizeof(ifr));
strcpy(ifr.ifr_name, dev);
ifr.ifr_flags = IFF_TAP | IFF_NAPI | IFF_MULTI_QUEUE;
err = ioctl(fd, TUNSETIFF, (void *) &ifr);
if (err < 0) {
fprintf(stderr, "can't TUNSETIFF: %s\n", strerror(errno));
close(fd);
return err;
}
strcpy(dev, ifr.ifr_name);
return fd;
}
static int tun_delete(char *dev)
{
struct {
struct nlmsghdr nh;
struct ifinfomsg ifm;
unsigned char data[64];
} req;
struct rtattr *rta;
int ret, rtnl;
rtnl = socket(AF_NETLINK, SOCK_DGRAM, NETLINK_ROUTE);
if (rtnl < 0) {
fprintf(stderr, "can't open rtnl: %s\n", strerror(errno));
return 1;
}
memset(&req, 0, sizeof(req));
req.nh.nlmsg_len = NLMSG_ALIGN(NLMSG_LENGTH(sizeof(req.ifm)));
req.nh.nlmsg_flags = NLM_F_REQUEST;
req.nh.nlmsg_type = RTM_DELLINK;
req.ifm.ifi_family = AF_UNSPEC;
rta = (struct rtattr *)(((char *)&req) + NLMSG_ALIGN(req.nh.nlmsg_len));
rta->rta_type = IFLA_IFNAME;
rta->rta_len = RTA_LENGTH(IFNAMSIZ);
req.nh.nlmsg_len += rta->rta_len;
memcpy(RTA_DATA(rta), dev, IFNAMSIZ);
ret = send(rtnl, &req, req.nh.nlmsg_len, 0);
if (ret < 0)
fprintf(stderr, "can't send: %s\n", strerror(errno));
ret = (unsigned int)ret != req.nh.nlmsg_len;
close(rtnl);
return ret;
}
FIXTURE(tun)
{
char ifname[IFNAMSIZ];
int fd, fd2;
};
FIXTURE_SETUP(tun)
{
memset(self->ifname, 0, sizeof(self->ifname));
self->fd = tun_alloc(self->ifname);
ASSERT_GE(self->fd, 0);
self->fd2 = tun_alloc(self->ifname);
ASSERT_GE(self->fd2, 0);
}
FIXTURE_TEARDOWN(tun)
{
if (self->fd >= 0)
close(self->fd);
if (self->fd2 >= 0)
close(self->fd2);
}
TEST_F(tun, delete_detach_close) {
EXPECT_EQ(tun_delete(self->ifname), 0);
EXPECT_EQ(tun_detach(self->fd, self->ifname), -1);
EXPECT_EQ(errno, 22);
}
TEST_F(tun, detach_delete_close) {
EXPECT_EQ(tun_detach(self->fd, self->ifname), 0);
EXPECT_EQ(tun_delete(self->ifname), 0);
}
TEST_F(tun, detach_close_delete) {
EXPECT_EQ(tun_detach(self->fd, self->ifname), 0);
close(self->fd);
self->fd = -1;
EXPECT_EQ(tun_delete(self->ifname), 0);
}
TEST_F(tun, reattach_delete_close) {
EXPECT_EQ(tun_detach(self->fd, self->ifname), 0);
EXPECT_EQ(tun_attach(self->fd, self->ifname), 0);
EXPECT_EQ(tun_delete(self->ifname), 0);
}
TEST_F(tun, reattach_close_delete) {
EXPECT_EQ(tun_detach(self->fd, self->ifname), 0);
EXPECT_EQ(tun_attach(self->fd, self->ifname), 0);
close(self->fd);
self->fd = -1;
EXPECT_EQ(tun_delete(self->ifname), 0);
}
TEST_HARNESS_MAIN

View File

@ -120,7 +120,7 @@ run_all() {
run_udp "${ipv4_args}" run_udp "${ipv4_args}"
echo "ipv6" echo "ipv6"
run_tcp "${ipv4_args}" run_tcp "${ipv6_args}"
run_udp "${ipv6_args}" run_udp "${ipv6_args}"
} }

View File

@ -609,5 +609,82 @@
"teardown": [ "teardown": [
"$TC actions flush action gact" "$TC actions flush action gact"
] ]
},
{
"id": "7f52",
"name": "Try to flush action which is referenced by filter",
"category": [
"actions",
"gact"
],
"plugins": {
"requires": "nsPlugin"
},
"setup": [
[
"$TC actions flush action gact",
0,
1,
255
],
"$TC qdisc add dev $DEV1 ingress",
"$TC actions add action pass index 1",
"$TC filter add dev $DEV1 protocol all ingress prio 1 handle 0x1234 matchall action gact index 1"
],
"cmdUnderTest": "$TC actions flush action gact",
"expExitCode": "1",
"verifyCmd": "$TC actions ls action gact",
"matchPattern": "total acts 1.*action order [0-9]*: gact action pass.*index 1 ref 2 bind 1",
"matchCount": "1",
"teardown": [
"$TC qdisc del dev $DEV1 ingress",
[
"sleep 1; $TC actions flush action gact",
0,
1
]
]
},
{
"id": "ae1e",
"name": "Try to flush actions when last one is referenced by filter",
"category": [
"actions",
"gact"
],
"plugins": {
"requires": "nsPlugin"
},
"setup": [
[
"$TC actions flush action gact",
0,
1,
255
],
"$TC qdisc add dev $DEV1 ingress",
[
"$TC actions add action pass index 1",
0,
1,
255
],
"$TC actions add action reclassify index 2",
"$TC actions add action drop index 3",
"$TC filter add dev $DEV1 protocol all ingress prio 1 handle 0x1234 matchall action gact index 3"
],
"cmdUnderTest": "$TC actions flush action gact",
"expExitCode": "0",
"verifyCmd": "$TC actions ls action gact",
"matchPattern": "total acts 1.*action order [0-9]*: gact action drop.*index 3 ref 2 bind 1",
"matchCount": "1",
"teardown": [
"$TC qdisc del dev $DEV1 ingress",
[
"sleep 1; $TC actions flush action gact",
0,
1
]
]
} }
] ]