edf9bc396e
On RZ/G3S SMARC Carrier II board having RGMII connections b/w Ethernet
MACs and PHYs it has been discovered that doing unbind/bind for ravb
driver in a loop leads to wrong speed and duplex for Ethernet links and
broken connectivity (the connectivity cannot be restored even with
bringing interface down/up). Before doing unbind/bind the Ethernet
interfaces were configured though systemd. The sh instructions used to
do unbind/bind were:
$ cd /sys/bus/platform/drivers/ravb/
$ while :; do echo 11c30000.ethernet > unbind ; \
echo 11c30000.ethernet > bind; done
It has been discovered that there is a race b/w IOCTLs initialized by
systemd at the response of success binding and the
"ravb_write(ndev, CCC_OPC_RESET, CCC)" call in ravb_remove() as
follows:
1/ as a result of bind success the user space open/configures the
interfaces tough an IOCTL; the following stack trace has been
identified on RZ/G3S:
Call trace:
dump_backtrace+0x9c/0x100
show_stack+0x20/0x38
dump_stack_lvl+0x48/0x60
dump_stack+0x18/0x28
ravb_open+0x70/0xa58
__dev_open+0xf4/0x1e8
__dev_change_flags+0x198/0x218
dev_change_flags+0x2c/0x80
devinet_ioctl+0x640/0x708
inet_ioctl+0x1e4/0x200
sock_do_ioctl+0x50/0x108
sock_ioctl+0x240/0x358
__arm64_sys_ioctl+0xb0/0x100
invoke_syscall+0x50/0x128
el0_svc_common.constprop.0+0xc8/0xf0
do_el0_svc+0x24/0x38
el0_svc+0x34/0xb8
el0t_64_sync_handler+0xc0/0xc8
el0t_64_sync+0x190/0x198
2/ this call may execute concurrently with ravb_remove() as the
unbind/bind operation was executed in a loop
3/ if the operation mode is changed to RESET (through
ravb_write(ndev, CCC_OPC_RESET, CCC) call in ravb_remove())
while the above ravb_open() is in progress it may lead to MAC
(or PHY, or MAC-PHY connection, the right point hasn't been identified
at the moment) to be broken, thus the Ethernet connectivity fails to
restore.
The simple fix for this is to move ravb_write(ndev, CCC_OPC_RESET, CCC))
after unregister_netdev() to avoid resetting the controller while the
netdev interface is still registered.
To avoid future issues in ravb_remove(), the patch follows the proper order
of operations in ravb_remove(): reverse order compared with ravb_probe().
This avoids described races as the IOCTLs as well as unregister_netdev()
(called now at the beginning of ravb_remove()) calls rtnl_lock() before
continuing and IOCTLs check (though devinet_ioctl()) if device is still
registered just after taking the lock:
int devinet_ioctl(struct net *net, unsigned int cmd, struct ifreq *ifr)
{
// ...
rtnl_lock();
ret = -ENODEV;
dev = __dev_get_by_name(net, ifr->ifr_name);
if (!dev)
goto done;
// ...
done:
rtnl_unlock();
out:
return ret;
}
Fixes:
|
||
---|---|---|
.. | ||
arcnet | ||
bonding | ||
caif | ||
can | ||
dsa | ||
ethernet | ||
fddi | ||
fjes | ||
hamradio | ||
hippi | ||
hyperv | ||
ieee802154 | ||
ipa | ||
ipvlan | ||
mctp | ||
mdio | ||
netdevsim | ||
pcs | ||
phy | ||
plip | ||
ppp | ||
pse-pd | ||
slip | ||
team | ||
thunderbolt | ||
usb | ||
vmxnet3 | ||
vxlan | ||
wan | ||
wireguard | ||
wireless | ||
wwan | ||
xen-netback | ||
amt.c | ||
bareudp.c | ||
dummy.c | ||
eql.c | ||
geneve.c | ||
gtp.c | ||
ifb.c | ||
Kconfig | ||
LICENSE.SRC | ||
loopback.c | ||
macsec.c | ||
macvlan.c | ||
macvtap.c | ||
Makefile | ||
mdio.c | ||
mhi_net.c | ||
mii.c | ||
net_failover.c | ||
netconsole.c | ||
netkit.c | ||
nlmon.c | ||
ntb_netdev.c | ||
rionet.c | ||
sb1000.c | ||
Space.c | ||
sungem_phy.c | ||
tap.c | ||
tun.c | ||
veth.c | ||
virtio_net.c | ||
vrf.c | ||
vsockmon.c | ||
xen-netfront.c |