2019-05-27 09:55:01 +03:00
// SPDX-License-Identifier: GPL-2.0-or-later
2005-04-17 02:20:36 +04:00
/*
* Device handling code
* Linux ethernet bridge
*
* Authors :
* Lennert Buytenhek < buytenh @ gnu . org >
*/
# include <linux/kernel.h>
# include <linux/netdevice.h>
2010-05-06 11:48:24 +04:00
# include <linux/netpoll.h>
2005-12-22 05:51:49 +03:00
# include <linux/etherdevice.h>
2005-12-22 06:00:58 +03:00
# include <linux/ethtool.h>
2010-05-06 11:48:24 +04:00
# include <linux/list.h>
2010-04-15 14:14:51 +04:00
# include <linux/netfilter_bridge.h>
2005-12-22 05:51:49 +03:00
2016-12-24 22:46:01 +03:00
# include <linux/uaccess.h>
2005-04-17 02:20:36 +04:00
# include "br_private.h"
2013-05-22 11:49:34 +04:00
# define COMMON_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HIGHDMA | \
NETIF_F_GSO_MASK | NETIF_F_HW_CSUM )
2015-03-10 12:27:18 +03:00
const struct nf_br_ops __rcu * nf_br_ops __read_mostly ;
EXPORT_SYMBOL_GPL ( nf_br_ops ) ;
2010-07-27 12:26:30 +04:00
/* net device transmit always called with BH disabled */
2009-08-31 23:50:41 +04:00
netdev_tx_t br_dev_xmit ( struct sk_buff * skb , struct net_device * dev )
2005-04-17 02:20:36 +04:00
{
2021-07-19 20:06:28 +03:00
struct net_bridge_mcast_port * pmctx_null = NULL ;
2005-04-17 02:20:36 +04:00
struct net_bridge * br = netdev_priv ( dev ) ;
2021-07-19 20:06:25 +03:00
struct net_bridge_mcast * brmctx = & br - > multicast_ctx ;
2005-04-17 02:20:36 +04:00
struct net_bridge_fdb_entry * dst ;
2010-02-27 22:41:48 +03:00
struct net_bridge_mdb_entry * mdst ;
2015-03-10 12:27:18 +03:00
const struct nf_br_ops * nf_ops ;
2020-01-24 14:40:22 +03:00
u8 state = BR_STATE_FORWARDING ;
2021-07-19 20:06:28 +03:00
struct net_bridge_vlan * vlan ;
2017-07-13 16:09:10 +03:00
const unsigned char * dest ;
2013-02-13 16:00:14 +04:00
u16 vid = 0 ;
2005-04-17 02:20:36 +04:00
2020-07-31 19:26:16 +03:00
memset ( skb - > cb , 0 , sizeof ( struct br_input_skb_cb ) ) ;
skbuff: bridge: Add layer 2 miss indication
For EVPN non-DF (Designated Forwarder) filtering we need to be able to
prevent decapsulated traffic from being flooded to a multi-homed host.
Filtering of multicast and broadcast traffic can be achieved using the
following flower filter:
# tc filter add dev bond0 egress pref 1 proto all flower indev vxlan0 dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 action drop
Unlike broadcast and multicast traffic, it is not currently possible to
filter unknown unicast traffic. The classification into unknown unicast
is performed by the bridge driver, but is not visible to other layers
such as tc.
Solve this by adding a new 'l2_miss' bit to the tc skb extension. Clear
the bit whenever a packet enters the bridge (received from a bridge port
or transmitted via the bridge) and set it if the packet did not match an
FDB or MDB entry. If there is no skb extension and the bit needs to be
cleared, then do not allocate one as no extension is equivalent to the
bit being cleared. The bit is not set for broadcast packets as they
never perform a lookup and therefore never incur a miss.
A bit that is set for every flooded packet would also work for the
current use case, but it does not allow us to differentiate between
registered and unregistered multicast traffic, which might be useful in
the future.
To keep the performance impact to a minimum, the marking of packets is
guarded by the 'tc_skb_ext_tc' static key. When 'false', the skb is not
touched and an skb extension is not allocated. Instead, only a
5 bytes nop is executed, as demonstrated below for the call site in
br_handle_frame().
Before the patch:
```
memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
c37b09: 49 c7 44 24 28 00 00 movq $0x0,0x28(%r12)
c37b10: 00 00
p = br_port_get_rcu(skb->dev);
c37b12: 49 8b 44 24 10 mov 0x10(%r12),%rax
memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
c37b17: 49 c7 44 24 30 00 00 movq $0x0,0x30(%r12)
c37b1e: 00 00
c37b20: 49 c7 44 24 38 00 00 movq $0x0,0x38(%r12)
c37b27: 00 00
```
After the patch (when static key is disabled):
```
memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
c37c29: 49 c7 44 24 28 00 00 movq $0x0,0x28(%r12)
c37c30: 00 00
c37c32: 49 8d 44 24 28 lea 0x28(%r12),%rax
c37c37: 48 c7 40 08 00 00 00 movq $0x0,0x8(%rax)
c37c3e: 00
c37c3f: 48 c7 40 10 00 00 00 movq $0x0,0x10(%rax)
c37c46: 00
#ifdef CONFIG_HAVE_JUMP_LABEL_HACK
static __always_inline bool arch_static_branch(struct static_key *key, bool branch)
{
asm_volatile_goto("1:"
c37c47: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
br_tc_skb_miss_set(skb, false);
p = br_port_get_rcu(skb->dev);
c37c4c: 49 8b 44 24 10 mov 0x10(%r12),%rax
```
Subsequent patches will extend the flower classifier to be able to match
on the new 'l2_miss' bit and enable / disable the static key when
filters that match on it are added / deleted.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-29 14:48:28 +03:00
br_tc_skb_miss_set ( skb , false ) ;
2020-07-31 19:26:16 +03:00
2012-08-14 19:19:33 +04:00
rcu_read_lock ( ) ;
2015-03-10 12:27:18 +03:00
nf_ops = rcu_dereference ( nf_br_ops ) ;
if ( nf_ops & & nf_ops - > br_dev_xmit_hook ( skb ) ) {
2012-08-14 19:19:33 +04:00
rcu_read_unlock ( ) ;
2010-04-15 14:14:51 +04:00
return NETDEV_TX_OK ;
}
2020-11-20 14:22:23 +03:00
dev_sw_netstats_tx_add ( dev , 1 , skb - > len ) ;
2010-02-27 22:41:42 +03:00
2017-09-03 17:44:13 +03:00
br_switchdev_frame_unmark ( skb ) ;
2010-03-02 16:32:09 +03:00
BR_INPUT_SKB_CB ( skb ) - > brdev = dev ;
netfilter: bridge: add connection tracking system
This patch adds basic connection tracking support for the bridge,
including initial IPv4 support.
This patch register two hooks to deal with the bridge forwarding path,
one from the bridge prerouting hook to call nf_conntrack_in(); and
another from the bridge postrouting hook to confirm the entry.
The conntrack bridge prerouting hook defragments packets before passing
them to nf_conntrack_in() to look up for an existing entry, otherwise a
new entry is allocated and it is attached to the skbuff. The conntrack
bridge postrouting hook confirms new conntrack entries, ie. if this is
the first packet seen, then it adds the entry to the hashtable and (if
needed) it refragments the skbuff into the original fragments, leaving
the geometry as is if possible. Exceptions are linearized skbuffs, eg.
skbuffs that are passed up to nfqueue and conntrack helpers, as well as
cloned skbuff for the local delivery (eg. tcpdump), also in case of
bridge port flooding (cloned skbuff too).
The packet defragmentation is done through the ip_defrag() call. This
forces us to save the bridge control buffer, reset the IP control buffer
area and then restore it after call. This function also bumps the IP
fragmentation statistics, it would be probably desiderable to have
independent statistics for the bridge defragmentation/refragmentation.
The maximum fragment length is stored in the control buffer and it is
used to refragment the skbuff from the postrouting path.
The new fraglist splitter and fragment transformer APIs are used to
implement the bridge refragmentation code. The br_ip_fragment() function
drops the packet in case the maximum fragment size seen is larger than
the output port MTU.
This patchset follows the principle that conntrack should not drop
packets, so users can do it through policy via invalid state matching.
Like br_netfilter, there is no refragmentation for packets that are
passed up for local delivery, ie. prerouting -> input path. There are
calls to nf_reset() already in several spots in the stack since time ago
already, eg. af_packet, that show that skbuff fraglist handling from the
netif_rx path is supported already.
The helpers are called from the postrouting hook, before confirmation,
from there we may see packet floods to bridge ports. Then, although
unlikely, this may result in exercising the helpers many times for each
clone. It would be good to explore how to pass all the packets in a list
to the conntrack hook to do this handle only once for this case.
Thanks to Florian Westphal for handing me over an initial patchset
version to add support for conntrack bridge.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 14:25:37 +03:00
BR_INPUT_SKB_CB ( skb ) - > frag_max_size = 0 ;
2005-04-17 02:20:36 +04:00
2007-03-20 01:30:44 +03:00
skb_reset_mac_header ( skb ) ;
2005-04-17 02:20:36 +04:00
skb_pull ( skb , ETH_HLEN ) ;
2021-07-19 20:06:28 +03:00
if ( ! br_allowed_ingress ( br , br_vlan_group_rcu ( br ) , skb , & vid ,
& state , & vlan ) )
2014-03-27 16:46:55 +04:00
goto out ;
2017-10-07 08:12:38 +03:00
if ( IS_ENABLED ( CONFIG_INET ) & &
2020-02-24 19:46:22 +03:00
( eth_hdr ( skb ) - > h_proto = = htons ( ETH_P_ARP ) | |
eth_hdr ( skb ) - > h_proto = = htons ( ETH_P_RARP ) ) & &
2018-09-26 17:01:05 +03:00
br_opt_get ( br , BROPT_NEIGH_SUPPRESS_ENABLED ) ) {
2017-10-07 08:12:38 +03:00
br_do_proxy_suppress_arp ( skb , br , vid , NULL ) ;
2017-10-07 08:12:39 +03:00
} else if ( IS_ENABLED ( CONFIG_IPV6 ) & &
skb - > protocol = = htons ( ETH_P_IPV6 ) & &
2018-09-26 17:01:05 +03:00
br_opt_get ( br , BROPT_NEIGH_SUPPRESS_ENABLED ) & &
2017-10-07 08:12:39 +03:00
pskb_may_pull ( skb , sizeof ( struct ipv6hdr ) +
sizeof ( struct nd_msg ) ) & &
ipv6_hdr ( skb ) - > nexthdr = = IPPROTO_ICMPV6 ) {
struct nd_msg * msg , _msg ;
msg = br_is_nd_neigh_msg ( skb , & _msg ) ;
if ( msg )
br_do_suppress_nd ( skb , br , vid , NULL , msg ) ;
2017-10-07 08:12:38 +03:00
}
2017-07-13 16:09:10 +03:00
dest = eth_hdr ( skb ) - > h_dest ;
2016-07-14 06:10:02 +03:00
if ( is_broadcast_ether_addr ( dest ) ) {
2023-04-19 18:34:53 +03:00
br_flood ( br , skb , BR_PKT_BROADCAST , false , true , vid ) ;
2016-07-14 06:10:02 +03:00
} else if ( is_multicast_ether_addr ( dest ) ) {
2010-06-10 20:12:50 +04:00
if ( unlikely ( netpoll_tx_running ( dev ) ) ) {
2023-04-19 18:34:53 +03:00
br_flood ( br , skb , BR_PKT_MULTICAST , false , true , vid ) ;
2010-06-10 20:12:50 +04:00
goto out ;
}
2021-07-19 20:06:28 +03:00
if ( br_multicast_rcv ( & brmctx , & pmctx_null , vlan , skb , vid ) ) {
2010-07-29 05:12:31 +04:00
kfree_skb ( skb ) ;
2010-02-27 22:41:48 +03:00
goto out ;
2010-07-29 05:12:31 +04:00
}
2010-02-27 22:41:48 +03:00
2021-07-19 20:06:25 +03:00
mdst = br_mdb_get ( brmctx , skb , vid ) ;
2013-08-01 03:06:20 +04:00
if ( ( mdst | | BR_INPUT_SKB_CB_MROUTERS_ONLY ( skb ) ) & &
2021-07-19 20:06:25 +03:00
br_multicast_querier_exists ( brmctx , eth_hdr ( skb ) , mdst ) )
br_multicast_flood ( mdst , skb , brmctx , false , true ) ;
2010-02-27 22:41:48 +03:00
else
2023-04-19 18:34:53 +03:00
br_flood ( br , skb , BR_PKT_MULTICAST , false , true , vid ) ;
2017-02-13 16:59:09 +03:00
} else if ( ( dst = br_fdb_find_rcu ( br , dest , vid ) ) ! = NULL ) {
2016-07-14 06:10:02 +03:00
br_forward ( dst - > dst , skb , false , true ) ;
} else {
2023-04-19 18:34:53 +03:00
br_flood ( br , skb , BR_PKT_UNICAST , false , true , vid ) ;
2016-07-14 06:10:02 +03:00
}
2010-02-27 22:41:48 +03:00
out :
2010-07-27 12:26:30 +04:00
rcu_read_unlock ( ) ;
2009-06-23 10:03:08 +04:00
return NETDEV_TX_OK ;
2005-04-17 02:20:36 +04:00
}
2020-06-09 00:53:01 +03:00
static struct lock_class_key bridge_netdev_addr_lock_key ;
static void br_set_lockdep_class ( struct net_device * dev )
{
lockdep_set_class ( & dev - > addr_list_lock , & bridge_netdev_addr_lock_key ) ;
}
2011-04-04 18:03:32 +04:00
static int br_dev_init ( struct net_device * dev )
{
struct net_bridge * br = netdev_priv ( dev ) ;
2014-10-03 19:29:18 +04:00
int err ;
2011-04-04 18:03:32 +04:00
2020-11-20 14:22:23 +03:00
dev - > tstats = netdev_alloc_pcpu_stats ( struct pcpu_sw_netstats ) ;
if ( ! dev - > tstats )
2011-04-04 18:03:32 +04:00
return - ENOMEM ;
2017-12-12 17:02:50 +03:00
err = br_fdb_hash_init ( br ) ;
if ( err ) {
2020-11-20 14:22:23 +03:00
free_percpu ( dev - > tstats ) ;
2017-12-12 17:02:50 +03:00
return err ;
}
2018-12-05 16:14:24 +03:00
err = br_mdb_hash_init ( br ) ;
if ( err ) {
2020-11-20 14:22:23 +03:00
free_percpu ( dev - > tstats ) ;
2018-12-05 16:14:24 +03:00
br_fdb_hash_fini ( br ) ;
return err ;
}
2014-10-03 19:29:18 +04:00
err = br_vlan_init ( br ) ;
2016-06-28 17:57:06 +03:00
if ( err ) {
2020-11-20 14:22:23 +03:00
free_percpu ( dev - > tstats ) ;
2018-12-05 16:14:24 +03:00
br_mdb_hash_fini ( br ) ;
2017-12-12 17:02:50 +03:00
br_fdb_hash_fini ( br ) ;
2016-06-28 17:57:06 +03:00
return err ;
}
err = br_multicast_init_stats ( br ) ;
if ( err ) {
2020-11-20 14:22:23 +03:00
free_percpu ( dev - > tstats ) ;
2016-06-28 17:57:06 +03:00
br_vlan_flush ( br ) ;
2018-12-05 16:14:24 +03:00
br_mdb_hash_fini ( br ) ;
2017-12-12 17:02:50 +03:00
br_fdb_hash_fini ( br ) ;
2016-06-28 17:57:06 +03:00
}
2014-10-03 19:29:18 +04:00
2020-06-09 00:53:01 +03:00
br_set_lockdep_class ( dev ) ;
2014-10-03 19:29:18 +04:00
return err ;
2011-04-04 18:03:32 +04:00
}
2017-04-10 14:59:27 +03:00
static void br_dev_uninit ( struct net_device * dev )
{
struct net_bridge * br = netdev_priv ( dev ) ;
2017-04-25 17:58:37 +03:00
br_multicast_dev_del ( br ) ;
2017-04-10 14:59:27 +03:00
br_multicast_uninit_stats ( br ) ;
br_vlan_flush ( br ) ;
2018-12-05 16:14:24 +03:00
br_mdb_hash_fini ( br ) ;
2017-12-12 17:02:50 +03:00
br_fdb_hash_fini ( br ) ;
2020-11-20 14:22:23 +03:00
free_percpu ( dev - > tstats ) ;
2017-04-10 14:59:27 +03:00
}
2005-04-17 02:20:36 +04:00
static int br_dev_open ( struct net_device * dev )
{
2005-05-30 01:15:17 +04:00
struct net_bridge * br = netdev_priv ( dev ) ;
2005-04-17 02:20:36 +04:00
2011-04-22 10:31:16 +04:00
netdev_update_features ( dev ) ;
2005-05-30 01:15:17 +04:00
netif_start_queue ( dev ) ;
br_stp_enable_bridge ( br ) ;
2010-02-28 11:49:38 +03:00
br_multicast_open ( br ) ;
2005-04-17 02:20:36 +04:00
2020-12-05 02:56:28 +03:00
if ( br_opt_get ( br , BROPT_MULTICAST_ENABLED ) )
br_multicast_join_snoopers ( br ) ;
2005-04-17 02:20:36 +04:00
return 0 ;
}
static void br_dev_set_multicast_list ( struct net_device * dev )
{
}
2014-05-16 17:59:20 +04:00
static void br_dev_change_rx_flags ( struct net_device * dev , int change )
{
if ( change & IFF_PROMISC )
br_manage_promisc ( netdev_priv ( dev ) ) ;
}
2005-04-17 02:20:36 +04:00
static int br_dev_stop ( struct net_device * dev )
{
2010-02-28 11:49:38 +03:00
struct net_bridge * br = netdev_priv ( dev ) ;
br_stp_disable_bridge ( br ) ;
br_multicast_stop ( br ) ;
2005-04-17 02:20:36 +04:00
2020-12-05 02:56:28 +03:00
if ( br_opt_get ( br , BROPT_MULTICAST_ENABLED ) )
br_multicast_leave_snoopers ( br ) ;
2005-04-17 02:20:36 +04:00
netif_stop_queue ( dev ) ;
return 0 ;
}
static int br_change_mtu ( struct net_device * dev , int new_mtu )
{
2008-07-31 03:27:55 +04:00
struct net_bridge * br = netdev_priv ( dev ) ;
2018-03-30 13:46:18 +03:00
2005-04-17 02:20:36 +04:00
dev - > mtu = new_mtu ;
2008-07-31 03:27:55 +04:00
2018-03-30 13:46:19 +03:00
/* this flag will be cleared if the MTU was automatically adjusted */
2018-09-26 17:01:06 +03:00
br_opt_toggle ( br , BROPT_MTU_SET_BY_USER , true ) ;
2014-09-18 13:29:03 +04:00
# if IS_ENABLED(CONFIG_BRIDGE_NETFILTER)
2008-07-31 03:27:55 +04:00
/* remember the MTU in the rtable for PMTU */
2010-12-09 08:16:57 +03:00
dst_metric_set ( & br - > fake_rtable . dst , RTAX_MTU , new_mtu ) ;
2008-07-31 03:27:55 +04:00
# endif
2005-04-17 02:20:36 +04:00
return 0 ;
}
2007-04-09 22:49:58 +04:00
/* Allow setting mac address to any valid ethernet address. */
2005-12-22 05:51:49 +03:00
static int br_set_mac_address ( struct net_device * dev , void * p )
{
struct net_bridge * br = netdev_priv ( dev ) ;
struct sockaddr * addr = p ;
2007-04-09 22:49:58 +04:00
if ( ! is_valid_ether_addr ( addr - > sa_data ) )
2012-02-21 06:07:52 +04:00
return - EADDRNOTAVAIL ;
2005-12-22 05:51:49 +03:00
2019-12-03 17:48:06 +03:00
/* dev_set_mac_addr() can be called by a master device on bridge's
* NETDEV_UNREGISTER , but since it ' s being destroyed do nothing
*/
if ( dev - > reg_state ! = NETREG_REGISTERED )
return - EBUSY ;
2005-12-22 05:51:49 +03:00
spin_lock_bh ( & br - > lock ) ;
bridge: Convert compare_ether_addr to ether_addr_equal
Use the new bool function ether_addr_equal to add
some clarity and reduce the likelihood for misuse
of compare_ether_addr for sorting.
Done via cocci script:
$ cat compare_ether_addr.cocci
@@
expression a,b;
@@
- !compare_ether_addr(a, b)
+ ether_addr_equal(a, b)
@@
expression a,b;
@@
- compare_ether_addr(a, b)
+ !ether_addr_equal(a, b)
@@
expression a,b;
@@
- !ether_addr_equal(a, b) == 0
+ ether_addr_equal(a, b)
@@
expression a,b;
@@
- !ether_addr_equal(a, b) != 0
+ !ether_addr_equal(a, b)
@@
expression a,b;
@@
- ether_addr_equal(a, b) == 0
+ !ether_addr_equal(a, b)
@@
expression a,b;
@@
- ether_addr_equal(a, b) != 0
+ ether_addr_equal(a, b)
@@
expression a,b;
@@
- !!ether_addr_equal(a, b)
+ ether_addr_equal(a, b)
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-08 22:56:49 +04:00
if ( ! ether_addr_equal ( dev - > dev_addr , addr - > sa_data ) ) {
2014-02-07 11:48:20 +04:00
/* Mac address will be changed in br_stp_change_bridge_id(). */
2011-12-08 11:17:49 +04:00
br_stp_change_bridge_id ( br , addr - > sa_data ) ;
}
2005-12-22 05:51:49 +03:00
spin_unlock_bh ( & br - > lock ) ;
2007-04-09 22:49:58 +04:00
return 0 ;
2005-12-22 05:51:49 +03:00
}
2005-12-22 06:00:58 +03:00
static void br_getinfo ( struct net_device * dev , struct ethtool_drvinfo * info )
{
2022-08-19 00:02:12 +03:00
strscpy ( info - > driver , " bridge " , sizeof ( info - > driver ) ) ;
strscpy ( info - > version , BR_VERSION , sizeof ( info - > version ) ) ;
strscpy ( info - > fw_version , " N/A " , sizeof ( info - > fw_version ) ) ;
strscpy ( info - > bus_info , " N/A " , sizeof ( info - > bus_info ) ) ;
2005-12-22 06:00:58 +03:00
}
2019-11-13 00:12:25 +03:00
static int br_get_link_ksettings ( struct net_device * dev ,
struct ethtool_link_ksettings * cmd )
{
struct net_bridge * br = netdev_priv ( dev ) ;
struct net_bridge_port * p ;
cmd - > base . duplex = DUPLEX_UNKNOWN ;
cmd - > base . port = PORT_OTHER ;
cmd - > base . speed = SPEED_UNKNOWN ;
list_for_each_entry ( p , & br - > port_list , list ) {
struct ethtool_link_ksettings ecmd ;
struct net_device * pdev = p - > dev ;
if ( ! netif_running ( pdev ) | | ! netif_oper_up ( pdev ) )
continue ;
if ( __ethtool_get_link_ksettings ( pdev , & ecmd ) )
continue ;
if ( ecmd . base . speed = = ( __u32 ) SPEED_UNKNOWN )
continue ;
if ( cmd - > base . speed = = ( __u32 ) SPEED_UNKNOWN | |
cmd - > base . speed < ecmd . base . speed )
cmd - > base . speed = ecmd . base . speed ;
}
return 0 ;
}
2011-11-15 19:29:55 +04:00
static netdev_features_t br_fix_features ( struct net_device * dev ,
netdev_features_t features )
2005-12-22 06:00:58 +03:00
{
struct net_bridge * br = netdev_priv ( dev ) ;
2011-04-22 10:31:16 +04:00
return br_features_recompute ( br , features ) ;
2010-10-20 17:56:08 +04:00
}
2010-05-06 11:48:24 +04:00
# ifdef CONFIG_NET_POLL_CONTROLLER
2010-06-10 20:12:50 +04:00
static void br_poll_controller ( struct net_device * br_dev )
2010-05-06 11:48:24 +04:00
{
}
2010-06-10 20:12:50 +04:00
static void br_netpoll_cleanup ( struct net_device * dev )
2010-05-06 11:48:24 +04:00
{
2010-06-10 20:12:50 +04:00
struct net_bridge * br = netdev_priv ( dev ) ;
2012-08-10 05:24:44 +04:00
struct net_bridge_port * p ;
2010-05-06 11:48:24 +04:00
2012-08-10 05:24:44 +04:00
list_for_each_entry ( p , & br - > port_list , list )
2010-06-10 20:12:50 +04:00
br_netpoll_disable ( p ) ;
2010-05-06 11:48:24 +04:00
}
2014-03-28 02:36:38 +04:00
static int __br_netpoll_enable ( struct net_bridge_port * p )
2014-02-07 03:00:52 +04:00
{
struct netpoll * np ;
int err ;
2014-03-28 02:36:38 +04:00
np = kzalloc ( sizeof ( * p - > np ) , GFP_KERNEL ) ;
2014-02-07 03:00:52 +04:00
if ( ! np )
return - ENOMEM ;
2014-03-28 02:36:38 +04:00
err = __netpoll_setup ( np , p - > dev ) ;
2014-02-07 03:00:52 +04:00
if ( err ) {
kfree ( np ) ;
return err ;
}
p - > np = np ;
return err ;
}
2014-03-28 02:36:38 +04:00
int br_netpoll_enable ( struct net_bridge_port * p )
2014-02-07 03:00:52 +04:00
{
if ( ! p - > br - > dev - > npinfo )
return 0 ;
2014-03-28 02:36:38 +04:00
return __br_netpoll_enable ( p ) ;
2014-02-07 03:00:52 +04:00
}
2014-03-28 02:36:38 +04:00
static int br_netpoll_setup ( struct net_device * dev , struct netpoll_info * ni )
2010-05-06 11:48:24 +04:00
{
2010-05-10 13:31:08 +04:00
struct net_bridge * br = netdev_priv ( dev ) ;
2012-08-10 05:24:44 +04:00
struct net_bridge_port * p ;
2010-06-10 20:12:50 +04:00
int err = 0 ;
2010-05-06 11:48:24 +04:00
2012-08-10 05:24:44 +04:00
list_for_each_entry ( p , & br - > port_list , list ) {
2010-06-10 20:12:50 +04:00
if ( ! p - > dev )
continue ;
2014-03-28 02:36:38 +04:00
err = __br_netpoll_enable ( p ) ;
2010-06-10 20:12:50 +04:00
if ( err )
goto fail ;
2010-05-06 11:48:24 +04:00
}
2010-06-10 20:12:50 +04:00
out :
return err ;
fail :
br_netpoll_cleanup ( dev ) ;
goto out ;
2010-05-06 11:48:24 +04:00
}
2010-06-10 20:12:50 +04:00
void br_netpoll_disable ( struct net_bridge_port * p )
2010-05-06 11:48:24 +04:00
{
2010-06-10 20:12:50 +04:00
struct netpoll * np = p - > np ;
if ( ! np )
return ;
p - > np = NULL ;
2018-10-18 18:18:26 +03:00
__netpoll_free ( np ) ;
2010-05-06 11:48:24 +04:00
}
# endif
2017-10-05 03:48:46 +03:00
static int br_add_slave ( struct net_device * dev , struct net_device * slave_dev ,
struct netlink_ext_ack * extack )
2011-02-13 12:33:42 +03:00
{
struct net_bridge * br = netdev_priv ( dev ) ;
2017-10-05 03:48:50 +03:00
return br_add_if ( br , slave_dev , extack ) ;
2011-02-13 12:33:42 +03:00
}
static int br_del_slave ( struct net_device * dev , struct net_device * slave_dev )
{
struct net_bridge * br = netdev_priv ( dev ) ;
return br_del_if ( br , slave_dev ) ;
}
2021-03-24 04:30:34 +03:00
static int br_fill_forward_path ( struct net_device_path_ctx * ctx ,
struct net_device_path * path )
{
struct net_bridge_fdb_entry * f ;
struct net_bridge_port * dst ;
struct net_bridge * br ;
if ( netif_is_bridge_port ( ctx - > dev ) )
return - 1 ;
br = netdev_priv ( ctx - > dev ) ;
2021-03-24 04:30:35 +03:00
br_vlan_fill_forward_path_pvid ( br , ctx , path ) ;
f = br_fdb_find_rcu ( br , ctx - > daddr , path - > bridge . vlan_id ) ;
2021-03-24 04:30:34 +03:00
if ( ! f | | ! f - > dst )
return - 1 ;
dst = READ_ONCE ( f - > dst ) ;
if ( ! dst )
return - 1 ;
2021-03-24 04:30:35 +03:00
if ( br_vlan_fill_forward_path_mode ( br , dst , path ) )
return - 1 ;
2021-03-24 04:30:34 +03:00
path - > type = DEV_PATH_BRIDGE ;
path - > dev = dst - > br - > dev ;
ctx - > dev = dst - > dev ;
2021-03-24 04:30:35 +03:00
switch ( path - > bridge . vlan_mode ) {
case DEV_PATH_BR_VLAN_TAG :
if ( ctx - > num_vlans > = ARRAY_SIZE ( ctx - > vlan ) )
return - ENOSPC ;
ctx - > vlan [ ctx - > num_vlans ] . id = path - > bridge . vlan_id ;
ctx - > vlan [ ctx - > num_vlans ] . proto = path - > bridge . vlan_proto ;
ctx - > num_vlans + + ;
break ;
2021-03-24 04:30:48 +03:00
case DEV_PATH_BR_VLAN_UNTAG_HW :
2021-03-24 04:30:35 +03:00
case DEV_PATH_BR_VLAN_UNTAG :
ctx - > num_vlans - - ;
break ;
case DEV_PATH_BR_VLAN_KEEP :
break ;
}
2021-03-24 04:30:34 +03:00
return 0 ;
}
2008-11-20 08:49:00 +03:00
static const struct ethtool_ops br_ethtool_ops = {
2019-11-13 00:12:25 +03:00
. get_drvinfo = br_getinfo ,
. get_link = ethtool_op_get_link ,
. get_link_ksettings = br_get_link_ksettings ,
2005-12-22 06:00:58 +03:00
} ;
2008-11-20 08:49:00 +03:00
static const struct net_device_ops br_netdev_ops = {
. ndo_open = br_dev_open ,
. ndo_stop = br_dev_stop ,
2011-04-04 18:03:32 +04:00
. ndo_init = br_dev_init ,
2017-04-10 14:59:27 +03:00
. ndo_uninit = br_dev_uninit ,
2008-11-21 07:14:53 +03:00
. ndo_start_xmit = br_dev_xmit ,
2020-11-20 14:22:23 +03:00
. ndo_get_stats64 = dev_get_tstats64 ,
2008-11-21 07:14:53 +03:00
. ndo_set_mac_address = br_set_mac_address ,
2011-08-16 10:29:01 +04:00
. ndo_set_rx_mode = br_dev_set_multicast_list ,
2014-05-16 17:59:20 +04:00
. ndo_change_rx_flags = br_dev_change_rx_flags ,
2008-11-21 07:14:53 +03:00
. ndo_change_mtu = br_change_mtu ,
2021-07-27 16:44:51 +03:00
. ndo_siocdevprivate = br_dev_siocdevprivate ,
2010-05-06 11:48:24 +04:00
# ifdef CONFIG_NET_POLL_CONTROLLER
2010-06-10 20:12:50 +04:00
. ndo_netpoll_setup = br_netpoll_setup ,
2010-05-06 11:48:24 +04:00
. ndo_netpoll_cleanup = br_netpoll_cleanup ,
. ndo_poll_controller = br_poll_controller ,
# endif
2011-02-13 12:33:42 +03:00
. ndo_add_slave = br_add_slave ,
. ndo_del_slave = br_del_slave ,
2011-04-22 10:31:16 +04:00
. ndo_fix_features = br_fix_features ,
2012-04-15 10:43:56 +04:00
. ndo_fdb_add = br_fdb_add ,
. ndo_fdb_del = br_fdb_delete ,
2022-04-13 13:51:58 +03:00
. ndo_fdb_del_bulk = br_fdb_delete_bulk ,
2012-04-15 10:43:56 +04:00
. ndo_fdb_dump = br_fdb_dump ,
2018-12-16 09:35:09 +03:00
. ndo_fdb_get = br_fdb_get ,
2023-03-15 16:11:47 +03:00
. ndo_mdb_add = br_mdb_add ,
. ndo_mdb_del = br_mdb_del ,
. ndo_mdb_dump = br_mdb_dump ,
2012-10-24 12:12:57 +04:00
. ndo_bridge_getlink = br_getlink ,
. ndo_bridge_setlink = br_setlink ,
2013-02-13 16:00:12 +04:00
. ndo_bridge_dellink = br_dellink ,
2015-07-31 09:03:26 +03:00
. ndo_features_check = passthru_features_check ,
2021-03-24 04:30:34 +03:00
. ndo_fill_forward_path = br_fill_forward_path ,
2008-11-20 08:49:00 +03:00
} ;
2011-04-04 18:03:32 +04:00
static struct device_type br_type = {
. name = " bridge " ,
} ;
2005-04-17 02:20:36 +04:00
void br_dev_setup ( struct net_device * dev )
{
2011-04-04 18:03:32 +04:00
struct net_bridge * br = netdev_priv ( dev ) ;
2012-02-15 10:45:40 +04:00
eth_hw_addr_random ( dev ) ;
2005-04-17 02:20:36 +04:00
ether_setup ( dev ) ;
2008-11-20 08:49:00 +03:00
dev - > netdev_ops = & br_netdev_ops ;
net: Fix inconsistent teardown and release of private netdev state.
Network devices can allocate reasources and private memory using
netdev_ops->ndo_init(). However, the release of these resources
can occur in one of two different places.
Either netdev_ops->ndo_uninit() or netdev->destructor().
The decision of which operation frees the resources depends upon
whether it is necessary for all netdev refs to be released before it
is safe to perform the freeing.
netdev_ops->ndo_uninit() presumably can occur right after the
NETDEV_UNREGISTER notifier completes and the unicast and multicast
address lists are flushed.
netdev->destructor(), on the other hand, does not run until the
netdev references all go away.
Further complicating the situation is that netdev->destructor()
almost universally does also a free_netdev().
This creates a problem for the logic in register_netdevice().
Because all callers of register_netdevice() manage the freeing
of the netdev, and invoke free_netdev(dev) if register_netdevice()
fails.
If netdev_ops->ndo_init() succeeds, but something else fails inside
of register_netdevice(), it does call ndo_ops->ndo_uninit(). But
it is not able to invoke netdev->destructor().
This is because netdev->destructor() will do a free_netdev() and
then the caller of register_netdevice() will do the same.
However, this means that the resources that would normally be released
by netdev->destructor() will not be.
Over the years drivers have added local hacks to deal with this, by
invoking their destructor parts by hand when register_netdevice()
fails.
Many drivers do not try to deal with this, and instead we have leaks.
Let's close this hole by formalizing the distinction between what
private things need to be freed up by netdev->destructor() and whether
the driver needs unregister_netdevice() to perform the free_netdev().
netdev->priv_destructor() performs all actions to free up the private
resources that used to be freed by netdev->destructor(), except for
free_netdev().
netdev->needs_free_netdev is a boolean that indicates whether
free_netdev() should be done at the end of unregister_netdevice().
Now, register_netdevice() can sanely release all resources after
ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
and netdev->priv_destructor().
And at the end of unregister_netdevice(), we invoke
netdev->priv_destructor() and optionally call free_netdev().
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08 19:52:56 +03:00
dev - > needs_free_netdev = true ;
2014-05-11 04:12:32 +04:00
dev - > ethtool_ops = & br_ethtool_ops ;
2011-04-04 18:03:32 +04:00
SET_NETDEV_DEVTYPE ( dev , & br_type ) ;
2015-08-18 11:30:37 +03:00
dev - > priv_flags = IFF_EBRIDGE | IFF_NO_QUEUE ;
2005-12-22 06:00:58 +03:00
2013-05-22 11:49:34 +04:00
dev - > features = COMMON_FEATURES | NETIF_F_LLTX | NETIF_F_NETNS_LOCAL |
2014-06-10 15:59:22 +04:00
NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX ;
dev - > hw_features = COMMON_FEATURES | NETIF_F_HW_VLAN_CTAG_TX |
NETIF_F_HW_VLAN_STAG_TX ;
2013-05-22 11:49:34 +04:00
dev - > vlan_features = COMMON_FEATURES ;
2011-04-04 18:03:32 +04:00
br - > dev = dev ;
spin_lock_init ( & br - > lock ) ;
INIT_LIST_HEAD ( & br - > port_list ) ;
2017-12-12 17:02:50 +03:00
INIT_HLIST_HEAD ( & br - > fdb_list ) ;
2020-10-27 13:02:42 +03:00
INIT_HLIST_HEAD ( & br - > frame_type_list ) ;
2020-04-26 16:22:07 +03:00
# if IS_ENABLED(CONFIG_BRIDGE_MRP)
2020-11-07 00:50:49 +03:00
INIT_HLIST_HEAD ( & br - > mrp_list ) ;
2020-10-27 13:02:43 +03:00
# endif
# if IS_ENABLED(CONFIG_BRIDGE_CFM)
INIT_HLIST_HEAD ( & br - > mep_list ) ;
2020-04-26 16:22:07 +03:00
# endif
2011-04-04 18:03:32 +04:00
spin_lock_init ( & br - > hash_lock ) ;
br - > bridge_id . prio [ 0 ] = 0x80 ;
br - > bridge_id . prio [ 1 ] = 0x00 ;
2017-11-02 12:36:48 +03:00
ether_addr_copy ( br - > group_addr , eth_stp_addr ) ;
2011-04-04 18:03:32 +04:00
br - > stp_enabled = BR_NO_STP ;
2011-10-03 22:14:46 +04:00
br - > group_fwd_mask = BR_GROUPFWD_DEFAULT ;
2014-06-10 15:59:24 +04:00
br - > group_fwd_mask_required = BR_GROUPFWD_DEFAULT ;
2011-10-03 22:14:46 +04:00
2011-04-04 18:03:32 +04:00
br - > designated_root = br - > bridge_id ;
br - > bridge_max_age = br - > max_age = 20 * HZ ;
br - > bridge_hello_time = br - > hello_time = 2 * HZ ;
br - > bridge_forward_delay = br - > forward_delay = 15 * HZ ;
2016-12-10 21:44:29 +03:00
br - > bridge_ageing_time = br - > ageing_time = BR_DEFAULT_AGEING_TIME ;
net: use core MTU range checking in core net infra
geneve:
- Merge __geneve_change_mtu back into geneve_change_mtu, set max_mtu
- This one isn't quite as straight-forward as others, could use some
closer inspection and testing
macvlan:
- set min/max_mtu
tun:
- set min/max_mtu, remove tun_net_change_mtu
vxlan:
- Merge __vxlan_change_mtu back into vxlan_change_mtu
- Set max_mtu to IP_MAX_MTU and retain dynamic MTU range checks in
change_mtu function
- This one is also not as straight-forward and could use closer inspection
and testing from vxlan folks
bridge:
- set max_mtu of IP_MAX_MTU and retain dynamic MTU range checks in
change_mtu function
openvswitch:
- set min/max_mtu, remove internal_dev_change_mtu
- note: max_mtu wasn't checked previously, it's been set to 65535, which
is the largest possible size supported
sch_teql:
- set min/max_mtu (note: max_mtu previously unchecked, used max of 65535)
macsec:
- min_mtu = 0, max_mtu = 65535
macvlan:
- min_mtu = 0, max_mtu = 65535
ntb_netdev:
- min_mtu = 0, max_mtu = 65535
veth:
- min_mtu = 68, max_mtu = 65535
8021q:
- min_mtu = 0, max_mtu = 65535
CC: netdev@vger.kernel.org
CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
CC: Tom Herbert <tom@herbertland.com>
CC: Daniel Borkmann <daniel@iogearbox.net>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Paolo Abeni <pabeni@redhat.com>
CC: Jiri Benc <jbenc@redhat.com>
CC: WANG Cong <xiyou.wangcong@gmail.com>
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
CC: Pravin B Shelar <pshelar@ovn.org>
CC: Sabrina Dubroca <sd@queasysnail.net>
CC: Patrick McHardy <kaber@trash.net>
CC: Stephen Hemminger <stephen@networkplumber.org>
CC: Pravin Shelar <pshelar@nicira.com>
CC: Maxim Krasnyansky <maxk@qti.qualcomm.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20 20:55:20 +03:00
dev - > max_mtu = ETH_MAX_MTU ;
2011-04-04 18:03:32 +04:00
br_netfilter_rtable_init ( br ) ;
br_stp_timer_init ( br ) ;
br_multicast_init ( br ) ;
2017-02-04 20:05:07 +03:00
INIT_DELAYED_WORK ( & br - > gc_work , br_fdb_cleanup ) ;
2005-04-17 02:20:36 +04:00
}