2019-05-27 08:55:01 +02:00
// SPDX-License-Identifier: GPL-2.0-or-later
2005-04-16 15:20:36 -07:00
/*
* IPv6 Address [ auto ] configuration
* Linux INET6 implementation
*
* Authors :
2007-02-09 23:24:49 +09:00
* Pedro Roque < roque @ di . fc . ul . pt >
2005-04-16 15:20:36 -07:00
* Alexey Kuznetsov < kuznet @ ms2 . inr . ac . ru >
*/
/*
* Changes :
*
* Janos Farkas : delete timer on ifdown
* < chexum @ bankinf . banki . hu >
* Andi Kleen : kill double kfree on module
* unload .
* Maciej W . Rozycki : FDDI support
* sekiya @ USAGI : Don ' t send too many RS
* packets .
* yoshfuji @ USAGI : Fixed interval between DAD
* packets .
* YOSHIFUJI Hideaki @ USAGI : improved accuracy of
* address validation timer .
* YOSHIFUJI Hideaki @ USAGI : Privacy Extensions ( RFC3041 )
* support .
* Yuji SEKIYA @ USAGI : Don ' t assign a same IPv6
* address on a same interface .
* YOSHIFUJI Hideaki @ USAGI : ARCnet support
* YOSHIFUJI Hideaki @ USAGI : convert / proc / net / if_inet6 to
* seq_file .
2005-11-08 09:38:12 -08:00
* YOSHIFUJI Hideaki @ USAGI : improved source address
* selection ; consider scope ,
* status etc .
2005-04-16 15:20:36 -07:00
*/
2012-05-15 14:11:53 +00:00
# define pr_fmt(fmt) "IPv6: " fmt
2005-04-16 15:20:36 -07:00
# include <linux/errno.h>
# include <linux/types.h>
2009-03-21 13:36:17 -07:00
# include <linux/kernel.h>
2017-02-02 19:15:33 +01:00
# include <linux/sched/signal.h>
2005-04-16 15:20:36 -07:00
# include <linux/socket.h>
# include <linux/sockios.h>
# include <linux/net.h>
2015-03-23 23:36:00 +01:00
# include <linux/inet.h>
2005-04-16 15:20:36 -07:00
# include <linux/in6.h>
# include <linux/netdevice.h>
2006-08-04 23:04:54 -07:00
# include <linux/if_addr.h>
2005-04-16 15:20:36 -07:00
# include <linux/if_arp.h>
# include <linux/if_arcnet.h>
# include <linux/if_infiniband.h>
# include <linux/route.h>
# include <linux/inetdevice.h>
# include <linux/init.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 17:04:11 +09:00
# include <linux/slab.h>
2005-04-16 15:20:36 -07:00
# ifdef CONFIG_SYSCTL
# include <linux/sysctl.h>
# endif
2006-01-11 12:17:47 -08:00
# include <linux/capability.h>
2005-04-16 15:20:36 -07:00
# include <linux/delay.h>
# include <linux/notifier.h>
2005-06-23 00:09:02 -07:00
# include <linux/string.h>
2012-07-18 08:11:12 +00:00
# include <linux/hash.h>
2005-04-16 15:20:36 -07:00
2007-09-12 12:01:34 +02:00
# include <net/net_namespace.h>
2005-04-16 15:20:36 -07:00
# include <net/sock.h>
# include <net/snmp.h>
2015-12-14 20:55:22 +01:00
# include <net/6lowpan.h>
2013-03-25 08:26:24 +00:00
# include <net/firewire.h>
2005-04-16 15:20:36 -07:00
# include <net/ipv6.h>
# include <net/protocol.h>
# include <net/ndisc.h>
# include <net/ip6_route.h>
# include <net/addrconf.h>
# include <net/tcp.h>
# include <net/ip.h>
2006-08-15 00:35:02 -07:00
# include <net/netlink.h>
2007-10-10 02:53:43 -07:00
# include <net/pkt_sched.h>
2015-10-12 11:47:10 -07:00
# include <net/l3mdev.h>
2005-04-16 15:20:36 -07:00
# include <linux/if_tunnel.h>
# include <linux/rtnetlink.h>
2012-10-25 22:28:50 +00:00
# include <linux/netconf.h>
2005-04-16 15:20:36 -07:00
# include <linux/random.h>
2010-03-20 16:09:01 -07:00
# include <linux/uaccess.h>
2007-04-24 21:54:09 -07:00
# include <asm/unaligned.h>
2005-04-16 15:20:36 -07:00
# include <linux/proc_fs.h>
# include <linux/seq_file.h>
2011-07-15 11:47:34 -04:00
# include <linux/export.h>
ipv6: ioam: Data plane support for Pre-allocated Trace
Implement support for processing the IOAM Pre-allocated Trace with IPv6,
see [1] and [2]. Introduce a new IPv6 Hop-by-Hop TLV option, see IANA [3].
A new per-interface sysctl is introduced. The value is a boolean to accept (=1)
or ignore (=0, by default) IPv6 IOAM options on ingress for an interface:
- net.ipv6.conf.XXX.ioam6_enabled
Two other sysctls are introduced to define IOAM IDs, represented by an integer.
They are respectively per-namespace and per-interface:
- net.ipv6.ioam6_id
- net.ipv6.conf.XXX.ioam6_id
The value of the first one represents the IOAM ID of the node itself (u32; max
and default value = U32_MAX>>8, due to hop limit concatenation) while the other
represents the IOAM ID of an interface (u16; max and default value = U16_MAX).
Each "ioam6_id" sysctl has a "_wide" equivalent:
- net.ipv6.ioam6_id_wide
- net.ipv6.conf.XXX.ioam6_id_wide
The value of the first one represents the wide IOAM ID of the node itself (u64;
max and default value = U64_MAX>>8, due to hop limit concatenation) while the
other represents the wide IOAM ID of an interface (u32; max and default value
= U32_MAX).
The use of short and wide equivalents is not exclusive, a deployment could
choose to leverage both. For example, net.ipv6.conf.XXX.ioam6_id (short format)
could be an identifier for a physical interface, whereas
net.ipv6.conf.XXX.ioam6_id_wide (wide format) could be an identifier for a
logical sub-interface. Documentation about new sysctls is provided at the end
of this patchset.
Two relativistic hash tables are used: one for IOAM namespaces, the other for
IOAM schemas. A namespace can only have a single active schema and a schema
can only be attached to a single namespace (1:1 relationship).
[1] https://tools.ietf.org/html/draft-ietf-ippm-ioam-ipv6-options
[2] https://tools.ietf.org/html/draft-ietf-ippm-ioam-data
[3] https://www.iana.org/assignments/ipv6-parameters/ipv6-parameters.xhtml#ipv6-parameters-2
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 21:42:57 +02:00
# include <linux/ioam6.h>
2005-04-16 15:20:36 -07:00
# define INFINITY_LIFE_TIME 0xFFFFFFFF
2010-11-17 04:12:02 +00:00
2015-03-23 23:36:00 +01:00
# define IPV6_MAX_STRLEN \
sizeof ( " ffff:ffff:ffff:ffff:ffff:ffff:255.255.255.255 " )
2010-11-17 04:12:02 +00:00
static inline u32 cstamp_delta ( unsigned long cstamp )
{
return ( cstamp - INITIAL_JIFFIES ) * 100UL / HZ ;
}
2005-04-16 15:20:36 -07:00
2016-09-27 23:57:58 -07:00
static inline s32 rfc3315_s14_backoff_init ( s32 irt )
{
/* multiply 'initial retransmission time' by 0.9 .. 1.1 */
2022-10-09 20:44:02 -06:00
u64 tmp = get_random_u32_inclusive ( 900000 , 1100000 ) * ( u64 ) irt ;
2016-09-27 23:57:58 -07:00
do_div ( tmp , 1000000 ) ;
return ( s32 ) tmp ;
}
static inline s32 rfc3315_s14_backoff_update ( s32 rt , s32 mrt )
{
/* multiply 'retransmission timeout' by 1.9 .. 2.1 */
2022-10-09 20:44:02 -06:00
u64 tmp = get_random_u32_inclusive ( 1900000 , 2100000 ) * ( u64 ) rt ;
2016-09-27 23:57:58 -07:00
do_div ( tmp , 1000000 ) ;
if ( ( s32 ) tmp > mrt ) {
/* multiply 'maximum retransmission time' by 0.9 .. 1.1 */
2022-10-09 20:44:02 -06:00
tmp = get_random_u32_inclusive ( 900000 , 1100000 ) * ( u64 ) mrt ;
2016-09-27 23:57:58 -07:00
do_div ( tmp , 1000000 ) ;
}
return ( s32 ) tmp ;
}
2005-04-16 15:20:36 -07:00
# ifdef CONFIG_SYSCTL
2014-07-25 15:25:09 -07:00
static int addrconf_sysctl_register ( struct inet6_dev * idev ) ;
2008-01-10 17:41:21 -08:00
static void addrconf_sysctl_unregister ( struct inet6_dev * idev ) ;
# else
2014-07-25 15:25:09 -07:00
static inline int addrconf_sysctl_register ( struct inet6_dev * idev )
2008-01-10 17:41:21 -08:00
{
2014-07-25 15:25:09 -07:00
return 0 ;
2008-01-10 17:41:21 -08:00
}
static inline void addrconf_sysctl_unregister ( struct inet6_dev * idev )
{
}
2005-04-16 15:20:36 -07:00
# endif
2020-05-01 00:51:47 -03:00
static void ipv6_gen_rnd_iid ( struct in6_addr * addr ) ;
2005-04-16 15:20:36 -07:00
2008-06-28 14:18:38 +09:00
static int ipv6_generate_eui64 ( u8 * eui , struct net_device * dev ) ;
2017-10-07 19:30:24 -07:00
static int ipv6_count_addresses ( const struct inet6_dev * idev ) ;
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
static int ipv6_generate_stable_address ( struct in6_addr * addr ,
u8 dad_count ,
const struct inet6_dev * idev ) ;
2005-04-16 15:20:36 -07:00
2017-11-04 08:53:27 -07:00
# define IN6_ADDR_HSIZE_SHIFT 8
# define IN6_ADDR_HSIZE (1 << IN6_ADDR_HSIZE_SHIFT)
2005-04-16 15:20:36 -07:00
2022-02-07 20:50:29 -08:00
static void addrconf_verify ( struct net * net ) ;
static void addrconf_verify_rtnl ( struct net * net ) ;
2005-04-16 15:20:36 -07:00
2014-03-27 18:28:07 +01:00
static struct workqueue_struct * addrconf_wq ;
2005-04-16 15:20:36 -07:00
static void addrconf_join_anycast ( struct inet6_ifaddr * ifp ) ;
static void addrconf_leave_anycast ( struct inet6_ifaddr * ifp ) ;
2010-03-10 10:28:56 +00:00
static void addrconf_type_change ( struct net_device * dev ,
unsigned long event ) ;
2020-07-31 15:32:07 +02:00
static int addrconf_ifdown ( struct net_device * dev , bool unregister ) ;
2005-04-16 15:20:36 -07:00
2018-04-17 17:33:26 -07:00
static struct fib6_info * addrconf_get_prefix_route ( const struct in6_addr * pfx ,
2013-01-09 21:06:03 +00:00
int plen ,
const struct net_device * dev ,
2019-03-27 20:53:52 -07:00
u32 flags , u32 noflags ,
bool no_gw ) ;
2013-01-09 21:06:03 +00:00
2012-04-14 21:37:40 -04:00
static void addrconf_dad_start ( struct inet6_ifaddr * ifp ) ;
2014-03-27 18:28:07 +01:00
static void addrconf_dad_work ( struct work_struct * w ) ;
2018-01-25 20:16:29 -08:00
static void addrconf_dad_completed ( struct inet6_ifaddr * ifp , bool bump_id ,
bool send_na ) ;
2018-11-21 21:52:33 +08:00
static void addrconf_dad_run ( struct inet6_dev * idev , bool restart ) ;
treewide: setup_timer() -> timer_setup()
This converts all remaining cases of the old setup_timer() API into using
timer_setup(), where the callback argument is the structure already
holding the struct timer_list. These should have no behavioral changes,
since they just change which pointer is passed into the callback with
the same available pointers after conversion. It handles the following
examples, in addition to some other variations.
Casting from unsigned long:
void my_callback(unsigned long data)
{
struct something *ptr = (struct something *)data;
...
}
...
setup_timer(&ptr->my_timer, my_callback, ptr);
and forced object casts:
void my_callback(struct something *ptr)
{
...
}
...
setup_timer(&ptr->my_timer, my_callback, (unsigned long)ptr);
become:
void my_callback(struct timer_list *t)
{
struct something *ptr = from_timer(ptr, t, my_timer);
...
}
...
timer_setup(&ptr->my_timer, my_callback, 0);
Direct function assignments:
void my_callback(unsigned long data)
{
struct something *ptr = (struct something *)data;
...
}
...
ptr->my_timer.function = my_callback;
have a temporary cast added, along with converting the args:
void my_callback(struct timer_list *t)
{
struct something *ptr = from_timer(ptr, t, my_timer);
...
}
...
ptr->my_timer.function = (TIMER_FUNC_TYPE)my_callback;
And finally, callbacks without a data assignment:
void my_callback(unsigned long data)
{
...
}
...
setup_timer(&ptr->my_timer, my_callback, 0);
have their argument renamed to verify they're unused during conversion:
void my_callback(struct timer_list *unused)
{
...
}
...
timer_setup(&ptr->my_timer, my_callback, 0);
The conversion is done with the following Coccinelle script:
spatch --very-quiet --all-includes --include-headers \
-I ./arch/x86/include -I ./arch/x86/include/generated \
-I ./include -I ./arch/x86/include/uapi \
-I ./arch/x86/include/generated/uapi -I ./include/uapi \
-I ./include/generated/uapi --include ./include/linux/kconfig.h \
--dir . \
--cocci-file ~/src/data/timer_setup.cocci
@fix_address_of@
expression e;
@@
setup_timer(
-&(e)
+&e
, ...)
// Update any raw setup_timer() usages that have a NULL callback, but
// would otherwise match change_timer_function_usage, since the latter
// will update all function assignments done in the face of a NULL
// function initialization in setup_timer().
@change_timer_function_usage_NULL@
expression _E;
identifier _timer;
type _cast_data;
@@
(
-setup_timer(&_E->_timer, NULL, _E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E->_timer, NULL, (_cast_data)_E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, &_E);
+timer_setup(&_E._timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, (_cast_data)&_E);
+timer_setup(&_E._timer, NULL, 0);
)
@change_timer_function_usage@
expression _E;
identifier _timer;
struct timer_list _stl;
identifier _callback;
type _cast_func, _cast_data;
@@
(
-setup_timer(&_E->_timer, _callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
_E->_timer@_stl.function = _callback;
|
_E->_timer@_stl.function = &_callback;
|
_E->_timer@_stl.function = (_cast_func)_callback;
|
_E->_timer@_stl.function = (_cast_func)&_callback;
|
_E._timer@_stl.function = _callback;
|
_E._timer@_stl.function = &_callback;
|
_E._timer@_stl.function = (_cast_func)_callback;
|
_E._timer@_stl.function = (_cast_func)&_callback;
)
// callback(unsigned long arg)
@change_callback_handle_cast
depends on change_timer_function_usage@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
identifier _handle;
@@
void _callback(
-_origtype _origarg
+struct timer_list *t
)
{
(
... when != _origarg
_handletype *_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle;
... when != _handle
_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle;
... when != _handle
_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
)
}
// callback(unsigned long arg) without existing variable
@change_callback_handle_cast_no_arg
depends on change_timer_function_usage &&
!change_callback_handle_cast@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
@@
void _callback(
-_origtype _origarg
+struct timer_list *t
)
{
+ _handletype *_origarg = from_timer(_origarg, t, _timer);
+
... when != _origarg
- (_handletype *)_origarg
+ _origarg
... when != _origarg
}
// Avoid already converted callbacks.
@match_callback_converted
depends on change_timer_function_usage &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier t;
@@
void _callback(struct timer_list *t)
{ ... }
// callback(struct something *handle)
@change_callback_handle_arg
depends on change_timer_function_usage &&
!match_callback_converted &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
@@
void _callback(
-_handletype *_handle
+struct timer_list *t
)
{
+ _handletype *_handle = from_timer(_handle, t, _timer);
...
}
// If change_callback_handle_arg ran on an empty function, remove
// the added handler.
@unchange_callback_handle_arg
depends on change_timer_function_usage &&
change_callback_handle_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
identifier t;
@@
void _callback(struct timer_list *t)
{
- _handletype *_handle = from_timer(_handle, t, _timer);
}
// We only want to refactor the setup_timer() data argument if we've found
// the matching callback. This undoes changes in change_timer_function_usage.
@unchange_timer_function_usage
depends on change_timer_function_usage &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg &&
!change_callback_handle_arg@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type change_timer_function_usage._cast_data;
@@
(
-timer_setup(&_E->_timer, _callback, 0);
+setup_timer(&_E->_timer, _callback, (_cast_data)_E);
|
-timer_setup(&_E._timer, _callback, 0);
+setup_timer(&_E._timer, _callback, (_cast_data)&_E);
)
// If we fixed a callback from a .function assignment, fix the
// assignment cast now.
@change_timer_function_assignment
depends on change_timer_function_usage &&
(change_callback_handle_cast ||
change_callback_handle_cast_no_arg ||
change_callback_handle_arg)@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_func;
typedef TIMER_FUNC_TYPE;
@@
(
_E->_timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-&_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-(_cast_func)_callback;
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-&_callback;
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-(_cast_func)_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
;
)
// Sometimes timer functions are called directly. Replace matched args.
@change_timer_function_calls
depends on change_timer_function_usage &&
(change_callback_handle_cast ||
change_callback_handle_cast_no_arg ||
change_callback_handle_arg)@
expression _E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_data;
@@
_callback(
(
-(_cast_data)_E
+&_E->_timer
|
-(_cast_data)&_E
+&_E._timer
|
-_E
+&_E->_timer
)
)
// If a timer has been configured without a data argument, it can be
// converted without regard to the callback argument, since it is unused.
@match_timer_function_unused_data@
expression _E;
identifier _timer;
identifier _callback;
@@
(
-setup_timer(&_E->_timer, _callback, 0);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0L);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0UL);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0L);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0UL);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0L);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0UL);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0L);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0UL);
+timer_setup(_timer, _callback, 0);
)
@change_callback_unused_data
depends on match_timer_function_unused_data@
identifier match_timer_function_unused_data._callback;
type _origtype;
identifier _origarg;
@@
void _callback(
-_origtype _origarg
+struct timer_list *unused
)
{
... when != _origarg
}
Signed-off-by: Kees Cook <keescook@chromium.org>
2017-10-16 14:43:17 -07:00
static void addrconf_rs_timer ( struct timer_list * t ) ;
2005-04-16 15:20:36 -07:00
static void __ipv6_ifa_notify ( int event , struct inet6_ifaddr * ifa ) ;
static void ipv6_ifa_notify ( int event , struct inet6_ifaddr * ifa ) ;
2007-02-09 23:24:49 +09:00
static void inet6_prefix_notify ( int event , struct inet6_dev * idev ,
2005-04-16 15:20:36 -07:00
struct prefix_info * pinfo ) ;
2008-07-22 14:21:58 -07:00
static struct ipv6_devconf ipv6_devconf __read_mostly = {
2005-04-16 15:20:36 -07:00
. forwarding = 0 ,
. hop_limit = IPV6_DEFAULT_HOPLIMIT ,
. mtu6 = IPV6_MIN_MTU ,
. accept_ra = 1 ,
. accept_redirects = 1 ,
. autoconf = 1 ,
. force_mld_version = 0 ,
2013-08-14 01:03:46 +02:00
. mldv1_unsolicited_report_interval = 10 * HZ ,
. mldv2_unsolicited_report_interval = HZ ,
2005-04-16 15:20:36 -07:00
. dad_transmits = 1 ,
. rtr_solicits = MAX_RTR_SOLICITATIONS ,
. rtr_solicit_interval = RTR_SOLICITATION_INTERVAL ,
2016-09-27 23:57:58 -07:00
. rtr_solicit_max_interval = RTR_SOLICITATION_MAX_INTERVAL ,
2005-04-16 15:20:36 -07:00
. rtr_solicit_delay = MAX_RTR_SOLICITATION_DELAY ,
2014-08-24 21:53:10 +01:00
. use_tempaddr = 0 ,
2005-04-16 15:20:36 -07:00
. temp_valid_lft = TEMP_VALID_LIFETIME ,
. temp_prefered_lft = TEMP_PREFERRED_LIFETIME ,
. regen_max_retry = REGEN_MAX_RETRY ,
. max_desync_factor = MAX_DESYNC_FACTOR ,
. max_addresses = IPV6_MAX_ADDRESSES ,
2006-03-20 16:55:08 -08:00
. accept_ra_defrtr = 1 ,
net: allow user to set metric on default route learned via Router Advertisement
For IPv4, default route is learned via DHCPv4 and user is allowed to change
metric using config etc/network/interfaces. But for IPv6, default route can
be learned via RA, for which, currently a fixed metric value 1024 is used.
Ideally, user should be able to configure metric on default route for IPv6
similar to IPv4. This patch adds sysctl for the same.
Logs:
For IPv4:
Config in etc/network/interfaces:
auto eth0
iface eth0 inet dhcp
metric 4261413864
IPv4 Kernel Route Table:
$ ip route list
default via 172.21.47.1 dev eth0 metric 4261413864
FRR Table, if a static route is configured:
[In real scenario, it is useful to prefer BGP learned default route over DHCPv4 default route.]
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
> - selected route, * - FIB route
S>* 0.0.0.0/0 [20/0] is directly connected, eth0, 00:00:03
K 0.0.0.0/0 [254/1000] via 172.21.47.1, eth0, 6d08h51m
i.e. User can prefer Default Router learned via Routing Protocol in IPv4.
Similar behavior is not possible for IPv6, without this fix.
After fix [for IPv6]:
sudo sysctl -w net.ipv6.conf.eth0.net.ipv6.conf.eth0.ra_defrtr_metric=1996489705
IP monitor: [When IPv6 RA is received]
default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 pref high
Kernel IPv6 routing table
$ ip -6 route list
default via fe80::be16:65ff:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 21sec hoplimit 64 pref high
FRR Table, if a static route is configured:
[In real scenario, it is useful to prefer BGP learned default route over IPv6 RA default route.]
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
> - selected route, * - FIB route
S>* ::/0 [20/0] is directly connected, eth0, 00:00:06
K ::/0 [119/1001] via fe80::xx16:xxxx:feb3:ce8e, eth0, 6d07h43m
If the metric is changed later, the effect will be seen only when next IPv6
RA is received, because the default route must be fully controlled by RA msg.
Below metric is changed from 1996489705 to 1996489704.
$ sudo sysctl -w net.ipv6.conf.eth0.ra_defrtr_metric=1996489704
net.ipv6.conf.eth0.ra_defrtr_metric = 1996489704
IP monitor:
[On next IPv6 RA msg, Kernel deletes prev route and installs new route with updated metric]
Deleted default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 3sec hoplimit 64 pref high
default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489704 pref high
Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210125214430.24079-1-pchaudhary@linkedin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 13:44:30 -08:00
. ra_defrtr_metric = IP6_RT_PRIO_USER ,
2014-06-25 14:44:53 -07:00
. accept_ra_from_local = 0 ,
2015-07-30 14:28:42 +08:00
. accept_ra_min_hop_limit = 1 ,
2023-07-26 16:07:01 -07:00
. accept_ra_min_lft = 0 ,
2006-03-20 16:55:26 -08:00
. accept_ra_pinfo = 1 ,
2006-03-20 17:05:30 -08:00
# ifdef CONFIG_IPV6_ROUTER_PREF
. accept_ra_rtr_pref = 1 ,
2006-03-20 17:05:47 -08:00
. rtr_probe_interval = 60 * HZ ,
2006-03-20 17:07:03 -08:00
# ifdef CONFIG_IPV6_ROUTE_INFO
2017-03-22 18:19:04 +09:00
. accept_ra_rt_info_min_plen = 0 ,
2006-03-20 17:07:03 -08:00
. accept_ra_rt_info_max_plen = 0 ,
# endif
2006-03-20 17:05:30 -08:00
# endif
2006-09-22 14:43:49 -07:00
. proxy_ndp = 0 ,
2007-04-24 14:58:30 -07:00
. accept_source_route = 0 , /* we do not accept RH0 by default. */
2008-06-28 14:17:11 +09:00
. disable_ipv6 = 0 ,
2017-11-14 14:21:32 +01:00
. accept_dad = 0 ,
2013-08-27 01:36:51 +02:00
. suppress_frag_ndisc = 1 ,
2015-01-20 10:06:05 -07:00
. accept_ra_mtu = 1 ,
2015-03-23 23:36:00 +01:00
. stable_secret = {
. initialized = false ,
2015-07-22 16:38:25 +09:00
} ,
. use_oif_addrs_only = 0 ,
2015-08-13 10:39:01 -04:00
. ignore_routes_with_linkdown = 0 ,
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
. keep_addr_on_down = 0 ,
2016-11-08 14:57:39 +01:00
. seg6_enabled = 0 ,
2016-11-08 14:57:42 +01:00
# ifdef CONFIG_IPV6_SEG6_HMAC
. seg6_require_hmac = 0 ,
# endif
2016-12-02 14:00:08 -08:00
. enhanced_dad = 1 ,
2017-01-26 16:59:17 +13:00
. addr_gen_mode = IN6_ADDR_GEN_MODE_EUI64 ,
2017-02-23 16:27:18 +00:00
. disable_policy = 0 ,
2020-03-27 18:00:20 -04:00
. rpl_seg_enabled = 0 ,
ipv6: ioam: Data plane support for Pre-allocated Trace
Implement support for processing the IOAM Pre-allocated Trace with IPv6,
see [1] and [2]. Introduce a new IPv6 Hop-by-Hop TLV option, see IANA [3].
A new per-interface sysctl is introduced. The value is a boolean to accept (=1)
or ignore (=0, by default) IPv6 IOAM options on ingress for an interface:
- net.ipv6.conf.XXX.ioam6_enabled
Two other sysctls are introduced to define IOAM IDs, represented by an integer.
They are respectively per-namespace and per-interface:
- net.ipv6.ioam6_id
- net.ipv6.conf.XXX.ioam6_id
The value of the first one represents the IOAM ID of the node itself (u32; max
and default value = U32_MAX>>8, due to hop limit concatenation) while the other
represents the IOAM ID of an interface (u16; max and default value = U16_MAX).
Each "ioam6_id" sysctl has a "_wide" equivalent:
- net.ipv6.ioam6_id_wide
- net.ipv6.conf.XXX.ioam6_id_wide
The value of the first one represents the wide IOAM ID of the node itself (u64;
max and default value = U64_MAX>>8, due to hop limit concatenation) while the
other represents the wide IOAM ID of an interface (u32; max and default value
= U32_MAX).
The use of short and wide equivalents is not exclusive, a deployment could
choose to leverage both. For example, net.ipv6.conf.XXX.ioam6_id (short format)
could be an identifier for a physical interface, whereas
net.ipv6.conf.XXX.ioam6_id_wide (wide format) could be an identifier for a
logical sub-interface. Documentation about new sysctls is provided at the end
of this patchset.
Two relativistic hash tables are used: one for IOAM namespaces, the other for
IOAM schemas. A namespace can only have a single active schema and a schema
can only be attached to a single namespace (1:1 relationship).
[1] https://tools.ietf.org/html/draft-ietf-ippm-ioam-ipv6-options
[2] https://tools.ietf.org/html/draft-ietf-ippm-ioam-data
[3] https://www.iana.org/assignments/ipv6-parameters/ipv6-parameters.xhtml#ipv6-parameters-2
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 21:42:57 +02:00
. ioam6_enabled = 0 ,
. ioam6_id = IOAM6_DEFAULT_IF_ID ,
. ioam6_id_wide = IOAM6_DEFAULT_IF_ID_WIDE ,
2021-11-01 10:36:29 -07:00
. ndisc_evict_nocarrier = 1 ,
2023-09-25 14:47:11 -07:00
. ra_honor_pio_life = 0 ,
2005-04-16 15:20:36 -07:00
} ;
2006-09-22 14:15:41 -07:00
static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
2005-04-16 15:20:36 -07:00
. forwarding = 0 ,
. hop_limit = IPV6_DEFAULT_HOPLIMIT ,
. mtu6 = IPV6_MIN_MTU ,
. accept_ra = 1 ,
. accept_redirects = 1 ,
. autoconf = 1 ,
2013-08-14 01:03:46 +02:00
. force_mld_version = 0 ,
. mldv1_unsolicited_report_interval = 10 * HZ ,
. mldv2_unsolicited_report_interval = HZ ,
2005-04-16 15:20:36 -07:00
. dad_transmits = 1 ,
. rtr_solicits = MAX_RTR_SOLICITATIONS ,
. rtr_solicit_interval = RTR_SOLICITATION_INTERVAL ,
2016-09-27 23:57:58 -07:00
. rtr_solicit_max_interval = RTR_SOLICITATION_MAX_INTERVAL ,
2005-04-16 15:20:36 -07:00
. rtr_solicit_delay = MAX_RTR_SOLICITATION_DELAY ,
. use_tempaddr = 0 ,
. temp_valid_lft = TEMP_VALID_LIFETIME ,
. temp_prefered_lft = TEMP_PREFERRED_LIFETIME ,
. regen_max_retry = REGEN_MAX_RETRY ,
. max_desync_factor = MAX_DESYNC_FACTOR ,
. max_addresses = IPV6_MAX_ADDRESSES ,
2006-03-20 16:55:08 -08:00
. accept_ra_defrtr = 1 ,
net: allow user to set metric on default route learned via Router Advertisement
For IPv4, default route is learned via DHCPv4 and user is allowed to change
metric using config etc/network/interfaces. But for IPv6, default route can
be learned via RA, for which, currently a fixed metric value 1024 is used.
Ideally, user should be able to configure metric on default route for IPv6
similar to IPv4. This patch adds sysctl for the same.
Logs:
For IPv4:
Config in etc/network/interfaces:
auto eth0
iface eth0 inet dhcp
metric 4261413864
IPv4 Kernel Route Table:
$ ip route list
default via 172.21.47.1 dev eth0 metric 4261413864
FRR Table, if a static route is configured:
[In real scenario, it is useful to prefer BGP learned default route over DHCPv4 default route.]
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
> - selected route, * - FIB route
S>* 0.0.0.0/0 [20/0] is directly connected, eth0, 00:00:03
K 0.0.0.0/0 [254/1000] via 172.21.47.1, eth0, 6d08h51m
i.e. User can prefer Default Router learned via Routing Protocol in IPv4.
Similar behavior is not possible for IPv6, without this fix.
After fix [for IPv6]:
sudo sysctl -w net.ipv6.conf.eth0.net.ipv6.conf.eth0.ra_defrtr_metric=1996489705
IP monitor: [When IPv6 RA is received]
default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 pref high
Kernel IPv6 routing table
$ ip -6 route list
default via fe80::be16:65ff:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 21sec hoplimit 64 pref high
FRR Table, if a static route is configured:
[In real scenario, it is useful to prefer BGP learned default route over IPv6 RA default route.]
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
> - selected route, * - FIB route
S>* ::/0 [20/0] is directly connected, eth0, 00:00:06
K ::/0 [119/1001] via fe80::xx16:xxxx:feb3:ce8e, eth0, 6d07h43m
If the metric is changed later, the effect will be seen only when next IPv6
RA is received, because the default route must be fully controlled by RA msg.
Below metric is changed from 1996489705 to 1996489704.
$ sudo sysctl -w net.ipv6.conf.eth0.ra_defrtr_metric=1996489704
net.ipv6.conf.eth0.ra_defrtr_metric = 1996489704
IP monitor:
[On next IPv6 RA msg, Kernel deletes prev route and installs new route with updated metric]
Deleted default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 3sec hoplimit 64 pref high
default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489704 pref high
Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210125214430.24079-1-pchaudhary@linkedin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 13:44:30 -08:00
. ra_defrtr_metric = IP6_RT_PRIO_USER ,
2014-06-25 14:44:53 -07:00
. accept_ra_from_local = 0 ,
2015-07-30 14:28:42 +08:00
. accept_ra_min_hop_limit = 1 ,
2023-07-26 16:07:01 -07:00
. accept_ra_min_lft = 0 ,
2006-03-20 16:55:26 -08:00
. accept_ra_pinfo = 1 ,
2006-03-20 17:05:30 -08:00
# ifdef CONFIG_IPV6_ROUTER_PREF
. accept_ra_rtr_pref = 1 ,
2006-03-20 17:05:47 -08:00
. rtr_probe_interval = 60 * HZ ,
2006-03-20 17:07:03 -08:00
# ifdef CONFIG_IPV6_ROUTE_INFO
2017-03-22 18:19:04 +09:00
. accept_ra_rt_info_min_plen = 0 ,
2006-03-20 17:07:03 -08:00
. accept_ra_rt_info_max_plen = 0 ,
# endif
2006-03-20 17:05:30 -08:00
# endif
2006-09-22 14:43:49 -07:00
. proxy_ndp = 0 ,
2007-04-24 14:58:30 -07:00
. accept_source_route = 0 , /* we do not accept RH0 by default. */
2008-06-28 14:17:11 +09:00
. disable_ipv6 = 0 ,
2008-06-28 14:18:38 +09:00
. accept_dad = 1 ,
2013-08-27 01:36:51 +02:00
. suppress_frag_ndisc = 1 ,
2015-01-20 10:06:05 -07:00
. accept_ra_mtu = 1 ,
2015-03-23 23:36:00 +01:00
. stable_secret = {
. initialized = false ,
} ,
2015-07-22 16:38:25 +09:00
. use_oif_addrs_only = 0 ,
2015-08-13 10:39:01 -04:00
. ignore_routes_with_linkdown = 0 ,
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
. keep_addr_on_down = 0 ,
2016-11-08 14:57:39 +01:00
. seg6_enabled = 0 ,
2016-11-08 14:57:42 +01:00
# ifdef CONFIG_IPV6_SEG6_HMAC
. seg6_require_hmac = 0 ,
# endif
2016-12-02 14:00:08 -08:00
. enhanced_dad = 1 ,
2017-01-26 16:59:17 +13:00
. addr_gen_mode = IN6_ADDR_GEN_MODE_EUI64 ,
2017-02-23 16:27:18 +00:00
. disable_policy = 0 ,
2020-03-27 18:00:20 -04:00
. rpl_seg_enabled = 0 ,
ipv6: ioam: Data plane support for Pre-allocated Trace
Implement support for processing the IOAM Pre-allocated Trace with IPv6,
see [1] and [2]. Introduce a new IPv6 Hop-by-Hop TLV option, see IANA [3].
A new per-interface sysctl is introduced. The value is a boolean to accept (=1)
or ignore (=0, by default) IPv6 IOAM options on ingress for an interface:
- net.ipv6.conf.XXX.ioam6_enabled
Two other sysctls are introduced to define IOAM IDs, represented by an integer.
They are respectively per-namespace and per-interface:
- net.ipv6.ioam6_id
- net.ipv6.conf.XXX.ioam6_id
The value of the first one represents the IOAM ID of the node itself (u32; max
and default value = U32_MAX>>8, due to hop limit concatenation) while the other
represents the IOAM ID of an interface (u16; max and default value = U16_MAX).
Each "ioam6_id" sysctl has a "_wide" equivalent:
- net.ipv6.ioam6_id_wide
- net.ipv6.conf.XXX.ioam6_id_wide
The value of the first one represents the wide IOAM ID of the node itself (u64;
max and default value = U64_MAX>>8, due to hop limit concatenation) while the
other represents the wide IOAM ID of an interface (u32; max and default value
= U32_MAX).
The use of short and wide equivalents is not exclusive, a deployment could
choose to leverage both. For example, net.ipv6.conf.XXX.ioam6_id (short format)
could be an identifier for a physical interface, whereas
net.ipv6.conf.XXX.ioam6_id_wide (wide format) could be an identifier for a
logical sub-interface. Documentation about new sysctls is provided at the end
of this patchset.
Two relativistic hash tables are used: one for IOAM namespaces, the other for
IOAM schemas. A namespace can only have a single active schema and a schema
can only be attached to a single namespace (1:1 relationship).
[1] https://tools.ietf.org/html/draft-ietf-ippm-ioam-ipv6-options
[2] https://tools.ietf.org/html/draft-ietf-ippm-ioam-data
[3] https://www.iana.org/assignments/ipv6-parameters/ipv6-parameters.xhtml#ipv6-parameters-2
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 21:42:57 +02:00
. ioam6_enabled = 0 ,
. ioam6_id = IOAM6_DEFAULT_IF_ID ,
. ioam6_id_wide = IOAM6_DEFAULT_IF_ID_WIDE ,
2021-11-01 10:36:29 -07:00
. ndisc_evict_nocarrier = 1 ,
2023-09-25 14:47:11 -07:00
. ra_honor_pio_life = 0 ,
2005-04-16 15:20:36 -07:00
} ;
2017-09-25 22:01:36 +01:00
/* Check if link is ready: is it up and is a valid qdisc available */
static inline bool addrconf_link_ready ( const struct net_device * dev )
2007-10-10 02:53:43 -07:00
{
2017-09-25 22:01:36 +01:00
return netif_oper_up ( dev ) & & ! qdisc_tx_is_noop ( dev ) ;
2007-10-10 02:53:43 -07:00
}
2013-06-23 18:39:01 +02:00
static void addrconf_del_rs_timer ( struct inet6_dev * idev )
2005-04-16 15:20:36 -07:00
{
2013-06-23 18:39:01 +02:00
if ( del_timer ( & idev - > rs_timer ) )
__in6_dev_put ( idev ) ;
}
2014-03-27 18:28:07 +01:00
static void addrconf_del_dad_work ( struct inet6_ifaddr * ifp )
2013-06-23 18:39:01 +02:00
{
2014-03-27 18:28:07 +01:00
if ( cancel_delayed_work ( & ifp - > dad_work ) )
2005-04-16 15:20:36 -07:00
__in6_ifa_put ( ifp ) ;
}
2013-06-23 18:39:01 +02:00
static void addrconf_mod_rs_timer ( struct inet6_dev * idev ,
unsigned long when )
{
2023-07-08 14:59:10 +08:00
if ( ! mod_timer ( & idev - > rs_timer , jiffies + when ) )
2013-06-23 18:39:01 +02:00
in6_dev_hold ( idev ) ;
}
2005-04-16 15:20:36 -07:00
2014-03-27 18:28:07 +01:00
static void addrconf_mod_dad_work ( struct inet6_ifaddr * ifp ,
unsigned long delay )
2005-04-16 15:20:36 -07:00
{
ipv6: fix calling in6_ifa_hold incorrectly for dad work
Now when starting the dad work in addrconf_mod_dad_work, if the dad work
is idle and queued, it needs to hold ifa.
The problem is there's one gap in [1], during which if the pending dad work
is removed elsewhere. It will miss to hold ifa, but the dad word is still
idea and queue.
if (!delayed_work_pending(&ifp->dad_work))
in6_ifa_hold(ifp);
<--------------[1]
mod_delayed_work(addrconf_wq, &ifp->dad_work, delay);
An use-after-free issue can be caused by this.
Chen Wei found this issue when WARN_ON(!hlist_unhashed(&ifp->addr_lst)) in
net6_ifa_finish_destroy was hit because of it.
As Hannes' suggestion, this patch is to fix it by holding ifa first in
addrconf_mod_dad_work, then calling mod_delayed_work and putting ifa if
the dad_work is already in queue.
Note that this patch did not choose to fix it with:
if (!mod_delayed_work(delay))
in6_ifa_hold(ifp);
As with it, when delay == 0, dad_work would be scheduled immediately, all
addrconf_mod_dad_work(0) callings had to be moved under ifp->lock.
Reported-by: Wei Chen <weichen@redhat.com>
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15 16:33:58 +08:00
in6_ifa_hold ( ifp ) ;
if ( mod_delayed_work ( addrconf_wq , & ifp - > dad_work , delay ) )
in6_ifa_put ( ifp ) ;
2005-04-16 15:20:36 -07:00
}
2007-04-24 21:54:09 -07:00
static int snmp6_alloc_dev ( struct inet6_dev * idev )
{
2013-10-07 15:51:58 -07:00
int i ;
2022-05-02 15:15:51 +03:00
idev - > stats . ipv6 = alloc_percpu_gfp ( struct ipstats_mib , GFP_KERNEL_ACCOUNT ) ;
2014-05-05 15:55:55 -07:00
if ( ! idev - > stats . ipv6 )
2007-04-24 21:54:09 -07:00
goto err_ip ;
2013-10-07 15:51:58 -07:00
for_each_possible_cpu ( i ) {
struct ipstats_mib * addrconf_stats ;
2014-05-05 15:55:55 -07:00
addrconf_stats = per_cpu_ptr ( idev - > stats . ipv6 , i ) ;
2013-10-07 15:51:58 -07:00
u64_stats_init ( & addrconf_stats - > syncp ) ;
}
2011-05-19 01:14:23 +00:00
idev - > stats . icmpv6dev = kzalloc ( sizeof ( struct icmpv6_mib_device ) ,
GFP_KERNEL ) ;
if ( ! idev - > stats . icmpv6dev )
2007-04-24 21:54:09 -07:00
goto err_icmp ;
2011-05-19 01:14:23 +00:00
idev - > stats . icmpv6msgdev = kzalloc ( sizeof ( struct icmpv6msg_mib_device ) ,
2022-05-02 15:15:51 +03:00
GFP_KERNEL_ACCOUNT ) ;
2011-05-19 01:14:23 +00:00
if ( ! idev - > stats . icmpv6msgdev )
2007-09-16 16:52:35 -07:00
goto err_icmpmsg ;
2007-04-24 21:54:09 -07:00
return 0 ;
2007-09-16 16:52:35 -07:00
err_icmpmsg :
2011-05-19 01:14:23 +00:00
kfree ( idev - > stats . icmpv6dev ) ;
2007-04-24 21:54:09 -07:00
err_icmp :
2014-05-05 15:55:55 -07:00
free_percpu ( idev - > stats . ipv6 ) ;
2007-04-24 21:54:09 -07:00
err_ip :
2007-10-17 21:25:32 -07:00
return - ENOMEM ;
2007-04-24 21:54:09 -07:00
}
2012-04-01 07:49:08 +00:00
static struct inet6_dev * ipv6_add_dev ( struct net_device * dev )
2005-04-16 15:20:36 -07:00
{
struct inet6_dev * ndev ;
2014-07-25 15:25:09 -07:00
int err = - ENOMEM ;
2005-04-16 15:20:36 -07:00
ASSERT_RTNL ( ) ;
2022-02-10 13:42:29 -08:00
if ( dev - > mtu < IPV6_MIN_MTU & & dev ! = blackhole_netdev )
2014-07-25 15:25:09 -07:00
return ERR_PTR ( - EINVAL ) ;
2005-04-16 15:20:36 -07:00
2022-05-02 15:15:51 +03:00
ndev = kzalloc ( sizeof ( * ndev ) , GFP_KERNEL_ACCOUNT ) ;
2015-03-29 14:00:04 +01:00
if ( ! ndev )
2014-07-25 15:25:09 -07:00
return ERR_PTR ( err ) ;
2006-03-20 23:01:47 -08:00
rwlock_init ( & ndev - > lock ) ;
ndev - > dev = dev ;
2010-03-17 20:31:13 +00:00
INIT_LIST_HEAD ( & ndev - > addr_list ) ;
treewide: setup_timer() -> timer_setup()
This converts all remaining cases of the old setup_timer() API into using
timer_setup(), where the callback argument is the structure already
holding the struct timer_list. These should have no behavioral changes,
since they just change which pointer is passed into the callback with
the same available pointers after conversion. It handles the following
examples, in addition to some other variations.
Casting from unsigned long:
void my_callback(unsigned long data)
{
struct something *ptr = (struct something *)data;
...
}
...
setup_timer(&ptr->my_timer, my_callback, ptr);
and forced object casts:
void my_callback(struct something *ptr)
{
...
}
...
setup_timer(&ptr->my_timer, my_callback, (unsigned long)ptr);
become:
void my_callback(struct timer_list *t)
{
struct something *ptr = from_timer(ptr, t, my_timer);
...
}
...
timer_setup(&ptr->my_timer, my_callback, 0);
Direct function assignments:
void my_callback(unsigned long data)
{
struct something *ptr = (struct something *)data;
...
}
...
ptr->my_timer.function = my_callback;
have a temporary cast added, along with converting the args:
void my_callback(struct timer_list *t)
{
struct something *ptr = from_timer(ptr, t, my_timer);
...
}
...
ptr->my_timer.function = (TIMER_FUNC_TYPE)my_callback;
And finally, callbacks without a data assignment:
void my_callback(unsigned long data)
{
...
}
...
setup_timer(&ptr->my_timer, my_callback, 0);
have their argument renamed to verify they're unused during conversion:
void my_callback(struct timer_list *unused)
{
...
}
...
timer_setup(&ptr->my_timer, my_callback, 0);
The conversion is done with the following Coccinelle script:
spatch --very-quiet --all-includes --include-headers \
-I ./arch/x86/include -I ./arch/x86/include/generated \
-I ./include -I ./arch/x86/include/uapi \
-I ./arch/x86/include/generated/uapi -I ./include/uapi \
-I ./include/generated/uapi --include ./include/linux/kconfig.h \
--dir . \
--cocci-file ~/src/data/timer_setup.cocci
@fix_address_of@
expression e;
@@
setup_timer(
-&(e)
+&e
, ...)
// Update any raw setup_timer() usages that have a NULL callback, but
// would otherwise match change_timer_function_usage, since the latter
// will update all function assignments done in the face of a NULL
// function initialization in setup_timer().
@change_timer_function_usage_NULL@
expression _E;
identifier _timer;
type _cast_data;
@@
(
-setup_timer(&_E->_timer, NULL, _E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E->_timer, NULL, (_cast_data)_E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, &_E);
+timer_setup(&_E._timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, (_cast_data)&_E);
+timer_setup(&_E._timer, NULL, 0);
)
@change_timer_function_usage@
expression _E;
identifier _timer;
struct timer_list _stl;
identifier _callback;
type _cast_func, _cast_data;
@@
(
-setup_timer(&_E->_timer, _callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
_E->_timer@_stl.function = _callback;
|
_E->_timer@_stl.function = &_callback;
|
_E->_timer@_stl.function = (_cast_func)_callback;
|
_E->_timer@_stl.function = (_cast_func)&_callback;
|
_E._timer@_stl.function = _callback;
|
_E._timer@_stl.function = &_callback;
|
_E._timer@_stl.function = (_cast_func)_callback;
|
_E._timer@_stl.function = (_cast_func)&_callback;
)
// callback(unsigned long arg)
@change_callback_handle_cast
depends on change_timer_function_usage@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
identifier _handle;
@@
void _callback(
-_origtype _origarg
+struct timer_list *t
)
{
(
... when != _origarg
_handletype *_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle;
... when != _handle
_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle;
... when != _handle
_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
)
}
// callback(unsigned long arg) without existing variable
@change_callback_handle_cast_no_arg
depends on change_timer_function_usage &&
!change_callback_handle_cast@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
@@
void _callback(
-_origtype _origarg
+struct timer_list *t
)
{
+ _handletype *_origarg = from_timer(_origarg, t, _timer);
+
... when != _origarg
- (_handletype *)_origarg
+ _origarg
... when != _origarg
}
// Avoid already converted callbacks.
@match_callback_converted
depends on change_timer_function_usage &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier t;
@@
void _callback(struct timer_list *t)
{ ... }
// callback(struct something *handle)
@change_callback_handle_arg
depends on change_timer_function_usage &&
!match_callback_converted &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
@@
void _callback(
-_handletype *_handle
+struct timer_list *t
)
{
+ _handletype *_handle = from_timer(_handle, t, _timer);
...
}
// If change_callback_handle_arg ran on an empty function, remove
// the added handler.
@unchange_callback_handle_arg
depends on change_timer_function_usage &&
change_callback_handle_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
identifier t;
@@
void _callback(struct timer_list *t)
{
- _handletype *_handle = from_timer(_handle, t, _timer);
}
// We only want to refactor the setup_timer() data argument if we've found
// the matching callback. This undoes changes in change_timer_function_usage.
@unchange_timer_function_usage
depends on change_timer_function_usage &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg &&
!change_callback_handle_arg@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type change_timer_function_usage._cast_data;
@@
(
-timer_setup(&_E->_timer, _callback, 0);
+setup_timer(&_E->_timer, _callback, (_cast_data)_E);
|
-timer_setup(&_E._timer, _callback, 0);
+setup_timer(&_E._timer, _callback, (_cast_data)&_E);
)
// If we fixed a callback from a .function assignment, fix the
// assignment cast now.
@change_timer_function_assignment
depends on change_timer_function_usage &&
(change_callback_handle_cast ||
change_callback_handle_cast_no_arg ||
change_callback_handle_arg)@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_func;
typedef TIMER_FUNC_TYPE;
@@
(
_E->_timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-&_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-(_cast_func)_callback;
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-&_callback;
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-(_cast_func)_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
;
)
// Sometimes timer functions are called directly. Replace matched args.
@change_timer_function_calls
depends on change_timer_function_usage &&
(change_callback_handle_cast ||
change_callback_handle_cast_no_arg ||
change_callback_handle_arg)@
expression _E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_data;
@@
_callback(
(
-(_cast_data)_E
+&_E->_timer
|
-(_cast_data)&_E
+&_E._timer
|
-_E
+&_E->_timer
)
)
// If a timer has been configured without a data argument, it can be
// converted without regard to the callback argument, since it is unused.
@match_timer_function_unused_data@
expression _E;
identifier _timer;
identifier _callback;
@@
(
-setup_timer(&_E->_timer, _callback, 0);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0L);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0UL);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0L);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0UL);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0L);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0UL);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0L);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0UL);
+timer_setup(_timer, _callback, 0);
)
@change_callback_unused_data
depends on match_timer_function_unused_data@
identifier match_timer_function_unused_data._callback;
type _origtype;
identifier _origarg;
@@
void _callback(
-_origtype _origarg
+struct timer_list *unused
)
{
... when != _origarg
}
Signed-off-by: Kees Cook <keescook@chromium.org>
2017-10-16 14:43:17 -07:00
timer_setup ( & ndev - > rs_timer , addrconf_rs_timer , 0 ) ;
2008-03-25 21:47:49 +09:00
memcpy ( & ndev - > cnf , dev_net ( dev ) - > ipv6 . devconf_dflt , sizeof ( ndev - > cnf ) ) ;
2015-12-15 22:59:12 +01:00
if ( ndev - > cnf . stable_secret . initialized )
2017-01-26 16:59:17 +13:00
ndev - > cnf . addr_gen_mode = IN6_ADDR_GEN_MODE_STABLE_PRIVACY ;
2015-12-15 22:59:12 +01:00
2006-03-20 23:01:47 -08:00
ndev - > cnf . mtu6 = dev - > mtu ;
ipv6: add IFLA_INET6_RA_MTU to expose mtu value
The kernel provides a "/proc/sys/net/ipv6/conf/<iface>/mtu"
file, which can temporarily record the mtu value of the last
received RA message when the RA mtu value is lower than the
interface mtu, but this proc has following limitations:
(1) when the interface mtu (/sys/class/net/<iface>/mtu) is
updeated, mtu6 (/proc/sys/net/ipv6/conf/<iface>/mtu) will
be updated to the value of interface mtu;
(2) mtu6 (/proc/sys/net/ipv6/conf/<iface>/mtu) only affect
ipv6 connection, and not affect ipv4.
Therefore, when the mtu option is carried in the RA message,
there will be a problem that the user sometimes cannot obtain
RA mtu value correctly by reading mtu6.
After this patch set, if a RA message carries the mtu option,
you can send a netlink msg which nlmsg_type is RTM_GETLINK,
and then by parsing the attribute of IFLA_INET6_RA_MTU to
get the mtu value carried in the RA message received on the
inet6 device. In addition, you can also get a link notification
when ra_mtu is updated so it doesn't have to poll.
In this way, if the MTU values that the device receives from
the network in the PCO IPv4 and the RA IPv6 procedures are
different, the user can obtain the correct ipv6 ra_mtu value
and compare the value of ra_mtu and ipv4 mtu, then the device
can use the lower MTU value for both IPv4 and IPv6.
Signed-off-by: Rocco Yue <rocco.yue@mediatek.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210827150412.9267-1-rocco.yue@mediatek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-27 23:04:12 +08:00
ndev - > ra_mtu = 0 ;
2006-03-20 23:01:47 -08:00
ndev - > nd_parms = neigh_parms_alloc ( dev , & nd_tbl ) ;
2015-03-29 14:00:04 +01:00
if ( ! ndev - > nd_parms ) {
2006-03-20 23:01:47 -08:00
kfree ( ndev ) ;
2014-07-25 15:25:09 -07:00
return ERR_PTR ( err ) ;
2006-03-20 23:01:47 -08:00
}
2008-06-19 16:15:47 -07:00
if ( ndev - > cnf . forwarding )
dev_disable_lro ( dev ) ;
2006-03-20 23:01:47 -08:00
/* We refer to the device */
2022-06-07 21:39:55 -07:00
netdev_hold ( dev , & ndev - > dev_tracker , GFP_KERNEL ) ;
2005-04-16 15:20:36 -07:00
2022-02-13 18:10:56 -08:00
if ( snmp6_alloc_dev ( ndev ) < 0 ) {
netdev_dbg ( dev , " %s: cannot allocate memory for statistics \n " ,
__func__ ) ;
neigh_parms_release ( & nd_tbl , ndev - > nd_parms ) ;
2022-06-07 21:39:55 -07:00
netdev_put ( dev , & ndev - > dev_tracker ) ;
2022-02-13 18:10:56 -08:00
kfree ( ndev ) ;
return ERR_PTR ( err ) ;
}
2005-04-16 15:20:36 -07:00
2022-02-13 18:10:56 -08:00
if ( dev ! = blackhole_netdev ) {
2022-02-10 13:42:29 -08:00
if ( snmp6_register_dev ( ndev ) < 0 ) {
netdev_dbg ( dev , " %s: cannot create /proc/net/dev_snmp6/%s \n " ,
__func__ , dev - > name ) ;
goto err_release ;
}
2006-03-20 23:01:47 -08:00
}
2016-10-13 18:50:02 +02:00
/* One reference from device. */
2017-07-04 09:34:55 +03:00
refcount_set ( & ndev - > refcnt , 1 ) ;
2005-04-16 15:20:36 -07:00
2008-06-28 14:18:38 +09:00
if ( dev - > flags & ( IFF_NOARP | IFF_LOOPBACK ) )
ndev - > cnf . accept_dad = - 1 ;
2012-10-29 16:23:10 +00:00
# if IS_ENABLED(CONFIG_IPV6_SIT)
2008-04-13 23:42:18 -07:00
if ( dev - > type = = ARPHRD_SIT & & ( dev - > priv_flags & IFF_ISATAP ) ) {
2012-05-15 14:11:53 +00:00
pr_info ( " %s: Disabled Multicast RS \n " , dev - > name ) ;
2008-04-13 23:42:18 -07:00
ndev - > cnf . rtr_solicits = 0 ;
}
# endif
2010-03-17 20:31:09 +00:00
INIT_LIST_HEAD ( & ndev - > tempaddr_list ) ;
2016-10-13 18:52:15 +02:00
ndev - > desync_factor = U32_MAX ;
2006-03-20 23:01:47 -08:00
if ( ( dev - > flags & IFF_LOOPBACK ) | |
dev - > type = = ARPHRD_TUNNEL | |
2008-04-13 23:47:11 -07:00
dev - > type = = ARPHRD_TUNNEL6 | |
2006-10-10 14:49:53 -07:00
dev - > type = = ARPHRD_SIT | |
dev - > type = = ARPHRD_NONE ) {
2006-03-20 23:01:47 -08:00
ndev - > cnf . use_tempaddr = - 1 ;
2020-05-01 00:51:47 -03:00
}
2013-10-28 20:07:50 -04:00
2013-04-09 03:47:14 +00:00
ndev - > token = in6addr_any ;
2005-04-16 15:20:36 -07:00
2017-09-25 22:01:36 +01:00
if ( netif_running ( dev ) & & addrconf_link_ready ( dev ) )
2007-03-27 14:31:52 -07:00
ndev - > if_flags | = IF_READY ;
2006-03-20 23:01:47 -08:00
ipv6_mc_init_dev ( ndev ) ;
ndev - > tstamp = jiffies ;
2022-02-10 13:42:29 -08:00
if ( dev ! = blackhole_netdev ) {
err = addrconf_sysctl_register ( ndev ) ;
if ( err ) {
ipv6_mc_destroy_dev ( ndev ) ;
snmp6_unregister_dev ( ndev ) ;
goto err_release ;
}
2014-07-25 15:25:09 -07:00
}
2007-01-04 12:31:14 -08:00
/* protected by rtnl_lock */
2012-01-12 04:41:32 +00:00
rcu_assign_pointer ( dev - > ip6_ptr , ndev ) ;
2007-01-14 21:48:40 -08:00
2022-02-10 13:42:29 -08:00
if ( dev ! = blackhole_netdev ) {
/* Join interface-local all-node multicast group */
ipv6_dev_mc_inc ( dev , & in6addr_interfacelocal_allnodes ) ;
2013-02-10 03:50:18 +00:00
2022-02-10 13:42:29 -08:00
/* Join all-node multicast group */
ipv6_dev_mc_inc ( dev , & in6addr_linklocal_allnodes ) ;
2012-03-05 14:45:17 +00:00
2022-02-10 13:42:29 -08:00
/* Join all-router multicast group if forwarding is set */
if ( ndev - > cnf . forwarding & & ( dev - > flags & IFF_MULTICAST ) )
ipv6_dev_mc_inc ( dev , & in6addr_linklocal_allrouters ) ;
}
2005-04-16 15:20:36 -07:00
return ndev ;
2014-07-25 15:25:09 -07:00
err_release :
neigh_parms_release ( & nd_tbl , ndev - > nd_parms ) ;
ndev - > dead = 1 ;
in6_dev_finish_destroy ( ndev ) ;
return ERR_PTR ( err ) ;
2005-04-16 15:20:36 -07:00
}
2012-04-01 07:49:08 +00:00
static struct inet6_dev * ipv6_find_idev ( struct net_device * dev )
2005-04-16 15:20:36 -07:00
{
struct inet6_dev * idev ;
ASSERT_RTNL ( ) ;
2010-03-20 16:09:01 -07:00
idev = __in6_dev_get ( dev ) ;
if ( ! idev ) {
idev = ipv6_add_dev ( dev ) ;
2014-07-25 15:25:09 -07:00
if ( IS_ERR ( idev ) )
2019-08-23 15:44:36 +02:00
return idev ;
2005-04-16 15:20:36 -07:00
}
2005-12-21 22:57:44 +09:00
2005-04-16 15:20:36 -07:00
if ( dev - > flags & IFF_UP )
ipv6_mc_up ( idev ) ;
return idev ;
}
2012-10-25 22:28:50 +00:00
static int inet6_netconf_msgsize_devconf ( int type )
{
int size = NLMSG_ALIGN ( sizeof ( struct netconfmsg ) )
+ nla_total_size ( 4 ) ; /* NETCONFA_IFINDEX */
2016-03-10 08:55:50 +00:00
bool all = false ;
2012-10-25 22:28:50 +00:00
2016-03-10 08:55:50 +00:00
if ( type = = NETCONFA_ALL )
all = true ;
if ( all | | type = = NETCONFA_FORWARDING )
2012-10-25 22:28:50 +00:00
size + = nla_total_size ( 4 ) ;
2012-12-04 14:46:34 -05:00
# ifdef CONFIG_IPV6_MROUTE
2016-03-10 08:55:50 +00:00
if ( all | | type = = NETCONFA_MC_FORWARDING )
2012-12-04 01:13:35 +00:00
size + = nla_total_size ( 4 ) ;
2012-12-04 14:46:34 -05:00
# endif
2016-03-10 08:55:50 +00:00
if ( all | | type = = NETCONFA_PROXY_NEIGH )
2013-12-17 22:37:14 -08:00
size + = nla_total_size ( 4 ) ;
2012-10-25 22:28:50 +00:00
2016-03-10 08:55:50 +00:00
if ( all | | type = = NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN )
2015-08-13 10:39:01 -04:00
size + = nla_total_size ( 4 ) ;
2012-10-25 22:28:50 +00:00
return size ;
}
static int inet6_netconf_fill_devconf ( struct sk_buff * skb , int ifindex ,
struct ipv6_devconf * devconf , u32 portid ,
u32 seq , int event , unsigned int flags ,
int type )
{
struct nlmsghdr * nlh ;
struct netconfmsg * ncm ;
2016-03-10 08:55:50 +00:00
bool all = false ;
2012-10-25 22:28:50 +00:00
nlh = nlmsg_put ( skb , portid , seq , event , sizeof ( struct netconfmsg ) ,
flags ) ;
2015-03-29 14:00:04 +01:00
if ( ! nlh )
2012-10-25 22:28:50 +00:00
return - EMSGSIZE ;
2016-03-10 08:55:50 +00:00
if ( type = = NETCONFA_ALL )
all = true ;
2012-10-25 22:28:50 +00:00
ncm = nlmsg_data ( nlh ) ;
ncm - > ncm_family = AF_INET6 ;
if ( nla_put_s32 ( skb , NETCONFA_IFINDEX , ifindex ) < 0 )
goto nla_put_failure ;
2017-03-28 14:28:05 -07:00
if ( ! devconf )
goto out ;
2016-03-10 08:55:50 +00:00
if ( ( all | | type = = NETCONFA_FORWARDING ) & &
2012-10-25 22:28:50 +00:00
nla_put_s32 ( skb , NETCONFA_FORWARDING , devconf - > forwarding ) < 0 )
goto nla_put_failure ;
2012-12-04 14:46:34 -05:00
# ifdef CONFIG_IPV6_MROUTE
2016-03-10 08:55:50 +00:00
if ( ( all | | type = = NETCONFA_MC_FORWARDING ) & &
2012-12-04 01:13:35 +00:00
nla_put_s32 ( skb , NETCONFA_MC_FORWARDING ,
2022-02-04 12:15:45 -08:00
atomic_read ( & devconf - > mc_forwarding ) ) < 0 )
2012-12-04 01:13:35 +00:00
goto nla_put_failure ;
2012-12-04 14:46:34 -05:00
# endif
2016-03-10 08:55:50 +00:00
if ( ( all | | type = = NETCONFA_PROXY_NEIGH ) & &
2013-12-17 22:37:14 -08:00
nla_put_s32 ( skb , NETCONFA_PROXY_NEIGH , devconf - > proxy_ndp ) < 0 )
goto nla_put_failure ;
2016-03-10 08:55:50 +00:00
if ( ( all | | type = = NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN ) & &
2015-08-13 10:39:01 -04:00
nla_put_s32 ( skb , NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN ,
devconf - > ignore_routes_with_linkdown ) < 0 )
goto nla_put_failure ;
2017-03-28 14:28:05 -07:00
out :
2015-01-16 22:09:00 +01:00
nlmsg_end ( skb , nlh ) ;
return 0 ;
2012-10-25 22:28:50 +00:00
nla_put_failure :
nlmsg_cancel ( skb , nlh ) ;
return - EMSGSIZE ;
}
2017-03-28 14:28:04 -07:00
void inet6_netconf_notify_devconf ( struct net * net , int event , int type ,
int ifindex , struct ipv6_devconf * devconf )
2012-10-25 22:28:50 +00:00
{
struct sk_buff * skb ;
int err = - ENOBUFS ;
2016-07-08 05:46:04 +02:00
skb = nlmsg_new ( inet6_netconf_msgsize_devconf ( type ) , GFP_KERNEL ) ;
2015-03-29 14:00:04 +01:00
if ( ! skb )
2012-10-25 22:28:50 +00:00
goto errout ;
err = inet6_netconf_fill_devconf ( skb , ifindex , devconf , 0 , 0 ,
2017-03-28 14:28:04 -07:00
event , 0 , type ) ;
2012-10-25 22:28:50 +00:00
if ( err < 0 ) {
/* -EMSGSIZE implies BUG in inet6_netconf_msgsize_devconf() */
WARN_ON ( err = = - EMSGSIZE ) ;
kfree_skb ( skb ) ;
goto errout ;
}
2016-07-08 05:46:04 +02:00
rtnl_notify ( skb , net , 0 , RTNLGRP_IPV6_NETCONF , NULL , GFP_KERNEL ) ;
2012-10-25 22:28:50 +00:00
return ;
errout :
2012-12-18 12:08:56 +00:00
rtnl_set_sk_err ( net , RTNLGRP_IPV6_NETCONF , err ) ;
2012-10-25 22:28:50 +00:00
}
2012-10-25 22:28:51 +00:00
static const struct nla_policy devconf_ipv6_policy [ NETCONFA_MAX + 1 ] = {
[ NETCONFA_IFINDEX ] = { . len = sizeof ( int ) } ,
[ NETCONFA_FORWARDING ] = { . len = sizeof ( int ) } ,
2013-12-17 22:37:14 -08:00
[ NETCONFA_PROXY_NEIGH ] = { . len = sizeof ( int ) } ,
2015-08-13 10:39:01 -04:00
[ NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN ] = { . len = sizeof ( int ) } ,
2012-10-25 22:28:51 +00:00
} ;
2019-01-18 10:46:22 -08:00
static int inet6_netconf_valid_get_req ( struct sk_buff * skb ,
const struct nlmsghdr * nlh ,
struct nlattr * * tb ,
struct netlink_ext_ack * extack )
{
int i , err ;
if ( nlh - > nlmsg_len < nlmsg_msg_size ( sizeof ( struct netconfmsg ) ) ) {
NL_SET_ERR_MSG_MOD ( extack , " Invalid header for netconf get request " ) ;
return - EINVAL ;
}
if ( ! netlink_strict_get_check ( skb ) )
netlink: make validation more configurable for future strictness
We currently have two levels of strict validation:
1) liberal (default)
- undefined (type >= max) & NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
- garbage at end of message accepted
2) strict (opt-in)
- NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
Split out parsing strictness into four different options:
* TRAILING - check that there's no trailing data after parsing
attributes (in message or nested)
* MAXTYPE - reject attrs > max known type
* UNSPEC - reject attributes with NLA_UNSPEC policy entries
* STRICT_ATTRS - strictly validate attribute size
The default for future things should be *everything*.
The current *_strict() is a combination of TRAILING and MAXTYPE,
and is renamed to _deprecated_strict().
The current regular parsing has none of this, and is renamed to
*_parse_deprecated().
Additionally it allows us to selectively set one of the new flags
even on old policies. Notably, the UNSPEC flag could be useful in
this case, since it can be arranged (by filling in the policy) to
not be an incompatible userspace ABI change, but would then going
forward prevent forgetting attribute entries. Similar can apply
to the POLICY flag.
We end up with the following renames:
* nla_parse -> nla_parse_deprecated
* nla_parse_strict -> nla_parse_deprecated_strict
* nlmsg_parse -> nlmsg_parse_deprecated
* nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
* nla_parse_nested -> nla_parse_nested_deprecated
* nla_validate_nested -> nla_validate_nested_deprecated
Using spatch, of course:
@@
expression TB, MAX, HEAD, LEN, POL, EXT;
@@
-nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
+nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
@@
expression TB, MAX, NLA, POL, EXT;
@@
-nla_parse_nested(TB, MAX, NLA, POL, EXT)
+nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
@@
expression START, MAX, POL, EXT;
@@
-nla_validate_nested(START, MAX, POL, EXT)
+nla_validate_nested_deprecated(START, MAX, POL, EXT)
@@
expression NLH, HDRLEN, MAX, POL, EXT;
@@
-nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
+nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
For this patch, don't actually add the strict, non-renamed versions
yet so that it breaks compile if I get it wrong.
Also, while at it, make nla_validate and nla_parse go down to a
common __nla_validate_parse() function to avoid code duplication.
Ultimately, this allows us to have very strict validation for every
new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
next patch, while existing things will continue to work as is.
In effect then, this adds fully strict validation for any new command.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-26 14:07:28 +02:00
return nlmsg_parse_deprecated ( nlh , sizeof ( struct netconfmsg ) ,
tb , NETCONFA_MAX ,
devconf_ipv6_policy , extack ) ;
2019-01-18 10:46:22 -08:00
netlink: make validation more configurable for future strictness
We currently have two levels of strict validation:
1) liberal (default)
- undefined (type >= max) & NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
- garbage at end of message accepted
2) strict (opt-in)
- NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
Split out parsing strictness into four different options:
* TRAILING - check that there's no trailing data after parsing
attributes (in message or nested)
* MAXTYPE - reject attrs > max known type
* UNSPEC - reject attributes with NLA_UNSPEC policy entries
* STRICT_ATTRS - strictly validate attribute size
The default for future things should be *everything*.
The current *_strict() is a combination of TRAILING and MAXTYPE,
and is renamed to _deprecated_strict().
The current regular parsing has none of this, and is renamed to
*_parse_deprecated().
Additionally it allows us to selectively set one of the new flags
even on old policies. Notably, the UNSPEC flag could be useful in
this case, since it can be arranged (by filling in the policy) to
not be an incompatible userspace ABI change, but would then going
forward prevent forgetting attribute entries. Similar can apply
to the POLICY flag.
We end up with the following renames:
* nla_parse -> nla_parse_deprecated
* nla_parse_strict -> nla_parse_deprecated_strict
* nlmsg_parse -> nlmsg_parse_deprecated
* nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
* nla_parse_nested -> nla_parse_nested_deprecated
* nla_validate_nested -> nla_validate_nested_deprecated
Using spatch, of course:
@@
expression TB, MAX, HEAD, LEN, POL, EXT;
@@
-nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
+nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
@@
expression TB, MAX, NLA, POL, EXT;
@@
-nla_parse_nested(TB, MAX, NLA, POL, EXT)
+nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
@@
expression START, MAX, POL, EXT;
@@
-nla_validate_nested(START, MAX, POL, EXT)
+nla_validate_nested_deprecated(START, MAX, POL, EXT)
@@
expression NLH, HDRLEN, MAX, POL, EXT;
@@
-nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
+nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
For this patch, don't actually add the strict, non-renamed versions
yet so that it breaks compile if I get it wrong.
Also, while at it, make nla_validate and nla_parse go down to a
common __nla_validate_parse() function to avoid code duplication.
Ultimately, this allows us to have very strict validation for every
new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
next patch, while existing things will continue to work as is.
In effect then, this adds fully strict validation for any new command.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-26 14:07:28 +02:00
err = nlmsg_parse_deprecated_strict ( nlh , sizeof ( struct netconfmsg ) ,
tb , NETCONFA_MAX ,
devconf_ipv6_policy , extack ) ;
2019-01-18 10:46:22 -08:00
if ( err )
return err ;
for ( i = 0 ; i < = NETCONFA_MAX ; i + + ) {
if ( ! tb [ i ] )
continue ;
switch ( i ) {
case NETCONFA_IFINDEX :
break ;
default :
NL_SET_ERR_MSG_MOD ( extack , " Unsupported attribute in netconf get request " ) ;
return - EINVAL ;
}
}
return 0 ;
}
2012-10-25 22:28:51 +00:00
static int inet6_netconf_get_devconf ( struct sk_buff * in_skb ,
2017-04-16 09:48:24 -07:00
struct nlmsghdr * nlh ,
struct netlink_ext_ack * extack )
2012-10-25 22:28:51 +00:00
{
struct net * net = sock_net ( in_skb - > sk ) ;
struct nlattr * tb [ NETCONFA_MAX + 1 ] ;
2017-10-11 10:28:00 +02:00
struct inet6_dev * in6_dev = NULL ;
struct net_device * dev = NULL ;
2012-10-25 22:28:51 +00:00
struct sk_buff * skb ;
struct ipv6_devconf * devconf ;
int ifindex ;
int err ;
2019-01-18 10:46:22 -08:00
err = inet6_netconf_valid_get_req ( in_skb , nlh , tb , extack ) ;
2012-10-25 22:28:51 +00:00
if ( err < 0 )
2017-10-11 10:28:00 +02:00
return err ;
2012-10-25 22:28:51 +00:00
if ( ! tb [ NETCONFA_IFINDEX ] )
2017-10-11 10:28:00 +02:00
return - EINVAL ;
2012-10-25 22:28:51 +00:00
2017-10-11 10:28:00 +02:00
err = - EINVAL ;
2012-10-25 22:28:51 +00:00
ifindex = nla_get_s32 ( tb [ NETCONFA_IFINDEX ] ) ;
switch ( ifindex ) {
case NETCONFA_IFINDEX_ALL :
devconf = net - > ipv6 . devconf_all ;
break ;
case NETCONFA_IFINDEX_DEFAULT :
devconf = net - > ipv6 . devconf_dflt ;
break ;
default :
2017-10-11 10:28:00 +02:00
dev = dev_get_by_index ( net , ifindex ) ;
2015-03-29 14:00:04 +01:00
if ( ! dev )
2017-10-11 10:28:00 +02:00
return - EINVAL ;
in6_dev = in6_dev_get ( dev ) ;
2015-03-29 14:00:04 +01:00
if ( ! in6_dev )
2012-10-25 22:28:51 +00:00
goto errout ;
devconf = & in6_dev - > cnf ;
break ;
}
err = - ENOBUFS ;
2017-10-11 10:28:00 +02:00
skb = nlmsg_new ( inet6_netconf_msgsize_devconf ( NETCONFA_ALL ) , GFP_KERNEL ) ;
2015-03-29 14:00:04 +01:00
if ( ! skb )
2012-10-25 22:28:51 +00:00
goto errout ;
err = inet6_netconf_fill_devconf ( skb , ifindex , devconf ,
NETLINK_CB ( in_skb ) . portid ,
nlh - > nlmsg_seq , RTM_NEWNETCONF , 0 ,
2016-03-10 08:55:50 +00:00
NETCONFA_ALL ) ;
2012-10-25 22:28:51 +00:00
if ( err < 0 ) {
/* -EMSGSIZE implies BUG in inet6_netconf_msgsize_devconf() */
WARN_ON ( err = = - EMSGSIZE ) ;
kfree_skb ( skb ) ;
goto errout ;
}
err = rtnl_unicast ( skb , net , NETLINK_CB ( in_skb ) . portid ) ;
errout :
2017-10-11 10:28:00 +02:00
if ( in6_dev )
in6_dev_put ( in6_dev ) ;
2021-08-05 19:55:27 +08:00
dev_put ( dev ) ;
2012-10-25 22:28:51 +00:00
return err ;
}
2024-02-15 17:21:07 +00:00
/* Combine dev_addr_genid and dev_base_seq to detect changes.
*/
static u32 inet6_base_seq ( const struct net * net )
{
u32 res = atomic_read ( & net - > ipv6 . dev_addr_genid ) +
net - > dev_base_seq ;
/* Must not return 0 (see nl_dump_check_consistent()).
* Chose a value far away from 0.
*/
if ( ! res )
res = 0x80000000 ;
return res ;
}
2013-03-05 23:42:06 +00:00
static int inet6_netconf_dump_devconf ( struct sk_buff * skb ,
struct netlink_callback * cb )
{
2018-10-07 20:16:41 -07:00
const struct nlmsghdr * nlh = cb - > nlh ;
2013-03-05 23:42:06 +00:00
struct net * net = sock_net ( skb - > sk ) ;
int h , s_h ;
int idx , s_idx ;
struct net_device * dev ;
struct inet6_dev * idev ;
struct hlist_head * head ;
2018-10-07 20:16:41 -07:00
if ( cb - > strict_check ) {
struct netlink_ext_ack * extack = cb - > extack ;
struct netconfmsg * ncm ;
if ( nlh - > nlmsg_len < nlmsg_msg_size ( sizeof ( * ncm ) ) ) {
NL_SET_ERR_MSG_MOD ( extack , " Invalid header for netconf dump request " ) ;
return - EINVAL ;
}
if ( nlmsg_attrlen ( nlh , sizeof ( * ncm ) ) ) {
NL_SET_ERR_MSG_MOD ( extack , " Invalid data after header in netconf dump request " ) ;
return - EINVAL ;
}
}
2013-03-05 23:42:06 +00:00
s_h = cb - > args [ 0 ] ;
s_idx = idx = cb - > args [ 1 ] ;
for ( h = s_h ; h < NETDEV_HASHENTRIES ; h + + , s_idx = 0 ) {
idx = 0 ;
head = & net - > dev_index_head [ h ] ;
rcu_read_lock ( ) ;
2024-02-15 17:21:07 +00:00
cb - > seq = inet6_base_seq ( net ) ;
2013-03-05 23:42:06 +00:00
hlist_for_each_entry_rcu ( dev , head , index_hlist ) {
if ( idx < s_idx )
goto cont ;
idev = __in6_dev_get ( dev ) ;
if ( ! idev )
goto cont ;
if ( inet6_netconf_fill_devconf ( skb , dev - > ifindex ,
& idev - > cnf ,
NETLINK_CB ( cb - > skb ) . portid ,
2018-10-07 20:16:41 -07:00
nlh - > nlmsg_seq ,
2013-03-05 23:42:06 +00:00
RTM_NEWNETCONF ,
NLM_F_MULTI ,
2016-03-10 08:55:50 +00:00
NETCONFA_ALL ) < 0 ) {
2013-03-05 23:42:06 +00:00
rcu_read_unlock ( ) ;
goto done ;
}
2013-03-22 06:28:43 +00:00
nl_dump_check_consistent ( cb , nlmsg_hdr ( skb ) ) ;
2013-03-05 23:42:06 +00:00
cont :
idx + + ;
}
rcu_read_unlock ( ) ;
}
if ( h = = NETDEV_HASHENTRIES ) {
if ( inet6_netconf_fill_devconf ( skb , NETCONFA_IFINDEX_ALL ,
net - > ipv6 . devconf_all ,
NETLINK_CB ( cb - > skb ) . portid ,
2018-10-07 20:16:41 -07:00
nlh - > nlmsg_seq ,
2013-03-05 23:42:06 +00:00
RTM_NEWNETCONF , NLM_F_MULTI ,
2016-03-10 08:55:50 +00:00
NETCONFA_ALL ) < 0 )
2013-03-05 23:42:06 +00:00
goto done ;
else
h + + ;
}
if ( h = = NETDEV_HASHENTRIES + 1 ) {
if ( inet6_netconf_fill_devconf ( skb , NETCONFA_IFINDEX_DEFAULT ,
net - > ipv6 . devconf_dflt ,
NETLINK_CB ( cb - > skb ) . portid ,
2018-10-07 20:16:41 -07:00
nlh - > nlmsg_seq ,
2013-03-05 23:42:06 +00:00
RTM_NEWNETCONF , NLM_F_MULTI ,
2016-03-10 08:55:50 +00:00
NETCONFA_ALL ) < 0 )
2013-03-05 23:42:06 +00:00
goto done ;
else
h + + ;
}
done :
cb - > args [ 0 ] = h ;
cb - > args [ 1 ] = idx ;
return skb - > len ;
}
2005-04-16 15:20:36 -07:00
# ifdef CONFIG_SYSCTL
static void dev_forward_change ( struct inet6_dev * idev )
{
struct net_device * dev ;
struct inet6_ifaddr * ifa ;
2022-04-04 01:15:24 +02:00
LIST_HEAD ( tmp_addr_list ) ;
2005-04-16 15:20:36 -07:00
if ( ! idev )
return ;
dev = idev - > dev ;
2008-06-19 16:15:47 -07:00
if ( idev - > cnf . forwarding )
dev_disable_lro ( dev ) ;
2012-10-28 17:43:53 +00:00
if ( dev - > flags & IFF_MULTICAST ) {
2013-02-10 03:50:18 +00:00
if ( idev - > cnf . forwarding ) {
2008-04-10 15:42:11 +09:00
ipv6_dev_mc_inc ( dev , & in6addr_linklocal_allrouters ) ;
2013-02-10 03:50:18 +00:00
ipv6_dev_mc_inc ( dev , & in6addr_interfacelocal_allrouters ) ;
ipv6_dev_mc_inc ( dev , & in6addr_sitelocal_allrouters ) ;
} else {
2008-04-10 15:42:11 +09:00
ipv6_dev_mc_dec ( dev , & in6addr_linklocal_allrouters ) ;
2013-02-10 03:50:18 +00:00
ipv6_dev_mc_dec ( dev , & in6addr_interfacelocal_allrouters ) ;
ipv6_dev_mc_dec ( dev , & in6addr_sitelocal_allrouters ) ;
}
2005-04-16 15:20:36 -07:00
}
2010-03-17 20:31:13 +00:00
2022-04-04 01:15:24 +02:00
read_lock_bh ( & idev - > lock ) ;
2010-03-17 20:31:13 +00:00
list_for_each_entry ( ifa , & idev - > addr_list , if_list ) {
2007-02-26 15:36:10 -08:00
if ( ifa - > flags & IFA_F_TENTATIVE )
continue ;
2022-04-04 01:15:24 +02:00
list_add_tail ( & ifa - > if_list_aux , & tmp_addr_list ) ;
}
read_unlock_bh ( & idev - > lock ) ;
while ( ! list_empty ( & tmp_addr_list ) ) {
ifa = list_first_entry ( & tmp_addr_list ,
struct inet6_ifaddr , if_list_aux ) ;
list_del ( & ifa - > if_list_aux ) ;
2005-04-16 15:20:36 -07:00
if ( idev - > cnf . forwarding )
addrconf_join_anycast ( ifa ) ;
else
addrconf_leave_anycast ( ifa ) ;
}
2022-04-04 01:15:24 +02:00
2017-03-28 14:28:04 -07:00
inet6_netconf_notify_devconf ( dev_net ( dev ) , RTM_NEWNETCONF ,
NETCONFA_FORWARDING ,
2012-10-25 22:28:50 +00:00
dev - > ifindex , & idev - > cnf ) ;
2005-04-16 15:20:36 -07:00
}
2008-01-10 17:43:50 -08:00
static void addrconf_forward_change ( struct net * net , __s32 newf )
2005-04-16 15:20:36 -07:00
{
struct net_device * dev ;
struct inet6_dev * idev ;
2012-08-14 08:54:51 +00:00
for_each_netdev ( net , dev ) {
2005-04-16 15:20:36 -07:00
idev = __in6_dev_get ( dev ) ;
if ( idev ) {
2008-01-10 17:43:50 -08:00
int changed = ( ! idev - > cnf . forwarding ) ^ ( ! newf ) ;
idev - > cnf . forwarding = newf ;
2005-04-16 15:20:36 -07:00
if ( changed )
dev_forward_change ( idev ) ;
}
}
}
2007-12-05 01:50:24 -08:00
2012-01-16 10:40:10 +00:00
static int addrconf_fixup_forwarding ( struct ctl_table * table , int * p , int newf )
2007-12-05 01:50:24 -08:00
{
2008-01-10 17:42:13 -08:00
struct net * net ;
2012-01-16 10:40:10 +00:00
int old ;
if ( ! rtnl_trylock ( ) )
return restart_syscall ( ) ;
2008-01-10 17:42:13 -08:00
net = ( struct net * ) table - > extra2 ;
2012-01-16 10:40:10 +00:00
old = * p ;
* p = newf ;
2009-02-26 06:55:31 +00:00
2012-01-16 10:40:10 +00:00
if ( p = = & net - > ipv6 . devconf_dflt - > forwarding ) {
2012-10-25 22:28:50 +00:00
if ( ( ! newf ) ^ ( ! old ) )
2017-03-28 14:28:04 -07:00
inet6_netconf_notify_devconf ( net , RTM_NEWNETCONF ,
NETCONFA_FORWARDING ,
2012-10-25 22:28:50 +00:00
NETCONFA_IFINDEX_DEFAULT ,
net - > ipv6 . devconf_dflt ) ;
2012-01-16 10:40:10 +00:00
rtnl_unlock ( ) ;
return 0 ;
2010-02-19 13:22:59 +00:00
}
2007-12-05 01:50:24 -08:00
2008-01-10 17:43:50 -08:00
if ( p = = & net - > ipv6 . devconf_all - > forwarding ) {
2016-08-30 10:09:21 +02:00
int old_dflt = net - > ipv6 . devconf_dflt - > forwarding ;
2008-01-10 17:43:50 -08:00
net - > ipv6 . devconf_dflt - > forwarding = newf ;
2016-08-30 10:09:21 +02:00
if ( ( ! newf ) ^ ( ! old_dflt ) )
2017-03-28 14:28:04 -07:00
inet6_netconf_notify_devconf ( net , RTM_NEWNETCONF ,
NETCONFA_FORWARDING ,
2016-08-30 10:09:21 +02:00
NETCONFA_IFINDEX_DEFAULT ,
net - > ipv6 . devconf_dflt ) ;
2008-01-10 17:43:50 -08:00
addrconf_forward_change ( net , newf ) ;
2012-10-25 22:28:50 +00:00
if ( ( ! newf ) ^ ( ! old ) )
2017-03-28 14:28:04 -07:00
inet6_netconf_notify_devconf ( net , RTM_NEWNETCONF ,
NETCONFA_FORWARDING ,
2012-10-25 22:28:50 +00:00
NETCONFA_IFINDEX_ALL ,
net - > ipv6 . devconf_all ) ;
2012-01-16 10:40:10 +00:00
} else if ( ( ! newf ) ^ ( ! old ) )
2007-12-05 01:50:24 -08:00
dev_forward_change ( ( struct inet6_dev * ) table - > extra1 ) ;
2008-06-19 16:15:47 -07:00
rtnl_unlock ( ) ;
2007-12-05 01:50:24 -08:00
2012-01-16 10:40:10 +00:00
if ( newf )
2008-03-04 13:47:14 -08:00
rt6_purge_dflt_routers ( net ) ;
2009-02-26 06:55:31 +00:00
return 1 ;
2007-12-05 01:50:24 -08:00
}
2015-08-13 10:39:01 -04:00
static void addrconf_linkdown_change ( struct net * net , __s32 newf )
{
struct net_device * dev ;
struct inet6_dev * idev ;
for_each_netdev ( net , dev ) {
idev = __in6_dev_get ( dev ) ;
if ( idev ) {
int changed = ( ! idev - > cnf . ignore_routes_with_linkdown ) ^ ( ! newf ) ;
idev - > cnf . ignore_routes_with_linkdown = newf ;
if ( changed )
inet6_netconf_notify_devconf ( dev_net ( dev ) ,
2017-03-28 14:28:04 -07:00
RTM_NEWNETCONF ,
2015-08-13 10:39:01 -04:00
NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN ,
dev - > ifindex ,
& idev - > cnf ) ;
}
}
}
static int addrconf_fixup_linkdown ( struct ctl_table * table , int * p , int newf )
{
struct net * net ;
int old ;
if ( ! rtnl_trylock ( ) )
return restart_syscall ( ) ;
net = ( struct net * ) table - > extra2 ;
old = * p ;
* p = newf ;
if ( p = = & net - > ipv6 . devconf_dflt - > ignore_routes_with_linkdown ) {
if ( ( ! newf ) ^ ( ! old ) )
inet6_netconf_notify_devconf ( net ,
2017-03-28 14:28:04 -07:00
RTM_NEWNETCONF ,
2015-08-13 10:39:01 -04:00
NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN ,
NETCONFA_IFINDEX_DEFAULT ,
net - > ipv6 . devconf_dflt ) ;
rtnl_unlock ( ) ;
return 0 ;
}
if ( p = = & net - > ipv6 . devconf_all - > ignore_routes_with_linkdown ) {
net - > ipv6 . devconf_dflt - > ignore_routes_with_linkdown = newf ;
addrconf_linkdown_change ( net , newf ) ;
if ( ( ! newf ) ^ ( ! old ) )
inet6_netconf_notify_devconf ( net ,
2017-03-28 14:28:04 -07:00
RTM_NEWNETCONF ,
2015-08-13 10:39:01 -04:00
NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN ,
NETCONFA_IFINDEX_ALL ,
net - > ipv6 . devconf_all ) ;
}
rtnl_unlock ( ) ;
return 1 ;
}
2005-04-16 15:20:36 -07:00
# endif
2010-03-17 20:31:11 +00:00
/* Nobody refers to this ifaddr, destroy it */
2005-04-16 15:20:36 -07:00
void inet6_ifa_finish_destroy ( struct inet6_ifaddr * ifp )
{
2010-03-17 20:31:10 +00:00
WARN_ON ( ! hlist_unhashed ( & ifp - > addr_lst ) ) ;
2008-07-25 21:43:18 -07:00
2005-04-16 15:20:36 -07:00
# ifdef NET_REFCNT_DEBUG
2012-05-15 14:11:54 +00:00
pr_debug ( " %s \n " , __func__ ) ;
2005-04-16 15:20:36 -07:00
# endif
in6_dev_put ( ifp - > idev ) ;
2014-03-27 18:28:07 +01:00
if ( cancel_delayed_work ( & ifp - > dad_work ) )
pr_notice ( " delayed DAD work was pending while freeing ifa=%p \n " ,
ifp ) ;
2005-04-16 15:20:36 -07:00
2010-05-18 15:36:06 -07:00
if ( ifp - > state ! = INET6_IFADDR_STATE_DEAD ) {
2012-05-15 14:11:53 +00:00
pr_warn ( " Freeing alive inet6 address %p \n " , ifp ) ;
2005-04-16 15:20:36 -07:00
return ;
}
2011-03-15 18:00:14 +08:00
kfree_rcu ( ifp , rcu ) ;
2005-04-16 15:20:36 -07:00
}
2006-07-10 15:25:51 -07:00
static void
ipv6_link_dev_addr ( struct inet6_dev * idev , struct inet6_ifaddr * ifp )
{
2010-03-17 20:31:13 +00:00
struct list_head * p ;
2006-07-11 13:05:30 -07:00
int ifp_scope = ipv6_addr_src_scope ( & ifp - > addr ) ;
2006-07-10 15:25:51 -07:00
/*
* Each device address list is sorted in order of scope -
* global before linklocal .
*/
2010-03-17 20:31:13 +00:00
list_for_each ( p , & idev - > addr_list ) {
struct inet6_ifaddr * ifa
= list_entry ( p , struct inet6_ifaddr , if_list ) ;
2006-07-11 13:05:30 -07:00
if ( ifp_scope > = ipv6_addr_src_scope ( & ifa - > addr ) )
2006-07-10 15:25:51 -07:00
break ;
}
2017-10-07 19:30:23 -07:00
list_add_tail_rcu ( & ifp - > if_list , p ) ;
2006-07-10 15:25:51 -07:00
}
2017-10-23 16:17:47 -07:00
static u32 inet6_addr_hash ( const struct net * net , const struct in6_addr * addr )
2008-04-10 15:42:08 +09:00
{
2017-10-23 16:17:47 -07:00
u32 val = ipv6_addr_hash ( addr ) ^ net_hash_mix ( net ) ;
return hash_32 ( val , IN6_ADDR_HSIZE_SHIFT ) ;
2008-04-10 15:42:08 +09:00
}
2017-10-23 16:17:45 -07:00
static bool ipv6_chk_same_addr ( struct net * net , const struct in6_addr * addr ,
2017-10-23 16:17:46 -07:00
struct net_device * dev , unsigned int hash )
2017-10-23 16:17:45 -07:00
{
struct inet6_ifaddr * ifp ;
2022-02-07 20:50:30 -08:00
hlist_for_each_entry ( ifp , & net - > ipv6 . inet6_addr_lst [ hash ] , addr_lst ) {
2017-10-23 16:17:45 -07:00
if ( ipv6_addr_equal ( & ifp - > addr , addr ) ) {
if ( ! dev | | ifp - > idev - > dev = = dev )
return true ;
}
}
return false ;
}
2017-10-18 09:56:52 -07:00
static int ipv6_add_addr_hash ( struct net_device * dev , struct inet6_ifaddr * ifa )
{
2022-02-07 20:50:30 -08:00
struct net * net = dev_net ( dev ) ;
unsigned int hash = inet6_addr_hash ( net , & ifa - > addr ) ;
2017-10-18 09:56:52 -07:00
int err = 0 ;
2023-03-21 04:01:14 +00:00
spin_lock_bh ( & net - > ipv6 . addrconf_hash_lock ) ;
2017-10-18 09:56:52 -07:00
/* Ignore adding duplicate addresses on an interface */
2022-02-07 20:50:30 -08:00
if ( ipv6_chk_same_addr ( net , & ifa - > addr , dev , hash ) ) {
2018-03-26 08:35:01 -07:00
netdev_dbg ( dev , " ipv6_add_addr: already assigned \n " ) ;
2017-10-18 09:56:52 -07:00
err = - EEXIST ;
2017-10-23 16:17:46 -07:00
} else {
2022-02-07 20:50:30 -08:00
hlist_add_head_rcu ( & ifa - > addr_lst , & net - > ipv6 . inet6_addr_lst [ hash ] ) ;
2017-10-18 09:56:52 -07:00
}
2023-03-21 04:01:14 +00:00
spin_unlock_bh ( & net - > ipv6 . addrconf_hash_lock ) ;
2017-10-18 09:56:52 -07:00
return err ;
}
2005-04-16 15:20:36 -07:00
/* On success it returns ifp with increased reference count */
static struct inet6_ifaddr *
2018-05-27 08:09:53 -07:00
ipv6_add_addr ( struct inet6_dev * idev , struct ifa6_config * cfg ,
2017-10-18 09:56:54 -07:00
bool can_block , struct netlink_ext_ack * extack )
2005-04-16 15:20:36 -07:00
{
2017-10-18 09:56:52 -07:00
gfp_t gfp_flags = can_block ? GFP_KERNEL : GFP_ATOMIC ;
2018-05-27 08:09:53 -07:00
int addr_type = ipv6_addr_type ( cfg - > pfx ) ;
2017-02-23 16:27:18 +00:00
struct net * net = dev_net ( idev - > dev ) ;
2005-04-16 15:20:36 -07:00
struct inet6_ifaddr * ifa = NULL ;
2018-04-18 15:39:00 -07:00
struct fib6_info * f6i = NULL ;
2005-04-16 15:20:36 -07:00
int err = 0 ;
2008-06-25 16:26:47 +09:00
2023-07-26 10:39:05 +08:00
if ( addr_type = = IPV6_ADDR_ANY ) {
NL_SET_ERR_MSG_MOD ( extack , " Invalid address " ) ;
2008-06-25 16:26:47 +09:00
return ERR_PTR ( - EADDRNOTAVAIL ) ;
2023-07-26 10:39:05 +08:00
} else if ( addr_type & IPV6_ADDR_MULTICAST & &
! ( cfg - > ifa_flags & IFA_F_MCAUTOJOIN ) ) {
NL_SET_ERR_MSG_MOD ( extack , " Cannot assign multicast address without \" IFA_F_MCAUTOJOIN \" flag " ) ;
return ERR_PTR ( - EADDRNOTAVAIL ) ;
} else if ( ! ( idev - > dev - > flags & IFF_LOOPBACK ) & &
! netif_is_l3_master ( idev - > dev ) & &
addr_type & IPV6_ADDR_LOOPBACK ) {
NL_SET_ERR_MSG_MOD ( extack , " Cannot assign loopback address on this device " ) ;
return ERR_PTR ( - EADDRNOTAVAIL ) ;
}
2005-04-16 15:20:36 -07:00
if ( idev - > dead ) {
2023-07-26 10:39:05 +08:00
NL_SET_ERR_MSG_MOD ( extack , " device is going away " ) ;
err = - ENODEV ;
2017-10-18 09:56:52 -07:00
goto out ;
2005-04-16 15:20:36 -07:00
}
2009-06-01 03:07:33 -07:00
if ( idev - > cnf . disable_ipv6 ) {
2023-07-26 10:39:05 +08:00
NL_SET_ERR_MSG_MOD ( extack , " IPv6 is disabled on this device " ) ;
2009-03-18 18:22:48 -07:00
err = - EACCES ;
2017-10-18 09:56:52 -07:00
goto out ;
2009-03-18 18:22:48 -07:00
}
2017-10-18 09:56:53 -07:00
/* validator notifier needs to be blocking;
* do not call in atomic context
*/
if ( can_block ) {
struct in6_validator_info i6vi = {
2018-05-27 08:09:53 -07:00
. i6vi_addr = * cfg - > pfx ,
2017-10-18 09:56:53 -07:00
. i6vi_dev = idev ,
2017-10-18 09:56:54 -07:00
. extack = extack ,
2017-10-18 09:56:53 -07:00
} ;
err = inet6addr_validator_notifier_call_chain ( NETDEV_UP , & i6vi ) ;
err = notifier_to_errno ( err ) ;
if ( err < 0 )
goto out ;
}
2005-04-16 15:20:36 -07:00
memcg: enable accounting for IP address and routing-related objects
An netadmin inside container can use 'ip a a' and 'ip r a'
to assign a large number of ipv4/ipv6 addresses and routing entries
and force kernel to allocate megabytes of unaccounted memory
for long-lived per-netdevice related kernel objects:
'struct in_ifaddr', 'struct inet6_ifaddr', 'struct fib6_node',
'struct rt6_info', 'struct fib_rules' and ip_fib caches.
These objects can be manually removed, though usually they lives
in memory till destroy of its net namespace.
It makes sense to account for them to restrict the host's memory
consumption from inside the memcg-limited container.
One of such objects is the 'struct fib6_node' mostly allocated in
net/ipv6/route.c::__ip6_ins_rt() inside the lock_bh()/unlock_bh() section:
write_lock_bh(&table->tb6_lock);
err = fib6_add(&table->tb6_root, rt, info, mxc);
write_unlock_bh(&table->tb6_lock);
In this case it is not enough to simply add SLAB_ACCOUNT to corresponding
kmem cache. The proper memory cgroup still cannot be found due to the
incorrect 'in_interrupt()' check used in memcg_kmem_bypass().
Obsoleted in_interrupt() does not describe real execution context properly.
>From include/linux/preempt.h:
The following macros are deprecated and should not be used in new code:
in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled
To verify the current execution context new macro should be used instead:
in_task() - We're in task context
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-19 13:44:31 +03:00
ifa = kzalloc ( sizeof ( * ifa ) , gfp_flags | __GFP_ACCOUNT ) ;
2015-03-29 14:00:04 +01:00
if ( ! ifa ) {
2005-04-16 15:20:36 -07:00
err = - ENOBUFS ;
goto out ;
}
2023-07-26 10:39:05 +08:00
f6i = addrconf_f6i_alloc ( net , idev , cfg - > pfx , false , gfp_flags , extack ) ;
2018-04-18 15:39:00 -07:00
if ( IS_ERR ( f6i ) ) {
err = PTR_ERR ( f6i ) ;
f6i = NULL ;
2005-04-16 15:20:36 -07:00
goto out ;
}
2013-12-07 19:26:57 +01:00
neigh_parms_data_state_setall ( idev - > nd_parms ) ;
2018-05-27 08:09:53 -07:00
ifa - > addr = * cfg - > pfx ;
if ( cfg - > peer_pfx )
ifa - > peer_addr = * cfg - > peer_pfx ;
2005-04-16 15:20:36 -07:00
spin_lock_init ( & ifa - > lock ) ;
2014-03-27 18:28:07 +01:00
INIT_DELAYED_WORK ( & ifa - > dad_work , addrconf_dad_work ) ;
2010-03-17 20:31:10 +00:00
INIT_HLIST_NODE ( & ifa - > addr_lst ) ;
2018-05-27 08:09:53 -07:00
ifa - > scope = cfg - > scope ;
ifa - > prefix_len = cfg - > plen ;
2018-05-27 08:09:58 -07:00
ifa - > rt_priority = cfg - > rt_priority ;
2018-05-27 08:09:53 -07:00
ifa - > flags = cfg - > ifa_flags ;
2022-02-17 16:02:02 +01:00
ifa - > ifa_proto = cfg - > ifa_proto ;
2017-05-12 17:03:39 -07:00
/* No need to add the TENTATIVE flag for addresses with NODAD */
2018-05-27 08:09:53 -07:00
if ( ! ( cfg - > ifa_flags & IFA_F_NODAD ) )
2017-05-12 17:03:39 -07:00
ifa - > flags | = IFA_F_TENTATIVE ;
2018-05-27 08:09:53 -07:00
ifa - > valid_lft = cfg - > valid_lft ;
ifa - > prefered_lft = cfg - > preferred_lft ;
2005-04-16 15:20:36 -07:00
ifa - > cstamp = ifa - > tstamp = jiffies ;
2013-04-09 03:47:16 +00:00
ifa - > tokenized = false ;
2005-04-16 15:20:36 -07:00
2018-04-18 15:39:00 -07:00
ifa - > rt = f6i ;
2006-08-29 02:43:49 -07:00
2005-04-16 15:20:36 -07:00
ifa - > idev = idev ;
2017-10-18 09:56:52 -07:00
in6_dev_hold ( idev ) ;
2005-04-16 15:20:36 -07:00
/* For caller */
2017-07-04 09:34:56 +03:00
refcount_set ( & ifa - > refcnt , 1 ) ;
2005-04-16 15:20:36 -07:00
2023-03-21 04:01:14 +00:00
rcu_read_lock ( ) ;
2005-04-16 15:20:36 -07:00
2017-10-18 09:56:52 -07:00
err = ipv6_add_addr_hash ( idev - > dev , ifa ) ;
if ( err < 0 ) {
2023-03-21 04:01:14 +00:00
rcu_read_unlock ( ) ;
2017-10-18 09:56:52 -07:00
goto out ;
}
2005-04-16 15:20:36 -07:00
2023-03-21 04:01:14 +00:00
write_lock_bh ( & idev - > lock ) ;
2017-10-18 09:56:52 -07:00
2005-04-16 15:20:36 -07:00
/* Add to inet6_dev unicast addr list. */
2006-07-10 15:25:51 -07:00
ipv6_link_dev_addr ( idev , ifa ) ;
2005-04-16 15:20:36 -07:00
if ( ifa - > flags & IFA_F_TEMPORARY ) {
2010-03-17 20:31:09 +00:00
list_add ( & ifa - > tmp_list , & idev - > tempaddr_list ) ;
2005-04-16 15:20:36 -07:00
in6_ifa_hold ( ifa ) ;
}
in6_ifa_hold ( ifa ) ;
2023-03-21 04:01:14 +00:00
write_unlock_bh ( & idev - > lock ) ;
2017-10-18 09:56:52 -07:00
2023-03-21 04:01:14 +00:00
rcu_read_unlock ( ) ;
2005-04-16 15:20:36 -07:00
2017-10-18 09:56:52 -07:00
inet6addr_notifier_call_chain ( NETDEV_UP , ifa ) ;
out :
if ( unlikely ( err < 0 ) ) {
2018-04-18 15:39:00 -07:00
fib6_info_release ( f6i ) ;
2018-04-17 17:33:25 -07:00
2017-10-18 09:56:52 -07:00
if ( ifa ) {
if ( ifa - > idev )
in6_dev_put ( ifa - > idev ) ;
kfree ( ifa ) ;
}
2005-04-16 15:20:36 -07:00
ifa = ERR_PTR ( err ) ;
}
return ifa ;
}
2014-01-15 15:36:59 +01:00
enum cleanup_prefix_rt_t {
CLEANUP_PREFIX_RT_NOP , /* no cleanup action for prefix route */
CLEANUP_PREFIX_RT_DEL , /* delete the prefix route */
CLEANUP_PREFIX_RT_EXPIRE , /* update the lifetime of the prefix route */
} ;
/*
* Check , whether the prefix for ifp would still need a prefix route
* after deleting ifp . The function returns one of the CLEANUP_PREFIX_RT_ *
* constants .
*
* 1 ) we don ' t purge prefix if address was not permanent .
* prefix is managed by its own lifetime .
* 2 ) we also don ' t purge , if the address was IFA_F_NOPREFIXROUTE .
* 3 ) if there are no addresses , delete prefix .
* 4 ) if there are still other permanent address ( es ) ,
* corresponding prefix is still permanent .
* 5 ) if there are still other addresses with IFA_F_NOPREFIXROUTE ,
* don ' t purge the prefix , assume user space is managing it .
* 6 ) otherwise , update prefix lifetime to the
* longest valid lifetime among the corresponding
* addresses on the device .
* Note : subsequent RA will update lifetime .
* */
static enum cleanup_prefix_rt_t
check_cleanup_prefix_route ( struct inet6_ifaddr * ifp , unsigned long * expires )
{
struct inet6_ifaddr * ifa ;
struct inet6_dev * idev = ifp - > idev ;
unsigned long lifetime ;
enum cleanup_prefix_rt_t action = CLEANUP_PREFIX_RT_DEL ;
* expires = jiffies ;
list_for_each_entry ( ifa , & idev - > addr_list , if_list ) {
if ( ifa = = ifp )
continue ;
2019-02-11 10:57:46 +08:00
if ( ifa - > prefix_len ! = ifp - > prefix_len | |
! ipv6_prefix_equal ( & ifa - > addr , & ifp - > addr ,
2014-01-15 15:36:59 +01:00
ifp - > prefix_len ) )
continue ;
if ( ifa - > flags & ( IFA_F_PERMANENT | IFA_F_NOPREFIXROUTE ) )
return CLEANUP_PREFIX_RT_NOP ;
action = CLEANUP_PREFIX_RT_EXPIRE ;
spin_lock ( & ifa - > lock ) ;
lifetime = addrconf_timeout_fixup ( ifa - > valid_lft , HZ ) ;
/*
* Note : Because this address is
* not permanent , lifetime <
* LONG_MAX / HZ here .
*/
if ( time_before ( * expires , ifa - > tstamp + lifetime * HZ ) )
* expires = ifa - > tstamp + lifetime * HZ ;
spin_unlock ( & ifa - > lock ) ;
}
return action ;
}
static void
2020-03-03 14:37:35 +08:00
cleanup_prefix_route ( struct inet6_ifaddr * ifp , unsigned long expires ,
bool del_rt , bool del_peer )
2014-01-15 15:36:59 +01:00
{
2018-04-18 15:38:59 -07:00
struct fib6_info * f6i ;
2014-01-15 15:36:59 +01:00
2020-03-03 14:37:35 +08:00
f6i = addrconf_get_prefix_route ( del_peer ? & ifp - > peer_addr : & ifp - > addr ,
ifp - > prefix_len ,
2019-03-27 20:53:52 -07:00
ifp - > idev - > dev , 0 , RTF_DEFAULT , true ) ;
2018-04-18 15:38:59 -07:00
if ( f6i ) {
2014-01-15 15:36:59 +01:00
if ( del_rt )
2020-04-27 13:56:45 -07:00
ip6_del_rt ( dev_net ( ifp - > idev - > dev ) , f6i , false ) ;
2014-01-15 15:36:59 +01:00
else {
2018-04-18 15:38:59 -07:00
if ( ! ( f6i - > fib6_flags & RTF_EXPIRES ) )
fib6_set_expires ( f6i , expires ) ;
fib6_info_release ( f6i ) ;
2014-01-15 15:36:59 +01:00
}
}
}
2005-04-16 15:20:36 -07:00
/* This function wants to get referenced ifp and releases it before return */
static void ipv6_del_addr ( struct inet6_ifaddr * ifp )
{
2014-01-15 15:36:59 +01:00
enum cleanup_prefix_rt_t action = CLEANUP_PREFIX_RT_NOP ;
2022-02-07 20:50:30 -08:00
struct net * net = dev_net ( ifp - > idev - > dev ) ;
2014-01-15 15:36:59 +01:00
unsigned long expires ;
2022-02-07 20:50:30 -08:00
int state ;
2005-04-16 15:20:36 -07:00
2014-03-27 18:28:07 +01:00
ASSERT_RTNL ( ) ;
2015-03-23 23:36:03 +01:00
spin_lock_bh ( & ifp - > lock ) ;
2010-05-18 15:54:18 -07:00
state = ifp - > state ;
2010-05-18 15:36:06 -07:00
ifp - > state = INET6_IFADDR_STATE_DEAD ;
2015-03-23 23:36:03 +01:00
spin_unlock_bh ( & ifp - > lock ) ;
2010-05-18 15:54:18 -07:00
if ( state = = INET6_IFADDR_STATE_DEAD )
goto out ;
2005-04-16 15:20:36 -07:00
2022-02-07 20:50:30 -08:00
spin_lock_bh ( & net - > ipv6 . addrconf_hash_lock ) ;
2010-03-17 20:31:11 +00:00
hlist_del_init_rcu ( & ifp - > addr_lst ) ;
2022-02-07 20:50:30 -08:00
spin_unlock_bh ( & net - > ipv6 . addrconf_hash_lock ) ;
2005-04-16 15:20:36 -07:00
2014-01-15 15:36:59 +01:00
write_lock_bh ( & ifp - > idev - > lock ) ;
2013-10-28 20:07:50 -04:00
2005-04-16 15:20:36 -07:00
if ( ifp - > flags & IFA_F_TEMPORARY ) {
2010-03-17 20:31:09 +00:00
list_del ( & ifp - > tmp_list ) ;
if ( ifp - > ifpub ) {
in6_ifa_put ( ifp - > ifpub ) ;
ifp - > ifpub = NULL ;
2005-04-16 15:20:36 -07:00
}
2010-03-17 20:31:09 +00:00
__in6_ifa_put ( ifp ) ;
2005-04-16 15:20:36 -07:00
}
2014-01-15 15:36:59 +01:00
if ( ifp - > flags & IFA_F_PERMANENT & & ! ( ifp - > flags & IFA_F_NOPREFIXROUTE ) )
action = check_cleanup_prefix_route ( ifp , & expires ) ;
2010-03-17 20:31:13 +00:00
2017-10-07 19:30:23 -07:00
list_del_rcu ( & ifp - > if_list ) ;
2014-01-15 15:36:59 +01:00
__in6_ifa_put ( ifp ) ;
write_unlock_bh ( & ifp - > idev - > lock ) ;
2005-04-16 15:20:36 -07:00
2014-03-27 18:28:07 +01:00
addrconf_del_dad_work ( ifp ) ;
2008-07-08 15:13:31 -07:00
2005-04-16 15:20:36 -07:00
ipv6_ifa_notify ( RTM_DELADDR , ifp ) ;
2013-04-14 23:18:43 +08:00
inet6addr_notifier_call_chain ( NETDEV_DOWN , ifp ) ;
2005-04-16 15:20:36 -07:00
2014-01-15 15:36:59 +01:00
if ( action ! = CLEANUP_PREFIX_RT_NOP ) {
cleanup_prefix_route ( ifp , expires ,
2020-03-03 14:37:35 +08:00
action = = CLEANUP_PREFIX_RT_DEL , false ) ;
2005-04-16 15:20:36 -07:00
}
2011-04-13 21:10:57 +00:00
/* clean up prefsrc entries */
rt6_remove_prefsrc ( ifp ) ;
2010-05-18 15:54:18 -07:00
out :
2005-04-16 15:20:36 -07:00
in6_ifa_put ( ifp ) ;
}
2020-05-01 00:51:47 -03:00
static int ipv6_create_tempaddr ( struct inet6_ifaddr * ifp , bool block )
2005-04-16 15:20:36 -07:00
{
struct inet6_dev * idev = ifp - > idev ;
2018-05-27 08:09:53 -07:00
unsigned long tmp_tstamp , age ;
2008-04-02 00:01:35 -07:00
unsigned long regen_advance ;
2011-07-26 13:50:49 +00:00
unsigned long now = jiffies ;
2016-10-20 12:29:26 +02:00
s32 cnf_temp_preferred_lft ;
2020-05-01 00:51:47 -03:00
struct inet6_ifaddr * ift ;
struct ifa6_config cfg ;
long max_desync_factor ;
struct in6_addr addr ;
int ret = 0 ;
2005-04-16 15:20:36 -07:00
2013-12-06 09:45:22 +01:00
write_lock_bh ( & idev - > lock ) ;
2020-05-01 00:51:47 -03:00
2005-04-16 15:20:36 -07:00
retry :
in6_dev_hold ( idev ) ;
if ( idev - > cnf . use_tempaddr < = 0 ) {
2013-12-06 09:45:22 +01:00
write_unlock_bh ( & idev - > lock ) ;
2012-05-15 14:11:53 +00:00
pr_info ( " %s: use_tempaddr is disabled \n " , __func__ ) ;
2005-04-16 15:20:36 -07:00
in6_dev_put ( idev ) ;
ret = - 1 ;
goto out ;
}
spin_lock_bh ( & ifp - > lock ) ;
if ( ifp - > regen_count + + > = idev - > cnf . regen_max_retry ) {
idev - > cnf . use_tempaddr = - 1 ; /*XXX*/
spin_unlock_bh ( & ifp - > lock ) ;
2013-12-06 09:45:22 +01:00
write_unlock_bh ( & idev - > lock ) ;
2012-05-15 14:11:53 +00:00
pr_warn ( " %s: regeneration time exceeded - disabled temporary address support \n " ,
__func__ ) ;
2005-04-16 15:20:36 -07:00
in6_dev_put ( idev ) ;
ret = - 1 ;
goto out ;
}
in6_ifa_hold ( ifp ) ;
memcpy ( addr . s6_addr , ifp - > addr . s6_addr , 8 ) ;
2020-05-01 00:51:47 -03:00
ipv6_gen_rnd_iid ( & addr ) ;
2011-07-26 13:50:49 +00:00
age = ( now - ifp - > tstamp ) / HZ ;
2016-10-13 18:52:15 +02:00
regen_advance = idev - > cnf . regen_max_retry *
idev - > cnf . dad_transmits *
2020-04-01 14:46:20 +08:00
max ( NEIGH_VAR ( idev - > nd_parms , RETRANS_TIME ) , HZ / 100 ) / HZ ;
2016-10-13 18:52:15 +02:00
/* recalculate max_desync_factor each time and update
* idev - > desync_factor if it ' s larger
*/
2016-10-20 12:29:26 +02:00
cnf_temp_preferred_lft = READ_ONCE ( idev - > cnf . temp_prefered_lft ) ;
2023-08-31 22:41:27 -06:00
max_desync_factor = min_t ( long ,
2016-10-13 18:52:15 +02:00
idev - > cnf . max_desync_factor ,
2016-10-20 12:29:26 +02:00
cnf_temp_preferred_lft - regen_advance ) ;
2016-10-13 18:52:15 +02:00
if ( unlikely ( idev - > desync_factor > max_desync_factor ) ) {
if ( max_desync_factor > 0 ) {
get_random_bytes ( & idev - > desync_factor ,
sizeof ( idev - > desync_factor ) ) ;
idev - > desync_factor % = max_desync_factor ;
} else {
idev - > desync_factor = 0 ;
}
}
2018-06-11 07:12:12 -07:00
memset ( & cfg , 0 , sizeof ( cfg ) ) ;
2018-05-27 08:09:53 -07:00
cfg . valid_lft = min_t ( __u32 , ifp - > valid_lft ,
2010-09-27 07:10:10 +00:00
idev - > cnf . temp_valid_lft + age ) ;
2018-05-27 08:09:53 -07:00
cfg . preferred_lft = cnf_temp_preferred_lft + age - idev - > desync_factor ;
cfg . preferred_lft = min_t ( __u32 , ifp - > prefered_lft , cfg . preferred_lft ) ;
2023-10-24 15:23:07 -06:00
cfg . preferred_lft = min_t ( __u32 , cfg . valid_lft , cfg . preferred_lft ) ;
2018-05-27 08:09:53 -07:00
cfg . plen = ifp - > prefix_len ;
2005-04-16 15:20:36 -07:00
tmp_tstamp = ifp - > tstamp ;
spin_unlock_bh ( & ifp - > lock ) ;
2013-12-06 09:45:22 +01:00
write_unlock_bh ( & idev - > lock ) ;
2007-04-25 17:08:10 -07:00
2023-12-29 21:32:44 -07:00
/* A temporary address is created only if this calculated Preferred
* Lifetime is greater than REGEN_ADVANCE time units . In particular ,
* an implementation must not create a temporary address with a zero
* Preferred Lifetime .
2014-03-12 22:13:19 +01:00
* Use age calculation as in addrconf_verify to avoid unnecessary
* temporary addresses being generated .
2008-04-02 00:01:35 -07:00
*/
2014-03-12 22:13:19 +01:00
age = ( now - tmp_tstamp + ADDRCONF_TIMER_FUZZ_MINUS ) / HZ ;
2023-12-29 21:32:44 -07:00
if ( cfg . preferred_lft < = regen_advance + age ) {
2008-04-02 00:01:35 -07:00
in6_ifa_put ( ifp ) ;
in6_dev_put ( idev ) ;
ret = - 1 ;
goto out ;
}
2018-05-27 08:09:53 -07:00
cfg . ifa_flags = IFA_F_TEMPORARY ;
2007-04-25 17:08:10 -07:00
/* set in addrconf_prefix_rcv() */
if ( ifp - > flags & IFA_F_OPTIMISTIC )
2018-05-27 08:09:53 -07:00
cfg . ifa_flags | = IFA_F_OPTIMISTIC ;
2007-04-25 17:08:10 -07:00
2018-05-27 08:09:53 -07:00
cfg . pfx = & addr ;
cfg . scope = ipv6_addr_scope ( cfg . pfx ) ;
2007-04-25 17:08:10 -07:00
2018-05-27 08:09:53 -07:00
ift = ipv6_add_addr ( idev , & cfg , block , NULL ) ;
2013-08-16 13:02:27 +02:00
if ( IS_ERR ( ift ) ) {
2005-04-16 15:20:36 -07:00
in6_ifa_put ( ifp ) ;
in6_dev_put ( idev ) ;
2012-05-15 14:11:53 +00:00
pr_info ( " %s: retry temporary address regeneration \n " , __func__ ) ;
2013-12-06 09:45:22 +01:00
write_lock_bh ( & idev - > lock ) ;
2005-04-16 15:20:36 -07:00
goto retry ;
}
spin_lock_bh ( & ift - > lock ) ;
ift - > ifpub = ifp ;
2011-07-26 13:50:49 +00:00
ift - > cstamp = now ;
2005-04-16 15:20:36 -07:00
ift - > tstamp = tmp_tstamp ;
spin_unlock_bh ( & ift - > lock ) ;
2012-04-14 21:37:40 -04:00
addrconf_dad_start ( ift ) ;
2005-04-16 15:20:36 -07:00
in6_ifa_put ( ift ) ;
in6_dev_put ( idev ) ;
out :
return ret ;
}
/*
2005-11-08 09:38:30 -08:00
* Choose an appropriate source address ( RFC3484 )
2005-04-16 15:20:36 -07:00
*/
2008-03-02 10:48:21 +09:00
enum {
IPV6_SADDR_RULE_INIT = 0 ,
IPV6_SADDR_RULE_LOCAL ,
IPV6_SADDR_RULE_SCOPE ,
IPV6_SADDR_RULE_PREFERRED ,
# ifdef CONFIG_IPV6_MIP6
IPV6_SADDR_RULE_HOA ,
# endif
IPV6_SADDR_RULE_OIF ,
IPV6_SADDR_RULE_LABEL ,
IPV6_SADDR_RULE_PRIVACY ,
IPV6_SADDR_RULE_ORCHID ,
IPV6_SADDR_RULE_PREFIX ,
net: ipv6: Add a sysctl to make optimistic addresses useful candidates
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set). A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 18:11:14 +09:00
# ifdef CONFIG_IPV6_OPTIMISTIC_DAD
IPV6_SADDR_RULE_NOT_OPTIMISTIC ,
# endif
2008-03-02 10:48:21 +09:00
IPV6_SADDR_RULE_MAX
} ;
2005-11-08 09:38:30 -08:00
struct ipv6_saddr_score {
2008-03-02 10:48:21 +09:00
int rule ;
int addr_type ;
struct inet6_ifaddr * ifa ;
DECLARE_BITMAP ( scorebits , IPV6_SADDR_RULE_MAX ) ;
int scopedist ;
int matchlen ;
2005-11-08 09:38:30 -08:00
} ;
2008-03-02 10:48:21 +09:00
struct ipv6_saddr_dst {
[IPV6]: Make address arguments const.
- net/ipv6/addrconf.c:
ipv6_get_ifaddr(), ipv6_dev_get_saddr()
- net/ipv6/mcast.c:
ipv6_sock_mc_join(), ipv6_sock_mc_drop(),
inet6_mc_check(),
ipv6_dev_mc_inc(), __ipv6_dev_mc_dec(), ipv6_dev_mc_dec(),
ipv6_chk_mcast_addr()
- net/ipv6/route.c:
rt6_lookup(), icmp6_dst_alloc()
- net/ipv6/ip6_output.c:
ip6_nd_hdr()
- net/ipv6/ndisc.c:
ndisc_send_ns(), ndisc_send_rs(), ndisc_send_redirect(),
ndisc_get_neigh(), __ndisc_send()
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-10 15:42:10 +09:00
const struct in6_addr * addr ;
2008-03-02 10:48:21 +09:00
int ifindex ;
int scope ;
int label ;
2008-03-25 09:37:42 +09:00
unsigned int prefs ;
2008-03-02 10:48:21 +09:00
} ;
2005-11-08 09:38:30 -08:00
2007-03-22 12:27:49 -07:00
static inline int ipv6_saddr_preferred ( int type )
2005-04-16 15:20:36 -07:00
{
2010-02-25 23:28:58 +00:00
if ( type & ( IPV6_ADDR_MAPPED | IPV6_ADDR_COMPATv4 | IPV6_ADDR_LOOPBACK ) )
2005-11-08 09:38:30 -08:00
return 1 ;
return 0 ;
2005-04-16 15:20:36 -07:00
}
ipv6: fix net.ipv6.conf.all interface DAD handlers
Currently, writing into
net.ipv6.conf.all.{accept_dad,use_optimistic,optimistic_dad} has no effect.
Fix handling of these flags by:
- using the maximum of global and per-interface values for the
accept_dad flag. That is, if at least one of the two values is
non-zero, enable DAD on the interface. If at least one value is
set to 2, enable DAD and disable IPv6 operation on the interface if
MAC-based link-local address was found
- using the logical OR of global and per-interface values for the
optimistic_dad flag. If at least one of them is set to one, optimistic
duplicate address detection (RFC 4429) is enabled on the interface
- using the logical OR of global and per-interface values for the
use_optimistic flag. If at least one of them is set to one,
optimistic addresses won't be marked as deprecated during source address
selection on the interface.
While at it, as we're modifying the prototype for ipv6_use_optimistic_addr(),
drop inline, and let the compiler decide.
Fixes: 7fd2561e4ebd ("net: ipv6: Add a sysctl to make optimistic addresses useful candidates")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-12 17:46:37 +02:00
static bool ipv6_use_optimistic_addr ( struct net * net ,
struct inet6_dev * idev )
net: ipv6: Add a sysctl to make optimistic addresses useful candidates
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set). A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 18:11:14 +09:00
{
# ifdef CONFIG_IPV6_OPTIMISTIC_DAD
ipv6: fix net.ipv6.conf.all interface DAD handlers
Currently, writing into
net.ipv6.conf.all.{accept_dad,use_optimistic,optimistic_dad} has no effect.
Fix handling of these flags by:
- using the maximum of global and per-interface values for the
accept_dad flag. That is, if at least one of the two values is
non-zero, enable DAD on the interface. If at least one value is
set to 2, enable DAD and disable IPv6 operation on the interface if
MAC-based link-local address was found
- using the logical OR of global and per-interface values for the
optimistic_dad flag. If at least one of them is set to one, optimistic
duplicate address detection (RFC 4429) is enabled on the interface
- using the logical OR of global and per-interface values for the
use_optimistic flag. If at least one of them is set to one,
optimistic addresses won't be marked as deprecated during source address
selection on the interface.
While at it, as we're modifying the prototype for ipv6_use_optimistic_addr(),
drop inline, and let the compiler decide.
Fixes: 7fd2561e4ebd ("net: ipv6: Add a sysctl to make optimistic addresses useful candidates")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-12 17:46:37 +02:00
if ( ! idev )
return false ;
if ( ! net - > ipv6 . devconf_all - > optimistic_dad & & ! idev - > cnf . optimistic_dad )
return false ;
if ( ! net - > ipv6 . devconf_all - > use_optimistic & & ! idev - > cnf . use_optimistic )
return false ;
return true ;
net: ipv6: Add a sysctl to make optimistic addresses useful candidates
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set). A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 18:11:14 +09:00
# else
return false ;
# endif
}
ipv6: allow userspace to add IFA_F_OPTIMISTIC addresses
According to RFC 4429 (section 3.1), adding new IPv6 addresses as
optimistic addresses is acceptable, as long as the implementation
follows some rules:
* Optimistic DAD SHOULD only be used when the implementation is aware
that the address is based on a most likely unique interface
identifier (such as in [RFC2464]), generated randomly [RFC3041],
or by a well-distributed hash function [RFC3972] or assigned by
Dynamic Host Configuration Protocol for IPv6 (DHCPv6) [RFC3315].
Optimistic DAD SHOULD NOT be used for manually entered
addresses.
Thus, it seems reasonable to allow userspace to set the optimistic flag
when adding new addresses.
We must not let userspace set NODAD + OPTIMISTIC, since if the kernel is
not performing DAD we would never clear the optimistic flag. We must
also ignore userspace's request to add OPTIMISTIC flag to addresses that
have already completed DAD (addresses that don't have the TENTATIVE
flag, or that have the DADFAILED flag).
Then we also need to clear the OPTIMISTIC flag on permanent addresses
when DAD fails. Otherwise, IFA_F_OPTIMISTIC addresses added by userspace
can still be used after DAD has failed, because in
ipv6_chk_addr_and_flags(), IFA_F_OPTIMISTIC overrides IFA_F_TENTATIVE.
Setting IFA_F_OPTIMISTIC from userspace is conditional on
CONFIG_IPV6_OPTIMISTIC_DAD and the optimistic_dad sysctl.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 16:40:08 +01:00
static bool ipv6_allow_optimistic_dad ( struct net * net ,
struct inet6_dev * idev )
{
# ifdef CONFIG_IPV6_OPTIMISTIC_DAD
if ( ! idev )
return false ;
if ( ! net - > ipv6 . devconf_all - > optimistic_dad & & ! idev - > cnf . optimistic_dad )
return false ;
return true ;
# else
return false ;
# endif
}
2008-05-28 14:51:24 +02:00
static int ipv6_get_saddr_eval ( struct net * net ,
struct ipv6_saddr_score * score ,
2008-03-02 10:48:21 +09:00
struct ipv6_saddr_dst * dst ,
int i )
{
int ret ;
if ( i < = score - > rule ) {
switch ( i ) {
case IPV6_SADDR_RULE_SCOPE :
ret = score - > scopedist ;
break ;
case IPV6_SADDR_RULE_PREFIX :
ret = score - > matchlen ;
break ;
default :
ret = ! ! test_bit ( i , score - > scorebits ) ;
}
goto out ;
}
switch ( i ) {
case IPV6_SADDR_RULE_INIT :
/* Rule 0: remember if hiscore is not ready yet */
ret = ! ! score - > ifa ;
break ;
case IPV6_SADDR_RULE_LOCAL :
/* Rule 1: Prefer same address */
ret = ipv6_addr_equal ( & score - > ifa - > addr , dst - > addr ) ;
break ;
case IPV6_SADDR_RULE_SCOPE :
/* Rule 2: Prefer appropriate scope
*
* ret
* ^
* - 1 | d 15
* - - - + - - + - + - - - > scope
* |
* | d is scope of the destination .
* B - d | \
* | \ < - smaller scope is better if
2014-01-12 11:26:32 -08:00
* B - 15 | \ if scope is enough for destination .
2008-03-02 10:48:21 +09:00
* | ret = B - scope ( - 1 < = scope > = d < = 15 ) .
* d - C - 1 | /
* | / < - greater is better
* - C / if scope is not enough for destination .
* / | ret = scope - C ( - 1 < = d < scope < = 15 ) .
*
* d - C - 1 < B - 15 ( for all - 1 < = d < = 15 ) .
* C > d + 14 - B > = 15 + 14 - B = 29 - B .
* Assume B = 0 and we get C > 29.
*/
ret = __ipv6_addr_src_scope ( score - > addr_type ) ;
if ( ret > = dst - > scope )
ret = - ret ;
else
ret - = 128 ; /* 30 is enough */
score - > scopedist = ret ;
break ;
case IPV6_SADDR_RULE_PREFERRED :
net: ipv6: Add a sysctl to make optimistic addresses useful candidates
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set). A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 18:11:14 +09:00
{
2008-03-02 10:48:21 +09:00
/* Rule 3: Avoid deprecated and optimistic addresses */
net: ipv6: Add a sysctl to make optimistic addresses useful candidates
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set). A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 18:11:14 +09:00
u8 avoid = IFA_F_DEPRECATED ;
ipv6: fix net.ipv6.conf.all interface DAD handlers
Currently, writing into
net.ipv6.conf.all.{accept_dad,use_optimistic,optimistic_dad} has no effect.
Fix handling of these flags by:
- using the maximum of global and per-interface values for the
accept_dad flag. That is, if at least one of the two values is
non-zero, enable DAD on the interface. If at least one value is
set to 2, enable DAD and disable IPv6 operation on the interface if
MAC-based link-local address was found
- using the logical OR of global and per-interface values for the
optimistic_dad flag. If at least one of them is set to one, optimistic
duplicate address detection (RFC 4429) is enabled on the interface
- using the logical OR of global and per-interface values for the
use_optimistic flag. If at least one of them is set to one,
optimistic addresses won't be marked as deprecated during source address
selection on the interface.
While at it, as we're modifying the prototype for ipv6_use_optimistic_addr(),
drop inline, and let the compiler decide.
Fixes: 7fd2561e4ebd ("net: ipv6: Add a sysctl to make optimistic addresses useful candidates")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-12 17:46:37 +02:00
if ( ! ipv6_use_optimistic_addr ( net , score - > ifa - > idev ) )
net: ipv6: Add a sysctl to make optimistic addresses useful candidates
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set). A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 18:11:14 +09:00
avoid | = IFA_F_OPTIMISTIC ;
2008-03-02 10:48:21 +09:00
ret = ipv6_saddr_preferred ( score - > addr_type ) | |
net: ipv6: Add a sysctl to make optimistic addresses useful candidates
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set). A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 18:11:14 +09:00
! ( score - > ifa - > flags & avoid ) ;
2008-03-02 10:48:21 +09:00
break ;
net: ipv6: Add a sysctl to make optimistic addresses useful candidates
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set). A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 18:11:14 +09:00
}
2008-03-02 10:48:21 +09:00
# ifdef CONFIG_IPV6_MIP6
case IPV6_SADDR_RULE_HOA :
2008-03-25 09:37:42 +09:00
{
2008-03-02 10:48:21 +09:00
/* Rule 4: Prefer home address */
2008-03-25 09:37:42 +09:00
int prefhome = ! ( dst - > prefs & IPV6_PREFER_SRC_COA ) ;
ret = ! ( score - > ifa - > flags & IFA_F_HOMEADDRESS ) ^ prefhome ;
2008-03-02 10:48:21 +09:00
break ;
2008-03-25 09:37:42 +09:00
}
2008-03-02 10:48:21 +09:00
# endif
case IPV6_SADDR_RULE_OIF :
/* Rule 5: Prefer outgoing interface */
ret = ( ! dst - > ifindex | |
dst - > ifindex = = score - > ifa - > idev - > dev - > ifindex ) ;
break ;
case IPV6_SADDR_RULE_LABEL :
/* Rule 6: Prefer matching label */
2008-05-28 14:51:24 +02:00
ret = ipv6_addr_label ( net ,
& score - > ifa - > addr , score - > addr_type ,
2008-03-02 10:48:21 +09:00
score - > ifa - > idev - > dev - > ifindex ) = = dst - > label ;
break ;
case IPV6_SADDR_RULE_PRIVACY :
2008-03-25 09:37:42 +09:00
{
2008-03-02 10:48:21 +09:00
/* Rule 7: Prefer public address
2011-03-30 22:57:33 -03:00
* Note : prefer temporary address if use_tempaddr > = 2
2008-03-02 10:48:21 +09:00
*/
2008-03-25 09:37:42 +09:00
int preftmp = dst - > prefs & ( IPV6_PREFER_SRC_PUBLIC | IPV6_PREFER_SRC_TMP ) ?
! ! ( dst - > prefs & IPV6_PREFER_SRC_TMP ) :
score - > ifa - > idev - > cnf . use_tempaddr > = 2 ;
ret = ( ! ( score - > ifa - > flags & IFA_F_TEMPORARY ) ) ^ preftmp ;
2008-03-02 10:48:21 +09:00
break ;
2008-03-25 09:37:42 +09:00
}
2008-03-02 10:48:21 +09:00
case IPV6_SADDR_RULE_ORCHID :
/* Rule 8-: Prefer ORCHID vs ORCHID or
* non - ORCHID vs non - ORCHID
*/
ret = ! ( ipv6_addr_orchid ( & score - > ifa - > addr ) ^
ipv6_addr_orchid ( dst - > addr ) ) ;
break ;
case IPV6_SADDR_RULE_PREFIX :
/* Rule 8: Use longest matching prefix */
2012-09-10 18:41:07 +00:00
ret = ipv6_addr_diff ( & score - > ifa - > addr , dst - > addr ) ;
if ( ret > score - > ifa - > prefix_len )
ret = score - > ifa - > prefix_len ;
score - > matchlen = ret ;
2008-03-02 10:48:21 +09:00
break ;
net: ipv6: Add a sysctl to make optimistic addresses useful candidates
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set). A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 18:11:14 +09:00
# ifdef CONFIG_IPV6_OPTIMISTIC_DAD
case IPV6_SADDR_RULE_NOT_OPTIMISTIC :
/* Optimistic addresses still have lower precedence than other
* preferred addresses .
*/
ret = ! ( score - > ifa - > flags & IFA_F_OPTIMISTIC ) ;
break ;
# endif
2008-03-02 10:48:21 +09:00
default :
ret = 0 ;
}
if ( ret )
__set_bit ( i , score - > scorebits ) ;
score - > rule = i ;
out :
return ret ;
}
2015-07-13 23:28:10 +09:00
static int __ipv6_dev_get_saddr ( struct net * net ,
struct ipv6_saddr_dst * dst ,
struct inet6_dev * idev ,
struct ipv6_saddr_score * scores ,
int hiscore_idx )
2015-07-10 16:58:31 +09:00
{
2015-07-13 23:28:10 +09:00
struct ipv6_saddr_score * score = & scores [ 1 - hiscore_idx ] , * hiscore = & scores [ hiscore_idx ] ;
2015-07-10 16:58:31 +09:00
2017-10-07 19:30:27 -07:00
list_for_each_entry_rcu ( score - > ifa , & idev - > addr_list , if_list ) {
2015-07-10 16:58:31 +09:00
int i ;
/*
* - Tentative Address ( RFC2462 section 5.4 )
* - A tentative address is not considered
* " assigned to an interface " in the traditional
* sense , unless it is also flagged as optimistic .
* - Candidate Source Address ( section 4 )
* - In any case , anycast addresses , multicast
* addresses , and the unspecified address MUST
* NOT be included in a candidate set .
*/
if ( ( score - > ifa - > flags & IFA_F_TENTATIVE ) & &
( ! ( score - > ifa - > flags & IFA_F_OPTIMISTIC ) ) )
continue ;
score - > addr_type = __ipv6_addr_type ( & score - > ifa - > addr ) ;
if ( unlikely ( score - > addr_type = = IPV6_ADDR_ANY | |
score - > addr_type & IPV6_ADDR_MULTICAST ) ) {
net_dbg_ratelimited ( " ADDRCONF: unspecified / multicast address assigned as unicast address on %s " ,
idev - > dev - > name ) ;
continue ;
}
score - > rule = - 1 ;
bitmap_zero ( score - > scorebits , IPV6_SADDR_RULE_MAX ) ;
for ( i = 0 ; i < IPV6_SADDR_RULE_MAX ; i + + ) {
int minihiscore , miniscore ;
minihiscore = ipv6_get_saddr_eval ( net , hiscore , dst , i ) ;
miniscore = ipv6_get_saddr_eval ( net , score , dst , i ) ;
if ( minihiscore > miniscore ) {
if ( i = = IPV6_SADDR_RULE_SCOPE & &
score - > scopedist > 0 ) {
/*
* special case :
* each remaining entry
* has too small ( not enough )
* scope , because ifa entries
* are sorted by their scope
* values .
*/
goto out ;
}
break ;
} else if ( minihiscore < miniscore ) {
swap ( hiscore , score ) ;
2015-07-13 23:28:10 +09:00
hiscore_idx = 1 - hiscore_idx ;
2015-07-10 16:58:31 +09:00
/* restore our iterator */
score - > ifa = hiscore - > ifa ;
break ;
}
}
}
out :
2015-07-13 23:28:10 +09:00
return hiscore_idx ;
2015-07-10 16:58:31 +09:00
}
2016-06-16 16:24:26 -07:00
static int ipv6_get_saddr_master ( struct net * net ,
const struct net_device * dst_dev ,
const struct net_device * master ,
struct ipv6_saddr_dst * dst ,
struct ipv6_saddr_score * scores ,
int hiscore_idx )
{
struct inet6_dev * idev ;
idev = __in6_dev_get ( dst_dev ) ;
if ( idev )
hiscore_idx = __ipv6_dev_get_saddr ( net , dst , idev ,
scores , hiscore_idx ) ;
idev = __in6_dev_get ( master ) ;
if ( idev )
hiscore_idx = __ipv6_dev_get_saddr ( net , dst , idev ,
scores , hiscore_idx ) ;
return hiscore_idx ;
}
2012-08-26 19:14:14 +02:00
int ipv6_dev_get_saddr ( struct net * net , const struct net_device * dst_dev ,
[IPV6]: Make address arguments const.
- net/ipv6/addrconf.c:
ipv6_get_ifaddr(), ipv6_dev_get_saddr()
- net/ipv6/mcast.c:
ipv6_sock_mc_join(), ipv6_sock_mc_drop(),
inet6_mc_check(),
ipv6_dev_mc_inc(), __ipv6_dev_mc_dec(), ipv6_dev_mc_dec(),
ipv6_chk_mcast_addr()
- net/ipv6/route.c:
rt6_lookup(), icmp6_dst_alloc()
- net/ipv6/ip6_output.c:
ip6_nd_hdr()
- net/ipv6/ndisc.c:
ndisc_send_ns(), ndisc_send_rs(), ndisc_send_redirect(),
ndisc_get_neigh(), __ndisc_send()
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-10 15:42:10 +09:00
const struct in6_addr * daddr , unsigned int prefs ,
2008-03-25 09:37:42 +09:00
struct in6_addr * saddr )
2005-04-16 15:20:36 -07:00
{
2015-07-13 23:28:10 +09:00
struct ipv6_saddr_score scores [ 2 ] , * hiscore ;
2008-03-02 10:48:21 +09:00
struct ipv6_saddr_dst dst ;
2015-07-10 16:58:31 +09:00
struct inet6_dev * idev ;
2005-11-08 09:38:30 -08:00
struct net_device * dev ;
2008-03-02 10:48:21 +09:00
int dst_type ;
2015-07-10 16:58:31 +09:00
bool use_oif_addr = false ;
2015-07-13 23:28:10 +09:00
int hiscore_idx = 0 ;
2017-10-07 19:30:28 -07:00
int ret = 0 ;
2005-04-16 15:20:36 -07:00
2008-03-02 10:48:21 +09:00
dst_type = __ipv6_addr_type ( daddr ) ;
dst . addr = daddr ;
dst . ifindex = dst_dev ? dst_dev - > ifindex : 0 ;
dst . scope = __ipv6_addr_src_scope ( dst_type ) ;
2008-05-28 14:51:24 +02:00
dst . label = ipv6_addr_label ( net , daddr , dst_type , dst . ifindex ) ;
2008-03-25 09:37:42 +09:00
dst . prefs = prefs ;
2008-03-02 10:48:21 +09:00
2015-07-13 23:28:10 +09:00
scores [ hiscore_idx ] . rule = - 1 ;
scores [ hiscore_idx ] . ifa = NULL ;
2005-04-16 15:20:36 -07:00
2006-09-22 14:44:24 -07:00
rcu_read_lock ( ) ;
2005-04-16 15:20:36 -07:00
2015-07-10 16:58:31 +09:00
/* Candidate Source Address (section 4)
* - multicast and link - local destination address ,
* the set of candidate source address MUST only
* include addresses assigned to interfaces
* belonging to the same link as the outgoing
* interface .
* ( - For site - local destination addresses , the
* set of candidate source addresses MUST only
* include addresses assigned to interfaces
* belonging to the same site as the outgoing
* interface . )
2015-07-22 16:38:25 +09:00
* - " It is RECOMMENDED that the candidate source addresses
* be the set of unicast addresses assigned to the
* interface that will be used to send to the destination
* ( the ' outgoing ' interface ) . " (RFC 6724)
2015-07-10 16:58:31 +09:00
*/
if ( dst_dev ) {
2015-07-22 16:38:25 +09:00
idev = __in6_dev_get ( dst_dev ) ;
2015-07-10 16:58:31 +09:00
if ( ( dst_type & IPV6_ADDR_MULTICAST ) | |
2015-07-22 16:38:25 +09:00
dst . scope < = IPV6_ADDR_SCOPE_LINKLOCAL | |
( idev & & idev - > cnf . use_oif_addrs_only ) ) {
2015-07-10 16:58:31 +09:00
use_oif_addr = true ;
}
}
2008-03-02 10:48:21 +09:00
2015-07-10 16:58:31 +09:00
if ( use_oif_addr ) {
2015-07-13 23:28:10 +09:00
if ( idev )
2015-07-16 16:51:30 +09:00
hiscore_idx = __ipv6_dev_get_saddr ( net , & dst , idev , scores , hiscore_idx ) ;
2015-07-10 16:58:31 +09:00
} else {
2016-06-16 16:24:26 -07:00
const struct net_device * master ;
int master_idx = 0 ;
/* if dst_dev exists and is enslaved to an L3 device, then
* prefer addresses from dst_dev and then the master over
* any other enslaved devices in the L3 domain .
*/
master = l3mdev_master_dev_rcu ( dst_dev ) ;
if ( master ) {
master_idx = master - > ifindex ;
hiscore_idx = ipv6_get_saddr_master ( net , dst_dev ,
master , & dst ,
scores , hiscore_idx ) ;
if ( scores [ hiscore_idx ] . ifa )
goto out ;
}
2015-07-10 16:58:31 +09:00
for_each_netdev_rcu ( net , dev ) {
2016-06-16 16:24:26 -07:00
/* only consider addresses on devices in the
* same L3 domain
*/
if ( l3mdev_master_ifindex_rcu ( dev ) ! = master_idx )
continue ;
2015-07-10 16:58:31 +09:00
idev = __in6_dev_get ( dev ) ;
if ( ! idev )
2005-11-08 09:38:30 -08:00
continue ;
2015-07-16 16:51:30 +09:00
hiscore_idx = __ipv6_dev_get_saddr ( net , & dst , idev , scores , hiscore_idx ) ;
2005-04-16 15:20:36 -07:00
}
}
2016-06-16 16:24:26 -07:00
out :
2015-07-13 23:28:10 +09:00
hiscore = & scores [ hiscore_idx ] ;
2008-03-02 10:48:21 +09:00
if ( ! hiscore - > ifa )
2017-10-07 19:30:28 -07:00
ret = - EADDRNOTAVAIL ;
else
* saddr = hiscore - > ifa - > addr ;
2007-02-09 23:24:49 +09:00
2017-10-07 19:30:28 -07:00
rcu_read_unlock ( ) ;
return ret ;
2005-04-16 15:20:36 -07:00
}
2008-03-03 21:44:34 +09:00
EXPORT_SYMBOL ( ipv6_dev_get_saddr ) ;
2005-04-16 15:20:36 -07:00
2022-02-11 17:30:42 +00:00
static int __ipv6_get_lladdr ( struct inet6_dev * idev , struct in6_addr * addr ,
u32 banned_flags )
2013-06-23 18:39:01 +02:00
{
struct inet6_ifaddr * ifp ;
int err = - EADDRNOTAVAIL ;
2014-01-19 21:58:19 +01:00
list_for_each_entry_reverse ( ifp , & idev - > addr_list , if_list ) {
if ( ifp - > scope > IFA_LINK )
break ;
2013-06-23 18:39:01 +02:00
if ( ifp - > scope = = IFA_LINK & &
! ( ifp - > flags & banned_flags ) ) {
* addr = ifp - > addr ;
err = 0 ;
break ;
}
}
return err ;
}
2007-04-25 17:08:10 -07:00
int ipv6_get_lladdr ( struct net_device * dev , struct in6_addr * addr ,
2013-12-06 09:45:21 +01:00
u32 banned_flags )
2005-04-16 15:20:36 -07:00
{
struct inet6_dev * idev ;
int err = - EADDRNOTAVAIL ;
2006-09-22 14:44:24 -07:00
rcu_read_lock ( ) ;
2010-03-20 16:09:01 -07:00
idev = __in6_dev_get ( dev ) ;
if ( idev ) {
2005-04-16 15:20:36 -07:00
read_lock_bh ( & idev - > lock ) ;
2013-06-23 18:39:01 +02:00
err = __ipv6_get_lladdr ( idev , addr , banned_flags ) ;
2005-04-16 15:20:36 -07:00
read_unlock_bh ( & idev - > lock ) ;
}
2006-09-22 14:44:24 -07:00
rcu_read_unlock ( ) ;
2005-04-16 15:20:36 -07:00
return err ;
}
2017-10-07 19:30:24 -07:00
static int ipv6_count_addresses ( const struct inet6_dev * idev )
2005-04-16 15:20:36 -07:00
{
2017-10-07 19:30:24 -07:00
const struct inet6_ifaddr * ifp ;
2005-04-16 15:20:36 -07:00
int cnt = 0 ;
2017-10-07 19:30:24 -07:00
rcu_read_lock ( ) ;
list_for_each_entry_rcu ( ifp , & idev - > addr_list , if_list )
2005-04-16 15:20:36 -07:00
cnt + + ;
2017-10-07 19:30:24 -07:00
rcu_read_unlock ( ) ;
2005-04-16 15:20:36 -07:00
return cnt ;
}
2011-04-22 04:53:02 +00:00
int ipv6_chk_addr ( struct net * net , const struct in6_addr * addr ,
2013-05-17 03:56:10 +00:00
const struct net_device * dev , int strict )
2015-02-04 20:01:23 +09:00
{
2018-03-13 08:29:37 -07:00
return ipv6_chk_addr_and_flags ( net , addr , dev , ! dev ,
strict , IFA_F_TENTATIVE ) ;
2015-02-04 20:01:23 +09:00
}
EXPORT_SYMBOL ( ipv6_chk_addr ) ;
2018-03-13 08:29:38 -07:00
/* device argument is used to find the L3 domain of interest. If
* skip_dev_check is set , then the ifp device is not checked against
* the passed in dev argument . So the 2 cases for addresses checks are :
* 1. does the address exist in the L3 domain that dev is part of
* ( skip_dev_check = true ) , or
*
* 2. does the address exist on the specific device
* ( skip_dev_check = false )
*/
2020-08-17 14:30:49 +08:00
static struct net_device *
__ipv6_chk_addr_and_flags ( struct net * net , const struct in6_addr * addr ,
const struct net_device * dev , bool skip_dev_check ,
int strict , u32 banned_flags )
2005-04-16 15:20:36 -07:00
{
2017-10-23 16:17:47 -07:00
unsigned int hash = inet6_addr_hash ( net , addr ) ;
2020-08-17 14:30:49 +08:00
struct net_device * l3mdev , * ndev ;
2010-05-17 22:27:12 -07:00
struct inet6_ifaddr * ifp ;
2015-02-04 20:01:23 +09:00
u32 ifp_flags ;
2005-04-16 15:20:36 -07:00
2017-10-23 16:17:48 -07:00
rcu_read_lock ( ) ;
2018-03-13 08:29:37 -07:00
2018-03-13 08:29:38 -07:00
l3mdev = l3mdev_master_dev_rcu ( dev ) ;
2018-03-13 08:29:37 -07:00
if ( skip_dev_check )
dev = NULL ;
2022-02-07 20:50:30 -08:00
hlist_for_each_entry_rcu ( ifp , & net - > ipv6 . inet6_addr_lst [ hash ] , addr_lst ) {
2020-08-17 14:30:49 +08:00
ndev = ifp - > idev - > dev ;
2018-03-13 08:29:38 -07:00
2020-08-17 14:30:49 +08:00
if ( l3mdev_master_dev_rcu ( ndev ) ! = l3mdev )
2018-03-13 08:29:38 -07:00
continue ;
2015-02-04 20:01:23 +09:00
/* Decouple optimistic from tentative for evaluation here.
* Ban optimistic addresses explicitly , when required .
*/
ifp_flags = ( ifp - > flags & IFA_F_OPTIMISTIC )
? ( ifp - > flags & ~ IFA_F_TENTATIVE )
: ifp - > flags ;
2005-04-16 15:20:36 -07:00
if ( ipv6_addr_equal ( & ifp - > addr , addr ) & &
2015-02-04 20:01:23 +09:00
! ( ifp_flags & banned_flags ) & &
2020-08-17 14:30:49 +08:00
( ! dev | | ndev = = dev | |
2010-05-17 22:27:12 -07:00
! ( ifp - > scope & ( IFA_LINK | IFA_HOST ) | | strict ) ) ) {
2017-10-23 16:17:48 -07:00
rcu_read_unlock ( ) ;
2020-08-17 14:30:49 +08:00
return ndev ;
2005-04-16 15:20:36 -07:00
}
}
2010-03-17 20:31:11 +00:00
2017-10-23 16:17:48 -07:00
rcu_read_unlock ( ) ;
2020-08-17 14:30:49 +08:00
return NULL ;
}
int ipv6_chk_addr_and_flags ( struct net * net , const struct in6_addr * addr ,
const struct net_device * dev , bool skip_dev_check ,
int strict , u32 banned_flags )
{
return __ipv6_chk_addr_and_flags ( net , addr , dev , skip_dev_check ,
strict , banned_flags ) ? 1 : 0 ;
2005-04-16 15:20:36 -07:00
}
2015-02-04 20:01:23 +09:00
EXPORT_SYMBOL ( ipv6_chk_addr_and_flags ) ;
2007-02-22 22:05:40 +09:00
2005-04-16 15:20:36 -07:00
2013-09-23 23:04:19 +03:00
/* Compares an address/prefix_len with addresses on device @dev.
* If one is found it returns true .
*/
bool ipv6_chk_custom_prefix ( const struct in6_addr * addr ,
const unsigned int prefix_len , struct net_device * dev )
{
2017-10-07 19:30:25 -07:00
const struct inet6_ifaddr * ifa ;
const struct inet6_dev * idev ;
2013-09-23 23:04:19 +03:00
bool ret = false ;
rcu_read_lock ( ) ;
idev = __in6_dev_get ( dev ) ;
if ( idev ) {
2017-10-07 19:30:25 -07:00
list_for_each_entry_rcu ( ifa , & idev - > addr_list , if_list ) {
2013-09-23 23:04:19 +03:00
ret = ipv6_prefix_equal ( addr , & ifa - > addr , prefix_len ) ;
if ( ret )
break ;
}
}
rcu_read_unlock ( ) ;
return ret ;
}
EXPORT_SYMBOL ( ipv6_chk_custom_prefix ) ;
2011-04-22 04:53:02 +00:00
int ipv6_chk_prefix ( const struct in6_addr * addr , struct net_device * dev )
2008-03-15 22:54:23 -04:00
{
2017-10-07 19:30:26 -07:00
const struct inet6_ifaddr * ifa ;
const struct inet6_dev * idev ;
2008-03-15 22:54:23 -04:00
int onlink ;
onlink = 0 ;
rcu_read_lock ( ) ;
idev = __in6_dev_get ( dev ) ;
if ( idev ) {
2017-10-07 19:30:26 -07:00
list_for_each_entry_rcu ( ifa , & idev - > addr_list , if_list ) {
2008-03-15 22:54:23 -04:00
onlink = ipv6_prefix_equal ( addr , & ifa - > addr ,
ifa - > prefix_len ) ;
if ( onlink )
break ;
}
}
rcu_read_unlock ( ) ;
return onlink ;
}
EXPORT_SYMBOL ( ipv6_chk_prefix ) ;
2020-08-03 23:34:46 +08:00
/**
* ipv6_dev_find - find the first device with a given source address .
* @ net : the net namespace
* @ addr : the source address
2020-10-31 19:30:44 +01:00
* @ dev : used to find the L3 domain of interest
2020-08-03 23:34:46 +08:00
*
* The caller should be protected by RCU , or RTNL .
*/
2020-08-17 14:30:49 +08:00
struct net_device * ipv6_dev_find ( struct net * net , const struct in6_addr * addr ,
struct net_device * dev )
2020-08-03 23:34:46 +08:00
{
2020-08-17 14:30:49 +08:00
return __ipv6_chk_addr_and_flags ( net , addr , dev , ! dev , 1 ,
IFA_F_TENTATIVE ) ;
2020-08-03 23:34:46 +08:00
}
EXPORT_SYMBOL ( ipv6_dev_find ) ;
[IPV6]: Make address arguments const.
- net/ipv6/addrconf.c:
ipv6_get_ifaddr(), ipv6_dev_get_saddr()
- net/ipv6/mcast.c:
ipv6_sock_mc_join(), ipv6_sock_mc_drop(),
inet6_mc_check(),
ipv6_dev_mc_inc(), __ipv6_dev_mc_dec(), ipv6_dev_mc_dec(),
ipv6_chk_mcast_addr()
- net/ipv6/route.c:
rt6_lookup(), icmp6_dst_alloc()
- net/ipv6/ip6_output.c:
ip6_nd_hdr()
- net/ipv6/ndisc.c:
ndisc_send_ns(), ndisc_send_rs(), ndisc_send_redirect(),
ndisc_get_neigh(), __ndisc_send()
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-10 15:42:10 +09:00
struct inet6_ifaddr * ipv6_get_ifaddr ( struct net * net , const struct in6_addr * addr ,
2008-01-10 22:44:09 -08:00
struct net_device * dev , int strict )
2005-04-16 15:20:36 -07:00
{
2017-10-23 16:17:47 -07:00
unsigned int hash = inet6_addr_hash ( net , addr ) ;
2010-03-25 21:39:21 -07:00
struct inet6_ifaddr * ifp , * result = NULL ;
2005-04-16 15:20:36 -07:00
2017-10-23 16:17:49 -07:00
rcu_read_lock ( ) ;
2022-02-07 20:50:30 -08:00
hlist_for_each_entry_rcu ( ifp , & net - > ipv6 . inet6_addr_lst [ hash ] , addr_lst ) {
2005-04-16 15:20:36 -07:00
if ( ipv6_addr_equal ( & ifp - > addr , addr ) ) {
2015-03-29 14:00:04 +01:00
if ( ! dev | | ifp - > idev - > dev = = dev | |
2005-04-16 15:20:36 -07:00
! ( ifp - > scope & ( IFA_LINK | IFA_HOST ) | | strict ) ) {
2010-03-25 21:39:21 -07:00
result = ifp ;
2005-04-16 15:20:36 -07:00
in6_ifa_hold ( ifp ) ;
break ;
}
}
}
2017-10-23 16:17:49 -07:00
rcu_read_unlock ( ) ;
2005-04-16 15:20:36 -07:00
2010-03-25 21:39:21 -07:00
return result ;
2005-04-16 15:20:36 -07:00
}
/* Gets referenced address, destroys ifaddr */
2009-09-09 14:41:32 +00:00
static void addrconf_dad_stop ( struct inet6_ifaddr * ifp , int dad_failed )
2005-04-16 15:20:36 -07:00
{
2016-01-08 13:47:23 +01:00
if ( dad_failed )
ifp - > flags | = IFA_F_DADFAILED ;
2017-06-29 16:56:54 +02:00
if ( ifp - > flags & IFA_F_TEMPORARY ) {
2005-04-16 15:20:36 -07:00
struct inet6_ifaddr * ifpub ;
spin_lock_bh ( & ifp - > lock ) ;
ifpub = ifp - > ifpub ;
if ( ifpub ) {
in6_ifa_hold ( ifpub ) ;
spin_unlock_bh ( & ifp - > lock ) ;
2020-05-01 00:51:47 -03:00
ipv6_create_tempaddr ( ifpub , true ) ;
2005-04-16 15:20:36 -07:00
in6_ifa_put ( ifpub ) ;
} else {
spin_unlock_bh ( & ifp - > lock ) ;
}
ipv6_del_addr ( ifp ) ;
2017-06-29 16:56:54 +02:00
} else if ( ifp - > flags & IFA_F_PERMANENT | | ! dad_failed ) {
spin_lock_bh ( & ifp - > lock ) ;
addrconf_del_dad_work ( ifp ) ;
ifp - > flags | = IFA_F_TENTATIVE ;
ipv6: allow userspace to add IFA_F_OPTIMISTIC addresses
According to RFC 4429 (section 3.1), adding new IPv6 addresses as
optimistic addresses is acceptable, as long as the implementation
follows some rules:
* Optimistic DAD SHOULD only be used when the implementation is aware
that the address is based on a most likely unique interface
identifier (such as in [RFC2464]), generated randomly [RFC3041],
or by a well-distributed hash function [RFC3972] or assigned by
Dynamic Host Configuration Protocol for IPv6 (DHCPv6) [RFC3315].
Optimistic DAD SHOULD NOT be used for manually entered
addresses.
Thus, it seems reasonable to allow userspace to set the optimistic flag
when adding new addresses.
We must not let userspace set NODAD + OPTIMISTIC, since if the kernel is
not performing DAD we would never clear the optimistic flag. We must
also ignore userspace's request to add OPTIMISTIC flag to addresses that
have already completed DAD (addresses that don't have the TENTATIVE
flag, or that have the DADFAILED flag).
Then we also need to clear the OPTIMISTIC flag on permanent addresses
when DAD fails. Otherwise, IFA_F_OPTIMISTIC addresses added by userspace
can still be used after DAD has failed, because in
ipv6_chk_addr_and_flags(), IFA_F_OPTIMISTIC overrides IFA_F_TENTATIVE.
Setting IFA_F_OPTIMISTIC from userspace is conditional on
CONFIG_IPV6_OPTIMISTIC_DAD and the optimistic_dad sysctl.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 16:40:08 +01:00
if ( dad_failed )
ifp - > flags & = ~ IFA_F_OPTIMISTIC ;
2017-06-29 16:56:54 +02:00
spin_unlock_bh ( & ifp - > lock ) ;
if ( dad_failed )
ipv6_ifa_notify ( 0 , ifp ) ;
in6_ifa_put ( ifp ) ;
2014-03-27 18:28:07 +01:00
} else {
2005-04-16 15:20:36 -07:00
ipv6_del_addr ( ifp ) ;
2014-03-27 18:28:07 +01:00
}
2005-04-16 15:20:36 -07:00
}
2010-05-18 15:55:27 -07:00
static int addrconf_dad_end ( struct inet6_ifaddr * ifp )
{
int err = - ENOENT ;
2015-03-23 23:36:03 +01:00
spin_lock_bh ( & ifp - > lock ) ;
2010-05-18 15:55:27 -07:00
if ( ifp - > state = = INET6_IFADDR_STATE_DAD ) {
ifp - > state = INET6_IFADDR_STATE_POSTDAD ;
err = 0 ;
}
2015-03-23 23:36:03 +01:00
spin_unlock_bh ( & ifp - > lock ) ;
2010-05-18 15:55:27 -07:00
return err ;
}
2017-10-30 19:38:52 -04:00
void addrconf_dad_failure ( struct sk_buff * skb , struct inet6_ifaddr * ifp )
2005-12-21 22:57:24 +09:00
{
2008-06-28 14:18:38 +09:00
struct inet6_dev * idev = ifp - > idev ;
2022-02-07 20:50:30 -08:00
struct net * net = dev_net ( idev - > dev ) ;
2009-03-18 18:22:48 -07:00
2010-10-24 23:06:43 +00:00
if ( addrconf_dad_end ( ifp ) ) {
in6_ifa_put ( ifp ) ;
2010-05-18 15:55:27 -07:00
return ;
2010-10-24 23:06:43 +00:00
}
2010-05-18 15:55:27 -07:00
2017-10-30 19:38:52 -04:00
net_info_ratelimited ( " %s: IPv6 duplicate address %pI6c used by %pM detected! \n " ,
ifp - > idev - > dev - > name , & ifp - > addr , eth_hdr ( skb ) - > h_source ) ;
2009-03-18 18:22:48 -07:00
2015-03-23 23:36:04 +01:00
spin_lock_bh ( & ifp - > lock ) ;
if ( ifp - > flags & IFA_F_STABLE_PRIVACY ) {
struct in6_addr new_addr ;
struct inet6_ifaddr * ifp2 ;
int retries = ifp - > stable_privacy_retry + 1 ;
2018-05-27 08:09:53 -07:00
struct ifa6_config cfg = {
. pfx = & new_addr ,
. plen = ifp - > prefix_len ,
. ifa_flags = ifp - > flags ,
. valid_lft = ifp - > valid_lft ,
. preferred_lft = ifp - > prefered_lft ,
. scope = ifp - > scope ,
} ;
2015-03-23 23:36:04 +01:00
2015-03-23 23:36:05 +01:00
if ( retries > net - > ipv6 . sysctl . idgen_retries ) {
2015-03-23 23:36:04 +01:00
net_info_ratelimited ( " %s: privacy stable address generation failed because of DAD conflicts! \n " ,
ifp - > idev - > dev - > name ) ;
goto errdad ;
}
new_addr = ifp - > addr ;
if ( ipv6_generate_stable_address ( & new_addr , retries ,
idev ) )
goto errdad ;
spin_unlock_bh ( & ifp - > lock ) ;
2008-06-28 14:18:38 +09:00
2015-03-23 23:36:04 +01:00
if ( idev - > cnf . max_addresses & &
ipv6_count_addresses ( idev ) > =
idev - > cnf . max_addresses )
goto lock_errdad ;
net_info_ratelimited ( " %s: generating new stable privacy address because of DAD conflict \n " ,
ifp - > idev - > dev - > name ) ;
2018-05-27 08:09:53 -07:00
ifp2 = ipv6_add_addr ( idev , & cfg , false , NULL ) ;
2015-03-23 23:36:04 +01:00
if ( IS_ERR ( ifp2 ) )
goto lock_errdad ;
spin_lock_bh ( & ifp2 - > lock ) ;
ifp2 - > stable_privacy_retry = retries ;
ifp2 - > state = INET6_IFADDR_STATE_PREDAD ;
spin_unlock_bh ( & ifp2 - > lock ) ;
2015-03-23 23:36:05 +01:00
addrconf_mod_dad_work ( ifp2 , net - > ipv6 . sysctl . idgen_delay ) ;
2015-03-23 23:36:04 +01:00
in6_ifa_put ( ifp2 ) ;
lock_errdad :
spin_lock_bh ( & ifp - > lock ) ;
2008-06-28 14:18:38 +09:00
}
2015-03-23 23:36:04 +01:00
errdad :
2014-03-27 18:28:07 +01:00
/* transition from _POSTDAD to _ERRDAD */
ifp - > state = INET6_IFADDR_STATE_ERRDAD ;
2015-03-23 23:36:03 +01:00
spin_unlock_bh ( & ifp - > lock ) ;
2014-03-27 18:28:07 +01:00
addrconf_mod_dad_work ( ifp , 0 ) ;
2016-09-05 16:06:31 +08:00
in6_ifa_put ( ifp ) ;
2005-12-21 22:57:24 +09:00
}
2005-04-16 15:20:36 -07:00
2014-09-02 10:29:29 +02:00
/* Join to solicited addr multicast group.
* caller must hold RTNL */
2011-04-22 04:53:02 +00:00
void addrconf_join_solict ( struct net_device * dev , const struct in6_addr * addr )
2005-04-16 15:20:36 -07:00
{
struct in6_addr maddr ;
if ( dev - > flags & ( IFF_LOOPBACK | IFF_NOARP ) )
return ;
addrconf_addr_solict_mult ( addr , & maddr ) ;
ipv6_dev_mc_inc ( dev , & maddr ) ;
}
2014-09-02 10:29:29 +02:00
/* caller must hold RTNL */
2011-04-22 04:53:02 +00:00
void addrconf_leave_solict ( struct inet6_dev * idev , const struct in6_addr * addr )
2005-04-16 15:20:36 -07:00
{
struct in6_addr maddr ;
if ( idev - > dev - > flags & ( IFF_LOOPBACK | IFF_NOARP ) )
return ;
addrconf_addr_solict_mult ( addr , & maddr ) ;
__ipv6_dev_mc_dec ( idev , & maddr ) ;
}
2014-09-02 10:29:29 +02:00
/* caller must hold RTNL */
2005-08-16 02:18:02 -03:00
static void addrconf_join_anycast ( struct inet6_ifaddr * ifp )
2005-04-16 15:20:36 -07:00
{
struct in6_addr addr ;
2014-03-27 18:28:07 +01:00
2014-01-06 17:53:14 +01:00
if ( ifp - > prefix_len > = 127 ) /* RFC 6164 */
2011-07-05 23:04:13 +00:00
return ;
2005-04-16 15:20:36 -07:00
ipv6_addr_prefix ( & addr , & ifp - > addr , ifp - > prefix_len ) ;
if ( ipv6_addr_any ( & addr ) )
return ;
2014-09-11 15:35:11 -07:00
__ipv6_dev_ac_inc ( ifp - > idev , & addr ) ;
2005-04-16 15:20:36 -07:00
}
2014-09-02 10:29:29 +02:00
/* caller must hold RTNL */
2005-08-16 02:18:02 -03:00
static void addrconf_leave_anycast ( struct inet6_ifaddr * ifp )
2005-04-16 15:20:36 -07:00
{
struct in6_addr addr ;
2014-03-27 18:28:07 +01:00
2014-01-06 17:53:14 +01:00
if ( ifp - > prefix_len > = 127 ) /* RFC 6164 */
2011-07-24 11:44:34 +00:00
return ;
2005-04-16 15:20:36 -07:00
ipv6_addr_prefix ( & addr , & ifp - > addr , ifp - > prefix_len ) ;
if ( ipv6_addr_any ( & addr ) )
return ;
__ipv6_dev_ac_dec ( ifp - > idev , & addr ) ;
}
2017-03-12 10:19:36 +02:00
static int addrconf_ifid_6lowpan ( u8 * eui , struct net_device * dev )
2012-05-10 03:25:52 +00:00
{
2017-03-12 10:19:36 +02:00
switch ( dev - > addr_len ) {
case ETH_ALEN :
2017-03-12 10:19:38 +02:00
memcpy ( eui , dev - > dev_addr , 3 ) ;
eui [ 3 ] = 0xFF ;
eui [ 4 ] = 0xFE ;
memcpy ( eui + 5 , dev - > dev_addr + 3 , 3 ) ;
break ;
2017-03-12 10:19:36 +02:00
case EUI64_ADDR_LEN :
memcpy ( eui , dev - > dev_addr , EUI64_ADDR_LEN ) ;
eui [ 0 ] ^ = 2 ;
break ;
default :
2012-05-10 03:25:52 +00:00
return - 1 ;
2017-03-12 10:19:36 +02:00
}
2012-05-10 03:25:52 +00:00
return 0 ;
}
2013-03-25 08:26:24 +00:00
static int addrconf_ifid_ieee1394 ( u8 * eui , struct net_device * dev )
{
2021-10-12 08:58:38 -07:00
const union fwnet_hwaddr * ha ;
2013-03-25 08:26:24 +00:00
if ( dev - > addr_len ! = FWNET_ALEN )
return - 1 ;
2021-10-12 08:58:38 -07:00
ha = ( const union fwnet_hwaddr * ) dev - > dev_addr ;
2013-03-25 08:26:24 +00:00
memcpy ( eui , & ha - > uc . uniq_id , sizeof ( ha - > uc . uniq_id ) ) ;
eui [ 0 ] ^ = 2 ;
return 0 ;
}
2006-03-20 16:54:49 -08:00
static int addrconf_ifid_arcnet ( u8 * eui , struct net_device * dev )
{
/* XXX: inherit EUI-64 from other interface -- yoshfuji */
if ( dev - > addr_len ! = ARCNET_ALEN )
return - 1 ;
memset ( eui , 0 , 7 ) ;
2012-04-01 07:49:08 +00:00
eui [ 7 ] = * ( u8 * ) dev - > dev_addr ;
2006-03-20 16:54:49 -08:00
return 0 ;
}
static int addrconf_ifid_infiniband ( u8 * eui , struct net_device * dev )
{
if ( dev - > addr_len ! = INFINIBAND_ALEN )
return - 1 ;
memcpy ( eui , dev - > dev_addr + 12 , 8 ) ;
eui [ 0 ] | = 2 ;
return 0 ;
}
2010-10-04 20:17:53 +00:00
static int __ipv6_isatap_ifid ( u8 * eui , __be32 addr )
2008-04-10 15:42:09 +09:00
{
2009-05-19 12:56:51 +00:00
if ( addr = = 0 )
return - 1 ;
2008-04-10 15:42:09 +09:00
eui [ 0 ] = ( ipv4_is_zeronet ( addr ) | | ipv4_is_private_10 ( addr ) | |
ipv4_is_loopback ( addr ) | | ipv4_is_linklocal_169 ( addr ) | |
ipv4_is_private_172 ( addr ) | | ipv4_is_test_192 ( addr ) | |
ipv4_is_anycast_6to4 ( addr ) | | ipv4_is_private_192 ( addr ) | |
ipv4_is_test_198 ( addr ) | | ipv4_is_multicast ( addr ) | |
ipv4_is_lbcast ( addr ) ) ? 0x00 : 0x02 ;
eui [ 1 ] = 0 ;
eui [ 2 ] = 0x5E ;
eui [ 3 ] = 0xFE ;
memcpy ( eui + 4 , & addr , 4 ) ;
return 0 ;
}
static int addrconf_ifid_sit ( u8 * eui , struct net_device * dev )
{
if ( dev - > priv_flags & IFF_ISATAP )
return __ipv6_isatap_ifid ( eui , * ( __be32 * ) dev - > dev_addr ) ;
return - 1 ;
}
2011-06-08 10:44:30 +00:00
static int addrconf_ifid_gre ( u8 * eui , struct net_device * dev )
{
return __ipv6_isatap_ifid ( eui , * ( __be32 * ) dev - > dev_addr ) ;
}
2013-08-20 12:16:06 +02:00
static int addrconf_ifid_ip6tnl ( u8 * eui , struct net_device * dev )
{
memcpy ( eui , dev - > perm_addr , 3 ) ;
memcpy ( eui + 5 , dev - > perm_addr + 3 , 3 ) ;
eui [ 3 ] = 0xFF ;
eui [ 4 ] = 0xFE ;
eui [ 0 ] ^ = 2 ;
return 0 ;
}
2005-04-16 15:20:36 -07:00
static int ipv6_generate_eui64 ( u8 * eui , struct net_device * dev )
{
switch ( dev - > type ) {
case ARPHRD_ETHER :
case ARPHRD_FDDI :
2006-03-20 16:54:49 -08:00
return addrconf_ifid_eui48 ( eui , dev ) ;
2005-04-16 15:20:36 -07:00
case ARPHRD_ARCNET :
2006-03-20 16:54:49 -08:00
return addrconf_ifid_arcnet ( eui , dev ) ;
2005-04-16 15:20:36 -07:00
case ARPHRD_INFINIBAND :
2006-03-20 16:54:49 -08:00
return addrconf_ifid_infiniband ( eui , dev ) ;
2007-11-29 22:11:40 +11:00
case ARPHRD_SIT :
2008-04-10 15:42:09 +09:00
return addrconf_ifid_sit ( eui , dev ) ;
2011-06-08 10:44:30 +00:00
case ARPHRD_IPGRE :
2017-01-26 16:59:18 +13:00
case ARPHRD_TUNNEL :
2011-06-08 10:44:30 +00:00
return addrconf_ifid_gre ( eui , dev ) ;
2013-12-11 17:05:36 +02:00
case ARPHRD_6LOWPAN :
2017-03-12 10:19:36 +02:00
return addrconf_ifid_6lowpan ( eui , dev ) ;
2013-03-25 08:26:24 +00:00
case ARPHRD_IEEE1394 :
return addrconf_ifid_ieee1394 ( eui , dev ) ;
2013-08-20 12:16:06 +02:00
case ARPHRD_TUNNEL6 :
2017-01-26 16:59:18 +13:00
case ARPHRD_IP6GRE :
2018-06-04 19:26:07 -06:00
case ARPHRD_RAWIP :
2013-08-20 12:16:06 +02:00
return addrconf_ifid_ip6tnl ( eui , dev ) ;
2005-04-16 15:20:36 -07:00
}
return - 1 ;
}
static int ipv6_inherit_eui64 ( u8 * eui , struct inet6_dev * idev )
{
int err = - 1 ;
struct inet6_ifaddr * ifp ;
read_lock_bh ( & idev - > lock ) ;
2014-01-19 21:58:19 +01:00
list_for_each_entry_reverse ( ifp , & idev - > addr_list , if_list ) {
if ( ifp - > scope > IFA_LINK )
break ;
2005-04-16 15:20:36 -07:00
if ( ifp - > scope = = IFA_LINK & & ! ( ifp - > flags & IFA_F_TENTATIVE ) ) {
memcpy ( eui , ifp - > addr . s6_addr + 8 , 8 ) ;
err = 0 ;
break ;
}
}
read_unlock_bh ( & idev - > lock ) ;
return err ;
}
2020-05-01 00:51:47 -03:00
/* Generation of a randomized Interface Identifier
* draft - ietf - 6 man - rfc4941bis , Section 3.3 .1
*/
static void ipv6_gen_rnd_iid ( struct in6_addr * addr )
2005-04-16 15:20:36 -07:00
{
regen :
2020-05-01 00:51:47 -03:00
get_random_bytes ( & addr - > s6_addr [ 8 ] , 8 ) ;
2005-04-16 15:20:36 -07:00
2020-05-01 00:51:47 -03:00
/* <draft-ietf-6man-rfc4941bis-08.txt>, Section 3.3.1:
* check if generated address is not inappropriate :
2005-04-16 15:20:36 -07:00
*
2021-03-27 04:42:40 +05:30
* - Reserved IPv6 Interface Identifiers
2020-05-01 00:51:47 -03:00
* - XXX : already assigned to an address on the device
2005-04-16 15:20:36 -07:00
*/
2020-05-01 00:51:47 -03:00
/* Subnet-router anycast: 0000:0000:0000:0000 */
if ( ! ( addr - > s6_addr32 [ 2 ] | addr - > s6_addr32 [ 3 ] ) )
2005-04-16 15:20:36 -07:00
goto regen ;
2020-05-01 00:51:47 -03:00
/* IANA Ethernet block: 0200:5EFF:FE00:0000-0200:5EFF:FE00:5212
* Proxy Mobile IPv6 : 0200 : 5 EFF : FE00 : 5213
* IANA Ethernet block : 0200 : 5 EFF : FE00 : 5214 - 0200 : 5 EFF : FEFF : FFFF
*/
if ( ntohl ( addr - > s6_addr32 [ 2 ] ) = = 0x02005eff & &
( ntohl ( addr - > s6_addr32 [ 3 ] ) & 0 Xff000000 ) = = 0xfe000000 )
goto regen ;
/* Reserved subnet anycast addresses */
if ( ntohl ( addr - > s6_addr32 [ 2 ] ) = = 0xfdffffff & &
ntohl ( addr - > s6_addr32 [ 3 ] ) > = 0 Xffffff80 )
goto regen ;
2005-04-16 15:20:36 -07:00
}
/*
* Add prefix route .
*/
static void
2018-05-27 08:09:58 -07:00
addrconf_prefix_route ( struct in6_addr * pfx , int plen , u32 metric ,
struct net_device * dev , unsigned long expires ,
u32 flags , gfp_t gfp_flags )
2005-04-16 15:20:36 -07:00
{
2006-08-22 00:01:08 -07:00
struct fib6_config cfg = {
2015-10-12 11:47:10 -07:00
. fc_table = l3mdev_fib_table ( dev ) ? : RT6_TABLE_PREFIX ,
2018-05-27 08:09:58 -07:00
. fc_metric = metric ? : IP6_RT_PRIO_ADDRCONF ,
2006-08-22 00:01:08 -07:00
. fc_ifindex = dev - > ifindex ,
. fc_expires = expires ,
. fc_dst_len = plen ,
. fc_flags = RTF_UP | flags ,
2008-03-25 21:47:49 +09:00
. fc_nlinfo . nl_net = dev_net ( dev ) ,
2008-08-23 05:16:46 -07:00
. fc_protocol = RTPROT_KERNEL ,
2018-04-17 17:33:13 -07:00
. fc_type = RTN_UNICAST ,
2006-08-22 00:01:08 -07:00
} ;
2005-04-16 15:20:36 -07:00
2011-11-21 03:39:03 +00:00
cfg . fc_dst = * pfx ;
2005-04-16 15:20:36 -07:00
/* Prevent useless cloning on PtP SIT.
This thing is done here expecting that the whole
class of non - broadcast devices need not cloning .
*/
2012-10-29 16:23:10 +00:00
# if IS_ENABLED(CONFIG_IPV6_SIT)
2006-08-22 00:01:08 -07:00
if ( dev - > type = = ARPHRD_SIT & & ( dev - > flags & IFF_POINTOPOINT ) )
cfg . fc_flags | = RTF_NONEXTHOP ;
2006-10-10 14:49:53 -07:00
# endif
2005-04-16 15:20:36 -07:00
2018-04-17 17:33:22 -07:00
ip6_route_add ( & cfg , gfp_flags , NULL ) ;
2005-04-16 15:20:36 -07:00
}
2011-10-26 03:24:29 +00:00
2018-04-17 17:33:26 -07:00
static struct fib6_info * addrconf_get_prefix_route ( const struct in6_addr * pfx ,
2011-10-26 03:24:29 +00:00
int plen ,
const struct net_device * dev ,
2019-03-27 20:53:52 -07:00
u32 flags , u32 noflags ,
bool no_gw )
2011-10-26 03:24:29 +00:00
{
struct fib6_node * fn ;
2018-04-17 17:33:26 -07:00
struct fib6_info * rt = NULL ;
2011-10-26 03:24:29 +00:00
struct fib6_table * table ;
2015-10-12 11:47:10 -07:00
u32 tb_id = l3mdev_fib_table ( dev ) ? : RT6_TABLE_PREFIX ;
2011-10-26 03:24:29 +00:00
2015-10-12 11:47:10 -07:00
table = fib6_get_table ( dev_net ( dev ) , tb_id ) ;
2015-03-29 14:00:04 +01:00
if ( ! table )
2011-10-26 03:24:29 +00:00
return NULL ;
ipv6: replace rwlock with rcu and spinlock in fib6_table
With all the preparation work before, we are now ready to replace rwlock
with rcu and spinlock in fib6_table.
That means now all fib6_node in fib6_table are protected by rcu. And
when freeing fib6_node, call_rcu() is used to wait for the rcu grace
period before releasing the memory.
When accessing fib6_node, corresponding rcu APIs need to be used.
And all previous sessions protected by the write lock will now be
protected by the spin lock per table.
All previous sessions protected by read lock will now be protected by
rcu_read_lock().
A couple of things to note here:
1. As part of the work of replacing rwlock with rcu, the linked list of
fn->leaf now has to be rcu protected as well. So both fn->leaf and
rt->dst.rt6_next are now __rcu tagged and corresponding rcu APIs are
used when manipulating them.
2. For fn->rr_ptr, first of all, it also needs to be rcu protected now
and is tagged with __rcu and rcu APIs are used in corresponding places.
Secondly, fn->rr_ptr is changed in rt6_select() which is a reader
thread. This makes the issue a bit complicated. We think a valid
solution for it is to let rt6_select() grab the tb6_lock if it decides
to change it. As it is not in the normal operation and only happens when
there is no valid neighbor cache for the route, we think the performance
impact should be low.
3. fib6_walk_continue() has to be called with tb6_lock held even in the
route dumping related functions, e.g. inet6_dump_fib(),
fib6_tables_dump() and ipv6_route_seq_ops. It is because
fib6_walk_continue() makes modifications to the walker structure, and so
are fib6_repair_tree() and fib6_del_route(). In order to do proper
syncing between them, we need to let fib6_walk_continue() hold the lock.
We may be able to do further improvement on the way we do the tree walk
to get rid of the need for holding the spin lock. But not for now.
4. When fib6_del_route() removes a route from the tree, we no longer
mark rt->dst.rt6_next to NULL to make simultaneous reader be able to
further traverse the list with rcu. However, rt->dst.rt6_next is only
valid within this same rcu period. No one should access it later.
5. All the operation of atomic_inc(rt->rt6i_ref) is changed to be
performed before we publish this route (either by linking it to fn->leaf
or insert it in the list pointed by fn->leaf) just to be safe because as
soon as we publish the route, some read thread will be able to access it.
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06 12:06:10 -07:00
rcu_read_lock ( ) ;
2017-10-06 12:06:02 -07:00
fn = fib6_locate ( & table - > tb6_root , pfx , plen , NULL , 0 , true ) ;
2011-10-26 03:24:29 +00:00
if ( ! fn )
goto out ;
2015-04-28 13:03:03 -07:00
ipv6: replace rwlock with rcu and spinlock in fib6_table
With all the preparation work before, we are now ready to replace rwlock
with rcu and spinlock in fib6_table.
That means now all fib6_node in fib6_table are protected by rcu. And
when freeing fib6_node, call_rcu() is used to wait for the rcu grace
period before releasing the memory.
When accessing fib6_node, corresponding rcu APIs need to be used.
And all previous sessions protected by the write lock will now be
protected by the spin lock per table.
All previous sessions protected by read lock will now be protected by
rcu_read_lock().
A couple of things to note here:
1. As part of the work of replacing rwlock with rcu, the linked list of
fn->leaf now has to be rcu protected as well. So both fn->leaf and
rt->dst.rt6_next are now __rcu tagged and corresponding rcu APIs are
used when manipulating them.
2. For fn->rr_ptr, first of all, it also needs to be rcu protected now
and is tagged with __rcu and rcu APIs are used in corresponding places.
Secondly, fn->rr_ptr is changed in rt6_select() which is a reader
thread. This makes the issue a bit complicated. We think a valid
solution for it is to let rt6_select() grab the tb6_lock if it decides
to change it. As it is not in the normal operation and only happens when
there is no valid neighbor cache for the route, we think the performance
impact should be low.
3. fib6_walk_continue() has to be called with tb6_lock held even in the
route dumping related functions, e.g. inet6_dump_fib(),
fib6_tables_dump() and ipv6_route_seq_ops. It is because
fib6_walk_continue() makes modifications to the walker structure, and so
are fib6_repair_tree() and fib6_del_route(). In order to do proper
syncing between them, we need to let fib6_walk_continue() hold the lock.
We may be able to do further improvement on the way we do the tree walk
to get rid of the need for holding the spin lock. But not for now.
4. When fib6_del_route() removes a route from the tree, we no longer
mark rt->dst.rt6_next to NULL to make simultaneous reader be able to
further traverse the list with rcu. However, rt->dst.rt6_next is only
valid within this same rcu period. No one should access it later.
5. All the operation of atomic_inc(rt->rt6i_ref) is changed to be
performed before we publish this route (either by linking it to fn->leaf
or insert it in the list pointed by fn->leaf) just to be safe because as
soon as we publish the route, some read thread will be able to access it.
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06 12:06:10 -07:00
for_each_fib6_node_rt_rcu ( fn ) {
2019-06-03 20:19:52 -07:00
/* prefix routes only use builtin fib6_nh */
if ( rt - > nh )
continue ;
2019-05-22 20:27:59 -07:00
if ( rt - > fib6_nh - > fib_nh_dev - > ifindex ! = dev - > ifindex )
2011-10-26 03:24:29 +00:00
continue ;
2019-05-22 20:27:59 -07:00
if ( no_gw & & rt - > fib6_nh - > fib_nh_gw_family )
2019-03-27 20:53:52 -07:00
continue ;
2018-04-18 15:38:59 -07:00
if ( ( rt - > fib6_flags & flags ) ! = flags )
2011-10-26 03:24:29 +00:00
continue ;
2018-04-18 15:38:59 -07:00
if ( ( rt - > fib6_flags & noflags ) ! = 0 )
2011-10-26 03:24:29 +00:00
continue ;
2018-07-21 20:56:32 -07:00
if ( ! fib6_info_hold_safe ( rt ) )
continue ;
2011-10-26 03:24:29 +00:00
break ;
}
out :
ipv6: replace rwlock with rcu and spinlock in fib6_table
With all the preparation work before, we are now ready to replace rwlock
with rcu and spinlock in fib6_table.
That means now all fib6_node in fib6_table are protected by rcu. And
when freeing fib6_node, call_rcu() is used to wait for the rcu grace
period before releasing the memory.
When accessing fib6_node, corresponding rcu APIs need to be used.
And all previous sessions protected by the write lock will now be
protected by the spin lock per table.
All previous sessions protected by read lock will now be protected by
rcu_read_lock().
A couple of things to note here:
1. As part of the work of replacing rwlock with rcu, the linked list of
fn->leaf now has to be rcu protected as well. So both fn->leaf and
rt->dst.rt6_next are now __rcu tagged and corresponding rcu APIs are
used when manipulating them.
2. For fn->rr_ptr, first of all, it also needs to be rcu protected now
and is tagged with __rcu and rcu APIs are used in corresponding places.
Secondly, fn->rr_ptr is changed in rt6_select() which is a reader
thread. This makes the issue a bit complicated. We think a valid
solution for it is to let rt6_select() grab the tb6_lock if it decides
to change it. As it is not in the normal operation and only happens when
there is no valid neighbor cache for the route, we think the performance
impact should be low.
3. fib6_walk_continue() has to be called with tb6_lock held even in the
route dumping related functions, e.g. inet6_dump_fib(),
fib6_tables_dump() and ipv6_route_seq_ops. It is because
fib6_walk_continue() makes modifications to the walker structure, and so
are fib6_repair_tree() and fib6_del_route(). In order to do proper
syncing between them, we need to let fib6_walk_continue() hold the lock.
We may be able to do further improvement on the way we do the tree walk
to get rid of the need for holding the spin lock. But not for now.
4. When fib6_del_route() removes a route from the tree, we no longer
mark rt->dst.rt6_next to NULL to make simultaneous reader be able to
further traverse the list with rcu. However, rt->dst.rt6_next is only
valid within this same rcu period. No one should access it later.
5. All the operation of atomic_inc(rt->rt6i_ref) is changed to be
performed before we publish this route (either by linking it to fn->leaf
or insert it in the list pointed by fn->leaf) just to be safe because as
soon as we publish the route, some read thread will be able to access it.
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06 12:06:10 -07:00
rcu_read_unlock ( ) ;
2011-10-26 03:24:29 +00:00
return rt ;
}
2005-04-16 15:20:36 -07:00
/* Create "default" multicast route to the interface */
static void addrconf_add_mroute ( struct net_device * dev )
{
2006-08-22 00:01:08 -07:00
struct fib6_config cfg = {
2015-10-12 11:47:10 -07:00
. fc_table = l3mdev_fib_table ( dev ) ? : RT6_TABLE_LOCAL ,
2006-08-22 00:01:08 -07:00
. fc_metric = IP6_RT_PRIO_ADDRCONF ,
. fc_ifindex = dev - > ifindex ,
. fc_dst_len = 8 ,
. fc_flags = RTF_UP ,
2021-01-15 19:42:09 +01:00
. fc_type = RTN_MULTICAST ,
2008-03-25 21:47:49 +09:00
. fc_nlinfo . nl_net = dev_net ( dev ) ,
2021-01-15 19:42:08 +01:00
. fc_protocol = RTPROT_KERNEL ,
2006-08-22 00:01:08 -07:00
} ;
ipv6_addr_set ( & cfg . fc_dst , htonl ( 0xFF000000 ) , 0 , 0 , 0 ) ;
2018-08-22 12:58:34 -07:00
ip6_route_add ( & cfg , GFP_KERNEL , NULL ) ;
2005-04-16 15:20:36 -07:00
}
static struct inet6_dev * addrconf_add_dev ( struct net_device * dev )
{
struct inet6_dev * idev ;
ASSERT_RTNL ( ) ;
2010-03-20 16:09:01 -07:00
idev = ipv6_find_idev ( dev ) ;
2019-08-23 15:44:36 +02:00
if ( IS_ERR ( idev ) )
return idev ;
2010-07-20 10:34:30 +00:00
if ( idev - > cnf . disable_ipv6 )
return ERR_PTR ( - EACCES ) ;
2005-04-16 15:20:36 -07:00
/* Add default multicast route */
2016-06-13 13:44:18 -07:00
if ( ! ( dev - > flags & IFF_LOOPBACK ) & & ! netif_is_l3_master ( dev ) )
2011-12-06 21:23:45 +00:00
addrconf_add_mroute ( dev ) ;
2005-04-16 15:20:36 -07:00
return idev ;
}
2013-12-06 09:45:22 +01:00
static void manage_tempaddrs ( struct inet6_dev * idev ,
struct inet6_ifaddr * ifp ,
__u32 valid_lft , __u32 prefered_lft ,
bool create , unsigned long now )
{
u32 flags ;
struct inet6_ifaddr * ift ;
read_lock_bh ( & idev - > lock ) ;
/* update all temporary addresses in the list */
list_for_each_entry ( ift , & idev - > tempaddr_list , tmp_list ) {
int age , max_valid , max_prefered ;
if ( ifp ! = ift - > ifpub )
continue ;
/* RFC 4941 section 3.3:
* If a received option will extend the lifetime of a public
* address , the lifetimes of temporary addresses should
* be extended , subject to the overall constraint that no
* temporary addresses should ever remain " valid " or " preferred "
* for a time longer than ( TEMP_VALID_LIFETIME ) or
* ( TEMP_PREFERRED_LIFETIME - DESYNC_FACTOR ) , respectively .
*/
age = ( now - ift - > cstamp ) / HZ ;
max_valid = idev - > cnf . temp_valid_lft - age ;
if ( max_valid < 0 )
max_valid = 0 ;
max_prefered = idev - > cnf . temp_prefered_lft -
2016-10-13 18:52:15 +02:00
idev - > desync_factor - age ;
2013-12-06 09:45:22 +01:00
if ( max_prefered < 0 )
max_prefered = 0 ;
if ( valid_lft > max_valid )
valid_lft = max_valid ;
if ( prefered_lft > max_prefered )
prefered_lft = max_prefered ;
spin_lock ( & ift - > lock ) ;
flags = ift - > flags ;
ift - > valid_lft = valid_lft ;
ift - > prefered_lft = prefered_lft ;
ift - > tstamp = now ;
if ( prefered_lft > 0 )
ift - > flags & = ~ IFA_F_DEPRECATED ;
spin_unlock ( & ift - > lock ) ;
if ( ! ( flags & IFA_F_TENTATIVE ) )
ipv6_ifa_notify ( 0 , ift ) ;
}
ipv6 addrconf: fix bug where deleting a mngtmpaddr can create a new temporary address
currently on 6.4 net/main:
# ip link add dummy1 type dummy
# echo 1 > /proc/sys/net/ipv6/conf/dummy1/use_tempaddr
# ip link set dummy1 up
# ip -6 addr add 2000::1/64 mngtmpaddr dev dummy1
# ip -6 addr show dev dummy1
11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
inet6 2000::44f3:581c:8ca:3983/64 scope global temporary dynamic
valid_lft 604800sec preferred_lft 86172sec
inet6 2000::1/64 scope global mngtmpaddr
valid_lft forever preferred_lft forever
inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link
valid_lft forever preferred_lft forever
# ip -6 addr del 2000::44f3:581c:8ca:3983/64 dev dummy1
(can wait a few seconds if you want to, the above delete isn't [directly] the problem)
# ip -6 addr show dev dummy1
11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
inet6 2000::1/64 scope global mngtmpaddr
valid_lft forever preferred_lft forever
inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link
valid_lft forever preferred_lft forever
# ip -6 addr del 2000::1/64 mngtmpaddr dev dummy1
# ip -6 addr show dev dummy1
11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
inet6 2000::81c9:56b7:f51a:b98f/64 scope global temporary dynamic
valid_lft 604797sec preferred_lft 86169sec
inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link
valid_lft forever preferred_lft forever
This patch prevents this new 'global temporary dynamic' address from being
created by the deletion of the related (same subnet prefix) 'mngtmpaddr'
(which is triggered by there already being no temporary addresses).
Cc: Jiri Pirko <jiri@resnulli.us>
Fixes: 53bd67491537 ("ipv6 addrconf: introduce IFA_F_MANAGETEMPADDR to tell kernel to manage temporary addresses")
Reported-by: Xiao Ma <xiaom@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230720160022.1887942-1-maze@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-20 09:00:22 -07:00
/* Also create a temporary address if it's enabled but no temporary
* address currently exists .
* However , we get called with valid_lft = = 0 , prefered_lft = = 0 , create = = false
* as part of cleanup ( ie . deleting the mngtmpaddr ) .
* We don ' t want that to result in creating a new temporary ip address .
*/
if ( list_empty ( & idev - > tempaddr_list ) & & ( valid_lft | | prefered_lft ) )
create = true ;
if ( create & & idev - > cnf . use_tempaddr > 0 ) {
2013-12-06 09:45:22 +01:00
/* When a new public address is created as described
* in [ ADDRCONF ] , also create a new temporary address .
*/
read_unlock_bh ( & idev - > lock ) ;
2020-05-01 00:51:47 -03:00
ipv6_create_tempaddr ( ifp , false ) ;
2013-12-06 09:45:22 +01:00
} else {
read_unlock_bh ( & idev - > lock ) ;
}
}
2015-12-16 16:44:38 +01:00
static bool is_addr_mode_generate_stable ( struct inet6_dev * idev )
{
2017-01-26 16:59:17 +13:00
return idev - > cnf . addr_gen_mode = = IN6_ADDR_GEN_MODE_STABLE_PRIVACY | |
idev - > cnf . addr_gen_mode = = IN6_ADDR_GEN_MODE_RANDOM ;
2015-12-16 16:44:38 +01:00
}
2016-06-15 21:20:24 +02:00
int addrconf_prefix_rcv_add_addr ( struct net * net , struct net_device * dev ,
const struct prefix_info * pinfo ,
struct inet6_dev * in6_dev ,
const struct in6_addr * addr , int addr_type ,
u32 addr_flags , bool sllao , bool tokenized ,
__u32 valid_lft , u32 prefered_lft )
2016-06-15 21:20:22 +02:00
{
struct inet6_ifaddr * ifp = ipv6_get_ifaddr ( net , addr , dev , 1 ) ;
2022-01-26 16:38:52 +01:00
int create = 0 , update_lft = 0 ;
2016-06-15 21:20:22 +02:00
if ( ! ifp & & valid_lft ) {
int max_addresses = in6_dev - > cnf . max_addresses ;
2018-05-27 08:09:53 -07:00
struct ifa6_config cfg = {
. pfx = addr ,
. plen = pinfo - > prefix_len ,
. ifa_flags = addr_flags ,
. valid_lft = valid_lft ,
. preferred_lft = prefered_lft ,
. scope = addr_type & IPV6_ADDR_SCOPE_MASK ,
2022-02-17 16:02:02 +01:00
. ifa_proto = IFAPROT_KERNEL_RA
2018-05-27 08:09:53 -07:00
} ;
2016-06-15 21:20:22 +02:00
# ifdef CONFIG_IPV6_OPTIMISTIC_DAD
ipv6: fix net.ipv6.conf.all interface DAD handlers
Currently, writing into
net.ipv6.conf.all.{accept_dad,use_optimistic,optimistic_dad} has no effect.
Fix handling of these flags by:
- using the maximum of global and per-interface values for the
accept_dad flag. That is, if at least one of the two values is
non-zero, enable DAD on the interface. If at least one value is
set to 2, enable DAD and disable IPv6 operation on the interface if
MAC-based link-local address was found
- using the logical OR of global and per-interface values for the
optimistic_dad flag. If at least one of them is set to one, optimistic
duplicate address detection (RFC 4429) is enabled on the interface
- using the logical OR of global and per-interface values for the
use_optimistic flag. If at least one of them is set to one,
optimistic addresses won't be marked as deprecated during source address
selection on the interface.
While at it, as we're modifying the prototype for ipv6_use_optimistic_addr(),
drop inline, and let the compiler decide.
Fixes: 7fd2561e4ebd ("net: ipv6: Add a sysctl to make optimistic addresses useful candidates")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-12 17:46:37 +02:00
if ( ( net - > ipv6 . devconf_all - > optimistic_dad | |
in6_dev - > cnf . optimistic_dad ) & &
2016-06-15 21:20:22 +02:00
! net - > ipv6 . devconf_all - > forwarding & & sllao )
2018-05-27 08:09:53 -07:00
cfg . ifa_flags | = IFA_F_OPTIMISTIC ;
2016-06-15 21:20:22 +02:00
# endif
/* Do not allow to create too much of autoconfigured
* addresses ; this would be too easy way to crash kernel .
*/
if ( ! max_addresses | |
ipv6_count_addresses ( in6_dev ) < max_addresses )
2018-05-27 08:09:53 -07:00
ifp = ipv6_add_addr ( in6_dev , & cfg , false , NULL ) ;
2016-06-15 21:20:22 +02:00
if ( IS_ERR_OR_NULL ( ifp ) )
return - 1 ;
create = 1 ;
spin_lock_bh ( & ifp - > lock ) ;
ifp - > flags | = IFA_F_MANAGETEMPADDR ;
ifp - > cstamp = jiffies ;
ifp - > tokenized = tokenized ;
spin_unlock_bh ( & ifp - > lock ) ;
addrconf_dad_start ( ifp ) ;
}
if ( ifp ) {
u32 flags ;
unsigned long now ;
u32 stored_lft ;
2022-01-26 16:38:52 +01:00
/* update lifetime (RFC2462 5.5.3 e) */
2016-06-15 21:20:22 +02:00
spin_lock_bh ( & ifp - > lock ) ;
now = jiffies ;
if ( ifp - > valid_lft > ( now - ifp - > tstamp ) / HZ )
stored_lft = ifp - > valid_lft - ( now - ifp - > tstamp ) / HZ ;
else
stored_lft = 0 ;
2023-09-25 14:47:11 -07:00
/* RFC4862 Section 5.5.3e:
* " Note that the preferred lifetime of the
* corresponding address is always reset to
* the Preferred Lifetime in the received
* Prefix Information option , regardless of
* whether the valid lifetime is also reset or
* ignored . "
*
* So we should always update prefered_lft here .
*/
update_lft = ! create & & stored_lft ;
if ( update_lft & & ! in6_dev - > cnf . ra_honor_pio_life ) {
2022-01-26 16:38:52 +01:00
const u32 minimum_lft = min_t ( u32 ,
stored_lft , MIN_VALID_LIFETIME ) ;
valid_lft = max ( valid_lft , minimum_lft ) ;
}
if ( update_lft ) {
2016-06-15 21:20:22 +02:00
ifp - > valid_lft = valid_lft ;
ifp - > prefered_lft = prefered_lft ;
ifp - > tstamp = now ;
flags = ifp - > flags ;
ifp - > flags & = ~ IFA_F_DEPRECATED ;
spin_unlock_bh ( & ifp - > lock ) ;
if ( ! ( flags & IFA_F_TENTATIVE ) )
ipv6_ifa_notify ( 0 , ifp ) ;
} else
spin_unlock_bh ( & ifp - > lock ) ;
manage_tempaddrs ( in6_dev , ifp , valid_lft , prefered_lft ,
create , now ) ;
in6_ifa_put ( ifp ) ;
2022-02-07 20:50:29 -08:00
addrconf_verify ( net ) ;
2016-06-15 21:20:22 +02:00
}
return 0 ;
}
2016-06-15 21:20:24 +02:00
EXPORT_SYMBOL_GPL ( addrconf_prefix_rcv_add_addr ) ;
2016-06-15 21:20:22 +02:00
2012-01-04 10:49:15 +00:00
void addrconf_prefix_rcv ( struct net_device * dev , u8 * opt , int len , bool sllao )
2005-04-16 15:20:36 -07:00
{
struct prefix_info * pinfo ;
__u32 valid_lft ;
__u32 prefered_lft ;
2016-06-15 21:20:22 +02:00
int addr_type , err ;
2015-03-23 23:36:02 +01:00
u32 addr_flags = 0 ;
2005-04-16 15:20:36 -07:00
struct inet6_dev * in6_dev ;
2009-06-01 03:07:33 -07:00
struct net * net = dev_net ( dev ) ;
2005-04-16 15:20:36 -07:00
pinfo = ( struct prefix_info * ) opt ;
2007-02-09 23:24:49 +09:00
2005-04-16 15:20:36 -07:00
if ( len < sizeof ( struct prefix_info ) ) {
2018-03-26 08:35:01 -07:00
netdev_dbg ( dev , " addrconf: prefix option too short \n " ) ;
2005-04-16 15:20:36 -07:00
return ;
}
2007-02-09 23:24:49 +09:00
2005-04-16 15:20:36 -07:00
/*
* Validation checks ( [ ADDRCONF ] , page 19 )
*/
addr_type = ipv6_addr_type ( & pinfo - > prefix ) ;
if ( addr_type & ( IPV6_ADDR_MULTICAST | IPV6_ADDR_LINKLOCAL ) )
return ;
valid_lft = ntohl ( pinfo - > valid ) ;
prefered_lft = ntohl ( pinfo - > prefered ) ;
if ( prefered_lft > valid_lft ) {
2012-05-13 21:56:26 +00:00
net_warn_ratelimited ( " addrconf: prefix option has invalid lifetime \n " ) ;
2005-04-16 15:20:36 -07:00
return ;
}
in6_dev = in6_dev_get ( dev ) ;
2015-03-29 14:00:04 +01:00
if ( ! in6_dev ) {
2012-05-13 21:56:26 +00:00
net_dbg_ratelimited ( " addrconf: device %s not configured \n " ,
dev - > name ) ;
2005-04-16 15:20:36 -07:00
return ;
}
2023-07-26 16:07:01 -07:00
if ( valid_lft ! = 0 & & valid_lft < in6_dev - > cnf . accept_ra_min_lft )
2023-08-18 11:22:49 -07:00
goto put ;
2023-07-26 16:07:01 -07:00
2005-04-16 15:20:36 -07:00
/*
* Two things going on here :
* 1 ) Add routes for on - link prefixes
* 2 ) Configure prefixes with the auto flag set
*/
2008-05-27 17:37:49 +09:00
if ( pinfo - > onlink ) {
2018-04-17 17:33:26 -07:00
struct fib6_info * rt ;
2008-05-27 17:37:49 +09:00
unsigned long rt_expires ;
2008-05-19 16:56:11 -07:00
/* Avoid arithmetic overflow. Really, we could
* save rt_expires in seconds , likely valid_lft ,
* but it would require division in fib gc , that it
* not good .
*/
2008-05-27 17:37:49 +09:00
if ( HZ > USER_HZ )
rt_expires = addrconf_timeout_fixup ( valid_lft , HZ ) ;
else
rt_expires = addrconf_timeout_fixup ( valid_lft , USER_HZ ) ;
2005-12-19 14:02:45 -08:00
2008-05-27 17:37:49 +09:00
if ( addrconf_finite_timeout ( rt_expires ) )
rt_expires * = HZ ;
2005-04-16 15:20:36 -07:00
2011-10-26 03:24:29 +00:00
rt = addrconf_get_prefix_route ( & pinfo - > prefix ,
pinfo - > prefix_len ,
dev ,
RTF_ADDRCONF | RTF_PREFIX_RT ,
2019-03-27 20:53:52 -07:00
RTF_DEFAULT , true ) ;
2005-04-16 15:20:36 -07:00
2011-10-26 03:24:29 +00:00
if ( rt ) {
2008-05-19 16:56:11 -07:00
/* Autoconf prefix route */
if ( valid_lft = = 0 ) {
2020-04-27 13:56:45 -07:00
ip6_del_rt ( net , rt , false ) ;
2008-05-19 16:56:11 -07:00
rt = NULL ;
2008-05-27 17:37:49 +09:00
} else if ( addrconf_finite_timeout ( rt_expires ) ) {
2008-05-19 16:56:11 -07:00
/* not infinity */
2018-04-17 17:33:17 -07:00
fib6_set_expires ( rt , jiffies + rt_expires ) ;
2008-05-19 16:56:11 -07:00
} else {
2018-04-17 17:33:17 -07:00
fib6_clean_expires ( rt ) ;
2005-04-16 15:20:36 -07:00
}
} else if ( valid_lft ) {
2008-05-19 16:56:11 -07:00
clock_t expires = 0 ;
2008-05-27 17:37:49 +09:00
int flags = RTF_ADDRCONF | RTF_PREFIX_RT ;
if ( addrconf_finite_timeout ( rt_expires ) ) {
2008-05-19 16:56:11 -07:00
/* not infinity */
flags | = RTF_EXPIRES ;
expires = jiffies_to_clock_t ( rt_expires ) ;
}
2005-04-16 15:20:36 -07:00
addrconf_prefix_route ( & pinfo - > prefix , pinfo - > prefix_len ,
2018-05-27 08:09:58 -07:00
0 , dev , expires , flags ,
GFP_ATOMIC ) ;
2005-04-16 15:20:36 -07:00
}
2018-04-17 17:33:25 -07:00
fib6_info_release ( rt ) ;
2005-04-16 15:20:36 -07:00
}
/* Try to figure out our local address for this prefix */
if ( pinfo - > autoconf & & in6_dev - > cnf . autoconf ) {
struct in6_addr addr ;
2016-06-15 21:20:23 +02:00
bool tokenized = false , dev_addr_generated = false ;
2005-04-16 15:20:36 -07:00
if ( pinfo - > prefix_len = = 64 ) {
memcpy ( & addr , & pinfo - > prefix , 8 ) ;
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
if ( ! ipv6_addr_any ( & in6_dev - > token ) ) {
read_lock_bh ( & in6_dev - > lock ) ;
memcpy ( addr . s6_addr + 8 ,
in6_dev - > token . s6_addr + 8 , 8 ) ;
read_unlock_bh ( & in6_dev - > lock ) ;
2013-04-09 03:47:16 +00:00
tokenized = true ;
2015-12-16 16:44:38 +01:00
} else if ( is_addr_mode_generate_stable ( in6_dev ) & &
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
! ipv6_generate_stable_address ( & addr , 0 ,
in6_dev ) ) {
2015-03-23 23:36:02 +01:00
addr_flags | = IFA_F_STABLE_PRIVACY ;
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
goto ok ;
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
} else if ( ipv6_generate_eui64 ( addr . s6_addr + 8 , dev ) & &
ipv6_inherit_eui64 ( addr . s6_addr + 8 , in6_dev ) ) {
2016-06-15 21:20:22 +02:00
goto put ;
2016-06-15 21:20:23 +02:00
} else {
dev_addr_generated = true ;
2005-04-16 15:20:36 -07:00
}
goto ok ;
}
2012-05-13 21:56:26 +00:00
net_dbg_ratelimited ( " IPv6 addrconf: prefix with wrong length %d \n " ,
pinfo - > prefix_len ) ;
2016-06-15 21:20:22 +02:00
goto put ;
2005-04-16 15:20:36 -07:00
ok :
2016-06-15 21:20:22 +02:00
err = addrconf_prefix_rcv_add_addr ( net , dev , pinfo , in6_dev ,
& addr , addr_type ,
addr_flags , sllao ,
tokenized , valid_lft ,
prefered_lft ) ;
if ( err )
goto put ;
2016-06-15 21:20:23 +02:00
/* Ignore error case here because previous prefix add addr was
* successful which will be notified .
*/
ndisc_ops_prefix_rcv_add_addr ( net , dev , pinfo , in6_dev , & addr ,
addr_type , addr_flags , sllao ,
tokenized , valid_lft ,
prefered_lft ,
dev_addr_generated ) ;
2005-04-16 15:20:36 -07:00
}
inet6_prefix_notify ( RTM_NEWPREFIX , in6_dev , pinfo ) ;
2016-06-15 21:20:22 +02:00
put :
2005-04-16 15:20:36 -07:00
in6_dev_put ( in6_dev ) ;
}
2020-05-19 15:03:18 +02:00
static int addrconf_set_sit_dstaddr ( struct net * net , struct net_device * dev ,
struct in6_ifreq * ireq )
{
struct ip_tunnel_parm p = { } ;
int err ;
if ( ! ( ipv6_addr_type ( & ireq - > ifr6_addr ) & IPV6_ADDR_COMPATv4 ) )
return - EADDRNOTAVAIL ;
p . iph . daddr = ireq - > ifr6_addr . s6_addr32 [ 3 ] ;
p . iph . version = 4 ;
p . iph . ihl = 5 ;
p . iph . protocol = IPPROTO_IPV6 ;
p . iph . ttl = 64 ;
2020-05-19 15:03:19 +02:00
if ( ! dev - > netdev_ops - > ndo_tunnel_ctl )
2020-05-19 15:03:18 +02:00
return - EOPNOTSUPP ;
2020-05-19 15:03:19 +02:00
err = dev - > netdev_ops - > ndo_tunnel_ctl ( dev , & p , SIOCADDTUNNEL ) ;
2020-05-19 15:03:18 +02:00
if ( err )
return err ;
dev = __dev_get_by_name ( net , p . name ) ;
if ( ! dev )
return - ENOBUFS ;
return dev_open ( dev , NULL ) ;
}
2005-04-16 15:20:36 -07:00
/*
* Set destination address .
* Special case for SIT interfaces where we create a new " virtual "
* device .
*/
2008-03-05 10:46:57 -08:00
int addrconf_set_dstaddr ( struct net * net , void __user * arg )
2005-04-16 15:20:36 -07:00
{
struct net_device * dev ;
2020-05-19 15:03:18 +02:00
struct in6_ifreq ireq ;
int err = - ENODEV ;
2005-04-16 15:20:36 -07:00
2020-05-19 15:03:17 +02:00
if ( ! IS_ENABLED ( CONFIG_IPV6_SIT ) )
return - ENODEV ;
2005-04-16 15:20:36 -07:00
if ( copy_from_user ( & ireq , arg , sizeof ( struct in6_ifreq ) ) )
2020-05-19 15:03:18 +02:00
return - EFAULT ;
2005-04-16 15:20:36 -07:00
2020-05-19 15:03:18 +02:00
rtnl_lock ( ) ;
2008-03-05 10:46:57 -08:00
dev = __dev_get_by_index ( net , ireq . ifr6_ifindex ) ;
2020-05-19 15:03:18 +02:00
if ( dev & & dev - > type = = ARPHRD_SIT )
err = addrconf_set_sit_dstaddr ( net , dev , & ireq ) ;
2005-04-16 15:20:36 -07:00
rtnl_unlock ( ) ;
return err ;
}
2015-02-25 09:58:35 -08:00
static int ipv6_mc_config ( struct sock * sk , bool join ,
const struct in6_addr * addr , int ifindex )
{
int ret ;
ASSERT_RTNL ( ) ;
lock_sock ( sk ) ;
if ( join )
ipv4, ipv6: kill ip_mc_{join, leave}_group and ipv6_sock_mc_{join, drop}
in favor of their inner __ ones, which doesn't grab rtnl.
As these functions need to operate on a locked socket, we can't be
grabbing rtnl by then. It's too late and doing so causes reversed
locking.
So this patch:
- move rtnl handling to callers instead while already fixing some
reversed locking situations, like on vxlan and ipvs code.
- renames __ ones to not have the __ mark:
__ip_mc_{join,leave}_group -> ip_mc_{join,leave}_group
__ipv6_sock_mc_{join,drop} -> ipv6_sock_mc_{join,drop}
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 14:50:43 -03:00
ret = ipv6_sock_mc_join ( sk , ifindex , addr ) ;
2015-02-25 09:58:35 -08:00
else
ipv4, ipv6: kill ip_mc_{join, leave}_group and ipv6_sock_mc_{join, drop}
in favor of their inner __ ones, which doesn't grab rtnl.
As these functions need to operate on a locked socket, we can't be
grabbing rtnl by then. It's too late and doing so causes reversed
locking.
So this patch:
- move rtnl handling to callers instead while already fixing some
reversed locking situations, like on vxlan and ipvs code.
- renames __ ones to not have the __ mark:
__ip_mc_{join,leave}_group -> ip_mc_{join,leave}_group
__ipv6_sock_mc_{join,drop} -> ipv6_sock_mc_{join,drop}
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 14:50:43 -03:00
ret = ipv6_sock_mc_drop ( sk , ifindex , addr ) ;
2015-02-25 09:58:35 -08:00
release_sock ( sk ) ;
return ret ;
}
2005-04-16 15:20:36 -07:00
/*
* Manual configuration of address on an interface
*/
2013-12-06 09:45:21 +01:00
static int inet6_addr_add ( struct net * net , int ifindex ,
2018-05-27 08:09:54 -07:00
struct ifa6_config * cfg ,
2017-10-18 09:56:54 -07:00
struct netlink_ext_ack * extack )
2005-04-16 15:20:36 -07:00
{
struct inet6_ifaddr * ifp ;
struct inet6_dev * idev ;
struct net_device * dev ;
2015-02-25 09:58:35 -08:00
unsigned long timeout ;
clock_t expires ;
2008-05-19 16:56:11 -07:00
u32 flags ;
2005-04-16 15:20:36 -07:00
ASSERT_RTNL ( ) ;
2007-02-09 23:24:49 +09:00
2023-07-26 10:39:05 +08:00
if ( cfg - > plen > 128 ) {
NL_SET_ERR_MSG_MOD ( extack , " Invalid prefix length " ) ;
2008-05-28 16:54:22 +02:00
return - EINVAL ;
2023-07-26 10:39:05 +08:00
}
2008-05-28 16:54:22 +02:00
2006-07-28 18:12:10 +09:00
/* check the lifetime */
2023-07-26 10:39:05 +08:00
if ( ! cfg - > valid_lft | | cfg - > preferred_lft > cfg - > valid_lft ) {
NL_SET_ERR_MSG_MOD ( extack , " address lifetime invalid " ) ;
2006-07-28 18:12:10 +09:00
return - EINVAL ;
2023-07-26 10:39:05 +08:00
}
2006-07-28 18:12:10 +09:00
2023-07-26 10:39:05 +08:00
if ( cfg - > ifa_flags & IFA_F_MANAGETEMPADDR & & cfg - > plen ! = 64 ) {
NL_SET_ERR_MSG_MOD ( extack , " address with \" mngtmpaddr \" flag must have a prefix length of 64 " ) ;
2013-12-06 09:45:22 +01:00
return - EINVAL ;
2023-07-26 10:39:05 +08:00
}
2013-12-06 09:45:22 +01:00
2008-03-05 10:46:57 -08:00
dev = __dev_get_by_index ( net , ifindex ) ;
if ( ! dev )
2005-04-16 15:20:36 -07:00
return - ENODEV ;
2007-02-09 23:24:49 +09:00
2010-07-20 10:34:30 +00:00
idev = addrconf_add_dev ( dev ) ;
2023-07-26 10:39:05 +08:00
if ( IS_ERR ( idev ) ) {
NL_SET_ERR_MSG_MOD ( extack , " IPv6 is disabled on this device " ) ;
2010-07-20 10:34:30 +00:00
return PTR_ERR ( idev ) ;
2023-07-26 10:39:05 +08:00
}
2005-04-16 15:20:36 -07:00
2018-05-27 08:09:54 -07:00
if ( cfg - > ifa_flags & IFA_F_MCAUTOJOIN ) {
2015-02-25 09:58:35 -08:00
int ret = ipv6_mc_config ( net - > ipv6 . mc_autojoin_sk ,
2018-05-27 08:09:54 -07:00
true , cfg - > pfx , ifindex ) ;
2015-02-25 09:58:35 -08:00
2023-07-26 10:39:05 +08:00
if ( ret < 0 ) {
NL_SET_ERR_MSG_MOD ( extack , " Multicast auto join failed " ) ;
2015-02-25 09:58:35 -08:00
return ret ;
2023-07-26 10:39:05 +08:00
}
2015-02-25 09:58:35 -08:00
}
2018-05-27 08:09:54 -07:00
cfg - > scope = ipv6_addr_scope ( cfg - > pfx ) ;
2005-04-16 15:20:36 -07:00
2018-05-27 08:09:54 -07:00
timeout = addrconf_timeout_fixup ( cfg - > valid_lft , HZ ) ;
2008-05-27 17:37:49 +09:00
if ( addrconf_finite_timeout ( timeout ) ) {
expires = jiffies_to_clock_t ( timeout * HZ ) ;
2018-05-27 08:09:54 -07:00
cfg - > valid_lft = timeout ;
2008-05-19 16:56:11 -07:00
flags = RTF_EXPIRES ;
2008-05-27 17:37:49 +09:00
} else {
expires = 0 ;
flags = 0 ;
2018-05-27 08:09:54 -07:00
cfg - > ifa_flags | = IFA_F_PERMANENT ;
2008-05-19 16:56:11 -07:00
}
2006-07-28 18:12:10 +09:00
2018-05-27 08:09:54 -07:00
timeout = addrconf_timeout_fixup ( cfg - > preferred_lft , HZ ) ;
2008-05-27 17:37:49 +09:00
if ( addrconf_finite_timeout ( timeout ) ) {
if ( timeout = = 0 )
2018-05-27 08:09:54 -07:00
cfg - > ifa_flags | = IFA_F_DEPRECATED ;
cfg - > preferred_lft = timeout ;
2008-05-27 17:37:49 +09:00
}
2006-07-28 18:12:10 +09:00
2018-05-27 08:09:54 -07:00
ifp = ipv6_add_addr ( idev , cfg , true , extack ) ;
2005-04-16 15:20:36 -07:00
if ( ! IS_ERR ( ifp ) ) {
2018-05-27 08:09:54 -07:00
if ( ! ( cfg - > ifa_flags & IFA_F_NOPREFIXROUTE ) ) {
2018-05-27 08:09:58 -07:00
addrconf_prefix_route ( & ifp - > addr , ifp - > prefix_len ,
ifp - > rt_priority , dev , expires ,
flags , GFP_KERNEL ) ;
2014-01-15 15:36:58 +01:00
}
2018-04-17 11:54:39 +02:00
/* Send a netlink notification if DAD is enabled and
* optimistic flag is not set
*/
if ( ! ( ifp - > flags & ( IFA_F_OPTIMISTIC | IFA_F_NODAD ) ) )
ipv6_ifa_notify ( 0 , ifp ) ;
2007-04-25 17:08:10 -07:00
/*
* Note that section 3.1 of RFC 4429 indicates
* that the Optimistic flag should not be set for
* manually configured addresses
*/
2012-04-14 21:37:40 -04:00
addrconf_dad_start ( ifp ) ;
2018-05-27 08:09:54 -07:00
if ( cfg - > ifa_flags & IFA_F_MANAGETEMPADDR )
manage_tempaddrs ( idev , ifp , cfg - > valid_lft ,
cfg - > preferred_lft , true , jiffies ) ;
2005-04-16 15:20:36 -07:00
in6_ifa_put ( ifp ) ;
2022-02-07 20:50:29 -08:00
addrconf_verify_rtnl ( net ) ;
2005-04-16 15:20:36 -07:00
return 0 ;
2018-05-27 08:09:54 -07:00
} else if ( cfg - > ifa_flags & IFA_F_MCAUTOJOIN ) {
ipv6_mc_config ( net - > ipv6 . mc_autojoin_sk , false ,
cfg - > pfx , ifindex ) ;
2005-04-16 15:20:36 -07:00
}
return PTR_ERR ( ifp ) ;
}
2014-04-20 21:29:36 +02:00
static int inet6_addr_del ( struct net * net , int ifindex , u32 ifa_flags ,
2023-07-26 10:39:05 +08:00
const struct in6_addr * pfx , unsigned int plen ,
struct netlink_ext_ack * extack )
2005-04-16 15:20:36 -07:00
{
struct inet6_ifaddr * ifp ;
struct inet6_dev * idev ;
struct net_device * dev ;
2007-02-09 23:24:49 +09:00
2023-07-26 10:39:05 +08:00
if ( plen > 128 ) {
NL_SET_ERR_MSG_MOD ( extack , " Invalid prefix length " ) ;
2008-05-28 16:54:22 +02:00
return - EINVAL ;
2023-07-26 10:39:05 +08:00
}
2008-05-28 16:54:22 +02:00
2008-03-05 10:46:57 -08:00
dev = __dev_get_by_index ( net , ifindex ) ;
2023-07-26 10:39:05 +08:00
if ( ! dev ) {
NL_SET_ERR_MSG_MOD ( extack , " Unable to find the interface " ) ;
2005-04-16 15:20:36 -07:00
return - ENODEV ;
2023-07-26 10:39:05 +08:00
}
2005-04-16 15:20:36 -07:00
2014-11-23 21:28:43 +00:00
idev = __in6_dev_get ( dev ) ;
2023-07-26 10:39:05 +08:00
if ( ! idev ) {
NL_SET_ERR_MSG_MOD ( extack , " IPv6 is disabled on this device " ) ;
2005-04-16 15:20:36 -07:00
return - ENXIO ;
2023-07-26 10:39:05 +08:00
}
2005-04-16 15:20:36 -07:00
read_lock_bh ( & idev - > lock ) ;
2010-03-17 20:31:13 +00:00
list_for_each_entry ( ifp , & idev - > addr_list , if_list ) {
2005-04-16 15:20:36 -07:00
if ( ifp - > prefix_len = = plen & &
ipv6_addr_equal ( pfx , & ifp - > addr ) ) {
in6_ifa_hold ( ifp ) ;
read_unlock_bh ( & idev - > lock ) ;
2007-02-09 23:24:49 +09:00
2014-04-20 21:29:36 +02:00
if ( ! ( ifp - > flags & IFA_F_TEMPORARY ) & &
( ifa_flags & IFA_F_MANAGETEMPADDR ) )
manage_tempaddrs ( idev , ifp , 0 , 0 , false ,
jiffies ) ;
2005-04-16 15:20:36 -07:00
ipv6_del_addr ( ifp ) ;
2022-02-07 20:50:29 -08:00
addrconf_verify_rtnl ( net ) ;
2015-02-25 09:58:35 -08:00
if ( ipv6_addr_is_multicast ( pfx ) ) {
ipv6_mc_config ( net - > ipv6 . mc_autojoin_sk ,
false , pfx , dev - > ifindex ) ;
}
2005-04-16 15:20:36 -07:00
return 0 ;
}
}
read_unlock_bh ( & idev - > lock ) ;
2023-07-26 10:39:05 +08:00
NL_SET_ERR_MSG_MOD ( extack , " address not found " ) ;
2005-04-16 15:20:36 -07:00
return - EADDRNOTAVAIL ;
}
2008-03-05 10:46:57 -08:00
int addrconf_add_ifaddr ( struct net * net , void __user * arg )
2005-04-16 15:20:36 -07:00
{
2018-05-27 08:09:54 -07:00
struct ifa6_config cfg = {
. ifa_flags = IFA_F_PERMANENT ,
. preferred_lft = INFINITY_LIFE_TIME ,
. valid_lft = INFINITY_LIFE_TIME ,
} ;
2005-04-16 15:20:36 -07:00
struct in6_ifreq ireq ;
int err ;
2007-02-09 23:24:49 +09:00
net: Allow userns root to control ipv6
Allow an unpriviled user who has created a user namespace, and then
created a network namespace to effectively use the new network
namespace, by reducing capable(CAP_NET_ADMIN) and
capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
Settings that merely control a single network device are allowed.
Either the network device is a logical network device where
restrictions make no difference or the network device is hardware NIC
that has been explicity moved from the initial network namespace.
In general policy and network stack state changes are allowed while
resource control is left unchanged.
Allow the SIOCSIFADDR ioctl to add ipv6 addresses.
Allow the SIOCDIFADDR ioctl to delete ipv6 addresses.
Allow the SIOCADDRT ioctl to add ipv6 routes.
Allow the SIOCDELRT ioctl to delete ipv6 routes.
Allow creation of ipv6 raw sockets.
Allow setting the IPV6_JOIN_ANYCAST socket option.
Allow setting the IPV6_FL_A_RENEW parameter of the IPV6_FLOWLABEL_MGR
socket option.
Allow setting the IPV6_TRANSPARENT socket option.
Allow setting the IPV6_HOPOPTS socket option.
Allow setting the IPV6_RTHDRDSTOPTS socket option.
Allow setting the IPV6_DSTOPTS socket option.
Allow setting the IPV6_IPSEC_POLICY socket option.
Allow setting the IPV6_XFRM_POLICY socket option.
Allow sending packets with the IPV6_2292HOPOPTS control message.
Allow sending packets with the IPV6_2292DSTOPTS control message.
Allow sending packets with the IPV6_RTHDRDSTOPTS control message.
Allow setting the multicast routing socket options on non multicast
routing sockets.
Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, and SIOCDELTUNNEL ioctls for
setting up, changing and deleting tunnels over ipv6.
Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, SIOCDELTUNNEL ioctls for
setting up, changing and deleting ipv6 over ipv4 tunnels.
Allow the SIOCADDPRL, SIOCDELPRL, SIOCCHGPRL ioctls for adding,
deleting, and changing the potential router list for ISATAP tunnels.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-16 03:03:06 +00:00
if ( ! ns_capable ( net - > user_ns , CAP_NET_ADMIN ) )
2005-04-16 15:20:36 -07:00
return - EPERM ;
2007-02-09 23:24:49 +09:00
2005-04-16 15:20:36 -07:00
if ( copy_from_user ( & ireq , arg , sizeof ( struct in6_ifreq ) ) )
return - EFAULT ;
2018-05-27 08:09:54 -07:00
cfg . pfx = & ireq . ifr6_addr ;
cfg . plen = ireq . ifr6_prefixlen ;
2005-04-16 15:20:36 -07:00
rtnl_lock ( ) ;
2018-05-27 08:09:54 -07:00
err = inet6_addr_add ( net , ireq . ifr6_ifindex , & cfg , NULL ) ;
2005-04-16 15:20:36 -07:00
rtnl_unlock ( ) ;
return err ;
}
2008-03-05 10:46:57 -08:00
int addrconf_del_ifaddr ( struct net * net , void __user * arg )
2005-04-16 15:20:36 -07:00
{
struct in6_ifreq ireq ;
int err ;
2007-02-09 23:24:49 +09:00
net: Allow userns root to control ipv6
Allow an unpriviled user who has created a user namespace, and then
created a network namespace to effectively use the new network
namespace, by reducing capable(CAP_NET_ADMIN) and
capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
Settings that merely control a single network device are allowed.
Either the network device is a logical network device where
restrictions make no difference or the network device is hardware NIC
that has been explicity moved from the initial network namespace.
In general policy and network stack state changes are allowed while
resource control is left unchanged.
Allow the SIOCSIFADDR ioctl to add ipv6 addresses.
Allow the SIOCDIFADDR ioctl to delete ipv6 addresses.
Allow the SIOCADDRT ioctl to add ipv6 routes.
Allow the SIOCDELRT ioctl to delete ipv6 routes.
Allow creation of ipv6 raw sockets.
Allow setting the IPV6_JOIN_ANYCAST socket option.
Allow setting the IPV6_FL_A_RENEW parameter of the IPV6_FLOWLABEL_MGR
socket option.
Allow setting the IPV6_TRANSPARENT socket option.
Allow setting the IPV6_HOPOPTS socket option.
Allow setting the IPV6_RTHDRDSTOPTS socket option.
Allow setting the IPV6_DSTOPTS socket option.
Allow setting the IPV6_IPSEC_POLICY socket option.
Allow setting the IPV6_XFRM_POLICY socket option.
Allow sending packets with the IPV6_2292HOPOPTS control message.
Allow sending packets with the IPV6_2292DSTOPTS control message.
Allow sending packets with the IPV6_RTHDRDSTOPTS control message.
Allow setting the multicast routing socket options on non multicast
routing sockets.
Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, and SIOCDELTUNNEL ioctls for
setting up, changing and deleting tunnels over ipv6.
Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, SIOCDELTUNNEL ioctls for
setting up, changing and deleting ipv6 over ipv4 tunnels.
Allow the SIOCADDPRL, SIOCDELPRL, SIOCCHGPRL ioctls for adding,
deleting, and changing the potential router list for ISATAP tunnels.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-16 03:03:06 +00:00
if ( ! ns_capable ( net - > user_ns , CAP_NET_ADMIN ) )
2005-04-16 15:20:36 -07:00
return - EPERM ;
if ( copy_from_user ( & ireq , arg , sizeof ( struct in6_ifreq ) ) )
return - EFAULT ;
rtnl_lock ( ) ;
2014-04-20 21:29:36 +02:00
err = inet6_addr_del ( net , ireq . ifr6_ifindex , 0 , & ireq . ifr6_addr ,
2023-07-26 10:39:05 +08:00
ireq . ifr6_prefixlen , NULL ) ;
2005-04-16 15:20:36 -07:00
rtnl_unlock ( ) ;
return err ;
}
2009-02-06 23:48:01 -08:00
static void add_addr ( struct inet6_dev * idev , const struct in6_addr * addr ,
2022-02-17 16:02:02 +01:00
int plen , int scope , u8 proto )
2009-02-06 23:48:01 -08:00
{
struct inet6_ifaddr * ifp ;
2018-05-27 08:09:53 -07:00
struct ifa6_config cfg = {
. pfx = addr ,
. plen = plen ,
. ifa_flags = IFA_F_PERMANENT ,
. valid_lft = INFINITY_LIFE_TIME ,
. preferred_lft = INFINITY_LIFE_TIME ,
2022-02-17 16:02:02 +01:00
. scope = scope ,
. ifa_proto = proto
2018-05-27 08:09:53 -07:00
} ;
2009-02-06 23:48:01 -08:00
2018-05-27 08:09:53 -07:00
ifp = ipv6_add_addr ( idev , & cfg , true , NULL ) ;
2009-02-06 23:48:01 -08:00
if ( ! IS_ERR ( ifp ) ) {
spin_lock_bh ( & ifp - > lock ) ;
ifp - > flags & = ~ IFA_F_TENTATIVE ;
spin_unlock_bh ( & ifp - > lock ) ;
2016-11-22 16:57:40 +01:00
rt_genid_bump_ipv6 ( dev_net ( idev - > dev ) ) ;
2009-02-06 23:48:01 -08:00
ipv6_ifa_notify ( RTM_NEWADDR , ifp ) ;
in6_ifa_put ( ifp ) ;
}
}
2021-09-03 18:58:42 +02:00
# if IS_ENABLED(CONFIG_IPV6_SIT) || IS_ENABLED(CONFIG_NET_IPGRE) || IS_ENABLED(CONFIG_IPV6_GRE)
static void add_v4_addrs ( struct inet6_dev * idev )
2005-04-16 15:20:36 -07:00
{
struct in6_addr addr ;
struct net_device * dev ;
2008-03-25 21:47:49 +09:00
struct net * net = dev_net ( idev - > dev ) ;
2021-09-03 18:58:42 +02:00
int scope , plen , offset = 0 ;
2013-11-14 13:51:06 +01:00
u32 pflags = 0 ;
2005-04-16 15:20:36 -07:00
ASSERT_RTNL ( ) ;
memset ( & addr , 0 , sizeof ( struct in6_addr ) ) ;
2021-09-03 18:58:42 +02:00
/* in case of IP6GRE the dev_addr is an IPv6 and therefore we use only the last 4 bytes */
if ( idev - > dev - > addr_len = = sizeof ( struct in6_addr ) )
offset = sizeof ( struct in6_addr ) - 4 ;
memcpy ( & addr . s6_addr32 [ 3 ] , idev - > dev - > dev_addr + offset , 4 ) ;
2005-04-16 15:20:36 -07:00
2023-01-31 16:46:46 +13:00
if ( ! ( idev - > dev - > flags & IFF_POINTOPOINT ) & & idev - > dev - > type = = ARPHRD_SIT ) {
scope = IPV6_ADDR_COMPATv4 ;
plen = 96 ;
pflags | = RTF_NONEXTHOP ;
} else {
2021-10-20 16:06:18 -04:00
if ( idev - > cnf . addr_gen_mode = = IN6_ADDR_GEN_MODE_NONE )
return ;
2005-04-16 15:20:36 -07:00
addr . s6_addr32 [ 0 ] = htonl ( 0xfe800000 ) ;
scope = IFA_LINK ;
2013-11-14 13:51:05 +01:00
plen = 64 ;
2005-04-16 15:20:36 -07:00
}
if ( addr . s6_addr32 [ 3 ] ) {
2022-02-17 16:02:02 +01:00
add_addr ( idev , & addr , plen , scope , IFAPROT_UNSPEC ) ;
2018-05-27 08:09:58 -07:00
addrconf_prefix_route ( & addr , plen , 0 , idev - > dev , 0 , pflags ,
2018-08-22 12:58:34 -07:00
GFP_KERNEL ) ;
2005-04-16 15:20:36 -07:00
return ;
}
2008-03-05 10:47:47 -08:00
for_each_netdev ( net , dev ) {
2012-04-01 07:49:08 +00:00
struct in_device * in_dev = __in_dev_get_rtnl ( dev ) ;
2005-04-16 15:20:36 -07:00
if ( in_dev & & ( dev - > flags & IFF_UP ) ) {
2012-04-01 07:49:08 +00:00
struct in_ifaddr * ifa ;
2005-04-16 15:20:36 -07:00
int flag = scope ;
2019-05-31 18:27:07 +02:00
in_dev_for_each_ifa_rtnl ( ifa , in_dev ) {
2005-04-16 15:20:36 -07:00
addr . s6_addr32 [ 3 ] = ifa - > ifa_local ;
if ( ifa - > ifa_scope = = RT_SCOPE_LINK )
continue ;
if ( ifa - > ifa_scope > = RT_SCOPE_HOST ) {
if ( idev - > dev - > flags & IFF_POINTOPOINT )
continue ;
flag | = IFA_HOST ;
}
2022-02-17 16:02:02 +01:00
add_addr ( idev , & addr , plen , flag ,
IFAPROT_UNSPEC ) ;
2018-05-27 08:09:58 -07:00
addrconf_prefix_route ( & addr , plen , 0 , idev - > dev ,
2018-08-22 12:58:34 -07:00
0 , pflags , GFP_KERNEL ) ;
2005-04-16 15:20:36 -07:00
}
}
2007-02-09 23:24:49 +09:00
}
2005-04-16 15:20:36 -07:00
}
2006-10-10 14:49:53 -07:00
# endif
2005-04-16 15:20:36 -07:00
static void init_loopback ( struct net_device * dev )
{
struct inet6_dev * idev ;
/* ::1 */
ASSERT_RTNL ( ) ;
2014-11-23 21:28:43 +00:00
idev = ipv6_find_idev ( dev ) ;
2019-08-23 15:44:36 +02:00
if ( IS_ERR ( idev ) ) {
2012-05-15 14:11:54 +00:00
pr_debug ( " %s: add_dev failed \n " , __func__ ) ;
2005-04-16 15:20:36 -07:00
return ;
}
2022-02-17 16:02:02 +01:00
add_addr ( idev , & in6addr_loopback , 128 , IFA_HOST , IFAPROT_KERNEL_LO ) ;
2005-04-16 15:20:36 -07:00
}
2016-06-15 21:20:17 +02:00
void addrconf_add_linklocal ( struct inet6_dev * idev ,
const struct in6_addr * addr , u32 flags )
2005-04-16 15:20:36 -07:00
{
2018-05-27 08:09:53 -07:00
struct ifa6_config cfg = {
. pfx = addr ,
. plen = 64 ,
. ifa_flags = flags | IFA_F_PERMANENT ,
. valid_lft = INFINITY_LIFE_TIME ,
. preferred_lft = INFINITY_LIFE_TIME ,
2022-02-17 16:02:02 +01:00
. scope = IFA_LINK ,
. ifa_proto = IFAPROT_KERNEL_LL
2018-05-27 08:09:53 -07:00
} ;
2012-04-01 07:49:08 +00:00
struct inet6_ifaddr * ifp ;
2007-04-25 17:08:10 -07:00
# ifdef CONFIG_IPV6_OPTIMISTIC_DAD
ipv6: fix net.ipv6.conf.all interface DAD handlers
Currently, writing into
net.ipv6.conf.all.{accept_dad,use_optimistic,optimistic_dad} has no effect.
Fix handling of these flags by:
- using the maximum of global and per-interface values for the
accept_dad flag. That is, if at least one of the two values is
non-zero, enable DAD on the interface. If at least one value is
set to 2, enable DAD and disable IPv6 operation on the interface if
MAC-based link-local address was found
- using the logical OR of global and per-interface values for the
optimistic_dad flag. If at least one of them is set to one, optimistic
duplicate address detection (RFC 4429) is enabled on the interface
- using the logical OR of global and per-interface values for the
use_optimistic flag. If at least one of them is set to one,
optimistic addresses won't be marked as deprecated during source address
selection on the interface.
While at it, as we're modifying the prototype for ipv6_use_optimistic_addr(),
drop inline, and let the compiler decide.
Fixes: 7fd2561e4ebd ("net: ipv6: Add a sysctl to make optimistic addresses useful candidates")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-12 17:46:37 +02:00
if ( ( dev_net ( idev - > dev ) - > ipv6 . devconf_all - > optimistic_dad | |
idev - > cnf . optimistic_dad ) & &
2008-07-20 18:17:02 -07:00
! dev_net ( idev - > dev ) - > ipv6 . devconf_all - > forwarding )
2018-05-27 08:09:53 -07:00
cfg . ifa_flags | = IFA_F_OPTIMISTIC ;
2007-04-25 17:08:10 -07:00
# endif
2005-04-16 15:20:36 -07:00
2018-05-27 08:09:53 -07:00
ifp = ipv6_add_addr ( idev , & cfg , true , NULL ) ;
2005-04-16 15:20:36 -07:00
if ( ! IS_ERR ( ifp ) ) {
2018-05-27 08:09:58 -07:00
addrconf_prefix_route ( & ifp - > addr , ifp - > prefix_len , 0 , idev - > dev ,
2018-04-17 17:33:22 -07:00
0 , 0 , GFP_ATOMIC ) ;
2012-04-14 21:37:40 -04:00
addrconf_dad_start ( ifp ) ;
2005-04-16 15:20:36 -07:00
in6_ifa_put ( ifp ) ;
}
}
2016-06-15 21:20:17 +02:00
EXPORT_SYMBOL_GPL ( addrconf_add_linklocal ) ;
2005-04-16 15:20:36 -07:00
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
static bool ipv6_reserved_interfaceid ( struct in6_addr address )
{
if ( ( address . s6_addr32 [ 2 ] | address . s6_addr32 [ 3 ] ) = = 0 )
return true ;
if ( address . s6_addr32 [ 2 ] = = htonl ( 0x02005eff ) & &
( ( address . s6_addr32 [ 3 ] & htonl ( 0xfe000000 ) ) = = htonl ( 0xfe000000 ) ) )
return true ;
if ( address . s6_addr32 [ 2 ] = = htonl ( 0xfdffffff ) & &
( ( address . s6_addr32 [ 3 ] & htonl ( 0xffffff80 ) ) = = htonl ( 0xffffff80 ) ) )
return true ;
return false ;
}
static int ipv6_generate_stable_address ( struct in6_addr * address ,
u8 dad_count ,
const struct inet6_dev * idev )
{
static DEFINE_SPINLOCK ( lock ) ;
2020-05-02 11:24:25 -07:00
static __u32 digest [ SHA1_DIGEST_WORDS ] ;
static __u32 workspace [ SHA1_WORKSPACE_WORDS ] ;
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
static union {
2020-05-02 11:24:25 -07:00
char __data [ SHA1_BLOCK_SIZE ] ;
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
struct {
struct in6_addr secret ;
2015-03-24 11:05:28 +01:00
__be32 prefix [ 2 ] ;
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
unsigned char hwaddr [ MAX_ADDR_LEN ] ;
u8 dad_count ;
} __packed ;
} data ;
struct in6_addr secret ;
struct in6_addr temp ;
struct net * net = dev_net ( idev - > dev ) ;
BUILD_BUG_ON ( sizeof ( data . __data ) ! = sizeof ( data ) ) ;
if ( idev - > cnf . stable_secret . initialized )
secret = idev - > cnf . stable_secret . secret ;
else if ( net - > ipv6 . devconf_dflt - > stable_secret . initialized )
secret = net - > ipv6 . devconf_dflt - > stable_secret . secret ;
else
return - 1 ;
retry :
spin_lock_bh ( & lock ) ;
2020-05-02 11:24:25 -07:00
sha1_init ( digest ) ;
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
memset ( & data , 0 , sizeof ( data ) ) ;
memset ( workspace , 0 , sizeof ( workspace ) ) ;
memcpy ( data . hwaddr , idev - > dev - > perm_addr , idev - > dev - > addr_len ) ;
2015-03-24 11:05:28 +01:00
data . prefix [ 0 ] = address - > s6_addr32 [ 0 ] ;
data . prefix [ 1 ] = address - > s6_addr32 [ 1 ] ;
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
data . secret = secret ;
data . dad_count = dad_count ;
2020-05-02 11:24:25 -07:00
sha1_transform ( digest , data . __data , workspace ) ;
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
temp = * address ;
2015-03-24 11:05:28 +01:00
temp . s6_addr32 [ 2 ] = ( __force __be32 ) digest [ 0 ] ;
temp . s6_addr32 [ 3 ] = ( __force __be32 ) digest [ 1 ] ;
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
spin_unlock_bh ( & lock ) ;
if ( ipv6_reserved_interfaceid ( temp ) ) {
dad_count + + ;
2015-03-23 23:36:05 +01:00
if ( dad_count > dev_net ( idev - > dev ) - > ipv6 . sysctl . idgen_retries )
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
return - 1 ;
goto retry ;
}
* address = temp ;
return 0 ;
}
2015-12-16 16:44:38 +01:00
static void ipv6_gen_mode_random_init ( struct inet6_dev * idev )
{
struct ipv6_stable_secret * s = & idev - > cnf . stable_secret ;
if ( s - > initialized )
return ;
s = & idev - > cnf . stable_secret ;
get_random_bytes ( & s - > secret , sizeof ( s - > secret ) ) ;
s - > initialized = true ;
}
2014-07-11 21:10:18 +02:00
static void addrconf_addr_gen ( struct inet6_dev * idev , bool prefix_route )
{
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
struct in6_addr addr ;
2015-10-12 11:47:10 -07:00
/* no link local addresses on L3 master devices */
if ( netif_is_l3_master ( idev - > dev ) )
return ;
ipv6: don't auto-add link-local address to lag ports
Bonding slave and team port devices should not have link-local addresses
automatically added to them, as it can interfere with openvswitch being
able to properly add tc ingress.
Basic reproducer, courtesy of Marcelo:
$ ip link add name bond0 type bond
$ ip link set dev ens2f0np0 master bond0
$ ip link set dev ens2f1np2 master bond0
$ ip link set dev bond0 up
$ ip a s
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens2f0np0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc
mq master bond0 state UP group default qlen 1000
link/ether 00:0f:53:2f:ea:40 brd ff:ff:ff:ff:ff:ff
5: ens2f1np2: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc
mq master bond0 state DOWN group default qlen 1000
link/ether 00:0f:53:2f:ea:40 brd ff:ff:ff:ff:ff:ff
11: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc
noqueue state UP group default qlen 1000
link/ether 00:0f:53:2f:ea:40 brd ff:ff:ff:ff:ff:ff
inet6 fe80::20f:53ff:fe2f:ea40/64 scope link
valid_lft forever preferred_lft forever
(above trimmed to relevant entries, obviously)
$ sysctl net.ipv6.conf.ens2f0np0.addr_gen_mode=0
net.ipv6.conf.ens2f0np0.addr_gen_mode = 0
$ sysctl net.ipv6.conf.ens2f1np2.addr_gen_mode=0
net.ipv6.conf.ens2f1np2.addr_gen_mode = 0
$ ip a l ens2f0np0
2: ens2f0np0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc
mq master bond0 state UP group default qlen 1000
link/ether 00:0f:53:2f:ea:40 brd ff:ff:ff:ff:ff:ff
inet6 fe80::20f:53ff:fe2f:ea40/64 scope link tentative
valid_lft forever preferred_lft forever
$ ip a l ens2f1np2
5: ens2f1np2: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc
mq master bond0 state DOWN group default qlen 1000
link/ether 00:0f:53:2f:ea:40 brd ff:ff:ff:ff:ff:ff
inet6 fe80::20f:53ff:fe2f:ea40/64 scope link tentative
valid_lft forever preferred_lft forever
Looks like addrconf_sysctl_addr_gen_mode() bypasses the original "is
this a slave interface?" check added by commit c2edacf80e15, and
results in an address getting added, while w/the proposed patch added,
no address gets added. This simply adds the same gating check to another
code path, and thus should prevent the same devices from erroneously
obtaining an ipv6 link-local address.
Fixes: d35a00b8e33d ("net/ipv6: allow sysctl to change link-local address generation mode")
Reported-by: Moshe Levi <moshele@mellanox.com>
CC: Stephen Hemminger <stephen@networkplumber.org>
CC: Marcelo Ricardo Leitner <mleitner@redhat.com>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:22:19 -04:00
/* no link local addresses on devices flagged as slaves */
2022-12-09 10:21:38 -05:00
if ( idev - > dev - > priv_flags & IFF_NO_ADDRCONF )
ipv6: don't auto-add link-local address to lag ports
Bonding slave and team port devices should not have link-local addresses
automatically added to them, as it can interfere with openvswitch being
able to properly add tc ingress.
Basic reproducer, courtesy of Marcelo:
$ ip link add name bond0 type bond
$ ip link set dev ens2f0np0 master bond0
$ ip link set dev ens2f1np2 master bond0
$ ip link set dev bond0 up
$ ip a s
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens2f0np0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc
mq master bond0 state UP group default qlen 1000
link/ether 00:0f:53:2f:ea:40 brd ff:ff:ff:ff:ff:ff
5: ens2f1np2: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc
mq master bond0 state DOWN group default qlen 1000
link/ether 00:0f:53:2f:ea:40 brd ff:ff:ff:ff:ff:ff
11: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc
noqueue state UP group default qlen 1000
link/ether 00:0f:53:2f:ea:40 brd ff:ff:ff:ff:ff:ff
inet6 fe80::20f:53ff:fe2f:ea40/64 scope link
valid_lft forever preferred_lft forever
(above trimmed to relevant entries, obviously)
$ sysctl net.ipv6.conf.ens2f0np0.addr_gen_mode=0
net.ipv6.conf.ens2f0np0.addr_gen_mode = 0
$ sysctl net.ipv6.conf.ens2f1np2.addr_gen_mode=0
net.ipv6.conf.ens2f1np2.addr_gen_mode = 0
$ ip a l ens2f0np0
2: ens2f0np0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc
mq master bond0 state UP group default qlen 1000
link/ether 00:0f:53:2f:ea:40 brd ff:ff:ff:ff:ff:ff
inet6 fe80::20f:53ff:fe2f:ea40/64 scope link tentative
valid_lft forever preferred_lft forever
$ ip a l ens2f1np2
5: ens2f1np2: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc
mq master bond0 state DOWN group default qlen 1000
link/ether 00:0f:53:2f:ea:40 brd ff:ff:ff:ff:ff:ff
inet6 fe80::20f:53ff:fe2f:ea40/64 scope link tentative
valid_lft forever preferred_lft forever
Looks like addrconf_sysctl_addr_gen_mode() bypasses the original "is
this a slave interface?" check added by commit c2edacf80e15, and
results in an address getting added, while w/the proposed patch added,
no address gets added. This simply adds the same gating check to another
code path, and thus should prevent the same devices from erroneously
obtaining an ipv6 link-local address.
Fixes: d35a00b8e33d ("net/ipv6: allow sysctl to change link-local address generation mode")
Reported-by: Moshe Levi <moshele@mellanox.com>
CC: Stephen Hemminger <stephen@networkplumber.org>
CC: Marcelo Ricardo Leitner <mleitner@redhat.com>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:22:19 -04:00
return ;
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
ipv6_addr_set ( & addr , htonl ( 0xFE800000 ) , 0 , 0 , 0 ) ;
2014-07-11 21:10:18 +02:00
2017-01-26 16:59:17 +13:00
switch ( idev - > cnf . addr_gen_mode ) {
2015-12-16 16:44:38 +01:00
case IN6_ADDR_GEN_MODE_RANDOM :
ipv6_gen_mode_random_init ( idev ) ;
2020-03-12 15:50:22 -07:00
fallthrough ;
2015-12-16 16:44:38 +01:00
case IN6_ADDR_GEN_MODE_STABLE_PRIVACY :
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
if ( ! ipv6_generate_stable_address ( & addr , 0 , idev ) )
2015-03-23 23:36:02 +01:00
addrconf_add_linklocal ( idev , & addr ,
IFA_F_STABLE_PRIVACY ) ;
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
else if ( prefix_route )
2018-05-27 08:09:58 -07:00
addrconf_prefix_route ( & addr , 64 , 0 , idev - > dev ,
2018-04-17 17:33:22 -07:00
0 , 0 , GFP_KERNEL ) ;
2015-12-16 16:44:38 +01:00
break ;
case IN6_ADDR_GEN_MODE_EUI64 :
2014-07-11 21:10:18 +02:00
/* addrconf_add_linklocal also adds a prefix_route and we
* only need to care about prefix routes if ipv6_generate_eui64
* couldn ' t generate one .
*/
if ( ipv6_generate_eui64 ( addr . s6_addr + 8 , idev - > dev ) = = 0 )
2015-03-23 23:36:02 +01:00
addrconf_add_linklocal ( idev , & addr , 0 ) ;
2014-07-11 21:10:18 +02:00
else if ( prefix_route )
2018-05-27 08:09:58 -07:00
addrconf_prefix_route ( & addr , 64 , 0 , idev - > dev ,
2018-04-18 15:39:06 -07:00
0 , 0 , GFP_KERNEL ) ;
2015-12-16 16:44:38 +01:00
break ;
case IN6_ADDR_GEN_MODE_NONE :
default :
/* will not add any link local address */
break ;
2014-07-11 21:10:18 +02:00
}
}
2005-04-16 15:20:36 -07:00
static void addrconf_dev_config ( struct net_device * dev )
{
2012-04-01 07:49:08 +00:00
struct inet6_dev * idev ;
2005-04-16 15:20:36 -07:00
ASSERT_RTNL ( ) ;
2007-06-14 13:02:55 -07:00
if ( ( dev - > type ! = ARPHRD_ETHER ) & &
( dev - > type ! = ARPHRD_FDDI ) & &
( dev - > type ! = ARPHRD_ARCNET ) & &
2012-05-10 03:25:52 +00:00
( dev - > type ! = ARPHRD_INFINIBAND ) & &
2013-08-20 12:16:06 +02:00
( dev - > type ! = ARPHRD_IEEE1394 ) & &
2013-12-11 17:05:36 +02:00
( dev - > type ! = ARPHRD_TUNNEL6 ) & &
2015-12-16 16:44:38 +01:00
( dev - > type ! = ARPHRD_6LOWPAN ) & &
2017-01-26 16:59:18 +13:00
( dev - > type ! = ARPHRD_TUNNEL ) & &
2018-06-04 19:26:07 -06:00
( dev - > type ! = ARPHRD_NONE ) & &
( dev - > type ! = ARPHRD_RAWIP ) ) {
2007-06-14 13:02:55 -07:00
/* Alas, we support only Ethernet autoconfiguration. */
ipv6/addrconf: call ipv6_mc_up() for non-Ethernet interface
Rafał found an issue that for non-Ethernet interface, if we down and up
frequently, the memory will be consumed slowly.
The reason is we add allnodes/allrouters addressed in multicast list in
ipv6_add_dev(). When link down, we call ipv6_mc_down(), store all multicast
addresses via mld_add_delrec(). But when link up, we don't call ipv6_mc_up()
for non-Ethernet interface to remove the addresses. This makes idev->mc_tomb
getting bigger and bigger. The call stack looks like:
addrconf_notify(NETDEV_REGISTER)
ipv6_add_dev
ipv6_dev_mc_inc(ff01::1)
ipv6_dev_mc_inc(ff02::1)
ipv6_dev_mc_inc(ff02::2)
addrconf_notify(NETDEV_UP)
addrconf_dev_config
/* Alas, we support only Ethernet autoconfiguration. */
return;
addrconf_notify(NETDEV_DOWN)
addrconf_ifdown
ipv6_mc_down
igmp6_group_dropped(ff02::2)
mld_add_delrec(ff02::2)
igmp6_group_dropped(ff02::1)
igmp6_group_dropped(ff01::1)
After investigating, I can't found a rule to disable multicast on
non-Ethernet interface. In RFC2460, the link could be Ethernet, PPP, ATM,
tunnels, etc. In IPv4, it doesn't check the dev type when calls ip_mc_up()
in inetdev_event(). Even for IPv6, we don't check the dev type and call
ipv6_add_dev(), ipv6_dev_mc_inc() after register device.
So I think it's OK to fix this memory consumer by calling ipv6_mc_up() for
non-Ethernet interface.
v2: Also check IFF_MULTICAST flag to make sure the interface supports
multicast
Reported-by: Rafał Miłecki <zajec5@gmail.com>
Tested-by: Rafał Miłecki <zajec5@gmail.com>
Fixes: 74235a25c673 ("[IPV6] addrconf: Fix IPv6 on tuntap tunnels")
Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when set link down")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-10 15:27:37 +08:00
idev = __in6_dev_get ( dev ) ;
if ( ! IS_ERR_OR_NULL ( idev ) & & dev - > flags & IFF_UP & &
dev - > flags & IFF_MULTICAST )
ipv6_mc_up ( idev ) ;
2007-06-14 13:02:55 -07:00
return ;
}
2005-04-16 15:20:36 -07:00
idev = addrconf_add_dev ( dev ) ;
2010-07-20 10:34:30 +00:00
if ( IS_ERR ( idev ) )
2005-04-16 15:20:36 -07:00
return ;
2015-12-16 16:44:38 +01:00
/* this device type has no EUI support */
if ( dev - > type = = ARPHRD_NONE & &
2017-01-26 16:59:17 +13:00
idev - > cnf . addr_gen_mode = = IN6_ADDR_GEN_MODE_EUI64 )
idev - > cnf . addr_gen_mode = IN6_ADDR_GEN_MODE_RANDOM ;
2015-12-16 16:44:38 +01:00
2014-07-11 21:10:18 +02:00
addrconf_addr_gen ( idev , false ) ;
2005-04-16 15:20:36 -07:00
}
2012-10-29 16:23:10 +00:00
# if IS_ENABLED(CONFIG_IPV6_SIT)
2005-04-16 15:20:36 -07:00
static void addrconf_sit_config ( struct net_device * dev )
{
struct inet6_dev * idev ;
ASSERT_RTNL ( ) ;
2007-02-09 23:24:49 +09:00
/*
* Configure the tunnel with one of our IPv4
* addresses . . . we should configure all of
2005-04-16 15:20:36 -07:00
* our v4 addrs in the tunnel
*/
2014-11-23 21:28:43 +00:00
idev = ipv6_find_idev ( dev ) ;
2019-08-23 15:44:36 +02:00
if ( IS_ERR ( idev ) ) {
2012-05-15 14:11:54 +00:00
pr_debug ( " %s: add_dev failed \n " , __func__ ) ;
2005-04-16 15:20:36 -07:00
return ;
}
2007-11-29 22:11:40 +11:00
if ( dev - > priv_flags & IFF_ISATAP ) {
2014-07-11 21:10:18 +02:00
addrconf_addr_gen ( idev , false ) ;
2007-11-29 22:11:40 +11:00
return ;
}
2021-09-03 18:58:42 +02:00
add_v4_addrs ( idev ) ;
2005-04-16 15:20:36 -07:00
2012-10-01 23:19:14 +00:00
if ( dev - > flags & IFF_POINTOPOINT )
2005-04-16 15:20:36 -07:00
addrconf_add_mroute ( dev ) ;
}
2006-10-10 14:49:53 -07:00
# endif
2005-04-16 15:20:36 -07:00
2021-09-03 18:58:42 +02:00
# if IS_ENABLED(CONFIG_NET_IPGRE) || IS_ENABLED(CONFIG_IPV6_GRE)
2011-06-08 10:44:30 +00:00
static void addrconf_gre_config ( struct net_device * dev )
{
struct inet6_dev * idev ;
ASSERT_RTNL ( ) ;
2014-11-23 21:28:43 +00:00
idev = ipv6_find_idev ( dev ) ;
2019-08-23 15:44:36 +02:00
if ( IS_ERR ( idev ) ) {
2012-05-15 14:11:54 +00:00
pr_debug ( " %s: add_dev failed \n " , __func__ ) ;
2011-06-08 10:44:30 +00:00
return ;
}
2021-09-03 18:58:42 +02:00
if ( dev - > type = = ARPHRD_ETHER ) {
addrconf_addr_gen ( idev , true ) ;
return ;
}
add_v4_addrs ( idev ) ;
2015-10-08 18:19:39 +02:00
if ( dev - > flags & IFF_POINTOPOINT )
addrconf_add_mroute ( dev ) ;
2011-06-08 10:44:30 +00:00
}
# endif
2023-01-31 16:46:45 +13:00
static void addrconf_init_auto_addrs ( struct net_device * dev )
{
switch ( dev - > type ) {
# if IS_ENABLED(CONFIG_IPV6_SIT)
case ARPHRD_SIT :
addrconf_sit_config ( dev ) ;
break ;
# endif
# if IS_ENABLED(CONFIG_NET_IPGRE) || IS_ENABLED(CONFIG_IPV6_GRE)
case ARPHRD_IP6GRE :
case ARPHRD_IPGRE :
addrconf_gre_config ( dev ) ;
break ;
# endif
case ARPHRD_LOOPBACK :
init_loopback ( dev ) ;
break ;
default :
addrconf_dev_config ( dev ) ;
break ;
}
}
2018-04-17 17:33:11 -07:00
static int fixup_permanent_addr ( struct net * net ,
struct inet6_dev * idev ,
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
struct inet6_ifaddr * ifp )
{
2018-04-18 15:38:59 -07:00
/* !fib6_node means the host route was removed from the
2017-04-25 09:17:29 -07:00
* FIB , for example , if ' lo ' device is taken down . In that
* case regenerate the host route .
*/
2018-04-18 15:38:59 -07:00
if ( ! ifp - > rt | | ! ifp - > rt - > fib6_node ) {
2018-04-18 15:39:00 -07:00
struct fib6_info * f6i , * prev ;
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
2018-04-18 15:39:00 -07:00
f6i = addrconf_f6i_alloc ( net , idev , & ifp - > addr , false ,
2023-07-26 10:39:05 +08:00
GFP_ATOMIC , NULL ) ;
2018-04-18 15:39:00 -07:00
if ( IS_ERR ( f6i ) )
return PTR_ERR ( f6i ) ;
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
2017-04-25 09:17:29 -07:00
/* ifp->rt can be accessed outside of rtnl */
spin_lock ( & ifp - > lock ) ;
prev = ifp - > rt ;
2018-04-18 15:39:00 -07:00
ifp - > rt = f6i ;
2017-04-25 09:17:29 -07:00
spin_unlock ( & ifp - > lock ) ;
2018-04-17 17:33:25 -07:00
fib6_info_release ( prev ) ;
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
}
if ( ! ( ifp - > flags & IFA_F_NOPREFIXROUTE ) ) {
addrconf_prefix_route ( & ifp - > addr , ifp - > prefix_len ,
2018-05-27 08:09:58 -07:00
ifp - > rt_priority , idev - > dev , 0 , 0 ,
GFP_ATOMIC ) ;
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
}
2017-05-02 14:43:44 -07:00
if ( ifp - > state = = INET6_IFADDR_STATE_PREDAD )
addrconf_dad_start ( ifp ) ;
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
return 0 ;
}
2018-04-17 17:33:11 -07:00
static void addrconf_permanent_addr ( struct net * net , struct net_device * dev )
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
{
struct inet6_ifaddr * ifp , * tmp ;
struct inet6_dev * idev ;
idev = __in6_dev_get ( dev ) ;
if ( ! idev )
return ;
write_lock_bh ( & idev - > lock ) ;
list_for_each_entry_safe ( ifp , tmp , & idev - > addr_list , if_list ) {
if ( ( ifp - > flags & IFA_F_PERMANENT ) & &
2018-04-17 17:33:11 -07:00
fixup_permanent_addr ( net , idev , ifp ) < 0 ) {
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
write_unlock_bh ( & idev - > lock ) ;
2017-10-30 22:47:09 -07:00
in6_ifa_hold ( ifp ) ;
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
ipv6_del_addr ( ifp ) ;
write_lock_bh ( & idev - > lock ) ;
net_info_ratelimited ( " %s: Failed to add prefix route for address %pI6c; dropping \n " ,
idev - > dev - > name , & ifp - > addr ) ;
}
}
write_unlock_bh ( & idev - > lock ) ;
}
2007-02-09 23:24:49 +09:00
static int addrconf_notify ( struct notifier_block * this , unsigned long event ,
2013-05-28 01:30:21 +00:00
void * ptr )
2005-04-16 15:20:36 -07:00
{
2013-05-28 01:30:21 +00:00
struct net_device * dev = netdev_notifier_info_to_dev ( ptr ) ;
2018-11-21 21:52:33 +08:00
struct netdev_notifier_change_info * change_info ;
2016-04-07 11:10:41 -07:00
struct netdev_notifier_changeupper_info * info ;
2012-08-22 21:50:59 +00:00
struct inet6_dev * idev = __in6_dev_get ( dev ) ;
2017-06-21 14:34:58 -07:00
struct net * net = dev_net ( dev ) ;
2005-12-21 22:57:44 +09:00
int run_pending = 0 ;
2007-07-30 17:04:52 -07:00
int err ;
2005-04-16 15:20:36 -07:00
2010-03-20 16:09:01 -07:00
switch ( event ) {
2007-02-15 02:07:27 +09:00
case NETDEV_REGISTER :
2007-06-14 13:02:55 -07:00
if ( ! idev & & dev - > mtu > = IPV6_MIN_MTU ) {
2007-02-15 02:07:27 +09:00
idev = ipv6_add_dev ( dev ) ;
2014-07-25 15:25:09 -07:00
if ( IS_ERR ( idev ) )
return notifier_from_errno ( PTR_ERR ( idev ) ) ;
2007-02-15 02:07:27 +09:00
}
break ;
2010-03-20 16:08:18 -07:00
2015-10-26 11:06:33 -07:00
case NETDEV_CHANGEMTU :
/* if MTU under IPV6_MIN_MTU stop IPv6 on this interface. */
if ( dev - > mtu < IPV6_MIN_MTU ) {
2017-06-21 14:34:58 -07:00
addrconf_ifdown ( dev , dev ! = net - > loopback_dev ) ;
2015-10-26 11:06:33 -07:00
break ;
}
if ( idev ) {
rt6_mtu_change ( dev , dev - > mtu ) ;
idev - > cnf . mtu6 = dev - > mtu ;
break ;
}
/* allocate new idev */
idev = ipv6_add_dev ( dev ) ;
if ( IS_ERR ( idev ) )
break ;
/* device is still not ready */
if ( ! ( idev - > if_flags & IF_READY ) )
break ;
run_pending = 1 ;
2020-03-12 15:50:22 -07:00
fallthrough ;
2005-04-16 15:20:36 -07:00
case NETDEV_UP :
2005-12-21 22:57:24 +09:00
case NETDEV_CHANGE :
2022-08-30 17:37:21 +08:00
if ( idev & & idev - > cnf . disable_ipv6 )
bonding / ipv6: no addrconf for slaves separately from master
At present, when a device is enslaved to bonding, if ipv6 is
active then addrconf will be initated on the slave (because it is closed
then opened during the enslavement processing). This causes DAD and RS
packets to be sent from the slave. These packets in turn can confuse
switches that perform ipv6 snooping, causing them to incorrectly update
their forwarding tables (if, e.g., the slave being added is an inactve
backup that won't be used right away) and direct traffic away from the
active slave to a backup slave (where the incoming packets will be
dropped).
This patch alters the behavior so that addrconf will only run on
the master device itself. I believe this is logically correct, as it
prevents slaves from having an IPv6 identity independent from the
master. This is consistent with the IPv4 behavior for bonding.
This is accomplished by (a) having bonding set IFF_SLAVE sooner
in the enslavement processing than currently occurs (before open, not
after), and (b) having ipv6 addrconf ignore UP and CHANGE events on
slave devices.
The eql driver also uses the IFF_SLAVE flag. I inspected eql,
and I believe this change is reasonable for its usage of IFF_SLAVE, but
I did not test it.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-09 10:42:47 -07:00
break ;
2022-12-09 10:21:38 -05:00
if ( dev - > priv_flags & IFF_NO_ADDRCONF ) {
2022-08-30 17:37:21 +08:00
if ( event = = NETDEV_UP & & ! IS_ERR_OR_NULL ( idev ) & &
dev - > flags & IFF_UP & & dev - > flags & IFF_MULTICAST )
ipv6_mc_up ( idev ) ;
2014-09-11 15:07:16 -07:00
break ;
2022-08-30 17:37:21 +08:00
}
2014-09-11 15:07:16 -07:00
2005-12-21 22:57:24 +09:00
if ( event = = NETDEV_UP ) {
2016-04-21 20:56:12 -07:00
/* restore routes for permanent addresses */
2018-04-17 17:33:11 -07:00
addrconf_permanent_addr ( net , dev ) ;
2016-04-21 20:56:12 -07:00
2017-09-25 22:01:36 +01:00
if ( ! addrconf_link_ready ( dev ) ) {
2005-12-21 22:57:24 +09:00
/* device is not ready yet. */
2019-01-21 14:54:20 +01:00
pr_debug ( " ADDRCONF(NETDEV_UP): %s: link is not ready \n " ,
dev - > name ) ;
2005-12-21 22:57:24 +09:00
break ;
}
2006-02-08 16:10:53 -08:00
2007-11-30 23:36:08 +11:00
if ( ! idev & & dev - > mtu > = IPV6_MIN_MTU )
idev = ipv6_add_dev ( dev ) ;
2014-07-25 15:25:09 -07:00
if ( ! IS_ERR_OR_NULL ( idev ) ) {
2006-02-08 16:10:53 -08:00
idev - > if_flags | = IF_READY ;
ipv6: fix run pending DAD when interface becomes ready
With some net devices types, an IPv6 address configured while the
interface was down can stay 'tentative' forever, even after the interface
is set up. In some case, pending IPv6 DADs are not executed when the
device becomes ready.
I observed this while doing some tests with kvm. If I assign an IPv6
address to my interface eth0 (kvm driver rtl8139) when it is still down
then the address is flagged tentative (IFA_F_TENTATIVE). Then, I set
eth0 up, and to my surprise, the address stays 'tentative', no DAD is
executed and the address can't be pinged.
I also observed the same behaviour, without kvm, with virtual interfaces
types macvlan and veth.
Some easy steps to reproduce the issue with macvlan:
1. ip link add link eth0 type macvlan
2. ip -6 addr add 2003::ab32/64 dev macvlan0
3. ip addr show dev macvlan0
...
inet6 2003::ab32/64 scope global tentative
...
4. ip link set macvlan0 up
5. ip addr show dev macvlan0
...
inet6 2003::ab32/64 scope global tentative
...
Address is still tentative
I think there's a bug in net/ipv6/addrconf.c, addrconf_notify():
addrconf_dad_run() is not always run when the interface is flagged IF_READY.
Currently it is only run when receiving NETDEV_CHANGE event. Looks like
some (virtual) devices doesn't send this event when becoming up.
For both NETDEV_UP and NETDEV_CHANGE events, when the interface becomes
ready, run_pending should be set to 1. Patch below.
'run_pending = 1' could be moved below the if/else block but it makes
the code less readable.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 01:43:57 -08:00
run_pending = 1 ;
}
2015-10-26 11:06:33 -07:00
} else if ( event = = NETDEV_CHANGE ) {
2017-09-25 22:01:36 +01:00
if ( ! addrconf_link_ready ( dev ) ) {
2005-12-21 22:57:24 +09:00
/* device is still not ready. */
2018-01-07 12:45:05 +02:00
rt6_sync_down_dev ( dev , event ) ;
2005-12-21 22:57:24 +09:00
break ;
}
2018-11-21 21:52:33 +08:00
if ( ! IS_ERR_OR_NULL ( idev ) ) {
2017-02-03 08:11:03 +01:00
if ( idev - > if_flags & IF_READY ) {
/* device is already configured -
* but resend MLD reports , we might
* have roamed and need to update
* multicast snooping switches
*/
ipv6_mc_up ( idev ) ;
2018-11-21 21:52:33 +08:00
change_info = ptr ;
if ( change_info - > flags_changed & IFF_NOARP )
addrconf_dad_run ( idev , true ) ;
2018-01-07 12:45:05 +02:00
rt6_sync_up ( dev , RTNH_F_LINKDOWN ) ;
2005-12-21 22:57:24 +09:00
break ;
2017-02-03 08:11:03 +01:00
}
2005-12-21 22:57:24 +09:00
idev - > if_flags | = IF_READY ;
}
2023-06-02 11:36:07 +02:00
pr_debug ( " ADDRCONF(NETDEV_CHANGE): %s: link becomes ready \n " ,
dev - > name ) ;
2005-12-21 22:57:24 +09:00
2005-12-21 22:57:44 +09:00
run_pending = 1 ;
2005-12-21 22:57:24 +09:00
}
2023-01-31 16:46:45 +13:00
addrconf_init_auto_addrs ( dev ) ;
2010-03-20 16:08:18 -07:00
2014-07-25 15:25:09 -07:00
if ( ! IS_ERR_OR_NULL ( idev ) ) {
2005-12-21 22:57:44 +09:00
if ( run_pending )
2018-11-21 21:52:33 +08:00
addrconf_dad_run ( idev , false ) ;
2005-12-21 22:57:44 +09:00
2018-01-07 12:45:03 +02:00
/* Device has an address by now */
rt6_sync_up ( dev , RTNH_F_DEAD ) ;
2010-03-20 16:08:18 -07:00
/*
* If the MTU changed during the interface down ,
* when the interface up , the changed MTU must be
* reflected in the idev as well as routers .
2005-04-16 15:20:36 -07:00
*/
2010-03-20 16:08:18 -07:00
if ( idev - > cnf . mtu6 ! = dev - > mtu & &
dev - > mtu > = IPV6_MIN_MTU ) {
2005-04-16 15:20:36 -07:00
rt6_mtu_change ( dev , dev - > mtu ) ;
idev - > cnf . mtu6 = dev - > mtu ;
}
idev - > tstamp = jiffies ;
inet6_ifinfo_notify ( RTM_NEWLINK , idev ) ;
2010-03-20 16:08:18 -07:00
/*
* If the changed mtu during down is lower than
* IPV6_MIN_MTU stop IPv6 on this interface .
2005-04-16 15:20:36 -07:00
*/
if ( dev - > mtu < IPV6_MIN_MTU )
2017-06-21 14:34:58 -07:00
addrconf_ifdown ( dev , dev ! = net - > loopback_dev ) ;
2005-04-16 15:20:36 -07:00
}
break ;
case NETDEV_DOWN :
case NETDEV_UNREGISTER :
/*
* Remove all addresses from this interface .
*/
addrconf_ifdown ( dev , event ! = NETDEV_DOWN ) ;
break ;
2005-12-21 22:57:24 +09:00
2005-04-16 15:20:36 -07:00
case NETDEV_CHANGENAME :
if ( idev ) {
2007-04-28 21:16:39 -07:00
snmp6_unregister_dev ( idev ) ;
2008-01-10 17:41:21 -08:00
addrconf_sysctl_unregister ( idev ) ;
2014-07-25 15:25:09 -07:00
err = addrconf_sysctl_register ( idev ) ;
2007-07-30 17:04:52 -07:00
if ( err )
return notifier_from_errno ( err ) ;
2014-07-25 15:25:09 -07:00
err = snmp6_register_dev ( idev ) ;
if ( err ) {
addrconf_sysctl_unregister ( idev ) ;
return notifier_from_errno ( err ) ;
}
2007-04-28 21:16:39 -07:00
}
2005-04-16 15:20:36 -07:00
break ;
2010-03-20 16:08:18 -07:00
2010-03-10 10:28:56 +00:00
case NETDEV_PRE_TYPE_CHANGE :
case NETDEV_POST_TYPE_CHANGE :
2015-12-03 21:12:32 +01:00
if ( idev )
addrconf_type_change ( dev , event ) ;
2009-09-15 02:37:40 -07:00
break ;
2016-04-07 11:10:41 -07:00
case NETDEV_CHANGEUPPER :
info = ptr ;
/* flush all routes if dev is linked to or unlinked from
* an L3 master device ( e . g . , VRF )
*/
if ( info - > upper_dev & & netif_is_l3_master ( info - > upper_dev ) )
2020-07-31 15:32:07 +02:00
addrconf_ifdown ( dev , false ) ;
2007-04-20 17:09:22 -07:00
}
2005-04-16 15:20:36 -07:00
return NOTIFY_OK ;
}
/*
* addrconf module should be notified of a device going up
*/
static struct notifier_block ipv6_dev_notf = {
. notifier_call = addrconf_notify ,
2017-05-08 10:12:13 -07:00
. priority = ADDRCONF_NOTIFY_PRIORITY ,
2005-04-16 15:20:36 -07:00
} ;
2010-03-10 10:28:56 +00:00
static void addrconf_type_change ( struct net_device * dev , unsigned long event )
2009-09-15 02:37:40 -07:00
{
struct inet6_dev * idev ;
ASSERT_RTNL ( ) ;
idev = __in6_dev_get ( dev ) ;
2010-03-10 10:28:56 +00:00
if ( event = = NETDEV_POST_TYPE_CHANGE )
2009-09-15 02:37:40 -07:00
ipv6_mc_remap ( idev ) ;
2010-03-10 10:28:56 +00:00
else if ( event = = NETDEV_PRE_TYPE_CHANGE )
2009-09-15 02:37:40 -07:00
ipv6_mc_unmap ( idev ) ;
}
2016-04-08 12:01:21 -07:00
static bool addr_is_local ( const struct in6_addr * addr )
{
return ipv6_addr_type ( addr ) &
( IPV6_ADDR_LINKLOCAL | IPV6_ADDR_LOOPBACK ) ;
}
2020-07-31 15:32:07 +02:00
static int addrconf_ifdown ( struct net_device * dev , bool unregister )
2005-04-16 15:20:36 -07:00
{
2020-07-31 15:32:07 +02:00
unsigned long event = unregister ? NETDEV_UNREGISTER : NETDEV_DOWN ;
2008-03-25 21:47:49 +09:00
struct net * net = dev_net ( dev ) ;
2010-03-17 20:31:13 +00:00
struct inet6_dev * idev ;
2022-04-04 01:15:24 +02:00
struct inet6_ifaddr * ifa ;
LIST_HEAD ( tmp_addr_list ) ;
2018-04-24 19:51:29 +02:00
bool keep_addr = false ;
2022-02-24 10:06:49 +01:00
bool was_ready ;
2011-01-23 23:27:15 -08:00
int state , i ;
2005-04-16 15:20:36 -07:00
ASSERT_RTNL ( ) ;
2018-01-07 12:45:04 +02:00
rt6_disable_ip ( dev , event ) ;
2005-04-16 15:20:36 -07:00
idev = __in6_dev_get ( dev ) ;
2015-03-29 14:00:04 +01:00
if ( ! idev )
2005-04-16 15:20:36 -07:00
return - ENODEV ;
2010-03-20 16:08:18 -07:00
/*
* Step 1 : remove reference to ipv6 device from parent device .
* Do not dev_put !
2005-04-16 15:20:36 -07:00
*/
2020-07-31 15:32:07 +02:00
if ( unregister ) {
2005-04-16 15:20:36 -07:00
idev - > dead = 1 ;
2006-09-22 14:44:24 -07:00
/* protected by rtnl_lock */
2011-08-01 16:19:00 +00:00
RCU_INIT_POINTER ( dev - > ip6_ptr , NULL ) ;
2005-04-16 15:20:36 -07:00
/* Step 1.5: remove snmp6 entry */
snmp6_unregister_dev ( idev ) ;
}
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
/* combine the user config with event to determine if permanent
* addresses are to be removed from address hash table
*/
2020-07-31 15:32:07 +02:00
if ( ! unregister & & ! idev - > cnf . disable_ipv6 ) {
2018-04-24 19:51:29 +02:00
/* aggregate the system setting and interface setting */
int _keep_addr = net - > ipv6 . devconf_all - > keep_addr_on_down ;
if ( ! _keep_addr )
_keep_addr = idev - > cnf . keep_addr_on_down ;
keep_addr = ( _keep_addr > 0 ) ;
}
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
2011-01-23 23:27:15 -08:00
/* Step 2: clear hash table */
for ( i = 0 ; i < IN6_ADDR_HSIZE ; i + + ) {
2022-02-07 20:50:30 -08:00
struct hlist_head * h = & net - > ipv6 . inet6_addr_lst [ i ] ;
2011-01-23 23:27:15 -08:00
2022-02-07 20:50:30 -08:00
spin_lock_bh ( & net - > ipv6 . addrconf_hash_lock ) ;
2014-08-24 21:53:10 +01:00
restart :
hlist: drop the node parameter from iterators
I'm not sure why, but the hlist for each entry iterators were conceived
list_for_each_entry(pos, head, member)
The hlist ones were greedy and wanted an extra parameter:
hlist_for_each_entry(tpos, pos, head, member)
Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.
Besides the semantic patch, there was some manual work required:
- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.
The semantic patch which is mostly the work of Peter Senna Tschudin is here:
@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
type T;
expression a,c,d,e;
identifier b;
statement S;
@@
-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>
[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 17:06:00 -08:00
hlist_for_each_entry_rcu ( ifa , h , addr_lst ) {
2011-01-23 23:27:15 -08:00
if ( ifa - > idev = = idev ) {
2014-03-27 18:28:07 +01:00
addrconf_del_dad_work ( ifa ) ;
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
/* combined flag + permanent flag decide if
* address is retained on a down event
*/
if ( ! keep_addr | |
2016-04-08 12:01:21 -07:00
! ( ifa - > flags & IFA_F_PERMANENT ) | |
addr_is_local ( & ifa - > addr ) ) {
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
hlist_del_init_rcu ( & ifa - > addr_lst ) ;
goto restart ;
}
2011-01-23 23:27:15 -08:00
}
}
2022-02-07 20:50:30 -08:00
spin_unlock_bh ( & net - > ipv6 . addrconf_hash_lock ) ;
2011-01-23 23:27:15 -08:00
}
2005-04-16 15:20:36 -07:00
write_lock_bh ( & idev - > lock ) ;
2013-06-23 18:39:01 +02:00
addrconf_del_rs_timer ( idev ) ;
2022-02-24 10:06:49 +01:00
/* Step 2: clear flags for stateless addrconf, repeated down
* detection
*/
was_ready = idev - > if_flags & IF_READY ;
2020-07-31 15:32:07 +02:00
if ( ! unregister )
2005-12-21 22:57:24 +09:00
idev - > if_flags & = ~ ( IF_RS_SENT | IF_RA_RCVD | IF_READY ) ;
2005-04-16 15:20:36 -07:00
2010-03-20 16:08:18 -07:00
/* Step 3: clear tempaddr list */
2010-03-17 20:31:09 +00:00
while ( ! list_empty ( & idev - > tempaddr_list ) ) {
ifa = list_first_entry ( & idev - > tempaddr_list ,
struct inet6_ifaddr , tmp_list ) ;
list_del ( & ifa - > tmp_list ) ;
2005-04-16 15:20:36 -07:00
write_unlock_bh ( & idev - > lock ) ;
spin_lock_bh ( & ifa - > lock ) ;
if ( ifa - > ifpub ) {
in6_ifa_put ( ifa - > ifpub ) ;
ifa - > ifpub = NULL ;
}
spin_unlock_bh ( & ifa - > lock ) ;
in6_ifa_put ( ifa ) ;
write_lock_bh ( & idev - > lock ) ;
}
2010-03-03 08:19:59 +00:00
2022-04-04 01:15:24 +02:00
list_for_each_entry ( ifa , & idev - > addr_list , if_list )
list_add_tail ( & ifa - > if_list_aux , & tmp_addr_list ) ;
write_unlock_bh ( & idev - > lock ) ;
while ( ! list_empty ( & tmp_addr_list ) ) {
2018-04-17 17:33:26 -07:00
struct fib6_info * rt = NULL ;
2017-04-10 08:36:39 +02:00
bool keep ;
2016-04-21 20:56:12 -07:00
2022-04-04 01:15:24 +02:00
ifa = list_first_entry ( & tmp_addr_list ,
struct inet6_ifaddr , if_list_aux ) ;
list_del ( & ifa - > if_list_aux ) ;
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
addrconf_del_dad_work ( ifa ) ;
2010-03-02 13:32:46 +00:00
2017-04-10 08:36:39 +02:00
keep = keep_addr & & ( ifa - > flags & IFA_F_PERMANENT ) & &
! addr_is_local ( & ifa - > addr ) ;
2015-03-23 23:36:03 +01:00
spin_lock_bh ( & ifa - > lock ) ;
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
2017-04-10 08:36:39 +02:00
if ( keep ) {
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
/* set state to skip the notifier below */
state = INET6_IFADDR_STATE_DEAD ;
2017-05-02 14:43:44 -07:00
ifa - > state = INET6_IFADDR_STATE_PREDAD ;
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
if ( ! ( ifa - > flags & IFA_F_NODAD ) )
ifa - > flags | = IFA_F_TENTATIVE ;
2016-04-21 20:56:12 -07:00
rt = ifa - > rt ;
ifa - > rt = NULL ;
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
} else {
state = ifa - > state ;
ifa - > state = INET6_IFADDR_STATE_DEAD ;
}
2015-03-23 23:36:03 +01:00
spin_unlock_bh ( & ifa - > lock ) ;
2010-11-15 20:29:21 +00:00
2016-04-21 20:56:12 -07:00
if ( rt )
2020-04-27 13:56:45 -07:00
ip6_del_rt ( net , rt , false ) ;
2016-04-21 20:56:12 -07:00
2011-01-23 23:27:15 -08:00
if ( state ! = INET6_IFADDR_STATE_DEAD ) {
__ipv6_ifa_notify ( RTM_DELADDR , ifa ) ;
2013-04-14 23:18:43 +08:00
inet6addr_notifier_call_chain ( NETDEV_DOWN , ifa ) ;
2016-07-22 18:32:11 +01:00
} else {
if ( idev - > cnf . forwarding )
addrconf_leave_anycast ( ifa ) ;
addrconf_leave_solict ( ifa - > idev , & ifa - > addr ) ;
2010-04-12 05:41:32 +00:00
}
2010-03-03 08:19:59 +00:00
2017-10-07 19:30:23 -07:00
if ( ! keep ) {
2022-04-04 01:15:24 +02:00
write_lock_bh ( & idev - > lock ) ;
2017-10-07 19:30:23 -07:00
list_del_rcu ( & ifa - > if_list ) ;
2022-04-04 01:15:24 +02:00
write_unlock_bh ( & idev - > lock ) ;
2017-10-07 19:30:23 -07:00
in6_ifa_put ( ifa ) ;
}
2011-01-23 23:27:15 -08:00
}
2010-03-03 08:19:59 +00:00
2014-09-10 23:23:02 +02:00
/* Step 5: Discard anycast and multicast list */
2020-07-31 15:32:07 +02:00
if ( unregister ) {
2014-09-10 23:23:02 +02:00
ipv6_ac_destroy_dev ( idev ) ;
2005-04-16 15:20:36 -07:00
ipv6_mc_destroy_dev ( idev ) ;
2022-02-24 10:06:49 +01:00
} else if ( was_ready ) {
2005-04-16 15:20:36 -07:00
ipv6_mc_down ( idev ) ;
2014-09-10 23:23:02 +02:00
}
2005-04-16 15:20:36 -07:00
idev - > tstamp = jiffies ;
ipv6: add IFLA_INET6_RA_MTU to expose mtu value
The kernel provides a "/proc/sys/net/ipv6/conf/<iface>/mtu"
file, which can temporarily record the mtu value of the last
received RA message when the RA mtu value is lower than the
interface mtu, but this proc has following limitations:
(1) when the interface mtu (/sys/class/net/<iface>/mtu) is
updeated, mtu6 (/proc/sys/net/ipv6/conf/<iface>/mtu) will
be updated to the value of interface mtu;
(2) mtu6 (/proc/sys/net/ipv6/conf/<iface>/mtu) only affect
ipv6 connection, and not affect ipv4.
Therefore, when the mtu option is carried in the RA message,
there will be a problem that the user sometimes cannot obtain
RA mtu value correctly by reading mtu6.
After this patch set, if a RA message carries the mtu option,
you can send a netlink msg which nlmsg_type is RTM_GETLINK,
and then by parsing the attribute of IFLA_INET6_RA_MTU to
get the mtu value carried in the RA message received on the
inet6 device. In addition, you can also get a link notification
when ra_mtu is updated so it doesn't have to poll.
In this way, if the MTU values that the device receives from
the network in the PCO IPv4 and the RA IPv6 procedures are
different, the user can obtain the correct ipv6 ra_mtu value
and compare the value of ra_mtu and ipv4 mtu, then the device
can use the lower MTU value for both IPv4 and IPv6.
Signed-off-by: Rocco Yue <rocco.yue@mediatek.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210827150412.9267-1-rocco.yue@mediatek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-27 23:04:12 +08:00
idev - > ra_mtu = 0 ;
2007-02-09 23:24:49 +09:00
2010-03-20 16:08:18 -07:00
/* Last: Shot the device (if unregistered) */
2020-07-31 15:32:07 +02:00
if ( unregister ) {
2008-01-10 17:41:21 -08:00
addrconf_sysctl_unregister ( idev ) ;
2005-04-16 15:20:36 -07:00
neigh_parms_release ( & nd_tbl , idev - > nd_parms ) ;
neigh_ifdown ( & nd_tbl , dev ) ;
in6_dev_put ( idev ) ;
}
return 0 ;
}
treewide: setup_timer() -> timer_setup()
This converts all remaining cases of the old setup_timer() API into using
timer_setup(), where the callback argument is the structure already
holding the struct timer_list. These should have no behavioral changes,
since they just change which pointer is passed into the callback with
the same available pointers after conversion. It handles the following
examples, in addition to some other variations.
Casting from unsigned long:
void my_callback(unsigned long data)
{
struct something *ptr = (struct something *)data;
...
}
...
setup_timer(&ptr->my_timer, my_callback, ptr);
and forced object casts:
void my_callback(struct something *ptr)
{
...
}
...
setup_timer(&ptr->my_timer, my_callback, (unsigned long)ptr);
become:
void my_callback(struct timer_list *t)
{
struct something *ptr = from_timer(ptr, t, my_timer);
...
}
...
timer_setup(&ptr->my_timer, my_callback, 0);
Direct function assignments:
void my_callback(unsigned long data)
{
struct something *ptr = (struct something *)data;
...
}
...
ptr->my_timer.function = my_callback;
have a temporary cast added, along with converting the args:
void my_callback(struct timer_list *t)
{
struct something *ptr = from_timer(ptr, t, my_timer);
...
}
...
ptr->my_timer.function = (TIMER_FUNC_TYPE)my_callback;
And finally, callbacks without a data assignment:
void my_callback(unsigned long data)
{
...
}
...
setup_timer(&ptr->my_timer, my_callback, 0);
have their argument renamed to verify they're unused during conversion:
void my_callback(struct timer_list *unused)
{
...
}
...
timer_setup(&ptr->my_timer, my_callback, 0);
The conversion is done with the following Coccinelle script:
spatch --very-quiet --all-includes --include-headers \
-I ./arch/x86/include -I ./arch/x86/include/generated \
-I ./include -I ./arch/x86/include/uapi \
-I ./arch/x86/include/generated/uapi -I ./include/uapi \
-I ./include/generated/uapi --include ./include/linux/kconfig.h \
--dir . \
--cocci-file ~/src/data/timer_setup.cocci
@fix_address_of@
expression e;
@@
setup_timer(
-&(e)
+&e
, ...)
// Update any raw setup_timer() usages that have a NULL callback, but
// would otherwise match change_timer_function_usage, since the latter
// will update all function assignments done in the face of a NULL
// function initialization in setup_timer().
@change_timer_function_usage_NULL@
expression _E;
identifier _timer;
type _cast_data;
@@
(
-setup_timer(&_E->_timer, NULL, _E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E->_timer, NULL, (_cast_data)_E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, &_E);
+timer_setup(&_E._timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, (_cast_data)&_E);
+timer_setup(&_E._timer, NULL, 0);
)
@change_timer_function_usage@
expression _E;
identifier _timer;
struct timer_list _stl;
identifier _callback;
type _cast_func, _cast_data;
@@
(
-setup_timer(&_E->_timer, _callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
_E->_timer@_stl.function = _callback;
|
_E->_timer@_stl.function = &_callback;
|
_E->_timer@_stl.function = (_cast_func)_callback;
|
_E->_timer@_stl.function = (_cast_func)&_callback;
|
_E._timer@_stl.function = _callback;
|
_E._timer@_stl.function = &_callback;
|
_E._timer@_stl.function = (_cast_func)_callback;
|
_E._timer@_stl.function = (_cast_func)&_callback;
)
// callback(unsigned long arg)
@change_callback_handle_cast
depends on change_timer_function_usage@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
identifier _handle;
@@
void _callback(
-_origtype _origarg
+struct timer_list *t
)
{
(
... when != _origarg
_handletype *_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle;
... when != _handle
_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle;
... when != _handle
_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
)
}
// callback(unsigned long arg) without existing variable
@change_callback_handle_cast_no_arg
depends on change_timer_function_usage &&
!change_callback_handle_cast@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
@@
void _callback(
-_origtype _origarg
+struct timer_list *t
)
{
+ _handletype *_origarg = from_timer(_origarg, t, _timer);
+
... when != _origarg
- (_handletype *)_origarg
+ _origarg
... when != _origarg
}
// Avoid already converted callbacks.
@match_callback_converted
depends on change_timer_function_usage &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier t;
@@
void _callback(struct timer_list *t)
{ ... }
// callback(struct something *handle)
@change_callback_handle_arg
depends on change_timer_function_usage &&
!match_callback_converted &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
@@
void _callback(
-_handletype *_handle
+struct timer_list *t
)
{
+ _handletype *_handle = from_timer(_handle, t, _timer);
...
}
// If change_callback_handle_arg ran on an empty function, remove
// the added handler.
@unchange_callback_handle_arg
depends on change_timer_function_usage &&
change_callback_handle_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
identifier t;
@@
void _callback(struct timer_list *t)
{
- _handletype *_handle = from_timer(_handle, t, _timer);
}
// We only want to refactor the setup_timer() data argument if we've found
// the matching callback. This undoes changes in change_timer_function_usage.
@unchange_timer_function_usage
depends on change_timer_function_usage &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg &&
!change_callback_handle_arg@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type change_timer_function_usage._cast_data;
@@
(
-timer_setup(&_E->_timer, _callback, 0);
+setup_timer(&_E->_timer, _callback, (_cast_data)_E);
|
-timer_setup(&_E._timer, _callback, 0);
+setup_timer(&_E._timer, _callback, (_cast_data)&_E);
)
// If we fixed a callback from a .function assignment, fix the
// assignment cast now.
@change_timer_function_assignment
depends on change_timer_function_usage &&
(change_callback_handle_cast ||
change_callback_handle_cast_no_arg ||
change_callback_handle_arg)@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_func;
typedef TIMER_FUNC_TYPE;
@@
(
_E->_timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-&_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-(_cast_func)_callback;
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-&_callback;
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-(_cast_func)_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
;
)
// Sometimes timer functions are called directly. Replace matched args.
@change_timer_function_calls
depends on change_timer_function_usage &&
(change_callback_handle_cast ||
change_callback_handle_cast_no_arg ||
change_callback_handle_arg)@
expression _E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_data;
@@
_callback(
(
-(_cast_data)_E
+&_E->_timer
|
-(_cast_data)&_E
+&_E._timer
|
-_E
+&_E->_timer
)
)
// If a timer has been configured without a data argument, it can be
// converted without regard to the callback argument, since it is unused.
@match_timer_function_unused_data@
expression _E;
identifier _timer;
identifier _callback;
@@
(
-setup_timer(&_E->_timer, _callback, 0);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0L);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0UL);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0L);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0UL);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0L);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0UL);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0L);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0UL);
+timer_setup(_timer, _callback, 0);
)
@change_callback_unused_data
depends on match_timer_function_unused_data@
identifier match_timer_function_unused_data._callback;
type _origtype;
identifier _origarg;
@@
void _callback(
-_origtype _origarg
+struct timer_list *unused
)
{
... when != _origarg
}
Signed-off-by: Kees Cook <keescook@chromium.org>
2017-10-16 14:43:17 -07:00
static void addrconf_rs_timer ( struct timer_list * t )
2005-04-16 15:20:36 -07:00
{
treewide: setup_timer() -> timer_setup()
This converts all remaining cases of the old setup_timer() API into using
timer_setup(), where the callback argument is the structure already
holding the struct timer_list. These should have no behavioral changes,
since they just change which pointer is passed into the callback with
the same available pointers after conversion. It handles the following
examples, in addition to some other variations.
Casting from unsigned long:
void my_callback(unsigned long data)
{
struct something *ptr = (struct something *)data;
...
}
...
setup_timer(&ptr->my_timer, my_callback, ptr);
and forced object casts:
void my_callback(struct something *ptr)
{
...
}
...
setup_timer(&ptr->my_timer, my_callback, (unsigned long)ptr);
become:
void my_callback(struct timer_list *t)
{
struct something *ptr = from_timer(ptr, t, my_timer);
...
}
...
timer_setup(&ptr->my_timer, my_callback, 0);
Direct function assignments:
void my_callback(unsigned long data)
{
struct something *ptr = (struct something *)data;
...
}
...
ptr->my_timer.function = my_callback;
have a temporary cast added, along with converting the args:
void my_callback(struct timer_list *t)
{
struct something *ptr = from_timer(ptr, t, my_timer);
...
}
...
ptr->my_timer.function = (TIMER_FUNC_TYPE)my_callback;
And finally, callbacks without a data assignment:
void my_callback(unsigned long data)
{
...
}
...
setup_timer(&ptr->my_timer, my_callback, 0);
have their argument renamed to verify they're unused during conversion:
void my_callback(struct timer_list *unused)
{
...
}
...
timer_setup(&ptr->my_timer, my_callback, 0);
The conversion is done with the following Coccinelle script:
spatch --very-quiet --all-includes --include-headers \
-I ./arch/x86/include -I ./arch/x86/include/generated \
-I ./include -I ./arch/x86/include/uapi \
-I ./arch/x86/include/generated/uapi -I ./include/uapi \
-I ./include/generated/uapi --include ./include/linux/kconfig.h \
--dir . \
--cocci-file ~/src/data/timer_setup.cocci
@fix_address_of@
expression e;
@@
setup_timer(
-&(e)
+&e
, ...)
// Update any raw setup_timer() usages that have a NULL callback, but
// would otherwise match change_timer_function_usage, since the latter
// will update all function assignments done in the face of a NULL
// function initialization in setup_timer().
@change_timer_function_usage_NULL@
expression _E;
identifier _timer;
type _cast_data;
@@
(
-setup_timer(&_E->_timer, NULL, _E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E->_timer, NULL, (_cast_data)_E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, &_E);
+timer_setup(&_E._timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, (_cast_data)&_E);
+timer_setup(&_E._timer, NULL, 0);
)
@change_timer_function_usage@
expression _E;
identifier _timer;
struct timer_list _stl;
identifier _callback;
type _cast_func, _cast_data;
@@
(
-setup_timer(&_E->_timer, _callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
_E->_timer@_stl.function = _callback;
|
_E->_timer@_stl.function = &_callback;
|
_E->_timer@_stl.function = (_cast_func)_callback;
|
_E->_timer@_stl.function = (_cast_func)&_callback;
|
_E._timer@_stl.function = _callback;
|
_E._timer@_stl.function = &_callback;
|
_E._timer@_stl.function = (_cast_func)_callback;
|
_E._timer@_stl.function = (_cast_func)&_callback;
)
// callback(unsigned long arg)
@change_callback_handle_cast
depends on change_timer_function_usage@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
identifier _handle;
@@
void _callback(
-_origtype _origarg
+struct timer_list *t
)
{
(
... when != _origarg
_handletype *_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle;
... when != _handle
_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle;
... when != _handle
_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
)
}
// callback(unsigned long arg) without existing variable
@change_callback_handle_cast_no_arg
depends on change_timer_function_usage &&
!change_callback_handle_cast@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
@@
void _callback(
-_origtype _origarg
+struct timer_list *t
)
{
+ _handletype *_origarg = from_timer(_origarg, t, _timer);
+
... when != _origarg
- (_handletype *)_origarg
+ _origarg
... when != _origarg
}
// Avoid already converted callbacks.
@match_callback_converted
depends on change_timer_function_usage &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier t;
@@
void _callback(struct timer_list *t)
{ ... }
// callback(struct something *handle)
@change_callback_handle_arg
depends on change_timer_function_usage &&
!match_callback_converted &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
@@
void _callback(
-_handletype *_handle
+struct timer_list *t
)
{
+ _handletype *_handle = from_timer(_handle, t, _timer);
...
}
// If change_callback_handle_arg ran on an empty function, remove
// the added handler.
@unchange_callback_handle_arg
depends on change_timer_function_usage &&
change_callback_handle_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
identifier t;
@@
void _callback(struct timer_list *t)
{
- _handletype *_handle = from_timer(_handle, t, _timer);
}
// We only want to refactor the setup_timer() data argument if we've found
// the matching callback. This undoes changes in change_timer_function_usage.
@unchange_timer_function_usage
depends on change_timer_function_usage &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg &&
!change_callback_handle_arg@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type change_timer_function_usage._cast_data;
@@
(
-timer_setup(&_E->_timer, _callback, 0);
+setup_timer(&_E->_timer, _callback, (_cast_data)_E);
|
-timer_setup(&_E._timer, _callback, 0);
+setup_timer(&_E._timer, _callback, (_cast_data)&_E);
)
// If we fixed a callback from a .function assignment, fix the
// assignment cast now.
@change_timer_function_assignment
depends on change_timer_function_usage &&
(change_callback_handle_cast ||
change_callback_handle_cast_no_arg ||
change_callback_handle_arg)@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_func;
typedef TIMER_FUNC_TYPE;
@@
(
_E->_timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-&_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-(_cast_func)_callback;
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-&_callback;
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-(_cast_func)_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
;
)
// Sometimes timer functions are called directly. Replace matched args.
@change_timer_function_calls
depends on change_timer_function_usage &&
(change_callback_handle_cast ||
change_callback_handle_cast_no_arg ||
change_callback_handle_arg)@
expression _E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_data;
@@
_callback(
(
-(_cast_data)_E
+&_E->_timer
|
-(_cast_data)&_E
+&_E._timer
|
-_E
+&_E->_timer
)
)
// If a timer has been configured without a data argument, it can be
// converted without regard to the callback argument, since it is unused.
@match_timer_function_unused_data@
expression _E;
identifier _timer;
identifier _callback;
@@
(
-setup_timer(&_E->_timer, _callback, 0);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0L);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0UL);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0L);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0UL);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0L);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0UL);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0L);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0UL);
+timer_setup(_timer, _callback, 0);
)
@change_callback_unused_data
depends on match_timer_function_unused_data@
identifier match_timer_function_unused_data._callback;
type _origtype;
identifier _origarg;
@@
void _callback(
-_origtype _origarg
+struct timer_list *unused
)
{
... when != _origarg
}
Signed-off-by: Kees Cook <keescook@chromium.org>
2017-10-16 14:43:17 -07:00
struct inet6_dev * idev = from_timer ( idev , t , rs_timer ) ;
2013-08-31 13:44:32 +08:00
struct net_device * dev = idev - > dev ;
2013-06-23 18:39:01 +02:00
struct in6_addr lladdr ;
2005-04-16 15:20:36 -07:00
2013-06-23 18:39:01 +02:00
write_lock ( & idev - > lock ) ;
2010-03-02 13:32:45 +00:00
if ( idev - > dead | | ! ( idev - > if_flags & IF_READY ) )
2005-04-16 15:20:36 -07:00
goto out ;
2012-12-02 01:44:53 +00:00
if ( ! ipv6_accept_ra ( idev ) )
2010-03-02 13:32:45 +00:00
goto out ;
/* Announcement received after solicitation was sent */
if ( idev - > if_flags & IF_RA_RCVD )
2005-04-16 15:20:36 -07:00
goto out ;
2016-09-27 23:57:58 -07:00
if ( idev - > rs_probes + + < idev - > cnf . rtr_solicits | | idev - > cnf . rtr_solicits < 0 ) {
2013-08-31 13:44:32 +08:00
write_unlock ( & idev - > lock ) ;
if ( ! ipv6_get_lladdr ( dev , & lladdr , IFA_F_TENTATIVE ) )
ndisc_send_rs ( dev , & lladdr ,
2013-06-23 18:39:01 +02:00
& in6addr_linklocal_allrouters ) ;
else
2013-08-31 13:44:32 +08:00
goto put ;
2005-04-16 15:20:36 -07:00
2013-08-31 13:44:32 +08:00
write_lock ( & idev - > lock ) ;
2016-09-27 23:57:58 -07:00
idev - > rs_interval = rfc3315_s14_backoff_update (
idev - > rs_interval , idev - > cnf . rtr_solicit_max_interval ) ;
2013-06-23 18:39:01 +02:00
/* The wait after the last probe can be shorter */
addrconf_mod_rs_timer ( idev , ( idev - > rs_probes = =
idev - > cnf . rtr_solicits ) ?
idev - > cnf . rtr_solicit_delay :
2016-09-27 23:57:58 -07:00
idev - > rs_interval ) ;
2005-04-16 15:20:36 -07:00
} else {
/*
* Note : we do not support deprecated " all on-link "
* assumption any longer .
*/
2012-05-15 14:11:54 +00:00
pr_debug ( " %s: no IPv6 routers present \n " , idev - > dev - > name ) ;
2005-04-16 15:20:36 -07:00
}
out :
2013-06-23 18:39:01 +02:00
write_unlock ( & idev - > lock ) ;
2013-08-31 13:44:32 +08:00
put :
2013-06-23 18:39:01 +02:00
in6_dev_put ( idev ) ;
2005-04-16 15:20:36 -07:00
}
/*
* Duplicate Address Detection
*/
2005-12-21 22:57:24 +09:00
static void addrconf_dad_kick ( struct inet6_ifaddr * ifp )
{
unsigned long rand_num ;
struct inet6_dev * idev = ifp - > idev ;
2016-12-02 14:00:08 -08:00
u64 nonce ;
2005-12-21 22:57:24 +09:00
2007-04-25 17:08:10 -07:00
if ( ifp - > flags & IFA_F_OPTIMISTIC )
rand_num = 0 ;
else
2022-10-09 20:44:02 -06:00
rand_num = get_random_u32_below ( idev - > cnf . rtr_solicit_delay ? : 1 ) ;
2007-04-25 17:08:10 -07:00
2016-12-02 14:00:08 -08:00
nonce = 0 ;
if ( idev - > cnf . enhanced_dad | |
dev_net ( idev - > dev ) - > ipv6 . devconf_all - > enhanced_dad ) {
do
get_random_bytes ( & nonce , 6 ) ;
while ( nonce = = 0 ) ;
}
ifp - > dad_nonce = nonce ;
2013-06-23 18:39:01 +02:00
ifp - > dad_probes = idev - > cnf . dad_transmits ;
2014-03-27 18:28:07 +01:00
addrconf_mod_dad_work ( ifp , rand_num ) ;
2005-12-21 22:57:24 +09:00
}
2014-03-27 18:28:07 +01:00
static void addrconf_dad_begin ( struct inet6_ifaddr * ifp )
2005-04-16 15:20:36 -07:00
{
struct inet6_dev * idev = ifp - > idev ;
struct net_device * dev = idev - > dev ;
2016-11-22 16:57:40 +01:00
bool bump_id , notify = false ;
2018-04-17 17:33:11 -07:00
struct net * net ;
2005-04-16 15:20:36 -07:00
addrconf_join_solict ( dev , & ifp - > addr ) ;
read_lock_bh ( & idev - > lock ) ;
2010-05-18 15:56:06 -07:00
spin_lock ( & ifp - > lock ) ;
2010-05-18 15:36:06 -07:00
if ( ifp - > state = = INET6_IFADDR_STATE_DEAD )
2005-04-16 15:20:36 -07:00
goto out ;
2018-04-17 17:33:11 -07:00
net = dev_net ( dev ) ;
2005-04-16 15:20:36 -07:00
if ( dev - > flags & ( IFF_NOARP | IFF_LOOPBACK ) | |
2018-04-17 17:33:11 -07:00
( net - > ipv6 . devconf_all - > accept_dad < 1 & &
2017-10-05 19:03:05 +02:00
idev - > cnf . accept_dad < 1 ) | |
2006-09-22 14:45:27 -07:00
! ( ifp - > flags & IFA_F_TENTATIVE ) | |
ifp - > flags & IFA_F_NODAD ) {
2018-01-25 20:16:29 -08:00
bool send_na = false ;
if ( ifp - > flags & IFA_F_TENTATIVE & &
! ( ifp - > flags & IFA_F_OPTIMISTIC ) )
send_na = true ;
2016-11-22 16:57:40 +01:00
bump_id = ifp - > flags & IFA_F_TENTATIVE ;
2009-09-09 14:41:32 +00:00
ifp - > flags & = ~ ( IFA_F_TENTATIVE | IFA_F_OPTIMISTIC | IFA_F_DADFAILED ) ;
2010-02-08 19:48:52 +00:00
spin_unlock ( & ifp - > lock ) ;
2005-04-16 15:20:36 -07:00
read_unlock_bh ( & idev - > lock ) ;
2018-01-25 20:16:29 -08:00
addrconf_dad_completed ( ifp , bump_id , send_na ) ;
2005-04-16 15:20:36 -07:00
return ;
}
2005-12-27 13:35:15 -08:00
if ( ! ( idev - > if_flags & IF_READY ) ) {
2010-02-08 19:48:52 +00:00
spin_unlock ( & ifp - > lock ) ;
2005-12-27 13:35:15 -08:00
read_unlock_bh ( & idev - > lock ) ;
2005-12-21 22:57:24 +09:00
/*
2009-06-09 10:41:12 +09:00
* If the device is not ready :
2005-12-21 22:57:24 +09:00
* - keep it tentative if it is a permanent address .
* - otherwise , kill it .
*/
in6_ifa_hold ( ifp ) ;
2009-09-09 14:41:32 +00:00
addrconf_dad_stop ( ifp , 0 ) ;
2005-12-27 13:35:15 -08:00
return ;
2005-12-21 22:57:24 +09:00
}
2007-04-25 17:08:10 -07:00
/*
* Optimistic nodes can start receiving
* Frames right away
*/
net: ipv6: Add a sysctl to make optimistic addresses useful candidates
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set). A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 18:11:14 +09:00
if ( ifp - > flags & IFA_F_OPTIMISTIC ) {
2018-04-17 17:33:11 -07:00
ip6_ins_rt ( net , ifp - > rt ) ;
if ( ipv6_use_optimistic_addr ( net , idev ) ) {
net: ipv6: Add a sysctl to make optimistic addresses useful candidates
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set). A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 18:11:14 +09:00
/* Because optimistic nodes can use this address,
* notify listeners . If DAD fails , RTM_DELADDR is sent .
*/
2016-02-02 02:11:10 +00:00
notify = true ;
net: ipv6: Add a sysctl to make optimistic addresses useful candidates
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set). A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 18:11:14 +09:00
}
}
2007-04-25 17:08:10 -07:00
2005-12-27 13:35:15 -08:00
addrconf_dad_kick ( ifp ) ;
2005-04-16 15:20:36 -07:00
out :
2010-05-18 15:56:06 -07:00
spin_unlock ( & ifp - > lock ) ;
2005-04-16 15:20:36 -07:00
read_unlock_bh ( & idev - > lock ) ;
2016-02-02 02:11:10 +00:00
if ( notify )
ipv6_ifa_notify ( RTM_NEWADDR , ifp ) ;
2005-04-16 15:20:36 -07:00
}
2014-03-27 18:28:07 +01:00
static void addrconf_dad_start ( struct inet6_ifaddr * ifp )
2005-04-16 15:20:36 -07:00
{
2014-03-27 18:28:07 +01:00
bool begin_dad = false ;
2015-03-23 23:36:03 +01:00
spin_lock_bh ( & ifp - > lock ) ;
2014-03-27 18:28:07 +01:00
if ( ifp - > state ! = INET6_IFADDR_STATE_DEAD ) {
ifp - > state = INET6_IFADDR_STATE_PREDAD ;
begin_dad = true ;
}
2015-03-23 23:36:03 +01:00
spin_unlock_bh ( & ifp - > lock ) ;
2014-03-27 18:28:07 +01:00
if ( begin_dad )
addrconf_mod_dad_work ( ifp , 0 ) ;
}
static void addrconf_dad_work ( struct work_struct * w )
{
struct inet6_ifaddr * ifp = container_of ( to_delayed_work ( w ) ,
struct inet6_ifaddr ,
dad_work ) ;
2005-04-16 15:20:36 -07:00
struct inet6_dev * idev = ifp - > idev ;
2016-11-22 16:57:40 +01:00
bool bump_id , disable_ipv6 = false ;
2005-04-16 15:20:36 -07:00
struct in6_addr mcaddr ;
2014-03-27 18:28:07 +01:00
enum {
DAD_PROCESS ,
DAD_BEGIN ,
DAD_ABORT ,
} action = DAD_PROCESS ;
rtnl_lock ( ) ;
2015-03-23 23:36:03 +01:00
spin_lock_bh ( & ifp - > lock ) ;
2014-03-27 18:28:07 +01:00
if ( ifp - > state = = INET6_IFADDR_STATE_PREDAD ) {
action = DAD_BEGIN ;
ifp - > state = INET6_IFADDR_STATE_DAD ;
} else if ( ifp - > state = = INET6_IFADDR_STATE_ERRDAD ) {
action = DAD_ABORT ;
ifp - > state = INET6_IFADDR_STATE_POSTDAD ;
net: ipv6: Remove addresses for failures with strict DAD
If DAD fails with accept_dad set to 2, global addresses and host routes
are incorrectly left in place. Even though disable_ipv6 is set,
contrary to documentation, the addresses are not dynamically deleted
from the interface. It is only on a subsequent link down/up that these
are removed. The fix is not only to set the disable_ipv6 flag, but
also to call addrconf_ifdown(), which is the action to carry out when
disabling IPv6. This results in the addresses and routes being deleted
immediately. The DAD failure for the LL addr is determined as before
via netlink, or by the absence of the LL addr (which also previously
would have had to be checked for in case of an intervening link down
and up). As the call to addrconf_ifdown() requires an rtnl lock, the
logic to disable IPv6 when DAD fails is moved to addrconf_dad_work().
Previous behavior:
root@vm1:/# sysctl net.ipv6.conf.eth3.accept_dad=2
net.ipv6.conf.eth3.accept_dad = 2
root@vm1:/# ip -6 addr add 2000::10/64 dev eth3
root@vm1:/# ip link set up eth3
root@vm1:/# ip -6 addr show dev eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2000::10/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe43:dd5a/64 scope link tentative dadfailed
valid_lft forever preferred_lft forever
root@vm1:/# ip -6 route show dev eth3
2000::/64 proto kernel metric 256
fe80::/64 proto kernel metric 256
root@vm1:/# ip link set down eth3
root@vm1:/# ip link set up eth3
root@vm1:/# ip -6 addr show dev eth3
root@vm1:/# ip -6 route show dev eth3
root@vm1:/#
New behavior:
root@vm1:/# sysctl net.ipv6.conf.eth3.accept_dad=2
net.ipv6.conf.eth3.accept_dad = 2
root@vm1:/# ip -6 addr add 2000::10/64 dev eth3
root@vm1:/# ip link set up eth3
root@vm1:/# ip -6 addr show dev eth3
root@vm1:/# ip -6 route show dev eth3
root@vm1:/#
Signed-off-by: Mike Manning <mmanning@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-18 14:39:40 +01:00
ipv6: fix net.ipv6.conf.all interface DAD handlers
Currently, writing into
net.ipv6.conf.all.{accept_dad,use_optimistic,optimistic_dad} has no effect.
Fix handling of these flags by:
- using the maximum of global and per-interface values for the
accept_dad flag. That is, if at least one of the two values is
non-zero, enable DAD on the interface. If at least one value is
set to 2, enable DAD and disable IPv6 operation on the interface if
MAC-based link-local address was found
- using the logical OR of global and per-interface values for the
optimistic_dad flag. If at least one of them is set to one, optimistic
duplicate address detection (RFC 4429) is enabled on the interface
- using the logical OR of global and per-interface values for the
use_optimistic flag. If at least one of them is set to one,
optimistic addresses won't be marked as deprecated during source address
selection on the interface.
While at it, as we're modifying the prototype for ipv6_use_optimistic_addr(),
drop inline, and let the compiler decide.
Fixes: 7fd2561e4ebd ("net: ipv6: Add a sysctl to make optimistic addresses useful candidates")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-12 17:46:37 +02:00
if ( ( dev_net ( idev - > dev ) - > ipv6 . devconf_all - > accept_dad > 1 | |
idev - > cnf . accept_dad > 1 ) & &
! idev - > cnf . disable_ipv6 & &
net: ipv6: Remove addresses for failures with strict DAD
If DAD fails with accept_dad set to 2, global addresses and host routes
are incorrectly left in place. Even though disable_ipv6 is set,
contrary to documentation, the addresses are not dynamically deleted
from the interface. It is only on a subsequent link down/up that these
are removed. The fix is not only to set the disable_ipv6 flag, but
also to call addrconf_ifdown(), which is the action to carry out when
disabling IPv6. This results in the addresses and routes being deleted
immediately. The DAD failure for the LL addr is determined as before
via netlink, or by the absence of the LL addr (which also previously
would have had to be checked for in case of an intervening link down
and up). As the call to addrconf_ifdown() requires an rtnl lock, the
logic to disable IPv6 when DAD fails is moved to addrconf_dad_work().
Previous behavior:
root@vm1:/# sysctl net.ipv6.conf.eth3.accept_dad=2
net.ipv6.conf.eth3.accept_dad = 2
root@vm1:/# ip -6 addr add 2000::10/64 dev eth3
root@vm1:/# ip link set up eth3
root@vm1:/# ip -6 addr show dev eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2000::10/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe43:dd5a/64 scope link tentative dadfailed
valid_lft forever preferred_lft forever
root@vm1:/# ip -6 route show dev eth3
2000::/64 proto kernel metric 256
fe80::/64 proto kernel metric 256
root@vm1:/# ip link set down eth3
root@vm1:/# ip link set up eth3
root@vm1:/# ip -6 addr show dev eth3
root@vm1:/# ip -6 route show dev eth3
root@vm1:/#
New behavior:
root@vm1:/# sysctl net.ipv6.conf.eth3.accept_dad=2
net.ipv6.conf.eth3.accept_dad = 2
root@vm1:/# ip -6 addr add 2000::10/64 dev eth3
root@vm1:/# ip link set up eth3
root@vm1:/# ip -6 addr show dev eth3
root@vm1:/# ip -6 route show dev eth3
root@vm1:/#
Signed-off-by: Mike Manning <mmanning@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-18 14:39:40 +01:00
! ( ifp - > flags & IFA_F_STABLE_PRIVACY ) ) {
struct in6_addr addr ;
addr . s6_addr32 [ 0 ] = htonl ( 0xfe800000 ) ;
addr . s6_addr32 [ 1 ] = 0 ;
if ( ! ipv6_generate_eui64 ( addr . s6_addr + 8 , idev - > dev ) & &
ipv6_addr_equal ( & ifp - > addr , & addr ) ) {
/* DAD failed for link-local based on MAC */
idev - > cnf . disable_ipv6 = 1 ;
pr_info ( " %s: IPv6 being disabled! \n " ,
ifp - > idev - > dev - > name ) ;
disable_ipv6 = true ;
}
}
2014-03-27 18:28:07 +01:00
}
2015-03-23 23:36:03 +01:00
spin_unlock_bh ( & ifp - > lock ) ;
2014-03-27 18:28:07 +01:00
if ( action = = DAD_BEGIN ) {
addrconf_dad_begin ( ifp ) ;
goto out ;
} else if ( action = = DAD_ABORT ) {
2016-09-05 16:06:31 +08:00
in6_ifa_hold ( ifp ) ;
2014-03-27 18:28:07 +01:00
addrconf_dad_stop ( ifp , 1 ) ;
net: ipv6: Remove addresses for failures with strict DAD
If DAD fails with accept_dad set to 2, global addresses and host routes
are incorrectly left in place. Even though disable_ipv6 is set,
contrary to documentation, the addresses are not dynamically deleted
from the interface. It is only on a subsequent link down/up that these
are removed. The fix is not only to set the disable_ipv6 flag, but
also to call addrconf_ifdown(), which is the action to carry out when
disabling IPv6. This results in the addresses and routes being deleted
immediately. The DAD failure for the LL addr is determined as before
via netlink, or by the absence of the LL addr (which also previously
would have had to be checked for in case of an intervening link down
and up). As the call to addrconf_ifdown() requires an rtnl lock, the
logic to disable IPv6 when DAD fails is moved to addrconf_dad_work().
Previous behavior:
root@vm1:/# sysctl net.ipv6.conf.eth3.accept_dad=2
net.ipv6.conf.eth3.accept_dad = 2
root@vm1:/# ip -6 addr add 2000::10/64 dev eth3
root@vm1:/# ip link set up eth3
root@vm1:/# ip -6 addr show dev eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2000::10/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe43:dd5a/64 scope link tentative dadfailed
valid_lft forever preferred_lft forever
root@vm1:/# ip -6 route show dev eth3
2000::/64 proto kernel metric 256
fe80::/64 proto kernel metric 256
root@vm1:/# ip link set down eth3
root@vm1:/# ip link set up eth3
root@vm1:/# ip -6 addr show dev eth3
root@vm1:/# ip -6 route show dev eth3
root@vm1:/#
New behavior:
root@vm1:/# sysctl net.ipv6.conf.eth3.accept_dad=2
net.ipv6.conf.eth3.accept_dad = 2
root@vm1:/# ip -6 addr add 2000::10/64 dev eth3
root@vm1:/# ip link set up eth3
root@vm1:/# ip -6 addr show dev eth3
root@vm1:/# ip -6 route show dev eth3
root@vm1:/#
Signed-off-by: Mike Manning <mmanning@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-18 14:39:40 +01:00
if ( disable_ipv6 )
2020-07-31 15:32:07 +02:00
addrconf_ifdown ( idev - > dev , false ) ;
2014-03-27 18:28:07 +01:00
goto out ;
}
2013-06-23 18:39:01 +02:00
if ( ! ifp - > dad_probes & & addrconf_dad_end ( ifp ) )
2010-05-18 15:55:27 -07:00
goto out ;
2014-03-27 18:28:07 +01:00
write_lock_bh ( & idev - > lock ) ;
2010-03-02 13:32:44 +00:00
if ( idev - > dead | | ! ( idev - > if_flags & IF_READY ) ) {
2014-03-27 18:28:07 +01:00
write_unlock_bh ( & idev - > lock ) ;
2005-04-16 15:20:36 -07:00
goto out ;
}
2010-02-08 19:48:52 +00:00
spin_lock ( & ifp - > lock ) ;
2010-05-18 15:56:06 -07:00
if ( ifp - > state = = INET6_IFADDR_STATE_DEAD ) {
spin_unlock ( & ifp - > lock ) ;
2014-03-27 18:28:07 +01:00
write_unlock_bh ( & idev - > lock ) ;
2010-05-18 15:56:06 -07:00
goto out ;
}
2013-06-23 18:39:01 +02:00
if ( ifp - > dad_probes = = 0 ) {
2018-01-25 20:16:29 -08:00
bool send_na = false ;
2005-04-16 15:20:36 -07:00
/*
* DAD was successful
*/
2018-01-25 20:16:29 -08:00
if ( ifp - > flags & IFA_F_TENTATIVE & &
! ( ifp - > flags & IFA_F_OPTIMISTIC ) )
send_na = true ;
2016-11-22 16:57:40 +01:00
bump_id = ifp - > flags & IFA_F_TENTATIVE ;
2009-09-09 14:41:32 +00:00
ifp - > flags & = ~ ( IFA_F_TENTATIVE | IFA_F_OPTIMISTIC | IFA_F_DADFAILED ) ;
2010-02-08 19:48:52 +00:00
spin_unlock ( & ifp - > lock ) ;
2014-03-27 18:28:07 +01:00
write_unlock_bh ( & idev - > lock ) ;
2005-04-16 15:20:36 -07:00
2018-01-25 20:16:29 -08:00
addrconf_dad_completed ( ifp , bump_id , send_na ) ;
2005-04-16 15:20:36 -07:00
goto out ;
}
2013-06-23 18:39:01 +02:00
ifp - > dad_probes - - ;
2014-03-27 18:28:07 +01:00
addrconf_mod_dad_work ( ifp ,
2020-04-01 14:46:20 +08:00
max ( NEIGH_VAR ( ifp - > idev - > nd_parms , RETRANS_TIME ) ,
HZ / 100 ) ) ;
2010-02-08 19:48:52 +00:00
spin_unlock ( & ifp - > lock ) ;
2014-03-27 18:28:07 +01:00
write_unlock_bh ( & idev - > lock ) ;
2005-04-16 15:20:36 -07:00
/* send a neighbour solicitation for our addr */
addrconf_addr_solict_mult ( & ifp - > addr , & mcaddr ) ;
2016-12-02 14:00:08 -08:00
ndisc_send_ns ( ifp - > idev - > dev , & ifp - > addr , & mcaddr , & in6addr_any ,
ifp - > dad_nonce ) ;
2005-04-16 15:20:36 -07:00
out :
in6_ifa_put ( ifp ) ;
2014-03-27 18:28:07 +01:00
rtnl_unlock ( ) ;
2005-04-16 15:20:36 -07:00
}
2014-01-16 20:13:04 +01:00
/* ifp->idev must be at least read locked */
static bool ipv6_lonely_lladdr ( struct inet6_ifaddr * ifp )
{
struct inet6_ifaddr * ifpiter ;
struct inet6_dev * idev = ifp - > idev ;
2014-01-19 21:58:19 +01:00
list_for_each_entry_reverse ( ifpiter , & idev - > addr_list , if_list ) {
if ( ifpiter - > scope > IFA_LINK )
break ;
2014-01-16 20:13:04 +01:00
if ( ifp ! = ifpiter & & ifpiter - > scope = = IFA_LINK & &
( ifpiter - > flags & ( IFA_F_PERMANENT | IFA_F_TENTATIVE |
IFA_F_OPTIMISTIC | IFA_F_DADFAILED ) ) = =
IFA_F_PERMANENT )
return false ;
}
return true ;
}
2018-01-25 20:16:29 -08:00
static void addrconf_dad_completed ( struct inet6_ifaddr * ifp , bool bump_id ,
bool send_na )
2005-04-16 15:20:36 -07:00
{
2010-03-20 16:09:01 -07:00
struct net_device * dev = ifp - > idev - > dev ;
2013-06-23 18:39:01 +02:00
struct in6_addr lladdr ;
2013-06-27 00:07:01 +02:00
bool send_rs , send_mld ;
2013-06-23 18:39:01 +02:00
2014-03-27 18:28:07 +01:00
addrconf_del_dad_work ( ifp ) ;
2005-04-16 15:20:36 -07:00
/*
* Configure the address for reception . Now it is valid .
*/
ipv6_ifa_notify ( RTM_NEWADDR , ifp ) ;
ipv6: Send ICMPv6 RSes only when RAs are accepted
This patch improves the logic determining when to send ICMPv6 Router
Solicitations, so that they are 1) always sent when the kernel is
accepting Router Advertisements, and 2) never sent when the kernel is
not accepting RAs. In other words, the operational setting of the
"accept_ra" sysctl is used.
The change also makes the special "Hybrid Router" forwarding mode
("forwarding" sysctl set to 2) operate exactly the same as the standard
Router mode (forwarding=1). The only difference between the two was
that RSes was being sent in the Hybrid Router mode only. The sysctl
documentation describing the special Hybrid Router mode has therefore
been removed.
Rationale for the change:
Currently, the value of forwarding sysctl is the only thing determining
whether or not to send RSes. If it has the value 0 or 2, they are sent,
otherwise they are not. This leads to inconsistent behaviour in the
following cases:
* accept_ra=0, forwarding=0
* accept_ra=0, forwarding=2
* accept_ra=1, forwarding=2
* accept_ra=2, forwarding=1
In the first three cases, the kernel will send RSes, even though it will
not accept any RAs received in reply. In the last case, it will not send
any RSes, even though it will accept and process any RAs received. (Most
routers will send unsolicited RAs periodically, so suppressing RSes in
the last case will merely delay auto-configuration, not prevent it.)
Also, it is my opinion that having the forwarding sysctl control RS
sending behaviour (completely independent of whether RAs are being
accepted or not) is simply not what most users would intuitively expect
to be the case.
Signed-off-by: Tore Anderson <tore@fud.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-28 23:47:33 +00:00
/* If added prefix is link local and we are prepared to process
router advertisements , start sending router solicitations .
2005-04-16 15:20:36 -07:00
*/
2013-06-27 00:06:56 +02:00
read_lock_bh ( & ifp - > idev - > lock ) ;
2014-01-16 20:13:04 +01:00
send_mld = ifp - > scope = = IFA_LINK & & ipv6_lonely_lladdr ( ifp ) ;
2013-06-27 00:07:01 +02:00
send_rs = send_mld & &
ipv6_accept_ra ( ifp - > idev ) & &
2016-09-27 23:57:58 -07:00
ifp - > idev - > cnf . rtr_solicits ! = 0 & &
2022-04-29 13:38:02 +08:00
( dev - > flags & IFF_LOOPBACK ) = = 0 & &
2023-03-22 19:35:41 -04:00
( dev - > type ! = ARPHRD_TUNNEL ) & &
! netif_is_team_port ( dev ) ;
2013-06-27 00:06:56 +02:00
read_unlock_bh ( & ifp - > idev - > lock ) ;
2013-06-27 00:07:01 +02:00
/* While dad is in progress mld report's source address is in6_addrany.
* Resend with proper ll now .
*/
if ( send_mld )
ipv6_mc_dad_complete ( ifp - > idev ) ;
2018-01-25 20:16:29 -08:00
/* send unsolicited NA if enabled */
if ( send_na & &
( ifp - > idev - > cnf . ndisc_notify | |
dev_net ( dev ) - > ipv6 . devconf_all - > ndisc_notify ) ) {
ndisc_send_na ( dev , & in6addr_linklocal_allnodes , & ifp - > addr ,
/*router=*/ ! ! ifp - > idev - > cnf . forwarding ,
/*solicited=*/ false , /*override=*/ true ,
/*inc_opt=*/ true ) ;
}
2013-06-27 00:06:56 +02:00
if ( send_rs ) {
2005-04-16 15:20:36 -07:00
/*
* If a host as already performed a random delay
* [ . . . ] as part of DAD [ . . . ] there is no need
* to delay again before sending the first RS
*/
2013-06-27 00:06:56 +02:00
if ( ipv6_get_lladdr ( dev , & lladdr , IFA_F_TENTATIVE ) )
2013-06-23 18:39:01 +02:00
return ;
2013-06-27 00:06:56 +02:00
ndisc_send_rs ( dev , & lladdr , & in6addr_linklocal_allrouters ) ;
2005-04-16 15:20:36 -07:00
2013-06-23 18:39:01 +02:00
write_lock_bh ( & ifp - > idev - > lock ) ;
spin_lock ( & ifp - > lock ) ;
2016-09-27 23:57:58 -07:00
ifp - > idev - > rs_interval = rfc3315_s14_backoff_init (
ifp - > idev - > cnf . rtr_solicit_interval ) ;
2013-06-23 18:39:01 +02:00
ifp - > idev - > rs_probes = 1 ;
2005-04-16 15:20:36 -07:00
ifp - > idev - > if_flags | = IF_RS_SENT ;
2016-09-27 23:57:58 -07:00
addrconf_mod_rs_timer ( ifp - > idev , ifp - > idev - > rs_interval ) ;
2013-06-23 18:39:01 +02:00
spin_unlock ( & ifp - > lock ) ;
write_unlock_bh ( & ifp - > idev - > lock ) ;
2005-04-16 15:20:36 -07:00
}
2016-11-22 16:57:40 +01:00
if ( bump_id )
rt_genid_bump_ipv6 ( dev_net ( dev ) ) ;
ipv6: addrconf: fix generation of new temporary addresses
Under some circumstances it is possible that no new temporary addresses
will be generated.
For instance, addrconf_prefix_rcv_add_addr() indirectly calls
ipv6_create_tempaddr(), which creates a tentative temporary address and
starts dad. Next, addrconf_prefix_rcv_add_addr() indirectly calls
addrconf_verify_rtnl(). Now, assume that the previously created temporary
address has the least preferred lifetime among all existing addresses and
is still tentative (that is, dad is still running). Hence, the next run of
addrconf_verify_rtnl() is performed when the preferred lifetime of the
temporary address ends. If dad succeeds before the next run, the temporary
address becomes deprecated during the next run, but no new temporary
address is generated.
In order to fix this, schedule the next addrconf_verify_rtnl() run slightly
before the temporary address becomes deprecated, if dad succeeded.
Signed-off-by: Marcus Huewe <suse-tux@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06 18:34:56 +01:00
/* Make sure that a new temporary address will be created
* before this temporary address becomes deprecated .
*/
if ( ifp - > flags & IFA_F_TEMPORARY )
2022-02-07 20:50:29 -08:00
addrconf_verify_rtnl ( dev_net ( dev ) ) ;
2005-04-16 15:20:36 -07:00
}
2018-11-21 21:52:33 +08:00
static void addrconf_dad_run ( struct inet6_dev * idev , bool restart )
2010-03-20 16:09:01 -07:00
{
2005-12-21 22:57:44 +09:00
struct inet6_ifaddr * ifp ;
read_lock_bh ( & idev - > lock ) ;
2010-03-17 20:31:13 +00:00
list_for_each_entry ( ifp , & idev - > addr_list , if_list ) {
2010-02-08 19:48:52 +00:00
spin_lock ( & ifp - > lock ) ;
2018-11-21 21:52:33 +08:00
if ( ( ifp - > flags & IFA_F_TENTATIVE & &
ifp - > state = = INET6_IFADDR_STATE_DAD ) | | restart ) {
if ( restart )
ifp - > state = INET6_IFADDR_STATE_PREDAD ;
2010-05-18 15:55:27 -07:00
addrconf_dad_kick ( ifp ) ;
2018-11-21 21:52:33 +08:00
}
2010-02-08 19:48:52 +00:00
spin_unlock ( & ifp - > lock ) ;
2005-12-21 22:57:44 +09:00
}
read_unlock_bh ( & idev - > lock ) ;
}
2005-04-16 15:20:36 -07:00
# ifdef CONFIG_PROC_FS
struct if6_iter_state {
2008-01-10 22:42:49 -08:00
struct seq_net_private p ;
2005-04-16 15:20:36 -07:00
int bucket ;
2012-01-03 23:31:35 +00:00
int offset ;
2005-04-16 15:20:36 -07:00
} ;
2012-01-03 23:31:35 +00:00
static struct inet6_ifaddr * if6_get_first ( struct seq_file * seq , loff_t pos )
2005-04-16 15:20:36 -07:00
{
struct if6_iter_state * state = seq - > private ;
2008-03-26 02:36:06 +09:00
struct net * net = seq_file_net ( seq ) ;
2017-10-23 16:17:50 -07:00
struct inet6_ifaddr * ifa = NULL ;
2012-01-03 23:31:35 +00:00
int p = 0 ;
2005-04-16 15:20:36 -07:00
2012-01-03 23:31:35 +00:00
/* initial bucket if pos is 0 */
if ( pos = = 0 ) {
state - > bucket = 0 ;
state - > offset = 0 ;
}
for ( ; state - > bucket < IN6_ADDR_HSIZE ; + + state - > bucket ) {
2022-02-07 20:50:30 -08:00
hlist_for_each_entry_rcu ( ifa , & net - > ipv6 . inet6_addr_lst [ state - > bucket ] ,
2012-01-03 23:31:35 +00:00
addr_lst ) {
/* sync with offset */
if ( p < state - > offset ) {
p + + ;
continue ;
}
2012-10-16 07:37:27 +00:00
return ifa ;
2012-01-03 23:31:35 +00:00
}
/* prepare for next bucket */
state - > offset = 0 ;
p = 0 ;
2005-04-16 15:20:36 -07:00
}
2010-03-17 20:31:10 +00:00
return NULL ;
2005-04-16 15:20:36 -07:00
}
2010-03-17 20:31:10 +00:00
static struct inet6_ifaddr * if6_get_next ( struct seq_file * seq ,
struct inet6_ifaddr * ifa )
2005-04-16 15:20:36 -07:00
{
struct if6_iter_state * state = seq - > private ;
2008-03-26 02:36:06 +09:00
struct net * net = seq_file_net ( seq ) ;
2005-04-16 15:20:36 -07:00
2017-10-23 16:17:50 -07:00
hlist_for_each_entry_continue_rcu ( ifa , addr_lst ) {
2012-01-03 23:31:35 +00:00
state - > offset + + ;
2012-10-16 07:37:27 +00:00
return ifa ;
2012-01-03 23:31:35 +00:00
}
2008-01-10 22:42:49 -08:00
net/ipv6: Display all addresses in output of /proc/net/if_inet6
The backend handling for /proc/net/if_inet6 in addrconf.c doesn't properly
handle starting/stopping the iteration. The problem is that at some point
during the iteration, an overflow is detected and the process is
subsequently stopped. The item being shown via seq_printf() when the
overflow occurs is not actually shown, though. When start() is
subsequently called to resume iterating, it returns the next item, and
thus the item that was being processed when the overflow occurred never
gets printed.
Alter the meaning of the private data member "offset". Currently, when it
is not 0 (which only happens at the very beginning), "offset" represents
the next hlist item to be printed. After this change, "offset" always
represents the current item.
This is also consistent with the private data member "bucket", which
represents the current bucket, and also the use of "pos" as defined in
seq_file.txt:
The pos passed to start() will always be either zero, or the most
recent pos used in the previous session.
Signed-off-by: Jeff Barnhill <0xeffeff@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 00:45:27 +00:00
state - > offset = 0 ;
2010-03-17 20:31:10 +00:00
while ( + + state - > bucket < IN6_ADDR_HSIZE ) {
2017-10-23 16:17:50 -07:00
hlist_for_each_entry_rcu ( ifa ,
2022-02-07 20:50:30 -08:00
& net - > ipv6 . inet6_addr_lst [ state - > bucket ] , addr_lst ) {
2012-10-16 07:37:27 +00:00
return ifa ;
2010-03-17 20:31:10 +00:00
}
2005-04-16 15:20:36 -07:00
}
2008-01-10 22:42:49 -08:00
2010-03-17 20:31:10 +00:00
return NULL ;
2005-04-16 15:20:36 -07:00
}
static void * if6_seq_start ( struct seq_file * seq , loff_t * pos )
2017-10-23 16:17:50 -07:00
__acquires ( rcu )
2005-04-16 15:20:36 -07:00
{
2017-10-23 16:17:50 -07:00
rcu_read_lock ( ) ;
2012-01-03 23:31:35 +00:00
return if6_get_first ( seq , * pos ) ;
2005-04-16 15:20:36 -07:00
}
static void * if6_seq_next ( struct seq_file * seq , void * v , loff_t * pos )
{
struct inet6_ifaddr * ifa ;
ifa = if6_get_next ( seq , v ) ;
+ + * pos ;
return ifa ;
}
static void if6_seq_stop ( struct seq_file * seq , void * v )
2017-10-23 16:17:50 -07:00
__releases ( rcu )
2005-04-16 15:20:36 -07:00
{
2017-10-23 16:17:50 -07:00
rcu_read_unlock ( ) ;
2005-04-16 15:20:36 -07:00
}
static int if6_seq_show ( struct seq_file * seq , void * v )
{
struct inet6_ifaddr * ifp = ( struct inet6_ifaddr * ) v ;
2013-12-10 13:56:29 +01:00
seq_printf ( seq , " %pi6 %02x %02x %02x %02x %8s \n " ,
2008-10-28 16:05:40 -07:00
& ifp - > addr ,
2005-04-16 15:20:36 -07:00
ifp - > idev - > dev - > ifindex ,
ifp - > prefix_len ,
ifp - > scope ,
2013-12-10 13:56:29 +01:00
( u8 ) ifp - > flags ,
2005-04-16 15:20:36 -07:00
ifp - > idev - > dev - > name ) ;
return 0 ;
}
2007-07-10 23:07:31 -07:00
static const struct seq_operations if6_seq_ops = {
2005-04-16 15:20:36 -07:00
. start = if6_seq_start ,
. next = if6_seq_next ,
. show = if6_seq_show ,
. stop = if6_seq_stop ,
} ;
2010-01-17 03:35:32 +00:00
static int __net_init if6_proc_net_init ( struct net * net )
2005-04-16 15:20:36 -07:00
{
2018-04-10 19:42:55 +02:00
if ( ! proc_create_net ( " if_inet6 " , 0444 , net - > proc_net , & if6_seq_ops ,
sizeof ( struct if6_iter_state ) ) )
2005-04-16 15:20:36 -07:00
return - ENOMEM ;
return 0 ;
}
2010-01-17 03:35:32 +00:00
static void __net_exit if6_proc_net_exit ( struct net * net )
2008-01-10 22:42:49 -08:00
{
2013-02-18 01:34:56 +00:00
remove_proc_entry ( " if_inet6 " , net - > proc_net ) ;
2008-01-10 22:42:49 -08:00
}
static struct pernet_operations if6_proc_net_ops = {
2014-08-24 21:53:10 +01:00
. init = if6_proc_net_init ,
. exit = if6_proc_net_exit ,
2008-01-10 22:42:49 -08:00
} ;
int __init if6_proc_init ( void )
{
return register_pernet_subsys ( & if6_proc_net_ops ) ;
}
2005-04-16 15:20:36 -07:00
void if6_proc_exit ( void )
{
2008-01-10 22:42:49 -08:00
unregister_pernet_subsys ( & if6_proc_net_ops ) ;
2005-04-16 15:20:36 -07:00
}
# endif /* CONFIG_PROC_FS */
2012-10-29 16:23:10 +00:00
# if IS_ENABLED(CONFIG_IPV6_MIP6)
2006-09-22 14:45:56 -07:00
/* Check if address is a home address configured on any interface. */
2011-04-22 04:53:02 +00:00
int ipv6_chk_home_addr ( struct net * net , const struct in6_addr * addr )
2006-09-22 14:45:56 -07:00
{
2017-10-23 16:17:47 -07:00
unsigned int hash = inet6_addr_hash ( net , addr ) ;
2010-03-17 20:31:10 +00:00
struct inet6_ifaddr * ifp = NULL ;
2017-10-23 16:17:47 -07:00
int ret = 0 ;
2010-03-17 20:31:10 +00:00
2017-10-23 16:17:51 -07:00
rcu_read_lock ( ) ;
2022-02-07 20:50:30 -08:00
hlist_for_each_entry_rcu ( ifp , & net - > ipv6 . inet6_addr_lst [ hash ] , addr_lst ) {
2008-04-10 15:42:07 +09:00
if ( ipv6_addr_equal ( & ifp - > addr , addr ) & &
2006-09-22 14:45:56 -07:00
( ifp - > flags & IFA_F_HOMEADDRESS ) ) {
ret = 1 ;
break ;
}
}
2017-10-23 16:17:51 -07:00
rcu_read_unlock ( ) ;
2006-09-22 14:45:56 -07:00
return ret ;
}
# endif
2020-03-27 18:00:19 -04:00
/* RFC6554 has some algorithm to avoid loops in segment routing by
* checking if the segments contains any of a local interface address .
*
* Quote :
*
* To detect loops in the SRH , a router MUST determine if the SRH
* includes multiple addresses assigned to any interface on that router .
* If such addresses appear more than once and are separated by at least
* one address not assigned to that router .
*/
int ipv6_chk_rpl_srh_loop ( struct net * net , const struct in6_addr * segs ,
unsigned char nsegs )
{
const struct in6_addr * addr ;
int i , ret = 0 , found = 0 ;
struct inet6_ifaddr * ifp ;
bool separated = false ;
unsigned int hash ;
bool hash_found ;
rcu_read_lock ( ) ;
for ( i = 0 ; i < nsegs ; i + + ) {
addr = & segs [ i ] ;
hash = inet6_addr_hash ( net , addr ) ;
hash_found = false ;
2022-02-07 20:50:30 -08:00
hlist_for_each_entry_rcu ( ifp , & net - > ipv6 . inet6_addr_lst [ hash ] , addr_lst ) {
2020-03-27 18:00:19 -04:00
if ( ipv6_addr_equal ( & ifp - > addr , addr ) ) {
hash_found = true ;
break ;
}
}
if ( hash_found ) {
if ( found > 1 & & separated ) {
ret = 1 ;
break ;
}
separated = false ;
found + + ;
} else {
separated = true ;
}
}
rcu_read_unlock ( ) ;
return ret ;
}
2005-04-16 15:20:36 -07:00
/*
* Periodic address status verification
*/
2022-02-07 20:50:29 -08:00
static void addrconf_verify_rtnl ( struct net * net )
2005-04-16 15:20:36 -07:00
{
ipv6: Reduce timer events for addrconf_verify().
This patch reduces timer events while keeping accuracy by rounding
our timer and/or batching several address validations in addrconf_verify().
addrconf_verify() is called at earliest timeout among interface addresses'
timeouts, but at maximum ADDR_CHECK_FREQUENCY (120 secs).
In most cases, all of timeouts of interface addresses are long enough
(e.g. several hours or days vs 2 minutes), this timer is usually called
every ADDR_CHECK_FREQUENCY, and it is okay to be lazy.
(Note this timer could be eliminated if all code paths which modifies
variables related to timeouts call us manually, but it is another story.)
However, in other least but important cases, we try keeping accuracy.
When the real interface address timeout is coming, and the timeout
is just before the rounded timeout, we accept some error.
When a timeout has been reached, we also try batching other several
events in very near future.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20 16:11:12 -07:00
unsigned long now , next , next_sec , next_sched ;
2005-04-16 15:20:36 -07:00
struct inet6_ifaddr * ifp ;
int i ;
2014-03-27 18:28:07 +01:00
ASSERT_RTNL ( ) ;
2010-03-17 20:31:11 +00:00
rcu_read_lock_bh ( ) ;
2005-04-16 15:20:36 -07:00
now = jiffies ;
ipv6: Reduce timer events for addrconf_verify().
This patch reduces timer events while keeping accuracy by rounding
our timer and/or batching several address validations in addrconf_verify().
addrconf_verify() is called at earliest timeout among interface addresses'
timeouts, but at maximum ADDR_CHECK_FREQUENCY (120 secs).
In most cases, all of timeouts of interface addresses are long enough
(e.g. several hours or days vs 2 minutes), this timer is usually called
every ADDR_CHECK_FREQUENCY, and it is okay to be lazy.
(Note this timer could be eliminated if all code paths which modifies
variables related to timeouts call us manually, but it is another story.)
However, in other least but important cases, we try keeping accuracy.
When the real interface address timeout is coming, and the timeout
is just before the rounded timeout, we accept some error.
When a timeout has been reached, we also try batching other several
events in very near future.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20 16:11:12 -07:00
next = round_jiffies_up ( now + ADDR_CHECK_FREQUENCY ) ;
2005-04-16 15:20:36 -07:00
2022-02-07 20:50:29 -08:00
cancel_delayed_work ( & net - > ipv6 . addr_chk_work ) ;
2005-04-16 15:20:36 -07:00
2010-03-20 16:08:18 -07:00
for ( i = 0 ; i < IN6_ADDR_HSIZE ; i + + ) {
2005-04-16 15:20:36 -07:00
restart :
2022-02-07 20:50:30 -08:00
hlist_for_each_entry_rcu_bh ( ifp , & net - > ipv6 . inet6_addr_lst [ i ] , addr_lst ) {
2005-04-16 15:20:36 -07:00
unsigned long age ;
2013-12-31 12:04:19 +09:00
/* When setting preferred_lft to a value not zero or
* infinity , while valid_lft is infinity
* IFA_F_PERMANENT has a non - infinity life time .
*/
if ( ( ifp - > flags & IFA_F_PERMANENT ) & &
( ifp - > prefered_lft = = INFINITY_LIFE_TIME ) )
2005-04-16 15:20:36 -07:00
continue ;
spin_lock ( & ifp - > lock ) ;
ipv6: Reduce timer events for addrconf_verify().
This patch reduces timer events while keeping accuracy by rounding
our timer and/or batching several address validations in addrconf_verify().
addrconf_verify() is called at earliest timeout among interface addresses'
timeouts, but at maximum ADDR_CHECK_FREQUENCY (120 secs).
In most cases, all of timeouts of interface addresses are long enough
(e.g. several hours or days vs 2 minutes), this timer is usually called
every ADDR_CHECK_FREQUENCY, and it is okay to be lazy.
(Note this timer could be eliminated if all code paths which modifies
variables related to timeouts call us manually, but it is another story.)
However, in other least but important cases, we try keeping accuracy.
When the real interface address timeout is coming, and the timeout
is just before the rounded timeout, we accept some error.
When a timeout has been reached, we also try batching other several
events in very near future.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20 16:11:12 -07:00
/* We try to batch several events at once. */
age = ( now - ifp - > tstamp + ADDRCONF_TIMER_FUZZ_MINUS ) / HZ ;
2005-04-16 15:20:36 -07:00
2022-06-23 12:11:04 -06:00
if ( ( ifp - > flags & IFA_F_TEMPORARY ) & &
! ( ifp - > flags & IFA_F_TENTATIVE ) & &
ifp - > prefered_lft ! = INFINITY_LIFE_TIME & &
! ifp - > regen_count & & ifp - > ifpub ) {
/* This is a non-regenerated temporary addr. */
unsigned long regen_advance = ifp - > idev - > cnf . regen_max_retry *
ifp - > idev - > cnf . dad_transmits *
max ( NEIGH_VAR ( ifp - > idev - > nd_parms , RETRANS_TIME ) , HZ / 100 ) / HZ ;
if ( age + regen_advance > = ifp - > prefered_lft ) {
struct inet6_ifaddr * ifpub = ifp - > ifpub ;
if ( time_before ( ifp - > tstamp + ifp - > prefered_lft * HZ , next ) )
next = ifp - > tstamp + ifp - > prefered_lft * HZ ;
ifp - > regen_count + + ;
in6_ifa_hold ( ifp ) ;
in6_ifa_hold ( ifpub ) ;
spin_unlock ( & ifp - > lock ) ;
spin_lock ( & ifpub - > lock ) ;
ifpub - > regen_count = 0 ;
spin_unlock ( & ifpub - > lock ) ;
rcu_read_unlock_bh ( ) ;
ipv6_create_tempaddr ( ifpub , true ) ;
in6_ifa_put ( ifpub ) ;
in6_ifa_put ( ifp ) ;
rcu_read_lock_bh ( ) ;
goto restart ;
} else if ( time_before ( ifp - > tstamp + ifp - > prefered_lft * HZ - regen_advance * HZ , next ) )
next = ifp - > tstamp + ifp - > prefered_lft * HZ - regen_advance * HZ ;
}
2006-07-28 18:12:11 +09:00
if ( ifp - > valid_lft ! = INFINITY_LIFE_TIME & &
age > = ifp - > valid_lft ) {
2005-04-16 15:20:36 -07:00
spin_unlock ( & ifp - > lock ) ;
in6_ifa_hold ( ifp ) ;
2021-04-16 14:16:06 +00:00
rcu_read_unlock_bh ( ) ;
2005-04-16 15:20:36 -07:00
ipv6_del_addr ( ifp ) ;
2021-04-16 14:16:06 +00:00
rcu_read_lock_bh ( ) ;
2005-04-16 15:20:36 -07:00
goto restart ;
2006-07-28 18:12:11 +09:00
} else if ( ifp - > prefered_lft = = INFINITY_LIFE_TIME ) {
spin_unlock ( & ifp - > lock ) ;
continue ;
2005-04-16 15:20:36 -07:00
} else if ( age > = ifp - > prefered_lft ) {
IPv6: preferred lifetime of address not getting updated
There's a bug in addrconf_prefix_rcv() where it won't update the
preferred lifetime of an IPv6 address if the current valid lifetime
of the address is less than 2 hours (the minimum value in the RA).
For example, If I send a router advertisement with a prefix that
has valid lifetime = preferred lifetime = 2 hours we'll build
this address:
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic
valid_lft 7175sec preferred_lft 7175sec
If I then send the same prefix with valid lifetime = preferred
lifetime = 0 it will be ignored since the minimum valid lifetime
is 2 hours:
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic
valid_lft 7161sec preferred_lft 7161sec
But according to RFC 4862 we should always reset the preferred lifetime
even if the valid lifetime is invalid, which would cause the address
to immediately get deprecated. So with this patch we'd see this:
5: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2001:1890:1109:a20:21f:29ff:fe5a:ef04/64 scope global deprecated dynamic
valid_lft 7163sec preferred_lft 0sec
The comment winds-up being 5x the size of the code to fix the problem.
Update the preferred lifetime of IPv6 addresses derived from a prefix
info option in a router advertisement even if the valid lifetime in
the option is invalid, as specified in RFC 4862 Section 5.5.3e. Fixes
an issue where an address will not immediately become deprecated.
Reported by Jens Rosenboom.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-02 07:10:52 +00:00
/* jiffies - ifp->tstamp > age >= ifp->prefered_lft */
2005-04-16 15:20:36 -07:00
int deprecate = 0 ;
if ( ! ( ifp - > flags & IFA_F_DEPRECATED ) ) {
deprecate = 1 ;
ifp - > flags | = IFA_F_DEPRECATED ;
}
2013-12-31 12:04:19 +09:00
if ( ( ifp - > valid_lft ! = INFINITY_LIFE_TIME ) & &
( time_before ( ifp - > tstamp + ifp - > valid_lft * HZ , next ) ) )
2005-04-16 15:20:36 -07:00
next = ifp - > tstamp + ifp - > valid_lft * HZ ;
spin_unlock ( & ifp - > lock ) ;
if ( deprecate ) {
in6_ifa_hold ( ifp ) ;
ipv6_ifa_notify ( 0 , ifp ) ;
in6_ifa_put ( ifp ) ;
goto restart ;
}
} else {
/* ifp->prefered_lft <= ifp->valid_lft */
if ( time_before ( ifp - > tstamp + ifp - > prefered_lft * HZ , next ) )
next = ifp - > tstamp + ifp - > prefered_lft * HZ ;
spin_unlock ( & ifp - > lock ) ;
}
}
}
ipv6: Reduce timer events for addrconf_verify().
This patch reduces timer events while keeping accuracy by rounding
our timer and/or batching several address validations in addrconf_verify().
addrconf_verify() is called at earliest timeout among interface addresses'
timeouts, but at maximum ADDR_CHECK_FREQUENCY (120 secs).
In most cases, all of timeouts of interface addresses are long enough
(e.g. several hours or days vs 2 minutes), this timer is usually called
every ADDR_CHECK_FREQUENCY, and it is okay to be lazy.
(Note this timer could be eliminated if all code paths which modifies
variables related to timeouts call us manually, but it is another story.)
However, in other least but important cases, we try keeping accuracy.
When the real interface address timeout is coming, and the timeout
is just before the rounded timeout, we accept some error.
When a timeout has been reached, we also try batching other several
events in very near future.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20 16:11:12 -07:00
next_sec = round_jiffies_up ( next ) ;
next_sched = next ;
/* If rounded timeout is accurate enough, accept it. */
if ( time_before ( next_sec , next + ADDRCONF_TIMER_FUZZ ) )
next_sched = next_sec ;
/* And minimum interval is ADDRCONF_TIMER_FUZZ_MAX. */
if ( time_before ( next_sched , jiffies + ADDRCONF_TIMER_FUZZ_MAX ) )
next_sched = jiffies + ADDRCONF_TIMER_FUZZ_MAX ;
2018-03-26 08:35:01 -07:00
pr_debug ( " now = %lu, schedule = %lu, rounded schedule = %lu => %lu \n " ,
now , next , next_sec , next_sched ) ;
2022-02-07 20:50:29 -08:00
mod_delayed_work ( addrconf_wq , & net - > ipv6 . addr_chk_work , next_sched - now ) ;
2010-03-17 20:31:11 +00:00
rcu_read_unlock_bh ( ) ;
2005-04-16 15:20:36 -07:00
}
2014-03-27 18:28:07 +01:00
static void addrconf_verify_work ( struct work_struct * w )
{
2022-02-07 20:50:29 -08:00
struct net * net = container_of ( to_delayed_work ( w ) , struct net ,
ipv6 . addr_chk_work ) ;
2014-03-27 18:28:07 +01:00
rtnl_lock ( ) ;
2022-02-07 20:50:29 -08:00
addrconf_verify_rtnl ( net ) ;
2014-03-27 18:28:07 +01:00
rtnl_unlock ( ) ;
}
2022-02-07 20:50:29 -08:00
static void addrconf_verify ( struct net * net )
2014-03-27 18:28:07 +01:00
{
2022-02-07 20:50:29 -08:00
mod_delayed_work ( addrconf_wq , & net - > ipv6 . addr_chk_work , 0 ) ;
2014-03-27 18:28:07 +01:00
}
2013-05-16 22:32:00 +00:00
static struct in6_addr * extract_addr ( struct nlattr * addr , struct nlattr * local ,
struct in6_addr * * peer_pfx )
2006-09-18 00:09:49 -07:00
{
struct in6_addr * pfx = NULL ;
2013-05-16 22:32:00 +00:00
* peer_pfx = NULL ;
2006-09-18 00:09:49 -07:00
if ( addr )
pfx = nla_data ( addr ) ;
if ( local ) {
if ( pfx & & nla_memcmp ( local , pfx , sizeof ( * pfx ) ) )
2013-05-16 22:32:00 +00:00
* peer_pfx = pfx ;
pfx = nla_data ( local ) ;
2006-09-18 00:09:49 -07:00
}
return pfx ;
}
2007-06-05 12:38:30 -07:00
static const struct nla_policy ifa_ipv6_policy [ IFA_MAX + 1 ] = {
2006-09-18 00:09:49 -07:00
[ IFA_ADDRESS ] = { . len = sizeof ( struct in6_addr ) } ,
[ IFA_LOCAL ] = { . len = sizeof ( struct in6_addr ) } ,
[ IFA_CACHEINFO ] = { . len = sizeof ( struct ifa_cacheinfo ) } ,
2013-12-06 09:45:21 +01:00
[ IFA_FLAGS ] = { . len = sizeof ( u32 ) } ,
2018-05-27 08:09:58 -07:00
[ IFA_RT_PRIORITY ] = { . len = sizeof ( u32 ) } ,
2018-09-04 21:53:50 +02:00
[ IFA_TARGET_NETNSID ] = { . type = NLA_S32 } ,
2022-02-17 16:02:02 +01:00
[ IFA_PROTO ] = { . type = NLA_U8 } ,
2006-09-18 00:09:49 -07:00
} ;
2005-04-16 15:20:36 -07:00
static int
2017-04-16 09:48:24 -07:00
inet6_rtm_deladdr ( struct sk_buff * skb , struct nlmsghdr * nlh ,
struct netlink_ext_ack * extack )
2005-04-16 15:20:36 -07:00
{
2008-03-26 02:26:21 +09:00
struct net * net = sock_net ( skb - > sk ) ;
2006-09-18 00:10:19 -07:00
struct ifaddrmsg * ifm ;
struct nlattr * tb [ IFA_MAX + 1 ] ;
2013-05-16 22:32:00 +00:00
struct in6_addr * pfx , * peer_pfx ;
2014-04-20 21:29:36 +02:00
u32 ifa_flags ;
2006-09-18 00:10:19 -07:00
int err ;
2005-04-16 15:20:36 -07:00
netlink: make validation more configurable for future strictness
We currently have two levels of strict validation:
1) liberal (default)
- undefined (type >= max) & NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
- garbage at end of message accepted
2) strict (opt-in)
- NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
Split out parsing strictness into four different options:
* TRAILING - check that there's no trailing data after parsing
attributes (in message or nested)
* MAXTYPE - reject attrs > max known type
* UNSPEC - reject attributes with NLA_UNSPEC policy entries
* STRICT_ATTRS - strictly validate attribute size
The default for future things should be *everything*.
The current *_strict() is a combination of TRAILING and MAXTYPE,
and is renamed to _deprecated_strict().
The current regular parsing has none of this, and is renamed to
*_parse_deprecated().
Additionally it allows us to selectively set one of the new flags
even on old policies. Notably, the UNSPEC flag could be useful in
this case, since it can be arranged (by filling in the policy) to
not be an incompatible userspace ABI change, but would then going
forward prevent forgetting attribute entries. Similar can apply
to the POLICY flag.
We end up with the following renames:
* nla_parse -> nla_parse_deprecated
* nla_parse_strict -> nla_parse_deprecated_strict
* nlmsg_parse -> nlmsg_parse_deprecated
* nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
* nla_parse_nested -> nla_parse_nested_deprecated
* nla_validate_nested -> nla_validate_nested_deprecated
Using spatch, of course:
@@
expression TB, MAX, HEAD, LEN, POL, EXT;
@@
-nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
+nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
@@
expression TB, MAX, NLA, POL, EXT;
@@
-nla_parse_nested(TB, MAX, NLA, POL, EXT)
+nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
@@
expression START, MAX, POL, EXT;
@@
-nla_validate_nested(START, MAX, POL, EXT)
+nla_validate_nested_deprecated(START, MAX, POL, EXT)
@@
expression NLH, HDRLEN, MAX, POL, EXT;
@@
-nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
+nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
For this patch, don't actually add the strict, non-renamed versions
yet so that it breaks compile if I get it wrong.
Also, while at it, make nla_validate and nla_parse go down to a
common __nla_validate_parse() function to avoid code duplication.
Ultimately, this allows us to have very strict validation for every
new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
next patch, while existing things will continue to work as is.
In effect then, this adds fully strict validation for any new command.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-26 14:07:28 +02:00
err = nlmsg_parse_deprecated ( nlh , sizeof ( * ifm ) , tb , IFA_MAX ,
ifa_ipv6_policy , extack ) ;
2006-09-18 00:10:19 -07:00
if ( err < 0 )
return err ;
ifm = nlmsg_data ( nlh ) ;
2013-05-16 22:32:00 +00:00
pfx = extract_addr ( tb [ IFA_ADDRESS ] , tb [ IFA_LOCAL ] , & peer_pfx ) ;
2015-03-29 14:00:04 +01:00
if ( ! pfx )
2005-04-16 15:20:36 -07:00
return - EINVAL ;
2014-04-20 21:29:36 +02:00
ifa_flags = tb [ IFA_FLAGS ] ? nla_get_u32 ( tb [ IFA_FLAGS ] ) : ifm - > ifa_flags ;
/* We ignore other flags so far. */
ifa_flags & = IFA_F_MANAGETEMPADDR ;
return inet6_addr_del ( net , ifm - > ifa_index , ifa_flags , pfx ,
2023-07-26 10:39:05 +08:00
ifm - > ifa_prefixlen , extack ) ;
2005-04-16 15:20:36 -07:00
}
2018-05-27 08:09:58 -07:00
static int modify_prefix_route ( struct inet6_ifaddr * ifp ,
2020-03-03 14:37:34 +08:00
unsigned long expires , u32 flags ,
bool modify_peer )
2018-05-27 08:09:58 -07:00
{
struct fib6_info * f6i ;
2018-06-28 13:36:55 -07:00
u32 prio ;
2018-05-27 08:09:58 -07:00
2020-03-03 14:37:34 +08:00
f6i = addrconf_get_prefix_route ( modify_peer ? & ifp - > peer_addr : & ifp - > addr ,
ifp - > prefix_len ,
2019-03-27 20:53:52 -07:00
ifp - > idev - > dev , 0 , RTF_DEFAULT , true ) ;
2018-05-27 08:09:58 -07:00
if ( ! f6i )
return - ENOENT ;
2018-06-28 13:36:55 -07:00
prio = ifp - > rt_priority ? : IP6_RT_PRIO_ADDRCONF ;
if ( f6i - > fib6_metric ! = prio ) {
/* delete old one */
2020-04-27 13:56:45 -07:00
ip6_del_rt ( dev_net ( ifp - > idev - > dev ) , f6i , false ) ;
2018-06-28 13:36:55 -07:00
2018-05-27 08:09:58 -07:00
/* add new one */
2020-03-03 14:37:34 +08:00
addrconf_prefix_route ( modify_peer ? & ifp - > peer_addr : & ifp - > addr ,
ifp - > prefix_len ,
2018-05-27 08:09:58 -07:00
ifp - > rt_priority , ifp - > idev - > dev ,
expires , flags , GFP_KERNEL ) ;
} else {
if ( ! expires )
fib6_clean_expires ( f6i ) ;
else
fib6_set_expires ( f6i , expires ) ;
fib6_info_release ( f6i ) ;
}
return 0 ;
}
2022-02-07 20:50:29 -08:00
static int inet6_addr_modify ( struct net * net , struct inet6_ifaddr * ifp ,
struct ifa6_config * cfg )
2006-07-28 18:12:13 +09:00
{
2008-05-19 16:56:11 -07:00
u32 flags ;
clock_t expires ;
2008-05-27 17:37:49 +09:00
unsigned long timeout ;
2013-12-06 09:45:22 +01:00
bool was_managetempaddr ;
2014-01-15 15:36:59 +01:00
bool had_prefixroute ;
2020-03-03 14:37:35 +08:00
bool new_peer = false ;
2007-02-07 20:36:26 +09:00
2014-03-27 18:28:07 +01:00
ASSERT_RTNL ( ) ;
2018-05-27 08:09:55 -07:00
if ( ! cfg - > valid_lft | | cfg - > preferred_lft > cfg - > valid_lft )
2006-07-28 18:12:13 +09:00
return - EINVAL ;
2018-05-27 08:09:55 -07:00
if ( cfg - > ifa_flags & IFA_F_MANAGETEMPADDR & &
2013-12-06 09:45:22 +01:00
( ifp - > flags & IFA_F_TEMPORARY | | ifp - > prefix_len ! = 64 ) )
return - EINVAL ;
ipv6: allow userspace to add IFA_F_OPTIMISTIC addresses
According to RFC 4429 (section 3.1), adding new IPv6 addresses as
optimistic addresses is acceptable, as long as the implementation
follows some rules:
* Optimistic DAD SHOULD only be used when the implementation is aware
that the address is based on a most likely unique interface
identifier (such as in [RFC2464]), generated randomly [RFC3041],
or by a well-distributed hash function [RFC3972] or assigned by
Dynamic Host Configuration Protocol for IPv6 (DHCPv6) [RFC3315].
Optimistic DAD SHOULD NOT be used for manually entered
addresses.
Thus, it seems reasonable to allow userspace to set the optimistic flag
when adding new addresses.
We must not let userspace set NODAD + OPTIMISTIC, since if the kernel is
not performing DAD we would never clear the optimistic flag. We must
also ignore userspace's request to add OPTIMISTIC flag to addresses that
have already completed DAD (addresses that don't have the TENTATIVE
flag, or that have the DADFAILED flag).
Then we also need to clear the OPTIMISTIC flag on permanent addresses
when DAD fails. Otherwise, IFA_F_OPTIMISTIC addresses added by userspace
can still be used after DAD has failed, because in
ipv6_chk_addr_and_flags(), IFA_F_OPTIMISTIC overrides IFA_F_TENTATIVE.
Setting IFA_F_OPTIMISTIC from userspace is conditional on
CONFIG_IPV6_OPTIMISTIC_DAD and the optimistic_dad sysctl.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 16:40:08 +01:00
if ( ! ( ifp - > flags & IFA_F_TENTATIVE ) | | ifp - > flags & IFA_F_DADFAILED )
2018-05-27 08:09:55 -07:00
cfg - > ifa_flags & = ~ IFA_F_OPTIMISTIC ;
ipv6: allow userspace to add IFA_F_OPTIMISTIC addresses
According to RFC 4429 (section 3.1), adding new IPv6 addresses as
optimistic addresses is acceptable, as long as the implementation
follows some rules:
* Optimistic DAD SHOULD only be used when the implementation is aware
that the address is based on a most likely unique interface
identifier (such as in [RFC2464]), generated randomly [RFC3041],
or by a well-distributed hash function [RFC3972] or assigned by
Dynamic Host Configuration Protocol for IPv6 (DHCPv6) [RFC3315].
Optimistic DAD SHOULD NOT be used for manually entered
addresses.
Thus, it seems reasonable to allow userspace to set the optimistic flag
when adding new addresses.
We must not let userspace set NODAD + OPTIMISTIC, since if the kernel is
not performing DAD we would never clear the optimistic flag. We must
also ignore userspace's request to add OPTIMISTIC flag to addresses that
have already completed DAD (addresses that don't have the TENTATIVE
flag, or that have the DADFAILED flag).
Then we also need to clear the OPTIMISTIC flag on permanent addresses
when DAD fails. Otherwise, IFA_F_OPTIMISTIC addresses added by userspace
can still be used after DAD has failed, because in
ipv6_chk_addr_and_flags(), IFA_F_OPTIMISTIC overrides IFA_F_TENTATIVE.
Setting IFA_F_OPTIMISTIC from userspace is conditional on
CONFIG_IPV6_OPTIMISTIC_DAD and the optimistic_dad sysctl.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 16:40:08 +01:00
2018-05-27 08:09:55 -07:00
timeout = addrconf_timeout_fixup ( cfg - > valid_lft , HZ ) ;
2008-05-27 17:37:49 +09:00
if ( addrconf_finite_timeout ( timeout ) ) {
expires = jiffies_to_clock_t ( timeout * HZ ) ;
2018-05-27 08:09:55 -07:00
cfg - > valid_lft = timeout ;
2008-05-19 16:56:11 -07:00
flags = RTF_EXPIRES ;
2008-05-27 17:37:49 +09:00
} else {
expires = 0 ;
flags = 0 ;
2018-05-27 08:09:55 -07:00
cfg - > ifa_flags | = IFA_F_PERMANENT ;
2008-05-19 16:56:11 -07:00
}
2006-07-28 18:12:13 +09:00
2018-05-27 08:09:55 -07:00
timeout = addrconf_timeout_fixup ( cfg - > preferred_lft , HZ ) ;
2008-05-27 17:37:49 +09:00
if ( addrconf_finite_timeout ( timeout ) ) {
if ( timeout = = 0 )
2018-05-27 08:09:55 -07:00
cfg - > ifa_flags | = IFA_F_DEPRECATED ;
cfg - > preferred_lft = timeout ;
2008-05-27 17:37:49 +09:00
}
2006-07-28 18:12:13 +09:00
2020-03-03 14:37:35 +08:00
if ( cfg - > peer_pfx & &
memcmp ( & ifp - > peer_addr , cfg - > peer_pfx , sizeof ( struct in6_addr ) ) ) {
if ( ! ipv6_addr_any ( & ifp - > peer_addr ) )
cleanup_prefix_route ( ifp , expires , true , true ) ;
new_peer = true ;
}
2006-07-28 18:12:13 +09:00
spin_lock_bh ( & ifp - > lock ) ;
2013-12-06 09:45:22 +01:00
was_managetempaddr = ifp - > flags & IFA_F_MANAGETEMPADDR ;
2014-01-15 15:36:59 +01:00
had_prefixroute = ifp - > flags & IFA_F_PERMANENT & &
! ( ifp - > flags & IFA_F_NOPREFIXROUTE ) ;
2013-12-06 09:45:22 +01:00
ifp - > flags & = ~ ( IFA_F_DEPRECATED | IFA_F_PERMANENT | IFA_F_NODAD |
2014-01-15 15:36:58 +01:00
IFA_F_HOMEADDRESS | IFA_F_MANAGETEMPADDR |
IFA_F_NOPREFIXROUTE ) ;
2018-05-27 08:09:55 -07:00
ifp - > flags | = cfg - > ifa_flags ;
2006-07-28 18:12:13 +09:00
ifp - > tstamp = jiffies ;
2018-05-27 08:09:55 -07:00
ifp - > valid_lft = cfg - > valid_lft ;
ifp - > prefered_lft = cfg - > preferred_lft ;
2022-02-17 16:02:02 +01:00
ifp - > ifa_proto = cfg - > ifa_proto ;
2006-07-28 18:12:13 +09:00
2018-05-27 08:09:58 -07:00
if ( cfg - > rt_priority & & cfg - > rt_priority ! = ifp - > rt_priority )
ifp - > rt_priority = cfg - > rt_priority ;
2006-07-28 18:12:13 +09:00
2020-03-03 14:37:35 +08:00
if ( new_peer )
ifp - > peer_addr = * cfg - > peer_pfx ;
2006-07-28 18:12:13 +09:00
spin_unlock_bh ( & ifp - > lock ) ;
if ( ! ( ifp - > flags & IFA_F_TENTATIVE ) )
ipv6_ifa_notify ( 0 , ifp ) ;
2018-05-27 08:09:55 -07:00
if ( ! ( cfg - > ifa_flags & IFA_F_NOPREFIXROUTE ) ) {
2018-05-27 08:09:58 -07:00
int rc = - ENOENT ;
if ( had_prefixroute )
2020-03-03 14:37:34 +08:00
rc = modify_prefix_route ( ifp , expires , flags , false ) ;
2018-05-27 08:09:58 -07:00
/* prefix route could have been deleted; if so restore it */
if ( rc = = - ENOENT ) {
addrconf_prefix_route ( & ifp - > addr , ifp - > prefix_len ,
ifp - > rt_priority , ifp - > idev - > dev ,
expires , flags , GFP_KERNEL ) ;
}
2020-03-03 14:37:34 +08:00
if ( had_prefixroute & & ! ipv6_addr_any ( & ifp - > peer_addr ) )
rc = modify_prefix_route ( ifp , expires , flags , true ) ;
if ( rc = = - ENOENT & & ! ipv6_addr_any ( & ifp - > peer_addr ) ) {
addrconf_prefix_route ( & ifp - > peer_addr , ifp - > prefix_len ,
ifp - > rt_priority , ifp - > idev - > dev ,
expires , flags , GFP_KERNEL ) ;
}
2014-01-15 15:36:59 +01:00
} else if ( had_prefixroute ) {
enum cleanup_prefix_rt_t action ;
unsigned long rt_expires ;
write_lock_bh ( & ifp - > idev - > lock ) ;
action = check_cleanup_prefix_route ( ifp , & rt_expires ) ;
write_unlock_bh ( & ifp - > idev - > lock ) ;
if ( action ! = CLEANUP_PREFIX_RT_NOP ) {
cleanup_prefix_route ( ifp , rt_expires ,
2020-03-03 14:37:35 +08:00
action = = CLEANUP_PREFIX_RT_DEL , false ) ;
2014-01-15 15:36:59 +01:00
}
2014-01-15 15:36:58 +01:00
}
2013-12-06 09:45:22 +01:00
if ( was_managetempaddr | | ifp - > flags & IFA_F_MANAGETEMPADDR ) {
2018-05-27 08:09:55 -07:00
if ( was_managetempaddr & &
! ( ifp - > flags & IFA_F_MANAGETEMPADDR ) ) {
cfg - > valid_lft = 0 ;
cfg - > preferred_lft = 0 ;
}
manage_tempaddrs ( ifp - > idev , ifp , cfg - > valid_lft ,
cfg - > preferred_lft , ! was_managetempaddr ,
jiffies ) ;
2013-12-06 09:45:22 +01:00
}
2022-02-07 20:50:29 -08:00
addrconf_verify_rtnl ( net ) ;
2006-07-28 18:12:13 +09:00
return 0 ;
}
2005-04-16 15:20:36 -07:00
static int
2017-04-16 09:48:24 -07:00
inet6_rtm_newaddr ( struct sk_buff * skb , struct nlmsghdr * nlh ,
struct netlink_ext_ack * extack )
2005-04-16 15:20:36 -07:00
{
2008-03-26 02:26:21 +09:00
struct net * net = sock_net ( skb - > sk ) ;
2006-09-18 00:09:49 -07:00
struct ifaddrmsg * ifm ;
struct nlattr * tb [ IFA_MAX + 1 ] ;
2018-05-27 08:09:54 -07:00
struct in6_addr * peer_pfx ;
2006-09-18 00:13:46 -07:00
struct inet6_ifaddr * ifa ;
struct net_device * dev ;
ipv6: allow userspace to add IFA_F_OPTIMISTIC addresses
According to RFC 4429 (section 3.1), adding new IPv6 addresses as
optimistic addresses is acceptable, as long as the implementation
follows some rules:
* Optimistic DAD SHOULD only be used when the implementation is aware
that the address is based on a most likely unique interface
identifier (such as in [RFC2464]), generated randomly [RFC3041],
or by a well-distributed hash function [RFC3972] or assigned by
Dynamic Host Configuration Protocol for IPv6 (DHCPv6) [RFC3315].
Optimistic DAD SHOULD NOT be used for manually entered
addresses.
Thus, it seems reasonable to allow userspace to set the optimistic flag
when adding new addresses.
We must not let userspace set NODAD + OPTIMISTIC, since if the kernel is
not performing DAD we would never clear the optimistic flag. We must
also ignore userspace's request to add OPTIMISTIC flag to addresses that
have already completed DAD (addresses that don't have the TENTATIVE
flag, or that have the DADFAILED flag).
Then we also need to clear the OPTIMISTIC flag on permanent addresses
when DAD fails. Otherwise, IFA_F_OPTIMISTIC addresses added by userspace
can still be used after DAD has failed, because in
ipv6_chk_addr_and_flags(), IFA_F_OPTIMISTIC overrides IFA_F_TENTATIVE.
Setting IFA_F_OPTIMISTIC from userspace is conditional on
CONFIG_IPV6_OPTIMISTIC_DAD and the optimistic_dad sysctl.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 16:40:08 +01:00
struct inet6_dev * idev ;
2018-05-27 08:09:54 -07:00
struct ifa6_config cfg ;
2006-09-18 00:09:49 -07:00
int err ;
2005-04-16 15:20:36 -07:00
netlink: make validation more configurable for future strictness
We currently have two levels of strict validation:
1) liberal (default)
- undefined (type >= max) & NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
- garbage at end of message accepted
2) strict (opt-in)
- NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
Split out parsing strictness into four different options:
* TRAILING - check that there's no trailing data after parsing
attributes (in message or nested)
* MAXTYPE - reject attrs > max known type
* UNSPEC - reject attributes with NLA_UNSPEC policy entries
* STRICT_ATTRS - strictly validate attribute size
The default for future things should be *everything*.
The current *_strict() is a combination of TRAILING and MAXTYPE,
and is renamed to _deprecated_strict().
The current regular parsing has none of this, and is renamed to
*_parse_deprecated().
Additionally it allows us to selectively set one of the new flags
even on old policies. Notably, the UNSPEC flag could be useful in
this case, since it can be arranged (by filling in the policy) to
not be an incompatible userspace ABI change, but would then going
forward prevent forgetting attribute entries. Similar can apply
to the POLICY flag.
We end up with the following renames:
* nla_parse -> nla_parse_deprecated
* nla_parse_strict -> nla_parse_deprecated_strict
* nlmsg_parse -> nlmsg_parse_deprecated
* nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
* nla_parse_nested -> nla_parse_nested_deprecated
* nla_validate_nested -> nla_validate_nested_deprecated
Using spatch, of course:
@@
expression TB, MAX, HEAD, LEN, POL, EXT;
@@
-nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
+nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
@@
expression TB, MAX, NLA, POL, EXT;
@@
-nla_parse_nested(TB, MAX, NLA, POL, EXT)
+nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
@@
expression START, MAX, POL, EXT;
@@
-nla_validate_nested(START, MAX, POL, EXT)
+nla_validate_nested_deprecated(START, MAX, POL, EXT)
@@
expression NLH, HDRLEN, MAX, POL, EXT;
@@
-nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
+nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
For this patch, don't actually add the strict, non-renamed versions
yet so that it breaks compile if I get it wrong.
Also, while at it, make nla_validate and nla_parse go down to a
common __nla_validate_parse() function to avoid code duplication.
Ultimately, this allows us to have very strict validation for every
new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
next patch, while existing things will continue to work as is.
In effect then, this adds fully strict validation for any new command.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-26 14:07:28 +02:00
err = nlmsg_parse_deprecated ( nlh , sizeof ( * ifm ) , tb , IFA_MAX ,
ifa_ipv6_policy , extack ) ;
2006-09-18 00:09:49 -07:00
if ( err < 0 )
return err ;
2018-05-27 08:09:54 -07:00
memset ( & cfg , 0 , sizeof ( cfg ) ) ;
2006-09-18 00:09:49 -07:00
ifm = nlmsg_data ( nlh ) ;
2018-05-27 08:09:54 -07:00
cfg . pfx = extract_addr ( tb [ IFA_ADDRESS ] , tb [ IFA_LOCAL ] , & peer_pfx ) ;
if ( ! cfg . pfx )
2005-04-16 15:20:36 -07:00
return - EINVAL ;
2018-05-27 08:09:54 -07:00
cfg . peer_pfx = peer_pfx ;
cfg . plen = ifm - > ifa_prefixlen ;
2018-05-27 08:09:58 -07:00
if ( tb [ IFA_RT_PRIORITY ] )
cfg . rt_priority = nla_get_u32 ( tb [ IFA_RT_PRIORITY ] ) ;
2022-02-17 16:02:02 +01:00
if ( tb [ IFA_PROTO ] )
cfg . ifa_proto = nla_get_u8 ( tb [ IFA_PROTO ] ) ;
2018-05-27 08:09:54 -07:00
cfg . valid_lft = INFINITY_LIFE_TIME ;
cfg . preferred_lft = INFINITY_LIFE_TIME ;
2006-09-18 00:09:49 -07:00
if ( tb [ IFA_CACHEINFO ] ) {
2006-07-28 18:12:10 +09:00
struct ifa_cacheinfo * ci ;
2006-09-18 00:09:49 -07:00
ci = nla_data ( tb [ IFA_CACHEINFO ] ) ;
2018-05-27 08:09:54 -07:00
cfg . valid_lft = ci - > ifa_valid ;
cfg . preferred_lft = ci - > ifa_prefered ;
2006-07-28 18:12:10 +09:00
}
2008-03-05 10:46:57 -08:00
dev = __dev_get_by_index ( net , ifm - > ifa_index ) ;
2023-07-26 10:39:05 +08:00
if ( ! dev ) {
NL_SET_ERR_MSG_MOD ( extack , " Unable to find the interface " ) ;
2006-09-18 00:13:46 -07:00
return - ENODEV ;
2023-07-26 10:39:05 +08:00
}
2006-09-18 00:13:46 -07:00
2018-05-27 08:09:54 -07:00
if ( tb [ IFA_FLAGS ] )
cfg . ifa_flags = nla_get_u32 ( tb [ IFA_FLAGS ] ) ;
else
cfg . ifa_flags = ifm - > ifa_flags ;
2013-12-06 09:45:21 +01:00
2006-09-22 14:45:27 -07:00
/* We ignore other flags so far. */
2018-05-27 08:09:54 -07:00
cfg . ifa_flags & = IFA_F_NODAD | IFA_F_HOMEADDRESS |
IFA_F_MANAGETEMPADDR | IFA_F_NOPREFIXROUTE |
IFA_F_MCAUTOJOIN | IFA_F_OPTIMISTIC ;
ipv6: allow userspace to add IFA_F_OPTIMISTIC addresses
According to RFC 4429 (section 3.1), adding new IPv6 addresses as
optimistic addresses is acceptable, as long as the implementation
follows some rules:
* Optimistic DAD SHOULD only be used when the implementation is aware
that the address is based on a most likely unique interface
identifier (such as in [RFC2464]), generated randomly [RFC3041],
or by a well-distributed hash function [RFC3972] or assigned by
Dynamic Host Configuration Protocol for IPv6 (DHCPv6) [RFC3315].
Optimistic DAD SHOULD NOT be used for manually entered
addresses.
Thus, it seems reasonable to allow userspace to set the optimistic flag
when adding new addresses.
We must not let userspace set NODAD + OPTIMISTIC, since if the kernel is
not performing DAD we would never clear the optimistic flag. We must
also ignore userspace's request to add OPTIMISTIC flag to addresses that
have already completed DAD (addresses that don't have the TENTATIVE
flag, or that have the DADFAILED flag).
Then we also need to clear the OPTIMISTIC flag on permanent addresses
when DAD fails. Otherwise, IFA_F_OPTIMISTIC addresses added by userspace
can still be used after DAD has failed, because in
ipv6_chk_addr_and_flags(), IFA_F_OPTIMISTIC overrides IFA_F_TENTATIVE.
Setting IFA_F_OPTIMISTIC from userspace is conditional on
CONFIG_IPV6_OPTIMISTIC_DAD and the optimistic_dad sysctl.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 16:40:08 +01:00
idev = ipv6_find_idev ( dev ) ;
2019-08-23 15:44:36 +02:00
if ( IS_ERR ( idev ) )
return PTR_ERR ( idev ) ;
ipv6: allow userspace to add IFA_F_OPTIMISTIC addresses
According to RFC 4429 (section 3.1), adding new IPv6 addresses as
optimistic addresses is acceptable, as long as the implementation
follows some rules:
* Optimistic DAD SHOULD only be used when the implementation is aware
that the address is based on a most likely unique interface
identifier (such as in [RFC2464]), generated randomly [RFC3041],
or by a well-distributed hash function [RFC3972] or assigned by
Dynamic Host Configuration Protocol for IPv6 (DHCPv6) [RFC3315].
Optimistic DAD SHOULD NOT be used for manually entered
addresses.
Thus, it seems reasonable to allow userspace to set the optimistic flag
when adding new addresses.
We must not let userspace set NODAD + OPTIMISTIC, since if the kernel is
not performing DAD we would never clear the optimistic flag. We must
also ignore userspace's request to add OPTIMISTIC flag to addresses that
have already completed DAD (addresses that don't have the TENTATIVE
flag, or that have the DADFAILED flag).
Then we also need to clear the OPTIMISTIC flag on permanent addresses
when DAD fails. Otherwise, IFA_F_OPTIMISTIC addresses added by userspace
can still be used after DAD has failed, because in
ipv6_chk_addr_and_flags(), IFA_F_OPTIMISTIC overrides IFA_F_TENTATIVE.
Setting IFA_F_OPTIMISTIC from userspace is conditional on
CONFIG_IPV6_OPTIMISTIC_DAD and the optimistic_dad sysctl.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 16:40:08 +01:00
if ( ! ipv6_allow_optimistic_dad ( net , idev ) )
2018-05-27 08:09:54 -07:00
cfg . ifa_flags & = ~ IFA_F_OPTIMISTIC ;
ipv6: allow userspace to add IFA_F_OPTIMISTIC addresses
According to RFC 4429 (section 3.1), adding new IPv6 addresses as
optimistic addresses is acceptable, as long as the implementation
follows some rules:
* Optimistic DAD SHOULD only be used when the implementation is aware
that the address is based on a most likely unique interface
identifier (such as in [RFC2464]), generated randomly [RFC3041],
or by a well-distributed hash function [RFC3972] or assigned by
Dynamic Host Configuration Protocol for IPv6 (DHCPv6) [RFC3315].
Optimistic DAD SHOULD NOT be used for manually entered
addresses.
Thus, it seems reasonable to allow userspace to set the optimistic flag
when adding new addresses.
We must not let userspace set NODAD + OPTIMISTIC, since if the kernel is
not performing DAD we would never clear the optimistic flag. We must
also ignore userspace's request to add OPTIMISTIC flag to addresses that
have already completed DAD (addresses that don't have the TENTATIVE
flag, or that have the DADFAILED flag).
Then we also need to clear the OPTIMISTIC flag on permanent addresses
when DAD fails. Otherwise, IFA_F_OPTIMISTIC addresses added by userspace
can still be used after DAD has failed, because in
ipv6_chk_addr_and_flags(), IFA_F_OPTIMISTIC overrides IFA_F_TENTATIVE.
Setting IFA_F_OPTIMISTIC from userspace is conditional on
CONFIG_IPV6_OPTIMISTIC_DAD and the optimistic_dad sysctl.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 16:40:08 +01:00
2018-05-27 08:09:54 -07:00
if ( cfg . ifa_flags & IFA_F_NODAD & &
cfg . ifa_flags & IFA_F_OPTIMISTIC ) {
ipv6: allow userspace to add IFA_F_OPTIMISTIC addresses
According to RFC 4429 (section 3.1), adding new IPv6 addresses as
optimistic addresses is acceptable, as long as the implementation
follows some rules:
* Optimistic DAD SHOULD only be used when the implementation is aware
that the address is based on a most likely unique interface
identifier (such as in [RFC2464]), generated randomly [RFC3041],
or by a well-distributed hash function [RFC3972] or assigned by
Dynamic Host Configuration Protocol for IPv6 (DHCPv6) [RFC3315].
Optimistic DAD SHOULD NOT be used for manually entered
addresses.
Thus, it seems reasonable to allow userspace to set the optimistic flag
when adding new addresses.
We must not let userspace set NODAD + OPTIMISTIC, since if the kernel is
not performing DAD we would never clear the optimistic flag. We must
also ignore userspace's request to add OPTIMISTIC flag to addresses that
have already completed DAD (addresses that don't have the TENTATIVE
flag, or that have the DADFAILED flag).
Then we also need to clear the OPTIMISTIC flag on permanent addresses
when DAD fails. Otherwise, IFA_F_OPTIMISTIC addresses added by userspace
can still be used after DAD has failed, because in
ipv6_chk_addr_and_flags(), IFA_F_OPTIMISTIC overrides IFA_F_TENTATIVE.
Setting IFA_F_OPTIMISTIC from userspace is conditional on
CONFIG_IPV6_OPTIMISTIC_DAD and the optimistic_dad sysctl.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 16:40:08 +01:00
NL_SET_ERR_MSG ( extack , " IFA_F_NODAD and IFA_F_OPTIMISTIC are mutually exclusive " ) ;
return - EINVAL ;
}
2006-09-22 14:45:27 -07:00
2018-05-27 08:09:54 -07:00
ifa = ipv6_get_ifaddr ( net , cfg . pfx , dev , 1 ) ;
2015-03-29 14:00:04 +01:00
if ( ! ifa ) {
2006-09-18 00:13:46 -07:00
/*
* It would be best to check for ! NLM_F_CREATE here but
2014-01-12 11:26:32 -08:00
* userspace already relies on not having to provide this .
2006-09-18 00:13:46 -07:00
*/
2018-05-27 08:09:54 -07:00
return inet6_addr_add ( net , ifm - > ifa_index , & cfg , extack ) ;
2006-07-28 18:12:13 +09:00
}
2006-09-18 00:13:46 -07:00
if ( nlh - > nlmsg_flags & NLM_F_EXCL | |
2023-07-26 10:39:05 +08:00
! ( nlh - > nlmsg_flags & NLM_F_REPLACE ) ) {
NL_SET_ERR_MSG_MOD ( extack , " address already assigned " ) ;
2006-09-18 00:13:46 -07:00
err = - EEXIST ;
2023-07-26 10:39:05 +08:00
} else {
2022-02-07 20:50:29 -08:00
err = inet6_addr_modify ( net , ifa , & cfg ) ;
2023-07-26 10:39:05 +08:00
}
2006-09-18 00:13:46 -07:00
in6_ifa_put ( ifa ) ;
return err ;
2005-04-16 15:20:36 -07:00
}
2013-12-06 09:45:21 +01:00
static void put_ifaddrmsg ( struct nlmsghdr * nlh , u8 prefixlen , u32 flags ,
2006-09-18 00:11:52 -07:00
u8 scope , int ifindex )
{
struct ifaddrmsg * ifm ;
ifm = nlmsg_data ( nlh ) ;
ifm - > ifa_family = AF_INET6 ;
ifm - > ifa_prefixlen = prefixlen ;
ifm - > ifa_flags = flags ;
ifm - > ifa_scope = scope ;
ifm - > ifa_index = ifindex ;
}
2006-09-18 00:11:24 -07:00
static int put_cacheinfo ( struct sk_buff * skb , unsigned long cstamp ,
unsigned long tstamp , u32 preferred , u32 valid )
{
struct ifa_cacheinfo ci ;
2010-11-17 04:12:02 +00:00
ci . cstamp = cstamp_delta ( cstamp ) ;
ci . tstamp = cstamp_delta ( tstamp ) ;
2006-09-18 00:11:24 -07:00
ci . ifa_prefered = preferred ;
ci . ifa_valid = valid ;
return nla_put ( skb , IFA_CACHEINFO , sizeof ( ci ) , & ci ) ;
}
2006-09-18 00:11:52 -07:00
static inline int rt_scope ( int ifa_scope )
{
if ( ifa_scope & IFA_HOST )
return RT_SCOPE_HOST ;
else if ( ifa_scope & IFA_LINK )
return RT_SCOPE_LINK ;
else if ( ifa_scope & IFA_SITE )
return RT_SCOPE_SITE ;
else
return RT_SCOPE_UNIVERSE ;
}
2006-09-18 00:12:35 -07:00
static inline int inet6_ifaddr_msgsize ( void )
{
2006-11-10 14:10:15 -08:00
return NLMSG_ALIGN ( sizeof ( struct ifaddrmsg ) )
2013-05-16 22:32:00 +00:00
+ nla_total_size ( 16 ) /* IFA_LOCAL */
2006-11-10 14:10:15 -08:00
+ nla_total_size ( 16 ) /* IFA_ADDRESS */
2013-12-06 09:45:21 +01:00
+ nla_total_size ( sizeof ( struct ifa_cacheinfo ) )
2018-05-27 08:09:58 -07:00
+ nla_total_size ( 4 ) /* IFA_FLAGS */
2022-02-17 16:02:02 +01:00
+ nla_total_size ( 1 ) /* IFA_PROTO */
2018-05-27 08:09:58 -07:00
+ nla_total_size ( 4 ) /* IFA_RT_PRIORITY */ ;
2006-09-18 00:12:35 -07:00
}
2006-06-17 22:48:48 -07:00
2018-10-07 20:16:26 -07:00
enum addr_type_t {
UNICAST_ADDR ,
MULTICAST_ADDR ,
ANYCAST_ADDR ,
} ;
2018-09-04 21:53:55 +02:00
struct inet6_fill_args {
u32 portid ;
u32 seq ;
int event ;
unsigned int flags ;
int netnsid ;
2018-10-19 12:45:30 -07:00
int ifindex ;
2018-10-07 20:16:26 -07:00
enum addr_type_t type ;
2018-09-04 21:53:55 +02:00
} ;
2005-04-16 15:20:36 -07:00
static int inet6_fill_ifaddr ( struct sk_buff * skb , struct inet6_ifaddr * ifa ,
2018-09-04 21:53:55 +02:00
struct inet6_fill_args * args )
2005-04-16 15:20:36 -07:00
{
struct nlmsghdr * nlh ;
2006-09-18 00:11:24 -07:00
u32 preferred , valid ;
2005-04-16 15:20:36 -07:00
2018-09-04 21:53:55 +02:00
nlh = nlmsg_put ( skb , args - > portid , args - > seq , args - > event ,
sizeof ( struct ifaddrmsg ) , args - > flags ) ;
2015-03-29 14:00:04 +01:00
if ( ! nlh )
2007-01-31 23:16:40 -08:00
return - EMSGSIZE ;
2006-09-18 00:12:35 -07:00
2006-09-18 00:11:52 -07:00
put_ifaddrmsg ( nlh , ifa - > prefix_len , ifa - > flags , rt_scope ( ifa - > scope ) ,
ifa - > idev - > dev - > ifindex ) ;
2018-09-04 21:53:55 +02:00
if ( args - > netnsid > = 0 & &
nla_put_s32 ( skb , IFA_TARGET_NETNSID , args - > netnsid ) )
2018-09-04 21:53:50 +02:00
goto error ;
2022-02-23 14:19:56 +01:00
spin_lock_bh ( & ifa - > lock ) ;
2013-12-31 12:04:19 +09:00
if ( ! ( ( ifa - > flags & IFA_F_PERMANENT ) & &
( ifa - > prefered_lft = = INFINITY_LIFE_TIME ) ) ) {
2006-09-18 00:11:24 -07:00
preferred = ifa - > prefered_lft ;
valid = ifa - > valid_lft ;
if ( preferred ! = INFINITY_LIFE_TIME ) {
2005-04-16 15:20:36 -07:00
long tval = ( jiffies - ifa - > tstamp ) / HZ ;
2009-06-25 04:55:50 +00:00
if ( preferred > tval )
preferred - = tval ;
else
preferred = 0 ;
2010-06-26 11:37:47 +00:00
if ( valid ! = INFINITY_LIFE_TIME ) {
if ( valid > tval )
valid - = tval ;
else
valid = 0 ;
}
2005-04-16 15:20:36 -07:00
}
} else {
2006-09-18 00:11:24 -07:00
preferred = INFINITY_LIFE_TIME ;
valid = INFINITY_LIFE_TIME ;
}
2022-02-23 14:19:56 +01:00
spin_unlock_bh ( & ifa - > lock ) ;
2006-09-18 00:11:24 -07:00
2013-05-22 05:41:06 +00:00
if ( ! ipv6_addr_any ( & ifa - > peer_addr ) ) {
2015-03-29 16:59:25 +02:00
if ( nla_put_in6_addr ( skb , IFA_LOCAL , & ifa - > addr ) < 0 | |
nla_put_in6_addr ( skb , IFA_ADDRESS , & ifa - > peer_addr ) < 0 )
2013-05-16 22:32:00 +00:00
goto error ;
} else
2015-03-29 16:59:25 +02:00
if ( nla_put_in6_addr ( skb , IFA_ADDRESS , & ifa - > addr ) < 0 )
2013-05-16 22:32:00 +00:00
goto error ;
2018-05-27 08:09:58 -07:00
if ( ifa - > rt_priority & &
nla_put_u32 ( skb , IFA_RT_PRIORITY , ifa - > rt_priority ) )
goto error ;
2013-05-16 22:32:00 +00:00
if ( put_cacheinfo ( skb , ifa - > cstamp , ifa - > tstamp , preferred , valid ) < 0 )
goto error ;
2005-04-16 15:20:36 -07:00
2013-12-06 09:45:21 +01:00
if ( nla_put_u32 ( skb , IFA_FLAGS , ifa - > flags ) < 0 )
goto error ;
2022-02-17 16:02:02 +01:00
if ( ifa - > ifa_proto & &
nla_put_u8 ( skb , IFA_PROTO , ifa - > ifa_proto ) )
goto error ;
2015-01-16 22:09:00 +01:00
nlmsg_end ( skb , nlh ) ;
return 0 ;
2013-05-16 22:32:00 +00:00
error :
nlmsg_cancel ( skb , nlh ) ;
return - EMSGSIZE ;
2005-04-16 15:20:36 -07:00
}
static int inet6_fill_ifmcaddr ( struct sk_buff * skb , struct ifmcaddr6 * ifmca ,
2018-09-04 21:53:55 +02:00
struct inet6_fill_args * args )
2005-04-16 15:20:36 -07:00
{
struct nlmsghdr * nlh ;
2006-09-18 00:11:52 -07:00
u8 scope = RT_SCOPE_UNIVERSE ;
int ifindex = ifmca - > idev - > dev - > ifindex ;
if ( ipv6_addr_scope ( & ifmca - > mca_addr ) & IFA_SITE )
scope = RT_SCOPE_SITE ;
2005-04-16 15:20:36 -07:00
2018-09-04 21:53:55 +02:00
nlh = nlmsg_put ( skb , args - > portid , args - > seq , args - > event ,
sizeof ( struct ifaddrmsg ) , args - > flags ) ;
2015-03-29 14:00:04 +01:00
if ( ! nlh )
2007-01-31 23:16:40 -08:00
return - EMSGSIZE ;
2006-09-18 00:11:24 -07:00
2018-09-04 21:53:55 +02:00
if ( args - > netnsid > = 0 & &
2020-11-12 16:09:50 +08:00
nla_put_s32 ( skb , IFA_TARGET_NETNSID , args - > netnsid ) ) {
nlmsg_cancel ( skb , nlh ) ;
2018-09-04 21:53:50 +02:00
return - EMSGSIZE ;
2020-11-12 16:09:50 +08:00
}
2018-09-04 21:53:50 +02:00
2006-09-18 00:12:35 -07:00
put_ifaddrmsg ( nlh , 128 , IFA_F_PERMANENT , scope , ifindex ) ;
2015-03-29 16:59:25 +02:00
if ( nla_put_in6_addr ( skb , IFA_MULTICAST , & ifmca - > mca_addr ) < 0 | |
2006-09-18 00:12:35 -07:00
put_cacheinfo ( skb , ifmca - > mca_cstamp , ifmca - > mca_tstamp ,
2007-01-31 23:16:40 -08:00
INFINITY_LIFE_TIME , INFINITY_LIFE_TIME ) < 0 ) {
nlmsg_cancel ( skb , nlh ) ;
return - EMSGSIZE ;
}
2006-09-18 00:11:24 -07:00
2015-01-16 22:09:00 +01:00
nlmsg_end ( skb , nlh ) ;
return 0 ;
2005-04-16 15:20:36 -07:00
}
static int inet6_fill_ifacaddr ( struct sk_buff * skb , struct ifacaddr6 * ifaca ,
2018-09-04 21:53:55 +02:00
struct inet6_fill_args * args )
2005-04-16 15:20:36 -07:00
{
2018-04-18 15:39:01 -07:00
struct net_device * dev = fib6_info_nh_dev ( ifaca - > aca_rt ) ;
int ifindex = dev ? dev - > ifindex : 1 ;
2005-04-16 15:20:36 -07:00
struct nlmsghdr * nlh ;
2006-09-18 00:11:52 -07:00
u8 scope = RT_SCOPE_UNIVERSE ;
if ( ipv6_addr_scope ( & ifaca - > aca_addr ) & IFA_SITE )
scope = RT_SCOPE_SITE ;
2005-04-16 15:20:36 -07:00
2018-09-04 21:53:55 +02:00
nlh = nlmsg_put ( skb , args - > portid , args - > seq , args - > event ,
sizeof ( struct ifaddrmsg ) , args - > flags ) ;
2015-03-29 14:00:04 +01:00
if ( ! nlh )
2007-01-31 23:16:40 -08:00
return - EMSGSIZE ;
2006-09-18 00:11:24 -07:00
2018-09-04 21:53:55 +02:00
if ( args - > netnsid > = 0 & &
2020-11-12 16:09:50 +08:00
nla_put_s32 ( skb , IFA_TARGET_NETNSID , args - > netnsid ) ) {
nlmsg_cancel ( skb , nlh ) ;
2018-09-04 21:53:50 +02:00
return - EMSGSIZE ;
2020-11-12 16:09:50 +08:00
}
2018-09-04 21:53:50 +02:00
2006-09-18 00:12:35 -07:00
put_ifaddrmsg ( nlh , 128 , IFA_F_PERMANENT , scope , ifindex ) ;
2015-03-29 16:59:25 +02:00
if ( nla_put_in6_addr ( skb , IFA_ANYCAST , & ifaca - > aca_addr ) < 0 | |
2006-09-18 00:12:35 -07:00
put_cacheinfo ( skb , ifaca - > aca_cstamp , ifaca - > aca_tstamp ,
2007-01-31 23:16:40 -08:00
INFINITY_LIFE_TIME , INFINITY_LIFE_TIME ) < 0 ) {
nlmsg_cancel ( skb , nlh ) ;
return - EMSGSIZE ;
}
2005-04-16 15:20:36 -07:00
2015-01-16 22:09:00 +01:00
nlmsg_end ( skb , nlh ) ;
return 0 ;
2005-04-16 15:20:36 -07:00
}
2009-11-12 04:11:50 +00:00
/* called with rcu_read_lock() */
static int in6_dump_addrs ( struct inet6_dev * idev , struct sk_buff * skb ,
2018-10-19 12:45:28 -07:00
struct netlink_callback * cb , int s_ip_idx ,
2018-10-07 20:16:26 -07:00
struct inet6_fill_args * fillargs )
2009-11-12 04:11:50 +00:00
{
struct ifmcaddr6 * ifmca ;
struct ifacaddr6 * ifaca ;
2018-10-19 12:45:28 -07:00
int ip_idx = 0 ;
2009-11-12 04:11:50 +00:00
int err = 1 ;
read_lock_bh ( & idev - > lock ) ;
2018-10-07 20:16:26 -07:00
switch ( fillargs - > type ) {
2010-03-17 20:31:13 +00:00
case UNICAST_ADDR : {
struct inet6_ifaddr * ifa ;
2018-10-07 20:16:26 -07:00
fillargs - > event = RTM_NEWADDR ;
2010-03-17 20:31:13 +00:00
2009-11-12 04:11:50 +00:00
/* unicast address incl. temp addr */
2010-03-17 20:31:13 +00:00
list_for_each_entry ( ifa , & idev - > addr_list , if_list ) {
2018-10-19 10:00:19 -07:00
if ( ip_idx < s_ip_idx )
goto next ;
2018-10-07 20:16:26 -07:00
err = inet6_fill_ifaddr ( skb , ifa , fillargs ) ;
2015-01-16 22:09:00 +01:00
if ( err < 0 )
2009-11-12 04:11:50 +00:00
break ;
2013-03-22 06:28:43 +00:00
nl_dump_check_consistent ( cb , nlmsg_hdr ( skb ) ) ;
2018-10-19 10:00:19 -07:00
next :
ip_idx + + ;
2009-11-12 04:11:50 +00:00
}
break ;
2010-03-17 20:31:13 +00:00
}
2009-11-12 04:11:50 +00:00
case MULTICAST_ADDR :
2021-03-25 16:16:55 +00:00
read_unlock_bh ( & idev - > lock ) ;
2018-10-07 20:16:26 -07:00
fillargs - > event = RTM_GETMULTICAST ;
2018-09-04 21:53:55 +02:00
2009-11-12 04:11:50 +00:00
/* multicast address */
2022-06-28 12:12:48 +00:00
for ( ifmca = rtnl_dereference ( idev - > mc_list ) ;
2021-03-25 16:16:55 +00:00
ifmca ;
2022-06-28 12:12:48 +00:00
ifmca = rtnl_dereference ( ifmca - > next ) , ip_idx + + ) {
2009-11-12 04:11:50 +00:00
if ( ip_idx < s_ip_idx )
continue ;
2018-10-07 20:16:26 -07:00
err = inet6_fill_ifmcaddr ( skb , ifmca , fillargs ) ;
2015-01-16 22:09:00 +01:00
if ( err < 0 )
2009-11-12 04:11:50 +00:00
break ;
}
2021-03-25 16:16:55 +00:00
read_lock_bh ( & idev - > lock ) ;
2009-11-12 04:11:50 +00:00
break ;
case ANYCAST_ADDR :
2018-10-07 20:16:26 -07:00
fillargs - > event = RTM_GETANYCAST ;
2009-11-12 04:11:50 +00:00
/* anycast address */
for ( ifaca = idev - > ac_list ; ifaca ;
ifaca = ifaca - > aca_next , ip_idx + + ) {
if ( ip_idx < s_ip_idx )
continue ;
2018-10-07 20:16:26 -07:00
err = inet6_fill_ifacaddr ( skb , ifaca , fillargs ) ;
2015-01-16 22:09:00 +01:00
if ( err < 0 )
2009-11-12 04:11:50 +00:00
break ;
}
break ;
default :
break ;
}
read_unlock_bh ( & idev - > lock ) ;
2018-10-19 12:45:28 -07:00
cb - > args [ 2 ] = ip_idx ;
2009-11-12 04:11:50 +00:00
return err ;
}
2018-10-07 20:16:29 -07:00
static int inet6_valid_dump_ifaddr_req ( const struct nlmsghdr * nlh ,
struct inet6_fill_args * fillargs ,
struct net * * tgt_net , struct sock * sk ,
2018-10-19 12:45:30 -07:00
struct netlink_callback * cb )
2018-10-07 20:16:29 -07:00
{
2018-10-19 12:45:30 -07:00
struct netlink_ext_ack * extack = cb - > extack ;
2018-10-07 20:16:29 -07:00
struct nlattr * tb [ IFA_MAX + 1 ] ;
struct ifaddrmsg * ifm ;
int err , i ;
if ( nlh - > nlmsg_len < nlmsg_msg_size ( sizeof ( * ifm ) ) ) {
NL_SET_ERR_MSG_MOD ( extack , " Invalid header for address dump request " ) ;
return - EINVAL ;
}
ifm = nlmsg_data ( nlh ) ;
if ( ifm - > ifa_prefixlen | | ifm - > ifa_flags | | ifm - > ifa_scope ) {
NL_SET_ERR_MSG_MOD ( extack , " Invalid values in header for address dump request " ) ;
return - EINVAL ;
}
2018-10-19 12:45:30 -07:00
fillargs - > ifindex = ifm - > ifa_index ;
if ( fillargs - > ifindex ) {
cb - > answer_flags | = NLM_F_DUMP_FILTERED ;
fillargs - > flags | = NLM_F_DUMP_FILTERED ;
2018-10-07 20:16:29 -07:00
}
netlink: make validation more configurable for future strictness
We currently have two levels of strict validation:
1) liberal (default)
- undefined (type >= max) & NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
- garbage at end of message accepted
2) strict (opt-in)
- NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
Split out parsing strictness into four different options:
* TRAILING - check that there's no trailing data after parsing
attributes (in message or nested)
* MAXTYPE - reject attrs > max known type
* UNSPEC - reject attributes with NLA_UNSPEC policy entries
* STRICT_ATTRS - strictly validate attribute size
The default for future things should be *everything*.
The current *_strict() is a combination of TRAILING and MAXTYPE,
and is renamed to _deprecated_strict().
The current regular parsing has none of this, and is renamed to
*_parse_deprecated().
Additionally it allows us to selectively set one of the new flags
even on old policies. Notably, the UNSPEC flag could be useful in
this case, since it can be arranged (by filling in the policy) to
not be an incompatible userspace ABI change, but would then going
forward prevent forgetting attribute entries. Similar can apply
to the POLICY flag.
We end up with the following renames:
* nla_parse -> nla_parse_deprecated
* nla_parse_strict -> nla_parse_deprecated_strict
* nlmsg_parse -> nlmsg_parse_deprecated
* nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
* nla_parse_nested -> nla_parse_nested_deprecated
* nla_validate_nested -> nla_validate_nested_deprecated
Using spatch, of course:
@@
expression TB, MAX, HEAD, LEN, POL, EXT;
@@
-nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
+nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
@@
expression TB, MAX, NLA, POL, EXT;
@@
-nla_parse_nested(TB, MAX, NLA, POL, EXT)
+nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
@@
expression START, MAX, POL, EXT;
@@
-nla_validate_nested(START, MAX, POL, EXT)
+nla_validate_nested_deprecated(START, MAX, POL, EXT)
@@
expression NLH, HDRLEN, MAX, POL, EXT;
@@
-nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
+nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
For this patch, don't actually add the strict, non-renamed versions
yet so that it breaks compile if I get it wrong.
Also, while at it, make nla_validate and nla_parse go down to a
common __nla_validate_parse() function to avoid code duplication.
Ultimately, this allows us to have very strict validation for every
new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
next patch, while existing things will continue to work as is.
In effect then, this adds fully strict validation for any new command.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-26 14:07:28 +02:00
err = nlmsg_parse_deprecated_strict ( nlh , sizeof ( * ifm ) , tb , IFA_MAX ,
ifa_ipv6_policy , extack ) ;
2018-10-07 20:16:29 -07:00
if ( err < 0 )
return err ;
for ( i = 0 ; i < = IFA_MAX ; + + i ) {
if ( ! tb [ i ] )
continue ;
if ( i = = IFA_TARGET_NETNSID ) {
struct net * net ;
fillargs - > netnsid = nla_get_s32 ( tb [ i ] ) ;
net = rtnl_get_net_ns_capable ( sk , fillargs - > netnsid ) ;
if ( IS_ERR ( net ) ) {
2018-10-25 21:18:25 +02:00
fillargs - > netnsid = - 1 ;
2018-10-07 20:16:29 -07:00
NL_SET_ERR_MSG_MOD ( extack , " Invalid target network namespace id " ) ;
return PTR_ERR ( net ) ;
}
* tgt_net = net ;
} else {
NL_SET_ERR_MSG_MOD ( extack , " Unsupported attribute in dump request " ) ;
return - EINVAL ;
}
}
return 0 ;
}
2005-04-16 15:20:36 -07:00
static int inet6_dump_addr ( struct sk_buff * skb , struct netlink_callback * cb ,
enum addr_type_t type )
{
2018-10-07 20:16:29 -07:00
const struct nlmsghdr * nlh = cb - > nlh ;
2018-10-07 20:16:26 -07:00
struct inet6_fill_args fillargs = {
. portid = NETLINK_CB ( cb - > skb ) . portid ,
. seq = cb - > nlh - > nlmsg_seq ,
. flags = NLM_F_MULTI ,
. netnsid = - 1 ,
. type = type ,
} ;
2021-07-15 22:26:43 +08:00
struct net * tgt_net = sock_net ( skb - > sk ) ;
2018-10-19 12:45:28 -07:00
int idx , s_idx , s_ip_idx ;
2009-11-12 04:11:50 +00:00
int h , s_h ;
2005-04-16 15:20:36 -07:00
struct net_device * dev ;
2009-11-12 04:11:50 +00:00
struct inet6_dev * idev ;
struct hlist_head * head ;
2018-10-24 12:59:00 -07:00
int err = 0 ;
2007-02-09 23:24:49 +09:00
2009-11-12 04:11:50 +00:00
s_h = cb - > args [ 0 ] ;
s_idx = idx = cb - > args [ 1 ] ;
2018-10-19 12:45:28 -07:00
s_ip_idx = cb - > args [ 2 ] ;
2008-01-22 17:29:40 +09:00
2018-10-07 20:16:29 -07:00
if ( cb - > strict_check ) {
err = inet6_valid_dump_ifaddr_req ( nlh , & fillargs , & tgt_net ,
2018-10-19 12:45:30 -07:00
skb - > sk , cb ) ;
2018-10-07 20:16:29 -07:00
if ( err < 0 )
2018-10-24 12:59:00 -07:00
goto put_tgt_net ;
2018-10-19 12:45:30 -07:00
2018-10-24 12:59:00 -07:00
err = 0 ;
2018-10-19 12:45:30 -07:00
if ( fillargs . ifindex ) {
dev = __dev_get_by_index ( tgt_net , fillargs . ifindex ) ;
2018-10-24 12:59:00 -07:00
if ( ! dev ) {
err = - ENODEV ;
goto put_tgt_net ;
}
2018-10-19 12:45:30 -07:00
idev = __in6_dev_get ( dev ) ;
if ( idev ) {
err = in6_dump_addrs ( idev , skb , cb , s_ip_idx ,
& fillargs ) ;
2019-01-22 14:47:19 -08:00
if ( err > 0 )
err = 0 ;
2018-10-19 12:45:30 -07:00
}
goto put_tgt_net ;
}
2018-09-04 21:53:50 +02:00
}
2009-11-12 04:11:50 +00:00
rcu_read_lock ( ) ;
2024-02-15 17:21:07 +00:00
cb - > seq = inet6_base_seq ( tgt_net ) ;
2009-11-12 04:11:50 +00:00
for ( h = s_h ; h < NETDEV_HASHENTRIES ; h + + , s_idx = 0 ) {
idx = 0 ;
2018-09-04 21:53:50 +02:00
head = & tgt_net - > dev_index_head [ h ] ;
hlist: drop the node parameter from iterators
I'm not sure why, but the hlist for each entry iterators were conceived
list_for_each_entry(pos, head, member)
The hlist ones were greedy and wanted an extra parameter:
hlist_for_each_entry(tpos, pos, head, member)
Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.
Besides the semantic patch, there was some manual work required:
- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.
The semantic patch which is mostly the work of Peter Senna Tschudin is here:
@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
type T;
expression a,c,d,e;
identifier b;
statement S;
@@
-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>
[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 17:06:00 -08:00
hlist_for_each_entry_rcu ( dev , head , index_hlist ) {
2009-11-12 04:11:50 +00:00
if ( idx < s_idx )
goto cont ;
2010-03-26 20:27:49 -07:00
if ( h > s_h | | idx > s_idx )
2009-11-12 04:11:50 +00:00
s_ip_idx = 0 ;
2010-03-20 16:09:01 -07:00
idev = __in6_dev_get ( dev ) ;
if ( ! idev )
2009-11-12 04:11:50 +00:00
goto cont ;
2018-10-19 12:45:28 -07:00
if ( in6_dump_addrs ( idev , skb , cb , s_ip_idx ,
2018-10-07 20:16:26 -07:00
& fillargs ) < 0 )
2009-11-12 04:11:50 +00:00
goto done ;
2007-05-03 15:13:45 -07:00
cont :
2009-11-12 04:11:50 +00:00
idx + + ;
}
2005-04-16 15:20:36 -07:00
}
2009-11-12 04:11:50 +00:00
done :
rcu_read_unlock ( ) ;
cb - > args [ 0 ] = h ;
cb - > args [ 1 ] = idx ;
2018-10-19 12:45:30 -07:00
put_tgt_net :
2018-10-07 20:16:26 -07:00
if ( fillargs . netnsid > = 0 )
2018-09-04 21:53:50 +02:00
put_net ( tgt_net ) ;
2009-11-12 04:11:50 +00:00
2018-12-31 02:10:58 +00:00
return skb - > len ? : err ;
2005-04-16 15:20:36 -07:00
}
static int inet6_dump_ifaddr ( struct sk_buff * skb , struct netlink_callback * cb )
{
enum addr_type_t type = UNICAST_ADDR ;
2007-12-01 00:21:31 +11:00
2005-04-16 15:20:36 -07:00
return inet6_dump_addr ( skb , cb , type ) ;
}
static int inet6_dump_ifmcaddr ( struct sk_buff * skb , struct netlink_callback * cb )
{
enum addr_type_t type = MULTICAST_ADDR ;
2007-12-01 00:21:31 +11:00
2005-04-16 15:20:36 -07:00
return inet6_dump_addr ( skb , cb , type ) ;
}
static int inet6_dump_ifacaddr ( struct sk_buff * skb , struct netlink_callback * cb )
{
enum addr_type_t type = ANYCAST_ADDR ;
2007-12-01 00:21:31 +11:00
2005-04-16 15:20:36 -07:00
return inet6_dump_addr ( skb , cb , type ) ;
}
2019-01-18 10:46:21 -08:00
static int inet6_rtm_valid_getaddr_req ( struct sk_buff * skb ,
const struct nlmsghdr * nlh ,
struct nlattr * * tb ,
struct netlink_ext_ack * extack )
{
struct ifaddrmsg * ifm ;
int i , err ;
if ( nlh - > nlmsg_len < nlmsg_msg_size ( sizeof ( * ifm ) ) ) {
NL_SET_ERR_MSG_MOD ( extack , " Invalid header for get address request " ) ;
return - EINVAL ;
}
2019-12-11 22:20:16 +08:00
if ( ! netlink_strict_get_check ( skb ) )
return nlmsg_parse_deprecated ( nlh , sizeof ( * ifm ) , tb , IFA_MAX ,
ifa_ipv6_policy , extack ) ;
2019-01-18 10:46:21 -08:00
ifm = nlmsg_data ( nlh ) ;
if ( ifm - > ifa_prefixlen | | ifm - > ifa_flags | | ifm - > ifa_scope ) {
NL_SET_ERR_MSG_MOD ( extack , " Invalid values in header for get address request " ) ;
return - EINVAL ;
}
netlink: make validation more configurable for future strictness
We currently have two levels of strict validation:
1) liberal (default)
- undefined (type >= max) & NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
- garbage at end of message accepted
2) strict (opt-in)
- NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
Split out parsing strictness into four different options:
* TRAILING - check that there's no trailing data after parsing
attributes (in message or nested)
* MAXTYPE - reject attrs > max known type
* UNSPEC - reject attributes with NLA_UNSPEC policy entries
* STRICT_ATTRS - strictly validate attribute size
The default for future things should be *everything*.
The current *_strict() is a combination of TRAILING and MAXTYPE,
and is renamed to _deprecated_strict().
The current regular parsing has none of this, and is renamed to
*_parse_deprecated().
Additionally it allows us to selectively set one of the new flags
even on old policies. Notably, the UNSPEC flag could be useful in
this case, since it can be arranged (by filling in the policy) to
not be an incompatible userspace ABI change, but would then going
forward prevent forgetting attribute entries. Similar can apply
to the POLICY flag.
We end up with the following renames:
* nla_parse -> nla_parse_deprecated
* nla_parse_strict -> nla_parse_deprecated_strict
* nlmsg_parse -> nlmsg_parse_deprecated
* nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
* nla_parse_nested -> nla_parse_nested_deprecated
* nla_validate_nested -> nla_validate_nested_deprecated
Using spatch, of course:
@@
expression TB, MAX, HEAD, LEN, POL, EXT;
@@
-nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
+nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
@@
expression TB, MAX, NLA, POL, EXT;
@@
-nla_parse_nested(TB, MAX, NLA, POL, EXT)
+nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
@@
expression START, MAX, POL, EXT;
@@
-nla_validate_nested(START, MAX, POL, EXT)
+nla_validate_nested_deprecated(START, MAX, POL, EXT)
@@
expression NLH, HDRLEN, MAX, POL, EXT;
@@
-nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
+nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
For this patch, don't actually add the strict, non-renamed versions
yet so that it breaks compile if I get it wrong.
Also, while at it, make nla_validate and nla_parse go down to a
common __nla_validate_parse() function to avoid code duplication.
Ultimately, this allows us to have very strict validation for every
new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
next patch, while existing things will continue to work as is.
In effect then, this adds fully strict validation for any new command.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-26 14:07:28 +02:00
err = nlmsg_parse_deprecated_strict ( nlh , sizeof ( * ifm ) , tb , IFA_MAX ,
ifa_ipv6_policy , extack ) ;
2019-01-18 10:46:21 -08:00
if ( err )
return err ;
for ( i = 0 ; i < = IFA_MAX ; i + + ) {
if ( ! tb [ i ] )
continue ;
switch ( i ) {
case IFA_TARGET_NETNSID :
case IFA_ADDRESS :
case IFA_LOCAL :
break ;
default :
NL_SET_ERR_MSG_MOD ( extack , " Unsupported attribute in get address request " ) ;
return - EINVAL ;
}
}
return 0 ;
}
2017-04-16 09:48:24 -07:00
static int inet6_rtm_getaddr ( struct sk_buff * in_skb , struct nlmsghdr * nlh ,
struct netlink_ext_ack * extack )
2006-07-28 18:12:12 +09:00
{
2021-07-15 22:26:43 +08:00
struct net * tgt_net = sock_net ( in_skb - > sk ) ;
2018-09-04 21:53:55 +02:00
struct inet6_fill_args fillargs = {
. portid = NETLINK_CB ( in_skb ) . portid ,
. seq = nlh - > nlmsg_seq ,
. event = RTM_NEWADDR ,
. flags = 0 ,
. netnsid = - 1 ,
} ;
2006-09-18 00:10:50 -07:00
struct ifaddrmsg * ifm ;
struct nlattr * tb [ IFA_MAX + 1 ] ;
2013-05-16 22:32:00 +00:00
struct in6_addr * addr = NULL , * peer ;
2006-07-28 18:12:12 +09:00
struct net_device * dev = NULL ;
struct inet6_ifaddr * ifa ;
struct sk_buff * skb ;
int err ;
2019-01-18 10:46:21 -08:00
err = inet6_rtm_valid_getaddr_req ( in_skb , nlh , tb , extack ) ;
2006-09-18 00:10:50 -07:00
if ( err < 0 )
2017-10-11 10:28:01 +02:00
return err ;
2006-09-18 00:10:50 -07:00
2018-09-04 21:53:50 +02:00
if ( tb [ IFA_TARGET_NETNSID ] ) {
2018-09-04 21:53:55 +02:00
fillargs . netnsid = nla_get_s32 ( tb [ IFA_TARGET_NETNSID ] ) ;
2018-09-04 21:53:50 +02:00
tgt_net = rtnl_get_net_ns_capable ( NETLINK_CB ( in_skb ) . sk ,
2018-09-04 21:53:55 +02:00
fillargs . netnsid ) ;
2018-09-04 21:53:50 +02:00
if ( IS_ERR ( tgt_net ) )
return PTR_ERR ( tgt_net ) ;
}
2013-05-16 22:32:00 +00:00
addr = extract_addr ( tb [ IFA_ADDRESS ] , tb [ IFA_LOCAL ] , & peer ) ;
2024-02-22 12:17:47 +00:00
if ( ! addr ) {
err = - EINVAL ;
goto errout ;
}
2006-09-18 00:10:50 -07:00
ifm = nlmsg_data ( nlh ) ;
2006-07-28 18:12:12 +09:00
if ( ifm - > ifa_index )
2018-09-04 21:53:50 +02:00
dev = dev_get_by_index ( tgt_net , ifm - > ifa_index ) ;
2006-07-28 18:12:12 +09:00
2018-09-04 21:53:50 +02:00
ifa = ipv6_get_ifaddr ( tgt_net , addr , dev , 1 ) ;
2010-03-20 16:09:01 -07:00
if ( ! ifa ) {
2006-09-18 00:10:50 -07:00
err = - EADDRNOTAVAIL ;
goto errout ;
}
2006-07-28 18:12:12 +09:00
2010-03-20 16:09:01 -07:00
skb = nlmsg_new ( inet6_ifaddr_msgsize ( ) , GFP_KERNEL ) ;
if ( ! skb ) {
2006-07-28 18:12:12 +09:00
err = - ENOBUFS ;
2006-09-18 00:10:50 -07:00
goto errout_ifa ;
2006-07-28 18:12:12 +09:00
}
2018-09-04 21:53:55 +02:00
err = inet6_fill_ifaddr ( skb , ifa , & fillargs ) ;
2007-01-31 23:16:40 -08:00
if ( err < 0 ) {
/* -EMSGSIZE implies BUG in inet6_ifaddr_msgsize() */
WARN_ON ( err = = - EMSGSIZE ) ;
kfree_skb ( skb ) ;
goto errout_ifa ;
}
2018-09-04 21:53:50 +02:00
err = rtnl_unicast ( skb , tgt_net , NETLINK_CB ( in_skb ) . portid ) ;
2006-09-18 00:10:50 -07:00
errout_ifa :
2006-07-28 18:12:12 +09:00
in6_ifa_put ( ifa ) ;
2006-09-18 00:10:50 -07:00
errout :
2021-08-05 19:55:27 +08:00
dev_put ( dev ) ;
2018-09-04 21:53:55 +02:00
if ( fillargs . netnsid > = 0 )
2018-09-04 21:53:50 +02:00
put_net ( tgt_net ) ;
2006-07-28 18:12:12 +09:00
return err ;
}
2005-04-16 15:20:36 -07:00
static void inet6_ifa_notify ( int event , struct inet6_ifaddr * ifa )
{
struct sk_buff * skb ;
2008-03-25 21:47:49 +09:00
struct net * net = dev_net ( ifa - > idev - > dev ) ;
2018-09-04 21:53:55 +02:00
struct inet6_fill_args fillargs = {
. portid = 0 ,
. seq = 0 ,
. event = event ,
. flags = 0 ,
. netnsid = - 1 ,
} ;
2006-08-15 00:35:02 -07:00
int err = - ENOBUFS ;
2005-04-16 15:20:36 -07:00
2006-09-18 00:12:35 -07:00
skb = nlmsg_new ( inet6_ifaddr_msgsize ( ) , GFP_ATOMIC ) ;
2015-03-29 14:00:04 +01:00
if ( ! skb )
2006-08-15 00:35:02 -07:00
goto errout ;
2018-09-04 21:53:55 +02:00
err = inet6_fill_ifaddr ( skb , ifa , & fillargs ) ;
2007-01-31 23:16:40 -08:00
if ( err < 0 ) {
/* -EMSGSIZE implies BUG in inet6_ifaddr_msgsize() */
WARN_ON ( err = = - EMSGSIZE ) ;
kfree_skb ( skb ) ;
goto errout ;
}
2009-02-24 23:18:28 -08:00
rtnl_notify ( skb , net , 0 , RTNLGRP_IPV6_IFADDR , NULL , GFP_ATOMIC ) ;
return ;
2006-08-15 00:35:02 -07:00
errout :
if ( err < 0 )
2008-03-05 10:47:47 -08:00
rtnl_set_sk_err ( net , RTNLGRP_IPV6_IFADDR , err ) ;
2005-04-16 15:20:36 -07:00
}
2007-03-22 12:27:49 -07:00
static inline void ipv6_store_devconf ( struct ipv6_devconf * cnf ,
2005-04-16 15:20:36 -07:00
__s32 * array , int bytes )
{
2006-11-14 19:53:58 -08:00
BUG_ON ( bytes < ( DEVCONF_MAX * 4 ) ) ;
2005-04-16 15:20:36 -07:00
memset ( array , 0 , bytes ) ;
array [ DEVCONF_FORWARDING ] = cnf - > forwarding ;
array [ DEVCONF_HOPLIMIT ] = cnf - > hop_limit ;
array [ DEVCONF_MTU6 ] = cnf - > mtu6 ;
array [ DEVCONF_ACCEPT_RA ] = cnf - > accept_ra ;
array [ DEVCONF_ACCEPT_REDIRECTS ] = cnf - > accept_redirects ;
array [ DEVCONF_AUTOCONF ] = cnf - > autoconf ;
array [ DEVCONF_DAD_TRANSMITS ] = cnf - > dad_transmits ;
array [ DEVCONF_RTR_SOLICITS ] = cnf - > rtr_solicits ;
2010-11-17 01:44:24 +00:00
array [ DEVCONF_RTR_SOLICIT_INTERVAL ] =
jiffies_to_msecs ( cnf - > rtr_solicit_interval ) ;
2016-09-27 23:57:58 -07:00
array [ DEVCONF_RTR_SOLICIT_MAX_INTERVAL ] =
jiffies_to_msecs ( cnf - > rtr_solicit_max_interval ) ;
2010-11-17 01:44:24 +00:00
array [ DEVCONF_RTR_SOLICIT_DELAY ] =
jiffies_to_msecs ( cnf - > rtr_solicit_delay ) ;
2005-04-16 15:20:36 -07:00
array [ DEVCONF_FORCE_MLD_VERSION ] = cnf - > force_mld_version ;
2013-08-14 01:03:46 +02:00
array [ DEVCONF_MLDV1_UNSOLICITED_REPORT_INTERVAL ] =
jiffies_to_msecs ( cnf - > mldv1_unsolicited_report_interval ) ;
array [ DEVCONF_MLDV2_UNSOLICITED_REPORT_INTERVAL ] =
jiffies_to_msecs ( cnf - > mldv2_unsolicited_report_interval ) ;
2005-04-16 15:20:36 -07:00
array [ DEVCONF_USE_TEMPADDR ] = cnf - > use_tempaddr ;
array [ DEVCONF_TEMP_VALID_LFT ] = cnf - > temp_valid_lft ;
array [ DEVCONF_TEMP_PREFERED_LFT ] = cnf - > temp_prefered_lft ;
array [ DEVCONF_REGEN_MAX_RETRY ] = cnf - > regen_max_retry ;
array [ DEVCONF_MAX_DESYNC_FACTOR ] = cnf - > max_desync_factor ;
array [ DEVCONF_MAX_ADDRESSES ] = cnf - > max_addresses ;
2006-03-20 16:55:08 -08:00
array [ DEVCONF_ACCEPT_RA_DEFRTR ] = cnf - > accept_ra_defrtr ;
net: allow user to set metric on default route learned via Router Advertisement
For IPv4, default route is learned via DHCPv4 and user is allowed to change
metric using config etc/network/interfaces. But for IPv6, default route can
be learned via RA, for which, currently a fixed metric value 1024 is used.
Ideally, user should be able to configure metric on default route for IPv6
similar to IPv4. This patch adds sysctl for the same.
Logs:
For IPv4:
Config in etc/network/interfaces:
auto eth0
iface eth0 inet dhcp
metric 4261413864
IPv4 Kernel Route Table:
$ ip route list
default via 172.21.47.1 dev eth0 metric 4261413864
FRR Table, if a static route is configured:
[In real scenario, it is useful to prefer BGP learned default route over DHCPv4 default route.]
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
> - selected route, * - FIB route
S>* 0.0.0.0/0 [20/0] is directly connected, eth0, 00:00:03
K 0.0.0.0/0 [254/1000] via 172.21.47.1, eth0, 6d08h51m
i.e. User can prefer Default Router learned via Routing Protocol in IPv4.
Similar behavior is not possible for IPv6, without this fix.
After fix [for IPv6]:
sudo sysctl -w net.ipv6.conf.eth0.net.ipv6.conf.eth0.ra_defrtr_metric=1996489705
IP monitor: [When IPv6 RA is received]
default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 pref high
Kernel IPv6 routing table
$ ip -6 route list
default via fe80::be16:65ff:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 21sec hoplimit 64 pref high
FRR Table, if a static route is configured:
[In real scenario, it is useful to prefer BGP learned default route over IPv6 RA default route.]
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
> - selected route, * - FIB route
S>* ::/0 [20/0] is directly connected, eth0, 00:00:06
K ::/0 [119/1001] via fe80::xx16:xxxx:feb3:ce8e, eth0, 6d07h43m
If the metric is changed later, the effect will be seen only when next IPv6
RA is received, because the default route must be fully controlled by RA msg.
Below metric is changed from 1996489705 to 1996489704.
$ sudo sysctl -w net.ipv6.conf.eth0.ra_defrtr_metric=1996489704
net.ipv6.conf.eth0.ra_defrtr_metric = 1996489704
IP monitor:
[On next IPv6 RA msg, Kernel deletes prev route and installs new route with updated metric]
Deleted default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 3sec hoplimit 64 pref high
default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489704 pref high
Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210125214430.24079-1-pchaudhary@linkedin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 13:44:30 -08:00
array [ DEVCONF_RA_DEFRTR_METRIC ] = cnf - > ra_defrtr_metric ;
2015-07-30 14:28:42 +08:00
array [ DEVCONF_ACCEPT_RA_MIN_HOP_LIMIT ] = cnf - > accept_ra_min_hop_limit ;
2006-03-20 16:55:26 -08:00
array [ DEVCONF_ACCEPT_RA_PINFO ] = cnf - > accept_ra_pinfo ;
2006-03-20 17:05:30 -08:00
# ifdef CONFIG_IPV6_ROUTER_PREF
array [ DEVCONF_ACCEPT_RA_RTR_PREF ] = cnf - > accept_ra_rtr_pref ;
2010-11-17 01:44:24 +00:00
array [ DEVCONF_RTR_PROBE_INTERVAL ] =
jiffies_to_msecs ( cnf - > rtr_probe_interval ) ;
2007-01-30 14:30:10 -08:00
# ifdef CONFIG_IPV6_ROUTE_INFO
2017-03-22 18:19:04 +09:00
array [ DEVCONF_ACCEPT_RA_RT_INFO_MIN_PLEN ] = cnf - > accept_ra_rt_info_min_plen ;
2006-03-20 17:07:03 -08:00
array [ DEVCONF_ACCEPT_RA_RT_INFO_MAX_PLEN ] = cnf - > accept_ra_rt_info_max_plen ;
# endif
2006-03-20 17:05:30 -08:00
# endif
2006-09-22 14:43:49 -07:00
array [ DEVCONF_PROXY_NDP ] = cnf - > proxy_ndp ;
2007-04-24 14:58:30 -07:00
array [ DEVCONF_ACCEPT_SOURCE_ROUTE ] = cnf - > accept_source_route ;
2007-04-25 17:08:10 -07:00
# ifdef CONFIG_IPV6_OPTIMISTIC_DAD
array [ DEVCONF_OPTIMISTIC_DAD ] = cnf - > optimistic_dad ;
net: ipv6: Add a sysctl to make optimistic addresses useful candidates
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set). A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 18:11:14 +09:00
array [ DEVCONF_USE_OPTIMISTIC ] = cnf - > use_optimistic ;
2007-04-25 17:08:10 -07:00
# endif
2008-04-03 09:22:53 +09:00
# ifdef CONFIG_IPV6_MROUTE
2022-02-04 12:15:45 -08:00
array [ DEVCONF_MC_FORWARDING ] = atomic_read ( & cnf - > mc_forwarding ) ;
2008-04-03 09:22:53 +09:00
# endif
2008-06-28 14:17:11 +09:00
array [ DEVCONF_DISABLE_IPV6 ] = cnf - > disable_ipv6 ;
2008-06-28 14:18:38 +09:00
array [ DEVCONF_ACCEPT_DAD ] = cnf - > accept_dad ;
2009-10-09 03:11:14 +00:00
array [ DEVCONF_FORCE_TLLAO ] = cnf - > force_tllao ;
2012-11-06 16:46:20 +00:00
array [ DEVCONF_NDISC_NOTIFY ] = cnf - > ndisc_notify ;
2013-08-27 01:36:51 +02:00
array [ DEVCONF_SUPPRESS_FRAG_NDISC ] = cnf - > suppress_frag_ndisc ;
2014-06-25 14:44:53 -07:00
array [ DEVCONF_ACCEPT_RA_FROM_LOCAL ] = cnf - > accept_ra_from_local ;
2015-01-20 10:06:05 -07:00
array [ DEVCONF_ACCEPT_RA_MTU ] = cnf - > accept_ra_mtu ;
2015-08-13 10:39:01 -04:00
array [ DEVCONF_IGNORE_ROUTES_WITH_LINKDOWN ] = cnf - > ignore_routes_with_linkdown ;
2015-03-23 23:36:00 +01:00
/* we omit DEVCONF_STABLE_SECRET for now */
2015-07-22 16:38:25 +09:00
array [ DEVCONF_USE_OIF_ADDRS_ONLY ] = cnf - > use_oif_addrs_only ;
2016-02-04 13:31:19 +01:00
array [ DEVCONF_DROP_UNICAST_IN_L2_MULTICAST ] = cnf - > drop_unicast_in_l2_multicast ;
2016-02-04 13:31:20 +01:00
array [ DEVCONF_DROP_UNSOLICITED_NA ] = cnf - > drop_unsolicited_na ;
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 09:25:37 -08:00
array [ DEVCONF_KEEP_ADDR_ON_DOWN ] = cnf - > keep_addr_on_down ;
2016-11-08 14:57:39 +01:00
array [ DEVCONF_SEG6_ENABLED ] = cnf - > seg6_enabled ;
2016-11-08 14:57:42 +01:00
# ifdef CONFIG_IPV6_SEG6_HMAC
array [ DEVCONF_SEG6_REQUIRE_HMAC ] = cnf - > seg6_require_hmac ;
# endif
2016-12-02 14:00:08 -08:00
array [ DEVCONF_ENHANCED_DAD ] = cnf - > enhanced_dad ;
2017-01-26 16:59:17 +13:00
array [ DEVCONF_ADDR_GEN_MODE ] = cnf - > addr_gen_mode ;
2017-02-23 16:27:18 +00:00
array [ DEVCONF_DISABLE_POLICY ] = cnf - > disable_policy ;
net: ipv6: sysctl to specify IPv6 ND traffic class
Add a per-device sysctl to specify the default traffic class to use for
kernel originated IPv6 Neighbour Discovery packets.
Currently this includes:
- Router Solicitation (ICMPv6 type 133)
ndisc_send_rs() -> ndisc_send_skb() -> ip6_nd_hdr()
- Neighbour Solicitation (ICMPv6 type 135)
ndisc_send_ns() -> ndisc_send_skb() -> ip6_nd_hdr()
- Neighbour Advertisement (ICMPv6 type 136)
ndisc_send_na() -> ndisc_send_skb() -> ip6_nd_hdr()
- Redirect (ICMPv6 type 137)
ndisc_send_redirect() -> ndisc_send_skb() -> ip6_nd_hdr()
and if the kernel ever gets around to generating RA's,
it would presumably also include:
- Router Advertisement (ICMPv6 type 134)
(radvd daemon could pick up on the kernel setting and use it)
Interface drivers may examine the Traffic Class value and translate
the DiffServ Code Point into a link-layer appropriate traffic
prioritization scheme. An example of mapping IETF DSCP values to
IEEE 802.11 User Priority values can be found here:
https://tools.ietf.org/html/draft-ietf-tsvwg-ieee-802-11
The expected primary use case is to properly prioritize ND over wifi.
Testing:
jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
0
jzem22:~# echo -1 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
-bash: echo: write error: Invalid argument
jzem22:~# echo 256 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
-bash: echo: write error: Invalid argument
jzem22:~# echo 0 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
jzem22:~# echo 255 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
255
jzem22:~# echo 34 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
34
jzem22:~# echo $[0xDC] > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
jzem22:~# tcpdump -v -i eth0 icmp6 and src host jzem22.pgc and dst host fe80::1
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
IP6 (class 0xdc, hlim 255, next-header ICMPv6 (58) payload length: 24)
jzem22.pgc > fe80::1: [icmp6 sum ok] ICMP6, neighbor advertisement,
length 24, tgt is jzem22.pgc, Flags [solicited]
(based on original change written by Erik Kline, with minor changes)
v2: fix 'suspicious rcu_dereference_check() usage'
by explicitly grabbing the rcu_read_lock.
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Erik Kline <ek@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-07 21:52:09 -08:00
array [ DEVCONF_NDISC_TCLASS ] = cnf - > ndisc_tclass ;
2020-03-27 18:00:20 -04:00
array [ DEVCONF_RPL_SEG_ENABLED ] = cnf - > rpl_seg_enabled ;
ipv6: ioam: Data plane support for Pre-allocated Trace
Implement support for processing the IOAM Pre-allocated Trace with IPv6,
see [1] and [2]. Introduce a new IPv6 Hop-by-Hop TLV option, see IANA [3].
A new per-interface sysctl is introduced. The value is a boolean to accept (=1)
or ignore (=0, by default) IPv6 IOAM options on ingress for an interface:
- net.ipv6.conf.XXX.ioam6_enabled
Two other sysctls are introduced to define IOAM IDs, represented by an integer.
They are respectively per-namespace and per-interface:
- net.ipv6.ioam6_id
- net.ipv6.conf.XXX.ioam6_id
The value of the first one represents the IOAM ID of the node itself (u32; max
and default value = U32_MAX>>8, due to hop limit concatenation) while the other
represents the IOAM ID of an interface (u16; max and default value = U16_MAX).
Each "ioam6_id" sysctl has a "_wide" equivalent:
- net.ipv6.ioam6_id_wide
- net.ipv6.conf.XXX.ioam6_id_wide
The value of the first one represents the wide IOAM ID of the node itself (u64;
max and default value = U64_MAX>>8, due to hop limit concatenation) while the
other represents the wide IOAM ID of an interface (u32; max and default value
= U32_MAX).
The use of short and wide equivalents is not exclusive, a deployment could
choose to leverage both. For example, net.ipv6.conf.XXX.ioam6_id (short format)
could be an identifier for a physical interface, whereas
net.ipv6.conf.XXX.ioam6_id_wide (wide format) could be an identifier for a
logical sub-interface. Documentation about new sysctls is provided at the end
of this patchset.
Two relativistic hash tables are used: one for IOAM namespaces, the other for
IOAM schemas. A namespace can only have a single active schema and a schema
can only be attached to a single namespace (1:1 relationship).
[1] https://tools.ietf.org/html/draft-ietf-ippm-ioam-ipv6-options
[2] https://tools.ietf.org/html/draft-ietf-ippm-ioam-data
[3] https://www.iana.org/assignments/ipv6-parameters/ipv6-parameters.xhtml#ipv6-parameters-2
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 21:42:57 +02:00
array [ DEVCONF_IOAM6_ENABLED ] = cnf - > ioam6_enabled ;
array [ DEVCONF_IOAM6_ID ] = cnf - > ioam6_id ;
array [ DEVCONF_IOAM6_ID_WIDE ] = cnf - > ioam6_id_wide ;
2021-11-01 10:36:29 -07:00
array [ DEVCONF_NDISC_EVICT_NOCARRIER ] = cnf - > ndisc_evict_nocarrier ;
2022-05-30 10:14:14 +00:00
array [ DEVCONF_ACCEPT_UNTRACKED_NA ] = cnf - > accept_untracked_na ;
2023-07-26 16:07:01 -07:00
array [ DEVCONF_ACCEPT_RA_MIN_LFT ] = cnf - > accept_ra_min_lft ;
2005-04-16 15:20:36 -07:00
}
2010-11-16 04:33:57 +00:00
static inline size_t inet6_ifla6_size ( void )
{
return nla_total_size ( 4 ) /* IFLA_INET6_FLAGS */
+ nla_total_size ( sizeof ( struct ifla_cacheinfo ) )
+ nla_total_size ( DEVCONF_MAX * 4 ) /* IFLA_INET6_CONF */
+ nla_total_size ( IPSTATS_MIB_MAX * 8 ) /* IFLA_INET6_STATS */
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
+ nla_total_size ( ICMP6_MIB_MAX * 8 ) /* IFLA_INET6_ICMP6STATS */
2018-07-09 12:25:16 +02:00
+ nla_total_size ( sizeof ( struct in6_addr ) ) /* IFLA_INET6_TOKEN */
+ nla_total_size ( 1 ) /* IFLA_INET6_ADDR_GEN_MODE */
ipv6: add IFLA_INET6_RA_MTU to expose mtu value
The kernel provides a "/proc/sys/net/ipv6/conf/<iface>/mtu"
file, which can temporarily record the mtu value of the last
received RA message when the RA mtu value is lower than the
interface mtu, but this proc has following limitations:
(1) when the interface mtu (/sys/class/net/<iface>/mtu) is
updeated, mtu6 (/proc/sys/net/ipv6/conf/<iface>/mtu) will
be updated to the value of interface mtu;
(2) mtu6 (/proc/sys/net/ipv6/conf/<iface>/mtu) only affect
ipv6 connection, and not affect ipv4.
Therefore, when the mtu option is carried in the RA message,
there will be a problem that the user sometimes cannot obtain
RA mtu value correctly by reading mtu6.
After this patch set, if a RA message carries the mtu option,
you can send a netlink msg which nlmsg_type is RTM_GETLINK,
and then by parsing the attribute of IFLA_INET6_RA_MTU to
get the mtu value carried in the RA message received on the
inet6 device. In addition, you can also get a link notification
when ra_mtu is updated so it doesn't have to poll.
In this way, if the MTU values that the device receives from
the network in the PCO IPv4 and the RA IPv6 procedures are
different, the user can obtain the correct ipv6 ra_mtu value
and compare the value of ra_mtu and ipv4 mtu, then the device
can use the lower MTU value for both IPv4 and IPv6.
Signed-off-by: Rocco Yue <rocco.yue@mediatek.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210827150412.9267-1-rocco.yue@mediatek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-27 23:04:12 +08:00
+ nla_total_size ( 4 ) /* IFLA_INET6_RA_MTU */
2018-07-09 12:25:16 +02:00
+ 0 ;
2010-11-16 04:33:57 +00:00
}
2006-11-10 14:10:15 -08:00
static inline size_t inet6_if_nlmsg_size ( void )
{
return NLMSG_ALIGN ( sizeof ( struct ifinfomsg ) )
+ nla_total_size ( IFNAMSIZ ) /* IFLA_IFNAME */
+ nla_total_size ( MAX_ADDR_LEN ) /* IFLA_ADDRESS */
+ nla_total_size ( 4 ) /* IFLA_MTU */
+ nla_total_size ( 4 ) /* IFLA_LINK */
2015-08-13 15:26:35 -04:00
+ nla_total_size ( 1 ) /* IFLA_OPERSTATE */
2010-11-16 04:33:57 +00:00
+ nla_total_size ( inet6_ifla6_size ( ) ) ; /* IFLA_PROTINFO */
2006-11-10 14:10:15 -08:00
}
2006-06-17 22:48:48 -07:00
2011-05-19 01:14:23 +00:00
static inline void __snmp6_fill_statsdev ( u64 * stats , atomic_long_t * mib ,
2016-09-30 11:29:03 +08:00
int bytes )
2007-04-24 21:54:09 -07:00
{
int i ;
2016-09-30 11:29:03 +08:00
int pad = bytes - sizeof ( u64 ) * ICMP6_MIB_MAX ;
2007-04-24 21:54:09 -07:00
BUG_ON ( pad < 0 ) ;
/* Use put_unaligned() because stats may not be aligned for u64. */
2016-09-30 11:29:03 +08:00
put_unaligned ( ICMP6_MIB_MAX , & stats [ 0 ] ) ;
for ( i = 1 ; i < ICMP6_MIB_MAX ; i + + )
2011-05-19 01:14:23 +00:00
put_unaligned ( atomic_long_read ( & mib [ i ] ) , & stats [ i ] ) ;
2007-04-24 21:54:09 -07:00
2016-09-30 11:29:03 +08:00
memset ( & stats [ ICMP6_MIB_MAX ] , 0 , pad ) ;
2007-04-24 21:54:09 -07:00
}
2014-05-05 15:55:55 -07:00
static inline void __snmp6_fill_stats64 ( u64 * stats , void __percpu * mib ,
2015-08-30 11:29:42 +05:30
int bytes , size_t syncpoff )
2010-06-30 13:31:19 -07:00
{
2015-08-30 11:29:42 +05:30
int i , c ;
u64 buff [ IPSTATS_MIB_MAX ] ;
int pad = bytes - sizeof ( u64 ) * IPSTATS_MIB_MAX ;
2010-06-30 13:31:19 -07:00
BUG_ON ( pad < 0 ) ;
2015-08-30 11:29:42 +05:30
memset ( buff , 0 , sizeof ( buff ) ) ;
buff [ 0 ] = IPSTATS_MIB_MAX ;
2010-06-30 13:31:19 -07:00
2015-08-30 11:29:42 +05:30
for_each_possible_cpu ( c ) {
for ( i = 1 ; i < IPSTATS_MIB_MAX ; i + + )
buff [ i ] + = snmp_get_cpu_field64 ( mib , c , i , syncpoff ) ;
}
memcpy ( stats , buff , IPSTATS_MIB_MAX * sizeof ( u64 ) ) ;
memset ( & stats [ IPSTATS_MIB_MAX ] , 0 , pad ) ;
2010-06-30 13:31:19 -07:00
}
2007-04-24 21:54:09 -07:00
static void snmp6_fill_stats ( u64 * stats , struct inet6_dev * idev , int attrtype ,
int bytes )
{
2010-03-20 16:09:01 -07:00
switch ( attrtype ) {
2007-04-24 21:54:09 -07:00
case IFLA_INET6_STATS :
2015-08-30 11:29:42 +05:30
__snmp6_fill_stats64 ( stats , idev - > stats . ipv6 , bytes ,
offsetof ( struct ipstats_mib , syncp ) ) ;
2007-04-24 21:54:09 -07:00
break ;
case IFLA_INET6_ICMP6STATS :
2016-09-30 11:29:03 +08:00
__snmp6_fill_statsdev ( stats , idev - > stats . icmpv6dev - > mibs , bytes ) ;
2007-04-24 21:54:09 -07:00
break ;
}
}
2015-09-11 16:48:48 -04:00
static int inet6_fill_ifla6_attrs ( struct sk_buff * skb , struct inet6_dev * idev ,
u32 ext_filter_mask )
2010-11-16 04:33:57 +00:00
{
struct nlattr * nla ;
struct ifla_cacheinfo ci ;
2012-04-01 20:27:33 -04:00
if ( nla_put_u32 ( skb , IFLA_INET6_FLAGS , idev - > if_flags ) )
goto nla_put_failure ;
2010-11-16 04:33:57 +00:00
ci . max_reasm_len = IPV6_MAXPLEN ;
2010-11-19 13:13:47 -08:00
ci . tstamp = cstamp_delta ( idev - > tstamp ) ;
ci . reachable_time = jiffies_to_msecs ( idev - > nd_parms - > reachable_time ) ;
2013-12-07 19:26:53 +01:00
ci . retrans_time = jiffies_to_msecs ( NEIGH_VAR ( idev - > nd_parms , RETRANS_TIME ) ) ;
2012-04-01 20:27:33 -04:00
if ( nla_put ( skb , IFLA_INET6_CACHEINFO , sizeof ( ci ) , & ci ) )
goto nla_put_failure ;
2010-11-16 04:33:57 +00:00
nla = nla_reserve ( skb , IFLA_INET6_CONF , DEVCONF_MAX * sizeof ( s32 ) ) ;
2015-03-29 14:00:04 +01:00
if ( ! nla )
2010-11-16 04:33:57 +00:00
goto nla_put_failure ;
ipv6_store_devconf ( & idev - > cnf , nla_data ( nla ) , nla_len ( nla ) ) ;
/* XXX - MC not implemented */
2015-09-11 16:48:48 -04:00
if ( ext_filter_mask & RTEXT_FILTER_SKIP_STATS )
return 0 ;
2010-11-16 04:33:57 +00:00
nla = nla_reserve ( skb , IFLA_INET6_STATS , IPSTATS_MIB_MAX * sizeof ( u64 ) ) ;
2015-03-29 14:00:04 +01:00
if ( ! nla )
2010-11-16 04:33:57 +00:00
goto nla_put_failure ;
snmp6_fill_stats ( nla_data ( nla ) , idev , IFLA_INET6_STATS , nla_len ( nla ) ) ;
nla = nla_reserve ( skb , IFLA_INET6_ICMP6STATS , ICMP6_MIB_MAX * sizeof ( u64 ) ) ;
2015-03-29 14:00:04 +01:00
if ( ! nla )
2010-11-16 04:33:57 +00:00
goto nla_put_failure ;
snmp6_fill_stats ( nla_data ( nla ) , idev , IFLA_INET6_ICMP6STATS , nla_len ( nla ) ) ;
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
nla = nla_reserve ( skb , IFLA_INET6_TOKEN , sizeof ( struct in6_addr ) ) ;
2015-03-29 14:00:04 +01:00
if ( ! nla )
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
goto nla_put_failure ;
read_lock_bh ( & idev - > lock ) ;
memcpy ( nla_data ( nla ) , idev - > token . s6_addr , nla_len ( nla ) ) ;
read_unlock_bh ( & idev - > lock ) ;
2019-09-30 14:02:16 +02:00
if ( nla_put_u8 ( skb , IFLA_INET6_ADDR_GEN_MODE , idev - > cnf . addr_gen_mode ) )
goto nla_put_failure ;
ipv6: add IFLA_INET6_RA_MTU to expose mtu value
The kernel provides a "/proc/sys/net/ipv6/conf/<iface>/mtu"
file, which can temporarily record the mtu value of the last
received RA message when the RA mtu value is lower than the
interface mtu, but this proc has following limitations:
(1) when the interface mtu (/sys/class/net/<iface>/mtu) is
updeated, mtu6 (/proc/sys/net/ipv6/conf/<iface>/mtu) will
be updated to the value of interface mtu;
(2) mtu6 (/proc/sys/net/ipv6/conf/<iface>/mtu) only affect
ipv6 connection, and not affect ipv4.
Therefore, when the mtu option is carried in the RA message,
there will be a problem that the user sometimes cannot obtain
RA mtu value correctly by reading mtu6.
After this patch set, if a RA message carries the mtu option,
you can send a netlink msg which nlmsg_type is RTM_GETLINK,
and then by parsing the attribute of IFLA_INET6_RA_MTU to
get the mtu value carried in the RA message received on the
inet6 device. In addition, you can also get a link notification
when ra_mtu is updated so it doesn't have to poll.
In this way, if the MTU values that the device receives from
the network in the PCO IPv4 and the RA IPv6 procedures are
different, the user can obtain the correct ipv6 ra_mtu value
and compare the value of ra_mtu and ipv4 mtu, then the device
can use the lower MTU value for both IPv4 and IPv6.
Signed-off-by: Rocco Yue <rocco.yue@mediatek.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210827150412.9267-1-rocco.yue@mediatek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-27 23:04:12 +08:00
if ( idev - > ra_mtu & &
nla_put_u32 ( skb , IFLA_INET6_RA_MTU , idev - > ra_mtu ) )
goto nla_put_failure ;
2010-11-16 04:33:57 +00:00
return 0 ;
nla_put_failure :
return - EMSGSIZE ;
}
2015-10-19 09:23:28 -07:00
static size_t inet6_get_link_af_size ( const struct net_device * dev ,
u32 ext_filter_mask )
2010-11-16 04:33:57 +00:00
{
if ( ! __in6_dev_get ( dev ) )
return 0 ;
return inet6_ifla6_size ( ) ;
}
2015-09-11 16:48:48 -04:00
static int inet6_fill_link_af ( struct sk_buff * skb , const struct net_device * dev ,
u32 ext_filter_mask )
2010-11-16 04:33:57 +00:00
{
struct inet6_dev * idev = __in6_dev_get ( dev ) ;
if ( ! idev )
return - ENODATA ;
2015-09-11 16:48:48 -04:00
if ( inet6_fill_ifla6_attrs ( skb , idev , ext_filter_mask ) < 0 )
2010-11-16 04:33:57 +00:00
return - EMSGSIZE ;
return 0 ;
}
2021-04-07 08:59:12 -07:00
static int inet6_set_iftoken ( struct inet6_dev * idev , struct in6_addr * token ,
struct netlink_ext_ack * extack )
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
{
struct inet6_ifaddr * ifp ;
struct net_device * dev = idev - > dev ;
2016-04-08 15:55:00 +02:00
bool clear_token , update_rs = false ;
2013-06-24 21:42:40 +02:00
struct in6_addr ll_addr ;
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
2014-03-27 18:28:07 +01:00
ASSERT_RTNL ( ) ;
2015-03-29 14:00:04 +01:00
if ( ! token )
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
return - EINVAL ;
2021-04-07 08:59:12 -07:00
if ( dev - > flags & IFF_LOOPBACK ) {
NL_SET_ERR_MSG_MOD ( extack , " Device is loopback " ) ;
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
return - EINVAL ;
2021-04-07 08:59:12 -07:00
}
if ( dev - > flags & IFF_NOARP ) {
NL_SET_ERR_MSG_MOD ( extack ,
" Device does not do neighbour discovery " ) ;
return - EINVAL ;
}
if ( ! ipv6_accept_ra ( idev ) ) {
NL_SET_ERR_MSG_MOD ( extack ,
" Router advertisement is disabled on device " ) ;
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
return - EINVAL ;
2021-04-07 08:59:12 -07:00
}
if ( idev - > cnf . rtr_solicits = = 0 ) {
NL_SET_ERR_MSG ( extack ,
" Router solicitation is disabled on device " ) ;
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
return - EINVAL ;
2021-04-07 08:59:12 -07:00
}
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
write_lock_bh ( & idev - > lock ) ;
BUILD_BUG_ON ( sizeof ( token - > s6_addr ) ! = 16 ) ;
memcpy ( idev - > token . s6_addr + 8 , token - > s6_addr + 8 , 8 ) ;
write_unlock_bh ( & idev - > lock ) ;
2016-04-08 15:55:00 +02:00
clear_token = ipv6_addr_any ( token ) ;
if ( clear_token )
goto update_lft ;
2013-06-24 21:42:40 +02:00
if ( ! idev - > dead & & ( idev - > if_flags & IF_READY ) & &
! ipv6_get_lladdr ( dev , & ll_addr , IFA_F_TENTATIVE |
IFA_F_OPTIMISTIC ) ) {
2013-04-09 03:47:15 +00:00
/* If we're not ready, then normal ifup will take care
* of this . Otherwise , we need to request our rs here .
*/
ndisc_send_rs ( dev , & ll_addr , & in6addr_linklocal_allrouters ) ;
update_rs = true ;
}
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
2016-04-08 15:55:00 +02:00
update_lft :
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
write_lock_bh ( & idev - > lock ) ;
2013-04-09 03:47:15 +00:00
2013-06-26 03:41:49 +02:00
if ( update_rs ) {
2013-04-09 03:47:15 +00:00
idev - > if_flags | = IF_RS_SENT ;
2016-09-27 23:57:58 -07:00
idev - > rs_interval = rfc3315_s14_backoff_init (
idev - > cnf . rtr_solicit_interval ) ;
2013-06-26 03:41:49 +02:00
idev - > rs_probes = 1 ;
2016-09-27 23:57:58 -07:00
addrconf_mod_rs_timer ( idev , idev - > rs_interval ) ;
2013-06-26 03:41:49 +02:00
}
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
/* Well, that's kinda nasty ... */
list_for_each_entry ( ifp , & idev - > addr_list , if_list ) {
spin_lock ( & ifp - > lock ) ;
2013-04-09 03:47:16 +00:00
if ( ifp - > tokenized ) {
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
ifp - > valid_lft = 0 ;
ifp - > prefered_lft = 0 ;
}
spin_unlock ( & ifp - > lock ) ;
}
write_unlock_bh ( & idev - > lock ) ;
2014-10-27 17:39:16 +01:00
inet6_ifinfo_notify ( RTM_NEWLINK , idev ) ;
2022-02-07 20:50:29 -08:00
addrconf_verify_rtnl ( dev_net ( dev ) ) ;
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
return 0 ;
}
2015-02-05 14:39:11 +01:00
static const struct nla_policy inet6_af_policy [ IFLA_INET6_MAX + 1 ] = {
[ IFLA_INET6_ADDR_GEN_MODE ] = { . type = NLA_U8 } ,
[ IFLA_INET6_TOKEN ] = { . len = sizeof ( struct in6_addr ) } ,
ipv6: add IFLA_INET6_RA_MTU to expose mtu value
The kernel provides a "/proc/sys/net/ipv6/conf/<iface>/mtu"
file, which can temporarily record the mtu value of the last
received RA message when the RA mtu value is lower than the
interface mtu, but this proc has following limitations:
(1) when the interface mtu (/sys/class/net/<iface>/mtu) is
updeated, mtu6 (/proc/sys/net/ipv6/conf/<iface>/mtu) will
be updated to the value of interface mtu;
(2) mtu6 (/proc/sys/net/ipv6/conf/<iface>/mtu) only affect
ipv6 connection, and not affect ipv4.
Therefore, when the mtu option is carried in the RA message,
there will be a problem that the user sometimes cannot obtain
RA mtu value correctly by reading mtu6.
After this patch set, if a RA message carries the mtu option,
you can send a netlink msg which nlmsg_type is RTM_GETLINK,
and then by parsing the attribute of IFLA_INET6_RA_MTU to
get the mtu value carried in the RA message received on the
inet6 device. In addition, you can also get a link notification
when ra_mtu is updated so it doesn't have to poll.
In this way, if the MTU values that the device receives from
the network in the PCO IPv4 and the RA IPv6 procedures are
different, the user can obtain the correct ipv6 ra_mtu value
and compare the value of ra_mtu and ipv4 mtu, then the device
can use the lower MTU value for both IPv4 and IPv6.
Signed-off-by: Rocco Yue <rocco.yue@mediatek.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210827150412.9267-1-rocco.yue@mediatek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-27 23:04:12 +08:00
[ IFLA_INET6_RA_MTU ] = { . type = NLA_REJECT ,
. reject_message =
" IFLA_INET6_RA_MTU can not be set " } ,
2015-02-05 14:39:11 +01:00
} ;
2017-01-26 16:59:17 +13:00
static int check_addr_gen_mode ( int mode )
{
if ( mode ! = IN6_ADDR_GEN_MODE_EUI64 & &
mode ! = IN6_ADDR_GEN_MODE_NONE & &
mode ! = IN6_ADDR_GEN_MODE_STABLE_PRIVACY & &
mode ! = IN6_ADDR_GEN_MODE_RANDOM )
return - EINVAL ;
return 1 ;
}
static int check_stable_privacy ( struct inet6_dev * idev , struct net * net ,
int mode )
{
if ( mode = = IN6_ADDR_GEN_MODE_STABLE_PRIVACY & &
! idev - > cnf . stable_secret . initialized & &
! net - > ipv6 . devconf_dflt - > stable_secret . initialized )
return - EINVAL ;
return 1 ;
}
2019-05-21 06:40:04 +00:00
static int inet6_validate_link_af ( const struct net_device * dev ,
2021-08-03 20:02:50 +08:00
const struct nlattr * nla ,
struct netlink_ext_ack * extack )
2019-05-21 06:40:04 +00:00
{
struct nlattr * tb [ IFLA_INET6_MAX + 1 ] ;
struct inet6_dev * idev = NULL ;
int err ;
if ( dev ) {
idev = __in6_dev_get ( dev ) ;
if ( ! idev )
return - EAFNOSUPPORT ;
}
err = nla_parse_nested_deprecated ( tb , IFLA_INET6_MAX , nla ,
2021-08-03 20:02:50 +08:00
inet6_af_policy , extack ) ;
2019-05-21 06:40:04 +00:00
if ( err )
return err ;
if ( ! tb [ IFLA_INET6_TOKEN ] & & ! tb [ IFLA_INET6_ADDR_GEN_MODE ] )
return - EINVAL ;
if ( tb [ IFLA_INET6_ADDR_GEN_MODE ] ) {
u8 mode = nla_get_u8 ( tb [ IFLA_INET6_ADDR_GEN_MODE ] ) ;
if ( check_addr_gen_mode ( mode ) < 0 )
return - EINVAL ;
if ( dev & & check_stable_privacy ( idev , dev_net ( dev ) , mode ) < 0 )
return - EINVAL ;
}
return 0 ;
}
2021-04-07 08:59:12 -07:00
static int inet6_set_link_af ( struct net_device * dev , const struct nlattr * nla ,
struct netlink_ext_ack * extack )
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
{
struct inet6_dev * idev = __in6_dev_get ( dev ) ;
struct nlattr * tb [ IFLA_INET6_MAX + 1 ] ;
2019-05-21 06:40:04 +00:00
int err ;
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
ipv6/addrconf: fix potential NULL deref in inet6_set_link_af()
__in6_dev_get(dev) called from inet6_set_link_af() can return NULL.
The needed check has been recently removed, let's add it back.
While do_setlink() does call validate_linkmsg() :
...
err = validate_linkmsg(dev, tb); /* OK at this point */
...
It is possible that the following call happening before the
->set_link_af() removes IPv6 if MTU is less than 1280 :
if (tb[IFLA_MTU]) {
err = dev_set_mtu_ext(dev, nla_get_u32(tb[IFLA_MTU]), extack);
if (err < 0)
goto errout;
status |= DO_SETLINK_MODIFIED;
}
...
if (tb[IFLA_AF_SPEC]) {
...
err = af_ops->set_link_af(dev, af);
->inet6_set_link_af() // CRASH because idev is NULL
Please note that IPv4 is immune to the bug since inet_set_link_af() does :
struct in_device *in_dev = __in_dev_get_rcu(dev);
if (!in_dev)
return -EAFNOSUPPORT;
This problem has been mentioned in commit cf7afbfeb8ce ("rtnl: make
link af-specific updates atomic") changelog :
This method is not fail proof, while it is currently sufficient
to make set_link_af() inerrable and thus 100% atomic, the
validation function method will not be able to detect all error
scenarios in the future, there will likely always be errors
depending on states which are f.e. not protected by rtnl_mutex
and thus may change between validation and setting.
IPv6: ADDRCONF(NETDEV_CHANGE): lo: link becomes ready
general protection fault, probably for non-canonical address 0xdffffc0000000056: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x00000000000002b0-0x00000000000002b7]
CPU: 0 PID: 9698 Comm: syz-executor712 Not tainted 5.5.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:inet6_set_link_af+0x66e/0xae0 net/ipv6/addrconf.c:5733
Code: 38 d0 7f 08 84 c0 0f 85 20 03 00 00 48 8d bb b0 02 00 00 45 0f b6 64 24 04 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 1a 03 00 00 44 89 a3 b0 02 00
RSP: 0018:ffffc90005b06d40 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff86df39a6
RDX: 0000000000000056 RSI: ffffffff86df3e74 RDI: 00000000000002b0
RBP: ffffc90005b06e70 R08: ffff8880a2ac0380 R09: ffffc90005b06db0
R10: fffff52000b60dbe R11: ffffc90005b06df7 R12: 0000000000000000
R13: 0000000000000000 R14: ffff8880a1fcc424 R15: dffffc0000000000
FS: 0000000000c46880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055f0494ca0d0 CR3: 000000009e4ac000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
do_setlink+0x2a9f/0x3720 net/core/rtnetlink.c:2754
rtnl_group_changelink net/core/rtnetlink.c:3103 [inline]
__rtnl_newlink+0xdd1/0x1790 net/core/rtnetlink.c:3257
rtnl_newlink+0x69/0xa0 net/core/rtnetlink.c:3377
rtnetlink_rcv_msg+0x45e/0xaf0 net/core/rtnetlink.c:5438
netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5456
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:672
____sys_sendmsg+0x753/0x880 net/socket.c:2343
___sys_sendmsg+0x100/0x170 net/socket.c:2397
__sys_sendmsg+0x105/0x1d0 net/socket.c:2430
__do_sys_sendmsg net/socket.c:2439 [inline]
__se_sys_sendmsg net/socket.c:2437 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4402e9
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fffd62fbcf8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004402e9
RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 0000000000000008 R09: 00000000004002c8
R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000401b70
R13: 0000000000401c00 R14: 0000000000000000 R15: 0000000000000000
Modules linked in:
---[ end trace cfa7664b8fdcdff3 ]---
RIP: 0010:inet6_set_link_af+0x66e/0xae0 net/ipv6/addrconf.c:5733
Code: 38 d0 7f 08 84 c0 0f 85 20 03 00 00 48 8d bb b0 02 00 00 45 0f b6 64 24 04 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 1a 03 00 00 44 89 a3 b0 02 00
RSP: 0018:ffffc90005b06d40 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff86df39a6
RDX: 0000000000000056 RSI: ffffffff86df3e74 RDI: 00000000000002b0
RBP: ffffc90005b06e70 R08: ffff8880a2ac0380 R09: ffffc90005b06db0
R10: fffff52000b60dbe R11: ffffc90005b06df7 R12: 0000000000000000
R13: 0000000000000000 R14: ffff8880a1fcc424 R15: dffffc0000000000
FS: 0000000000c46880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000004 CR3: 000000009e4ac000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Fixes: 7dc2bccab0ee ("Validate required parameters in inet6_validate_link_af")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Bisected-and-reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-07 07:16:37 -08:00
if ( ! idev )
return - EAFNOSUPPORT ;
netlink: make validation more configurable for future strictness
We currently have two levels of strict validation:
1) liberal (default)
- undefined (type >= max) & NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
- garbage at end of message accepted
2) strict (opt-in)
- NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
Split out parsing strictness into four different options:
* TRAILING - check that there's no trailing data after parsing
attributes (in message or nested)
* MAXTYPE - reject attrs > max known type
* UNSPEC - reject attributes with NLA_UNSPEC policy entries
* STRICT_ATTRS - strictly validate attribute size
The default for future things should be *everything*.
The current *_strict() is a combination of TRAILING and MAXTYPE,
and is renamed to _deprecated_strict().
The current regular parsing has none of this, and is renamed to
*_parse_deprecated().
Additionally it allows us to selectively set one of the new flags
even on old policies. Notably, the UNSPEC flag could be useful in
this case, since it can be arranged (by filling in the policy) to
not be an incompatible userspace ABI change, but would then going
forward prevent forgetting attribute entries. Similar can apply
to the POLICY flag.
We end up with the following renames:
* nla_parse -> nla_parse_deprecated
* nla_parse_strict -> nla_parse_deprecated_strict
* nlmsg_parse -> nlmsg_parse_deprecated
* nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
* nla_parse_nested -> nla_parse_nested_deprecated
* nla_validate_nested -> nla_validate_nested_deprecated
Using spatch, of course:
@@
expression TB, MAX, HEAD, LEN, POL, EXT;
@@
-nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
+nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
@@
expression TB, MAX, NLA, POL, EXT;
@@
-nla_parse_nested(TB, MAX, NLA, POL, EXT)
+nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
@@
expression START, MAX, POL, EXT;
@@
-nla_validate_nested(START, MAX, POL, EXT)
+nla_validate_nested_deprecated(START, MAX, POL, EXT)
@@
expression NLH, HDRLEN, MAX, POL, EXT;
@@
-nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
+nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
For this patch, don't actually add the strict, non-renamed versions
yet so that it breaks compile if I get it wrong.
Also, while at it, make nla_validate and nla_parse go down to a
common __nla_validate_parse() function to avoid code duplication.
Ultimately, this allows us to have very strict validation for every
new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
next patch, while existing things will continue to work as is.
In effect then, this adds fully strict validation for any new command.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-26 14:07:28 +02:00
if ( nla_parse_nested_deprecated ( tb , IFLA_INET6_MAX , nla , NULL , NULL ) < 0 )
2021-06-08 09:53:15 +08:00
return - EINVAL ;
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
2014-07-11 21:10:18 +02:00
if ( tb [ IFLA_INET6_TOKEN ] ) {
2021-04-07 08:59:12 -07:00
err = inet6_set_iftoken ( idev , nla_data ( tb [ IFLA_INET6_TOKEN ] ) ,
extack ) ;
2014-07-11 21:10:18 +02:00
if ( err )
return err ;
}
if ( tb [ IFLA_INET6_ADDR_GEN_MODE ] ) {
u8 mode = nla_get_u8 ( tb [ IFLA_INET6_ADDR_GEN_MODE ] ) ;
2017-01-26 16:59:17 +13:00
idev - > cnf . addr_gen_mode = mode ;
2014-07-11 21:10:18 +02:00
}
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
2019-05-21 06:40:04 +00:00
return 0 ;
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
}
2007-02-09 23:24:49 +09:00
static int inet6_fill_ifinfo ( struct sk_buff * skb , struct inet6_dev * idev ,
2012-09-07 20:12:54 +00:00
u32 portid , u32 seq , int event , unsigned int flags )
2005-04-16 15:20:36 -07:00
{
2006-11-14 19:53:58 -08:00
struct net_device * dev = idev - > dev ;
struct ifinfomsg * hdr ;
struct nlmsghdr * nlh ;
void * protoinfo ;
2012-09-07 20:12:54 +00:00
nlh = nlmsg_put ( skb , portid , seq , event , sizeof ( * hdr ) , flags ) ;
2015-03-29 14:00:04 +01:00
if ( ! nlh )
2007-01-31 23:16:40 -08:00
return - EMSGSIZE ;
2006-11-14 19:53:58 -08:00
hdr = nlmsg_data ( nlh ) ;
hdr - > ifi_family = AF_INET6 ;
hdr - > __ifi_pad = 0 ;
hdr - > ifi_type = dev - > type ;
hdr - > ifi_index = dev - > ifindex ;
hdr - > ifi_flags = dev_get_flags ( dev ) ;
hdr - > ifi_change = 0 ;
2012-04-01 20:27:33 -04:00
if ( nla_put_string ( skb , IFLA_IFNAME , dev - > name ) | |
( dev - > addr_len & &
nla_put ( skb , IFLA_ADDRESS , dev - > addr_len , dev - > dev_addr ) ) | |
nla_put_u32 ( skb , IFLA_MTU , dev - > mtu ) | |
2015-04-02 17:07:00 +02:00
( dev - > ifindex ! = dev_get_iflink ( dev ) & &
2015-08-13 15:26:35 -04:00
nla_put_u32 ( skb , IFLA_LINK , dev_get_iflink ( dev ) ) ) | |
nla_put_u8 ( skb , IFLA_OPERSTATE ,
netif_running ( dev ) ? dev - > operstate : IF_OPER_DOWN ) )
2012-04-01 20:27:33 -04:00
goto nla_put_failure ;
2019-04-26 11:13:06 +02:00
protoinfo = nla_nest_start_noflag ( skb , IFLA_PROTINFO ) ;
2015-03-29 14:00:04 +01:00
if ( ! protoinfo )
2006-11-14 19:53:58 -08:00
goto nla_put_failure ;
2005-04-16 15:20:36 -07:00
2015-09-11 16:48:48 -04:00
if ( inet6_fill_ifla6_attrs ( skb , idev , 0 ) < 0 )
2007-04-20 15:56:20 -07:00
goto nla_put_failure ;
2005-04-16 15:20:36 -07:00
2006-11-14 19:53:58 -08:00
nla_nest_end ( skb , protoinfo ) ;
2015-01-16 22:09:00 +01:00
nlmsg_end ( skb , nlh ) ;
return 0 ;
2005-04-16 15:20:36 -07:00
2006-11-14 19:53:58 -08:00
nla_put_failure :
2007-01-31 23:16:40 -08:00
nlmsg_cancel ( skb , nlh ) ;
return - EMSGSIZE ;
2005-04-16 15:20:36 -07:00
}
2018-10-07 20:16:33 -07:00
static int inet6_valid_dump_ifinfo ( const struct nlmsghdr * nlh ,
struct netlink_ext_ack * extack )
{
struct ifinfomsg * ifm ;
if ( nlh - > nlmsg_len < nlmsg_msg_size ( sizeof ( * ifm ) ) ) {
NL_SET_ERR_MSG_MOD ( extack , " Invalid header for link dump request " ) ;
return - EINVAL ;
}
if ( nlmsg_attrlen ( nlh , sizeof ( * ifm ) ) ) {
NL_SET_ERR_MSG_MOD ( extack , " Invalid data after header " ) ;
return - EINVAL ;
}
ifm = nlmsg_data ( nlh ) ;
if ( ifm - > __ifi_pad | | ifm - > ifi_type | | ifm - > ifi_flags | |
ifm - > ifi_change | | ifm - > ifi_index ) {
NL_SET_ERR_MSG_MOD ( extack , " Invalid values in header for dump request " ) ;
return - EINVAL ;
}
return 0 ;
}
2005-04-16 15:20:36 -07:00
static int inet6_dump_ifinfo ( struct sk_buff * skb , struct netlink_callback * cb )
{
2008-03-26 02:26:21 +09:00
struct net * net = sock_net ( skb - > sk ) ;
2009-11-09 12:11:28 +00:00
int h , s_h ;
2009-11-11 18:53:00 -08:00
int idx = 0 , s_idx ;
2005-04-16 15:20:36 -07:00
struct net_device * dev ;
struct inet6_dev * idev ;
2009-11-09 12:11:28 +00:00
struct hlist_head * head ;
2005-04-16 15:20:36 -07:00
2018-10-07 20:16:33 -07:00
/* only requests using strict checking can pass data to
* influence the dump
*/
if ( cb - > strict_check ) {
int err = inet6_valid_dump_ifinfo ( cb - > nlh , cb - > extack ) ;
if ( err < 0 )
return err ;
}
2009-11-09 12:11:28 +00:00
s_h = cb - > args [ 0 ] ;
s_idx = cb - > args [ 1 ] ;
rcu_read_lock ( ) ;
for ( h = s_h ; h < NETDEV_HASHENTRIES ; h + + , s_idx = 0 ) {
idx = 0 ;
head = & net - > dev_index_head [ h ] ;
hlist: drop the node parameter from iterators
I'm not sure why, but the hlist for each entry iterators were conceived
list_for_each_entry(pos, head, member)
The hlist ones were greedy and wanted an extra parameter:
hlist_for_each_entry(tpos, pos, head, member)
Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.
Besides the semantic patch, there was some manual work required:
- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.
The semantic patch which is mostly the work of Peter Senna Tschudin is here:
@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
type T;
expression a,c,d,e;
identifier b;
statement S;
@@
-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>
[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 17:06:00 -08:00
hlist_for_each_entry_rcu ( dev , head , index_hlist ) {
2009-11-09 12:11:28 +00:00
if ( idx < s_idx )
goto cont ;
idev = __in6_dev_get ( dev ) ;
if ( ! idev )
goto cont ;
if ( inet6_fill_ifinfo ( skb , idev ,
2012-09-07 20:12:54 +00:00
NETLINK_CB ( cb - > skb ) . portid ,
2009-11-09 12:11:28 +00:00
cb - > nlh - > nlmsg_seq ,
2015-01-16 22:09:00 +01:00
RTM_NEWLINK , NLM_F_MULTI ) < 0 )
2009-11-09 12:11:28 +00:00
goto out ;
2007-05-03 15:13:45 -07:00
cont :
2009-11-09 12:11:28 +00:00
idx + + ;
}
2005-04-16 15:20:36 -07:00
}
2009-11-09 12:11:28 +00:00
out :
rcu_read_unlock ( ) ;
cb - > args [ 1 ] = idx ;
cb - > args [ 0 ] = h ;
2005-04-16 15:20:36 -07:00
return skb - > len ;
}
void inet6_ifinfo_notify ( int event , struct inet6_dev * idev )
{
struct sk_buff * skb ;
2008-03-25 21:47:49 +09:00
struct net * net = dev_net ( idev - > dev ) ;
2006-08-15 00:35:47 -07:00
int err = - ENOBUFS ;
2007-02-09 23:24:49 +09:00
2006-11-10 14:10:15 -08:00
skb = nlmsg_new ( inet6_if_nlmsg_size ( ) , GFP_ATOMIC ) ;
2015-03-29 14:00:04 +01:00
if ( ! skb )
2006-08-15 00:35:47 -07:00
goto errout ;
err = inet6_fill_ifinfo ( skb , idev , 0 , 0 , event , 0 ) ;
2007-01-31 23:16:40 -08:00
if ( err < 0 ) {
/* -EMSGSIZE implies BUG in inet6_if_nlmsg_size() */
WARN_ON ( err = = - EMSGSIZE ) ;
kfree_skb ( skb ) ;
goto errout ;
}
2010-12-07 23:38:31 +00:00
rtnl_notify ( skb , net , 0 , RTNLGRP_IPV6_IFINFO , NULL , GFP_ATOMIC ) ;
2009-02-24 23:18:28 -08:00
return ;
2006-08-15 00:35:47 -07:00
errout :
if ( err < 0 )
2010-12-07 23:38:31 +00:00
rtnl_set_sk_err ( net , RTNLGRP_IPV6_IFINFO , err ) ;
2005-04-16 15:20:36 -07:00
}
2006-11-10 14:10:15 -08:00
static inline size_t inet6_prefix_nlmsg_size ( void )
{
return NLMSG_ALIGN ( sizeof ( struct prefixmsg ) )
+ nla_total_size ( sizeof ( struct in6_addr ) )
+ nla_total_size ( sizeof ( struct prefix_cacheinfo ) ) ;
}
2006-06-17 22:48:48 -07:00
2005-04-16 15:20:36 -07:00
static int inet6_fill_prefix ( struct sk_buff * skb , struct inet6_dev * idev ,
2012-09-07 20:12:54 +00:00
struct prefix_info * pinfo , u32 portid , u32 seq ,
2006-11-14 19:54:19 -08:00
int event , unsigned int flags )
2005-04-16 15:20:36 -07:00
{
2006-11-14 19:54:19 -08:00
struct prefixmsg * pmsg ;
struct nlmsghdr * nlh ;
2005-04-16 15:20:36 -07:00
struct prefix_cacheinfo ci ;
2012-09-07 20:12:54 +00:00
nlh = nlmsg_put ( skb , portid , seq , event , sizeof ( * pmsg ) , flags ) ;
2015-03-29 14:00:04 +01:00
if ( ! nlh )
2007-01-31 23:16:40 -08:00
return - EMSGSIZE ;
2006-11-14 19:54:19 -08:00
pmsg = nlmsg_data ( nlh ) ;
2005-04-16 15:20:36 -07:00
pmsg - > prefix_family = AF_INET6 ;
2005-06-28 12:56:45 -07:00
pmsg - > prefix_pad1 = 0 ;
pmsg - > prefix_pad2 = 0 ;
2005-04-16 15:20:36 -07:00
pmsg - > prefix_ifindex = idev - > dev - > ifindex ;
pmsg - > prefix_len = pinfo - > prefix_len ;
pmsg - > prefix_type = pinfo - > type ;
2005-06-28 12:56:45 -07:00
pmsg - > prefix_pad3 = 0 ;
2023-12-06 09:36:12 -08:00
pmsg - > prefix_flags = pinfo - > flags ;
2005-04-16 15:20:36 -07:00
2012-04-01 20:27:33 -04:00
if ( nla_put ( skb , PREFIX_ADDRESS , sizeof ( pinfo - > prefix ) , & pinfo - > prefix ) )
goto nla_put_failure ;
2005-04-16 15:20:36 -07:00
ci . preferred_time = ntohl ( pinfo - > prefered ) ;
ci . valid_time = ntohl ( pinfo - > valid ) ;
2012-04-01 20:27:33 -04:00
if ( nla_put ( skb , PREFIX_CACHEINFO , sizeof ( ci ) , & ci ) )
goto nla_put_failure ;
2015-01-16 22:09:00 +01:00
nlmsg_end ( skb , nlh ) ;
return 0 ;
2005-04-16 15:20:36 -07:00
2006-11-14 19:54:19 -08:00
nla_put_failure :
2007-01-31 23:16:40 -08:00
nlmsg_cancel ( skb , nlh ) ;
return - EMSGSIZE ;
2005-04-16 15:20:36 -07:00
}
2007-02-09 23:24:49 +09:00
static void inet6_prefix_notify ( int event , struct inet6_dev * idev ,
2005-04-16 15:20:36 -07:00
struct prefix_info * pinfo )
{
struct sk_buff * skb ;
2008-03-25 21:47:49 +09:00
struct net * net = dev_net ( idev - > dev ) ;
2006-08-15 00:36:07 -07:00
int err = - ENOBUFS ;
2005-04-16 15:20:36 -07:00
2006-11-10 14:10:15 -08:00
skb = nlmsg_new ( inet6_prefix_nlmsg_size ( ) , GFP_ATOMIC ) ;
2015-03-29 14:00:04 +01:00
if ( ! skb )
2006-08-15 00:36:07 -07:00
goto errout ;
err = inet6_fill_prefix ( skb , idev , pinfo , 0 , 0 , event , 0 ) ;
2007-01-31 23:16:40 -08:00
if ( err < 0 ) {
/* -EMSGSIZE implies BUG in inet6_prefix_nlmsg_size() */
WARN_ON ( err = = - EMSGSIZE ) ;
kfree_skb ( skb ) ;
goto errout ;
}
2009-02-24 23:18:28 -08:00
rtnl_notify ( skb , net , 0 , RTNLGRP_IPV6_PREFIX , NULL , GFP_ATOMIC ) ;
return ;
2006-08-15 00:36:07 -07:00
errout :
if ( err < 0 )
2008-03-05 10:47:47 -08:00
rtnl_set_sk_err ( net , RTNLGRP_IPV6_PREFIX , err ) ;
2005-04-16 15:20:36 -07:00
}
static void __ipv6_ifa_notify ( int event , struct inet6_ifaddr * ifp )
{
2013-03-22 06:28:43 +00:00
struct net * net = dev_net ( ifp - > idev - > dev ) ;
2014-03-27 18:28:07 +01:00
if ( event )
ASSERT_RTNL ( ) ;
2005-04-16 15:20:36 -07:00
inet6_ifa_notify ( event ? : RTM_NEWADDR , ifp ) ;
switch ( event ) {
case RTM_NEWADDR :
2007-04-25 17:08:10 -07:00
/*
2019-10-04 08:03:09 -07:00
* If the address was optimistic we inserted the route at the
* start of our DAD process , so we don ' t need to do it again .
* If the device was taken down in the middle of the DAD
* cycle there is a race where we could get here without a
* host route , so nothing to insert . That will be fixed when
* the device is brought up .
2007-04-25 17:08:10 -07:00
*/
2019-10-04 08:03:09 -07:00
if ( ifp - > rt & & ! rcu_access_pointer ( ifp - > rt - > fib6_node ) ) {
2018-04-17 17:33:11 -07:00
ip6_ins_rt ( net , ifp - > rt ) ;
2019-10-04 08:03:09 -07:00
} else if ( ! ifp - > rt & & ( ifp - > idev - > dev - > flags & IFF_UP ) ) {
pr_warn ( " BUG: Address %pI6c on device %s is missing its host route. \n " ,
& ifp - > addr , ifp - > idev - > dev - > name ) ;
}
2005-04-16 15:20:36 -07:00
if ( ifp - > idev - > cnf . forwarding )
addrconf_join_anycast ( ifp ) ;
2013-05-22 05:41:06 +00:00
if ( ! ipv6_addr_any ( & ifp - > peer_addr ) )
2020-02-29 17:27:13 +08:00
addrconf_prefix_route ( & ifp - > peer_addr , 128 ,
ifp - > rt_priority , ifp - > idev - > dev ,
0 , 0 , GFP_ATOMIC ) ;
2005-04-16 15:20:36 -07:00
break ;
case RTM_DELADDR :
if ( ifp - > idev - > cnf . forwarding )
addrconf_leave_anycast ( ifp ) ;
addrconf_leave_solict ( ifp - > idev , & ifp - > addr ) ;
2013-05-22 05:41:06 +00:00
if ( ! ipv6_addr_any ( & ifp - > peer_addr ) ) {
2018-04-17 17:33:26 -07:00
struct fib6_info * rt ;
2013-05-16 22:32:00 +00:00
2014-09-03 23:59:22 +02:00
rt = addrconf_get_prefix_route ( & ifp - > peer_addr , 128 ,
2019-03-27 20:53:52 -07:00
ifp - > idev - > dev , 0 , 0 ,
false ) ;
2015-09-15 14:30:08 -07:00
if ( rt )
2020-04-27 13:56:45 -07:00
ip6_del_rt ( net , rt , false ) ;
2013-05-16 22:32:00 +00:00
}
2016-04-21 20:56:12 -07:00
if ( ifp - > rt ) {
2020-04-27 13:56:45 -07:00
ip6_del_rt ( net , ifp - > rt , false ) ;
2018-04-17 17:33:25 -07:00
ifp - > rt = NULL ;
2016-04-21 20:56:12 -07:00
}
2014-09-28 00:46:06 +02:00
rt_genid_bump_ipv6 ( net ) ;
2005-04-16 15:20:36 -07:00
break ;
}
2013-03-22 06:28:43 +00:00
atomic_inc ( & net - > ipv6 . dev_addr_genid ) ;
2005-04-16 15:20:36 -07:00
}
static void ipv6_ifa_notify ( int event , struct inet6_ifaddr * ifp )
{
if ( likely ( ifp - > idev - > dead = = 0 ) )
__ipv6_ifa_notify ( event , ifp ) ;
}
# ifdef CONFIG_SYSCTL
2020-04-24 08:43:38 +02:00
static int addrconf_sysctl_forward ( struct ctl_table * ctl , int write ,
void * buffer , size_t * lenp , loff_t * ppos )
2005-04-16 15:20:36 -07:00
{
int * valp = ctl - > data ;
int val = * valp ;
2010-02-19 13:22:59 +00:00
loff_t pos = * ppos ;
2013-06-11 23:04:25 -07:00
struct ctl_table lctl ;
2005-04-16 15:20:36 -07:00
int ret ;
2012-01-16 10:40:10 +00:00
/*
* ctl - > data points to idev - > cnf . forwarding , we should
* not modify it until we get the rtnl lock .
*/
lctl = * ctl ;
lctl . data = & val ;
ret = proc_dointvec ( & lctl , write , buffer , lenp , ppos ) ;
2005-04-16 15:20:36 -07:00
2007-12-05 01:50:24 -08:00
if ( write )
2009-02-26 06:55:31 +00:00
ret = addrconf_fixup_forwarding ( ctl , valp , val ) ;
2010-02-19 13:22:59 +00:00
if ( ret )
* ppos = pos ;
2007-02-09 23:24:49 +09:00
return ret ;
2005-04-16 15:20:36 -07:00
}
2020-04-24 08:43:38 +02:00
static int addrconf_sysctl_mtu ( struct ctl_table * ctl , int write ,
void * buffer , size_t * lenp , loff_t * ppos )
2015-02-23 11:17:13 -03:00
{
struct inet6_dev * idev = ctl - > extra1 ;
int min_mtu = IPV6_MIN_MTU ;
struct ctl_table lctl ;
lctl = * ctl ;
lctl . extra1 = & min_mtu ;
lctl . extra2 = idev ? & idev - > dev - > mtu : NULL ;
return proc_dointvec_minmax ( & lctl , write , buffer , lenp , ppos ) ;
}
2009-06-01 03:07:33 -07:00
static void dev_disable_change ( struct inet6_dev * idev )
{
2013-05-29 11:30:50 +08:00
struct netdev_notifier_info info ;
2009-06-01 03:07:33 -07:00
if ( ! idev | | ! idev - > dev )
return ;
2013-05-29 11:30:50 +08:00
netdev_notifier_info_init ( & info , idev - > dev ) ;
2009-06-01 03:07:33 -07:00
if ( idev - > cnf . disable_ipv6 )
2013-05-29 11:30:50 +08:00
addrconf_notify ( NULL , NETDEV_DOWN , & info ) ;
2009-06-01 03:07:33 -07:00
else
2013-05-29 11:30:50 +08:00
addrconf_notify ( NULL , NETDEV_UP , & info ) ;
2009-06-01 03:07:33 -07:00
}
static void addrconf_disable_change ( struct net * net , __s32 newf )
{
struct net_device * dev ;
struct inet6_dev * idev ;
2017-01-19 16:26:21 +08:00
for_each_netdev ( net , dev ) {
2009-06-01 03:07:33 -07:00
idev = __in6_dev_get ( dev ) ;
if ( idev ) {
int changed = ( ! idev - > cnf . disable_ipv6 ) ^ ( ! newf ) ;
idev - > cnf . disable_ipv6 = newf ;
if ( changed )
dev_disable_change ( idev ) ;
}
}
}
2012-01-16 10:40:10 +00:00
static int addrconf_disable_ipv6 ( struct ctl_table * table , int * p , int newf )
2009-06-01 03:07:33 -07:00
{
struct net * net ;
2012-01-16 10:40:10 +00:00
int old ;
if ( ! rtnl_trylock ( ) )
return restart_syscall ( ) ;
2009-06-01 03:07:33 -07:00
net = ( struct net * ) table - > extra2 ;
2012-01-16 10:40:10 +00:00
old = * p ;
* p = newf ;
2009-06-01 03:07:33 -07:00
2012-01-16 10:40:10 +00:00
if ( p = = & net - > ipv6 . devconf_dflt - > disable_ipv6 ) {
rtnl_unlock ( ) ;
2009-06-01 03:07:33 -07:00
return 0 ;
2010-02-19 13:22:59 +00:00
}
2009-06-01 03:07:33 -07:00
if ( p = = & net - > ipv6 . devconf_all - > disable_ipv6 ) {
net - > ipv6 . devconf_dflt - > disable_ipv6 = newf ;
addrconf_disable_change ( net , newf ) ;
2012-01-16 10:40:10 +00:00
} else if ( ( ! newf ) ^ ( ! old ) )
2009-06-01 03:07:33 -07:00
dev_disable_change ( ( struct inet6_dev * ) table - > extra1 ) ;
rtnl_unlock ( ) ;
return 0 ;
}
2020-04-24 08:43:38 +02:00
static int addrconf_sysctl_disable ( struct ctl_table * ctl , int write ,
void * buffer , size_t * lenp , loff_t * ppos )
2009-06-01 03:07:33 -07:00
{
int * valp = ctl - > data ;
int val = * valp ;
2010-02-19 13:22:59 +00:00
loff_t pos = * ppos ;
2013-06-11 23:04:25 -07:00
struct ctl_table lctl ;
2009-06-01 03:07:33 -07:00
int ret ;
2012-01-16 10:40:10 +00:00
/*
* ctl - > data points to idev - > cnf . disable_ipv6 , we should
* not modify it until we get the rtnl lock .
*/
lctl = * ctl ;
lctl . data = & val ;
ret = proc_dointvec ( & lctl , write , buffer , lenp , ppos ) ;
2009-06-01 03:07:33 -07:00
if ( write )
ret = addrconf_disable_ipv6 ( ctl , valp , val ) ;
2010-02-19 13:22:59 +00:00
if ( ret )
* ppos = pos ;
2009-06-01 03:07:33 -07:00
return ret ;
}
2020-04-24 08:43:38 +02:00
static int addrconf_sysctl_proxy_ndp ( struct ctl_table * ctl , int write ,
void * buffer , size_t * lenp , loff_t * ppos )
2013-12-17 22:37:14 -08:00
{
int * valp = ctl - > data ;
int ret ;
int old , new ;
old = * valp ;
ret = proc_dointvec ( ctl , write , buffer , lenp , ppos ) ;
new = * valp ;
if ( write & & old ! = new ) {
struct net * net = ctl - > extra2 ;
if ( ! rtnl_trylock ( ) )
return restart_syscall ( ) ;
if ( valp = = & net - > ipv6 . devconf_dflt - > proxy_ndp )
2017-03-28 14:28:04 -07:00
inet6_netconf_notify_devconf ( net , RTM_NEWNETCONF ,
NETCONFA_PROXY_NEIGH ,
2013-12-17 22:37:14 -08:00
NETCONFA_IFINDEX_DEFAULT ,
net - > ipv6 . devconf_dflt ) ;
else if ( valp = = & net - > ipv6 . devconf_all - > proxy_ndp )
2017-03-28 14:28:04 -07:00
inet6_netconf_notify_devconf ( net , RTM_NEWNETCONF ,
NETCONFA_PROXY_NEIGH ,
2013-12-17 22:37:14 -08:00
NETCONFA_IFINDEX_ALL ,
net - > ipv6 . devconf_all ) ;
else {
struct inet6_dev * idev = ctl - > extra1 ;
2017-03-28 14:28:04 -07:00
inet6_netconf_notify_devconf ( net , RTM_NEWNETCONF ,
NETCONFA_PROXY_NEIGH ,
2013-12-17 22:37:14 -08:00
idev - > dev - > ifindex ,
& idev - > cnf ) ;
}
rtnl_unlock ( ) ;
}
return ret ;
}
2017-01-26 16:59:17 +13:00
static int addrconf_sysctl_addr_gen_mode ( struct ctl_table * ctl , int write ,
2020-04-24 08:43:38 +02:00
void * buffer , size_t * lenp ,
2017-01-26 16:59:17 +13:00
loff_t * ppos )
{
int ret = 0 ;
net/ipv6: fix addrconf_sysctl_addr_gen_mode
addrconf_sysctl_addr_gen_mode() has multiple problems. First, it ignores
the errors returned by proc_dointvec().
addrconf_sysctl_addr_gen_mode() calls proc_dointvec() directly, which
writes the value to memory, and then checks if it's valid and may return
EINVAL. If a bad value is given, the value displayed when reading
net.ipv6.conf.foo.addr_gen_mode next time will be invalid. In case the
value provided by the user was valid, addrconf_dev_config() won't be
called since idev->cnf.addr_gen_mode has already been updated.
Fix this in the usual way we deal with values that need to be checked
after the proc_do*() helper has returned: define a local ctl_table and
storage, call proc_dointvec() on that temporary area, then check and
store.
addrconf_sysctl_addr_gen_mode() also writes the new value to the global
ipv6_devconf_dflt, when we're writing to some netns's default, so that
new netns will inherit the value that was set by the change occuring in
any netns. That doesn't make any sense, so let's drop this assignment.
Finally, since addr_gen_mode is a __u32, switch to proc_douintvec().
Fixes: d35a00b8e33d ("net/ipv6: allow sysctl to change link-local address generation mode")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 12:25:14 +02:00
u32 new_val ;
2017-01-26 16:59:17 +13:00
struct inet6_dev * idev = ( struct inet6_dev * ) ctl - > extra1 ;
struct net * net = ( struct net * ) ctl - > extra2 ;
net/ipv6: fix addrconf_sysctl_addr_gen_mode
addrconf_sysctl_addr_gen_mode() has multiple problems. First, it ignores
the errors returned by proc_dointvec().
addrconf_sysctl_addr_gen_mode() calls proc_dointvec() directly, which
writes the value to memory, and then checks if it's valid and may return
EINVAL. If a bad value is given, the value displayed when reading
net.ipv6.conf.foo.addr_gen_mode next time will be invalid. In case the
value provided by the user was valid, addrconf_dev_config() won't be
called since idev->cnf.addr_gen_mode has already been updated.
Fix this in the usual way we deal with values that need to be checked
after the proc_do*() helper has returned: define a local ctl_table and
storage, call proc_dointvec() on that temporary area, then check and
store.
addrconf_sysctl_addr_gen_mode() also writes the new value to the global
ipv6_devconf_dflt, when we're writing to some netns's default, so that
new netns will inherit the value that was set by the change occuring in
any netns. That doesn't make any sense, so let's drop this assignment.
Finally, since addr_gen_mode is a __u32, switch to proc_douintvec().
Fixes: d35a00b8e33d ("net/ipv6: allow sysctl to change link-local address generation mode")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 12:25:14 +02:00
struct ctl_table tmp = {
. data = & new_val ,
. maxlen = sizeof ( new_val ) ,
. mode = ctl - > mode ,
} ;
2017-01-26 16:59:17 +13:00
2017-02-27 12:41:23 +13:00
if ( ! rtnl_trylock ( ) )
return restart_syscall ( ) ;
net/ipv6: fix addrconf_sysctl_addr_gen_mode
addrconf_sysctl_addr_gen_mode() has multiple problems. First, it ignores
the errors returned by proc_dointvec().
addrconf_sysctl_addr_gen_mode() calls proc_dointvec() directly, which
writes the value to memory, and then checks if it's valid and may return
EINVAL. If a bad value is given, the value displayed when reading
net.ipv6.conf.foo.addr_gen_mode next time will be invalid. In case the
value provided by the user was valid, addrconf_dev_config() won't be
called since idev->cnf.addr_gen_mode has already been updated.
Fix this in the usual way we deal with values that need to be checked
after the proc_do*() helper has returned: define a local ctl_table and
storage, call proc_dointvec() on that temporary area, then check and
store.
addrconf_sysctl_addr_gen_mode() also writes the new value to the global
ipv6_devconf_dflt, when we're writing to some netns's default, so that
new netns will inherit the value that was set by the change occuring in
any netns. That doesn't make any sense, so let's drop this assignment.
Finally, since addr_gen_mode is a __u32, switch to proc_douintvec().
Fixes: d35a00b8e33d ("net/ipv6: allow sysctl to change link-local address generation mode")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 12:25:14 +02:00
new_val = * ( ( u32 * ) ctl - > data ) ;
2017-01-26 16:59:17 +13:00
net/ipv6: fix addrconf_sysctl_addr_gen_mode
addrconf_sysctl_addr_gen_mode() has multiple problems. First, it ignores
the errors returned by proc_dointvec().
addrconf_sysctl_addr_gen_mode() calls proc_dointvec() directly, which
writes the value to memory, and then checks if it's valid and may return
EINVAL. If a bad value is given, the value displayed when reading
net.ipv6.conf.foo.addr_gen_mode next time will be invalid. In case the
value provided by the user was valid, addrconf_dev_config() won't be
called since idev->cnf.addr_gen_mode has already been updated.
Fix this in the usual way we deal with values that need to be checked
after the proc_do*() helper has returned: define a local ctl_table and
storage, call proc_dointvec() on that temporary area, then check and
store.
addrconf_sysctl_addr_gen_mode() also writes the new value to the global
ipv6_devconf_dflt, when we're writing to some netns's default, so that
new netns will inherit the value that was set by the change occuring in
any netns. That doesn't make any sense, so let's drop this assignment.
Finally, since addr_gen_mode is a __u32, switch to proc_douintvec().
Fixes: d35a00b8e33d ("net/ipv6: allow sysctl to change link-local address generation mode")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 12:25:14 +02:00
ret = proc_douintvec ( & tmp , write , buffer , lenp , ppos ) ;
if ( ret ! = 0 )
goto out ;
2017-01-26 16:59:17 +13:00
net/ipv6: fix addrconf_sysctl_addr_gen_mode
addrconf_sysctl_addr_gen_mode() has multiple problems. First, it ignores
the errors returned by proc_dointvec().
addrconf_sysctl_addr_gen_mode() calls proc_dointvec() directly, which
writes the value to memory, and then checks if it's valid and may return
EINVAL. If a bad value is given, the value displayed when reading
net.ipv6.conf.foo.addr_gen_mode next time will be invalid. In case the
value provided by the user was valid, addrconf_dev_config() won't be
called since idev->cnf.addr_gen_mode has already been updated.
Fix this in the usual way we deal with values that need to be checked
after the proc_do*() helper has returned: define a local ctl_table and
storage, call proc_dointvec() on that temporary area, then check and
store.
addrconf_sysctl_addr_gen_mode() also writes the new value to the global
ipv6_devconf_dflt, when we're writing to some netns's default, so that
new netns will inherit the value that was set by the change occuring in
any netns. That doesn't make any sense, so let's drop this assignment.
Finally, since addr_gen_mode is a __u32, switch to proc_douintvec().
Fixes: d35a00b8e33d ("net/ipv6: allow sysctl to change link-local address generation mode")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 12:25:14 +02:00
if ( write ) {
2017-02-27 12:41:23 +13:00
if ( check_addr_gen_mode ( new_val ) < 0 ) {
ret = - EINVAL ;
goto out ;
}
2017-01-26 16:59:17 +13:00
net/ipv6: fix addrconf_sysctl_addr_gen_mode
addrconf_sysctl_addr_gen_mode() has multiple problems. First, it ignores
the errors returned by proc_dointvec().
addrconf_sysctl_addr_gen_mode() calls proc_dointvec() directly, which
writes the value to memory, and then checks if it's valid and may return
EINVAL. If a bad value is given, the value displayed when reading
net.ipv6.conf.foo.addr_gen_mode next time will be invalid. In case the
value provided by the user was valid, addrconf_dev_config() won't be
called since idev->cnf.addr_gen_mode has already been updated.
Fix this in the usual way we deal with values that need to be checked
after the proc_do*() helper has returned: define a local ctl_table and
storage, call proc_dointvec() on that temporary area, then check and
store.
addrconf_sysctl_addr_gen_mode() also writes the new value to the global
ipv6_devconf_dflt, when we're writing to some netns's default, so that
new netns will inherit the value that was set by the change occuring in
any netns. That doesn't make any sense, so let's drop this assignment.
Finally, since addr_gen_mode is a __u32, switch to proc_douintvec().
Fixes: d35a00b8e33d ("net/ipv6: allow sysctl to change link-local address generation mode")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 12:25:14 +02:00
if ( idev ) {
2017-02-27 12:41:23 +13:00
if ( check_stable_privacy ( idev , net , new_val ) < 0 ) {
ret = - EINVAL ;
goto out ;
}
2017-01-26 16:59:17 +13:00
if ( idev - > cnf . addr_gen_mode ! = new_val ) {
idev - > cnf . addr_gen_mode = new_val ;
2023-01-31 16:46:45 +13:00
addrconf_init_auto_addrs ( idev - > dev ) ;
2017-01-26 16:59:17 +13:00
}
2018-07-09 12:25:17 +02:00
} else if ( & net - > ipv6 . devconf_all - > addr_gen_mode = = ctl - > data ) {
struct net_device * dev ;
net - > ipv6 . devconf_dflt - > addr_gen_mode = new_val ;
for_each_netdev ( net , dev ) {
idev = __in6_dev_get ( dev ) ;
if ( idev & &
idev - > cnf . addr_gen_mode ! = new_val ) {
idev - > cnf . addr_gen_mode = new_val ;
2023-01-31 16:46:45 +13:00
addrconf_init_auto_addrs ( idev - > dev ) ;
2018-07-09 12:25:17 +02:00
}
}
2017-01-26 16:59:17 +13:00
}
net/ipv6: fix addrconf_sysctl_addr_gen_mode
addrconf_sysctl_addr_gen_mode() has multiple problems. First, it ignores
the errors returned by proc_dointvec().
addrconf_sysctl_addr_gen_mode() calls proc_dointvec() directly, which
writes the value to memory, and then checks if it's valid and may return
EINVAL. If a bad value is given, the value displayed when reading
net.ipv6.conf.foo.addr_gen_mode next time will be invalid. In case the
value provided by the user was valid, addrconf_dev_config() won't be
called since idev->cnf.addr_gen_mode has already been updated.
Fix this in the usual way we deal with values that need to be checked
after the proc_do*() helper has returned: define a local ctl_table and
storage, call proc_dointvec() on that temporary area, then check and
store.
addrconf_sysctl_addr_gen_mode() also writes the new value to the global
ipv6_devconf_dflt, when we're writing to some netns's default, so that
new netns will inherit the value that was set by the change occuring in
any netns. That doesn't make any sense, so let's drop this assignment.
Finally, since addr_gen_mode is a __u32, switch to proc_douintvec().
Fixes: d35a00b8e33d ("net/ipv6: allow sysctl to change link-local address generation mode")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 12:25:14 +02:00
* ( ( u32 * ) ctl - > data ) = new_val ;
2017-01-26 16:59:17 +13:00
}
2017-02-27 12:41:23 +13:00
out :
rtnl_unlock ( ) ;
2017-01-26 16:59:17 +13:00
return ret ;
}
2015-03-23 23:36:00 +01:00
static int addrconf_sysctl_stable_secret ( struct ctl_table * ctl , int write ,
2020-04-24 08:43:38 +02:00
void * buffer , size_t * lenp ,
2015-03-23 23:36:00 +01:00
loff_t * ppos )
{
int err ;
struct in6_addr addr ;
char str [ IPV6_MAX_STRLEN ] ;
struct ctl_table lctl = * ctl ;
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
struct net * net = ctl - > extra2 ;
2015-03-23 23:36:00 +01:00
struct ipv6_stable_secret * secret = ctl - > data ;
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
if ( & net - > ipv6 . devconf_all - > stable_secret = = ctl - > data )
return - EIO ;
2015-03-23 23:36:00 +01:00
lctl . maxlen = IPV6_MAX_STRLEN ;
lctl . data = str ;
if ( ! rtnl_trylock ( ) )
return restart_syscall ( ) ;
if ( ! write & & ! secret - > initialized ) {
err = - EIO ;
goto out ;
}
2015-12-21 10:55:45 -08:00
err = snprintf ( str , sizeof ( str ) , " %pI6 " , & secret - > secret ) ;
if ( err > = sizeof ( str ) ) {
err = - EIO ;
goto out ;
2015-03-23 23:36:00 +01:00
}
err = proc_dostring ( & lctl , write , buffer , lenp , ppos ) ;
if ( err | | ! write )
goto out ;
if ( in6_pton ( str , - 1 , addr . in6_u . u6_addr8 , - 1 , NULL ) ! = 1 ) {
err = - EIO ;
goto out ;
}
secret - > initialized = true ;
secret - > secret = addr ;
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
if ( & net - > ipv6 . devconf_dflt - > stable_secret = = ctl - > data ) {
struct net_device * dev ;
for_each_netdev ( net , dev ) {
struct inet6_dev * idev = __in6_dev_get ( dev ) ;
if ( idev ) {
2017-01-26 16:59:17 +13:00
idev - > cnf . addr_gen_mode =
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
IN6_ADDR_GEN_MODE_STABLE_PRIVACY ;
}
}
} else {
struct inet6_dev * idev = ctl - > extra1 ;
2017-01-26 16:59:17 +13:00
idev - > cnf . addr_gen_mode = IN6_ADDR_GEN_MODE_STABLE_PRIVACY ;
ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.
We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.
As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.
A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.
We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.
Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 23:36:01 +01:00
}
2015-03-23 23:36:00 +01:00
out :
rtnl_unlock ( ) ;
return err ;
}
2013-12-17 22:37:14 -08:00
2015-08-13 10:39:01 -04:00
static
int addrconf_sysctl_ignore_routes_with_linkdown ( struct ctl_table * ctl ,
2020-04-24 08:43:38 +02:00
int write , void * buffer ,
2015-08-13 10:39:01 -04:00
size_t * lenp ,
loff_t * ppos )
{
int * valp = ctl - > data ;
int val = * valp ;
loff_t pos = * ppos ;
struct ctl_table lctl ;
int ret ;
/* ctl->data points to idev->cnf.ignore_routes_when_linkdown
* we should not modify it until we get the rtnl lock .
*/
lctl = * ctl ;
lctl . data = & val ;
ret = proc_dointvec ( & lctl , write , buffer , lenp , ppos ) ;
if ( write )
ret = addrconf_fixup_linkdown ( ctl , valp , val ) ;
if ( ret )
* ppos = pos ;
return ret ;
}
2017-02-23 16:27:18 +00:00
static
void addrconf_set_nopolicy ( struct rt6_info * rt , int action )
{
if ( rt ) {
if ( action )
rt - > dst . flags | = DST_NOPOLICY ;
else
rt - > dst . flags & = ~ DST_NOPOLICY ;
}
}
static
void addrconf_disable_policy_idev ( struct inet6_dev * idev , int val )
{
struct inet6_ifaddr * ifa ;
read_lock_bh ( & idev - > lock ) ;
list_for_each_entry ( ifa , & idev - > addr_list , if_list ) {
spin_lock ( & ifa - > lock ) ;
if ( ifa - > rt ) {
2019-06-03 20:19:52 -07:00
/* host routes only use builtin fib6_nh */
2019-05-22 20:27:59 -07:00
struct fib6_nh * nh = ifa - > rt - > fib6_nh ;
2017-02-23 16:27:18 +00:00
int cpu ;
ipv6: replace rwlock with rcu and spinlock in fib6_table
With all the preparation work before, we are now ready to replace rwlock
with rcu and spinlock in fib6_table.
That means now all fib6_node in fib6_table are protected by rcu. And
when freeing fib6_node, call_rcu() is used to wait for the rcu grace
period before releasing the memory.
When accessing fib6_node, corresponding rcu APIs need to be used.
And all previous sessions protected by the write lock will now be
protected by the spin lock per table.
All previous sessions protected by read lock will now be protected by
rcu_read_lock().
A couple of things to note here:
1. As part of the work of replacing rwlock with rcu, the linked list of
fn->leaf now has to be rcu protected as well. So both fn->leaf and
rt->dst.rt6_next are now __rcu tagged and corresponding rcu APIs are
used when manipulating them.
2. For fn->rr_ptr, first of all, it also needs to be rcu protected now
and is tagged with __rcu and rcu APIs are used in corresponding places.
Secondly, fn->rr_ptr is changed in rt6_select() which is a reader
thread. This makes the issue a bit complicated. We think a valid
solution for it is to let rt6_select() grab the tb6_lock if it decides
to change it. As it is not in the normal operation and only happens when
there is no valid neighbor cache for the route, we think the performance
impact should be low.
3. fib6_walk_continue() has to be called with tb6_lock held even in the
route dumping related functions, e.g. inet6_dump_fib(),
fib6_tables_dump() and ipv6_route_seq_ops. It is because
fib6_walk_continue() makes modifications to the walker structure, and so
are fib6_repair_tree() and fib6_del_route(). In order to do proper
syncing between them, we need to let fib6_walk_continue() hold the lock.
We may be able to do further improvement on the way we do the tree walk
to get rid of the need for holding the spin lock. But not for now.
4. When fib6_del_route() removes a route from the tree, we no longer
mark rt->dst.rt6_next to NULL to make simultaneous reader be able to
further traverse the list with rcu. However, rt->dst.rt6_next is only
valid within this same rcu period. No one should access it later.
5. All the operation of atomic_inc(rt->rt6i_ref) is changed to be
performed before we publish this route (either by linking it to fn->leaf
or insert it in the list pointed by fn->leaf) just to be safe because as
soon as we publish the route, some read thread will be able to access it.
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06 12:06:10 -07:00
rcu_read_lock ( ) ;
2018-04-17 17:33:20 -07:00
ifa - > rt - > dst_nopolicy = val ? true : false ;
2019-05-22 20:27:55 -07:00
if ( nh - > rt6i_pcpu ) {
2017-02-23 16:27:18 +00:00
for_each_possible_cpu ( cpu ) {
struct rt6_info * * rtp ;
2019-05-22 20:27:55 -07:00
rtp = per_cpu_ptr ( nh - > rt6i_pcpu , cpu ) ;
2017-02-23 16:27:18 +00:00
addrconf_set_nopolicy ( * rtp , val ) ;
}
}
ipv6: replace rwlock with rcu and spinlock in fib6_table
With all the preparation work before, we are now ready to replace rwlock
with rcu and spinlock in fib6_table.
That means now all fib6_node in fib6_table are protected by rcu. And
when freeing fib6_node, call_rcu() is used to wait for the rcu grace
period before releasing the memory.
When accessing fib6_node, corresponding rcu APIs need to be used.
And all previous sessions protected by the write lock will now be
protected by the spin lock per table.
All previous sessions protected by read lock will now be protected by
rcu_read_lock().
A couple of things to note here:
1. As part of the work of replacing rwlock with rcu, the linked list of
fn->leaf now has to be rcu protected as well. So both fn->leaf and
rt->dst.rt6_next are now __rcu tagged and corresponding rcu APIs are
used when manipulating them.
2. For fn->rr_ptr, first of all, it also needs to be rcu protected now
and is tagged with __rcu and rcu APIs are used in corresponding places.
Secondly, fn->rr_ptr is changed in rt6_select() which is a reader
thread. This makes the issue a bit complicated. We think a valid
solution for it is to let rt6_select() grab the tb6_lock if it decides
to change it. As it is not in the normal operation and only happens when
there is no valid neighbor cache for the route, we think the performance
impact should be low.
3. fib6_walk_continue() has to be called with tb6_lock held even in the
route dumping related functions, e.g. inet6_dump_fib(),
fib6_tables_dump() and ipv6_route_seq_ops. It is because
fib6_walk_continue() makes modifications to the walker structure, and so
are fib6_repair_tree() and fib6_del_route(). In order to do proper
syncing between them, we need to let fib6_walk_continue() hold the lock.
We may be able to do further improvement on the way we do the tree walk
to get rid of the need for holding the spin lock. But not for now.
4. When fib6_del_route() removes a route from the tree, we no longer
mark rt->dst.rt6_next to NULL to make simultaneous reader be able to
further traverse the list with rcu. However, rt->dst.rt6_next is only
valid within this same rcu period. No one should access it later.
5. All the operation of atomic_inc(rt->rt6i_ref) is changed to be
performed before we publish this route (either by linking it to fn->leaf
or insert it in the list pointed by fn->leaf) just to be safe because as
soon as we publish the route, some read thread will be able to access it.
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06 12:06:10 -07:00
rcu_read_unlock ( ) ;
2017-02-23 16:27:18 +00:00
}
spin_unlock ( & ifa - > lock ) ;
}
read_unlock_bh ( & idev - > lock ) ;
}
static
int addrconf_disable_policy ( struct ctl_table * ctl , int * valp , int val )
{
struct inet6_dev * idev ;
struct net * net ;
if ( ! rtnl_trylock ( ) )
return restart_syscall ( ) ;
* valp = val ;
net = ( struct net * ) ctl - > extra2 ;
if ( valp = = & net - > ipv6 . devconf_dflt - > disable_policy ) {
rtnl_unlock ( ) ;
return 0 ;
}
if ( valp = = & net - > ipv6 . devconf_all - > disable_policy ) {
struct net_device * dev ;
for_each_netdev ( net , dev ) {
idev = __in6_dev_get ( dev ) ;
if ( idev )
addrconf_disable_policy_idev ( idev , val ) ;
}
} else {
idev = ( struct inet6_dev * ) ctl - > extra1 ;
addrconf_disable_policy_idev ( idev , val ) ;
}
rtnl_unlock ( ) ;
return 0 ;
}
2020-04-24 08:43:38 +02:00
static int addrconf_sysctl_disable_policy ( struct ctl_table * ctl , int write ,
void * buffer , size_t * lenp , loff_t * ppos )
2017-02-23 16:27:18 +00:00
{
int * valp = ctl - > data ;
int val = * valp ;
loff_t pos = * ppos ;
struct ctl_table lctl ;
int ret ;
lctl = * ctl ;
lctl . data = & val ;
ret = proc_dointvec ( & lctl , write , buffer , lenp , ppos ) ;
if ( write & & ( * valp ! = val ) )
ret = addrconf_disable_policy ( ctl , valp , val ) ;
if ( ret )
* ppos = pos ;
return ret ;
}
2016-10-07 01:00:49 -07:00
static int minus_one = - 1 ;
2016-09-29 00:33:43 -07:00
static const int two_five_five = 255 ;
2021-07-22 09:55:04 +02:00
static u32 ioam6_if_id_max = U16_MAX ;
2016-09-29 00:33:43 -07:00
2016-04-18 14:41:10 +03:00
static const struct ctl_table addrconf_sysctl [ ] = {
2016-04-18 14:41:17 +03:00
{
. procname = " forwarding " ,
. data = & ipv6_devconf . forwarding ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = addrconf_sysctl_forward ,
} ,
{
. procname = " hop_limit " ,
. data = & ipv6_devconf . hop_limit ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
2016-09-29 00:33:43 -07:00
. proc_handler = proc_dointvec_minmax ,
proc/sysctl: add shared variables for range check
In the sysctl code the proc_dointvec_minmax() function is often used to
validate the user supplied value between an allowed range. This
function uses the extra1 and extra2 members from struct ctl_table as
minimum and maximum allowed value.
On sysctl handler declaration, in every source file there are some
readonly variables containing just an integer which address is assigned
to the extra1 and extra2 members, so the sysctl range is enforced.
The special values 0, 1 and INT_MAX are very often used as range
boundary, leading duplication of variables like zero=0, one=1,
int_max=INT_MAX in different source files:
$ git grep -E '\.extra[12].*&(zero|one|int_max)' |wc -l
248
Add a const int array containing the most commonly used values, some
macros to refer more easily to the correct array member, and use them
instead of creating a local one for every object file.
This is the bloat-o-meter output comparing the old and new binary
compiled with the default Fedora config:
# scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o
add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164)
Data old new delta
sysctl_vals - 12 +12
__kstrtab_sysctl_vals - 12 +12
max 14 10 -4
int_max 16 - -16
one 68 - -68
zero 128 28 -100
Total: Before=20583249, After=20583085, chg -0.00%
[mcroce@redhat.com: tipc: remove two unused variables]
Link: http://lkml.kernel.org/r/20190530091952.4108-1-mcroce@redhat.com
[akpm@linux-foundation.org: fix net/ipv6/sysctl_net_ipv6.c]
[arnd@arndb.de: proc/sysctl: make firmware loader table conditional]
Link: http://lkml.kernel.org/r/20190617130014.1713870-1-arnd@arndb.de
[akpm@linux-foundation.org: fix fs/eventpoll.c]
Link: http://lkml.kernel.org/r/20190430180111.10688-1-mcroce@redhat.com
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-18 15:58:50 -07:00
. extra1 = ( void * ) SYSCTL_ONE ,
2016-09-29 00:33:43 -07:00
. extra2 = ( void * ) & two_five_five ,
2016-04-18 14:41:17 +03:00
} ,
{
. procname = " mtu " ,
. data = & ipv6_devconf . mtu6 ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = addrconf_sysctl_mtu ,
} ,
{
. procname = " accept_ra " ,
. data = & ipv6_devconf . accept_ra ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " accept_redirects " ,
. data = & ipv6_devconf . accept_redirects ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " autoconf " ,
. data = & ipv6_devconf . autoconf ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " dad_transmits " ,
. data = & ipv6_devconf . dad_transmits ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " router_solicitations " ,
. data = & ipv6_devconf . rtr_solicits ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
2016-10-07 01:00:49 -07:00
. proc_handler = proc_dointvec_minmax ,
. extra1 = & minus_one ,
2016-04-18 14:41:17 +03:00
} ,
{
. procname = " router_solicitation_interval " ,
. data = & ipv6_devconf . rtr_solicit_interval ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec_jiffies ,
} ,
2016-09-27 23:57:58 -07:00
{
. procname = " router_solicitation_max_interval " ,
. data = & ipv6_devconf . rtr_solicit_max_interval ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec_jiffies ,
} ,
2016-04-18 14:41:17 +03:00
{
. procname = " router_solicitation_delay " ,
. data = & ipv6_devconf . rtr_solicit_delay ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec_jiffies ,
} ,
{
. procname = " force_mld_version " ,
. data = & ipv6_devconf . force_mld_version ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " mldv1_unsolicited_report_interval " ,
. data =
& ipv6_devconf . mldv1_unsolicited_report_interval ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec_ms_jiffies ,
} ,
{
. procname = " mldv2_unsolicited_report_interval " ,
. data =
& ipv6_devconf . mldv2_unsolicited_report_interval ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec_ms_jiffies ,
} ,
{
. procname = " use_tempaddr " ,
. data = & ipv6_devconf . use_tempaddr ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " temp_valid_lft " ,
. data = & ipv6_devconf . temp_valid_lft ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " temp_prefered_lft " ,
. data = & ipv6_devconf . temp_prefered_lft ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " regen_max_retry " ,
. data = & ipv6_devconf . regen_max_retry ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " max_desync_factor " ,
. data = & ipv6_devconf . max_desync_factor ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " max_addresses " ,
. data = & ipv6_devconf . max_addresses ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " accept_ra_defrtr " ,
. data = & ipv6_devconf . accept_ra_defrtr ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
net: allow user to set metric on default route learned via Router Advertisement
For IPv4, default route is learned via DHCPv4 and user is allowed to change
metric using config etc/network/interfaces. But for IPv6, default route can
be learned via RA, for which, currently a fixed metric value 1024 is used.
Ideally, user should be able to configure metric on default route for IPv6
similar to IPv4. This patch adds sysctl for the same.
Logs:
For IPv4:
Config in etc/network/interfaces:
auto eth0
iface eth0 inet dhcp
metric 4261413864
IPv4 Kernel Route Table:
$ ip route list
default via 172.21.47.1 dev eth0 metric 4261413864
FRR Table, if a static route is configured:
[In real scenario, it is useful to prefer BGP learned default route over DHCPv4 default route.]
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
> - selected route, * - FIB route
S>* 0.0.0.0/0 [20/0] is directly connected, eth0, 00:00:03
K 0.0.0.0/0 [254/1000] via 172.21.47.1, eth0, 6d08h51m
i.e. User can prefer Default Router learned via Routing Protocol in IPv4.
Similar behavior is not possible for IPv6, without this fix.
After fix [for IPv6]:
sudo sysctl -w net.ipv6.conf.eth0.net.ipv6.conf.eth0.ra_defrtr_metric=1996489705
IP monitor: [When IPv6 RA is received]
default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 pref high
Kernel IPv6 routing table
$ ip -6 route list
default via fe80::be16:65ff:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 21sec hoplimit 64 pref high
FRR Table, if a static route is configured:
[In real scenario, it is useful to prefer BGP learned default route over IPv6 RA default route.]
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
> - selected route, * - FIB route
S>* ::/0 [20/0] is directly connected, eth0, 00:00:06
K ::/0 [119/1001] via fe80::xx16:xxxx:feb3:ce8e, eth0, 6d07h43m
If the metric is changed later, the effect will be seen only when next IPv6
RA is received, because the default route must be fully controlled by RA msg.
Below metric is changed from 1996489705 to 1996489704.
$ sudo sysctl -w net.ipv6.conf.eth0.ra_defrtr_metric=1996489704
net.ipv6.conf.eth0.ra_defrtr_metric = 1996489704
IP monitor:
[On next IPv6 RA msg, Kernel deletes prev route and installs new route with updated metric]
Deleted default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 3sec hoplimit 64 pref high
default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489704 pref high
Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210125214430.24079-1-pchaudhary@linkedin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 13:44:30 -08:00
{
. procname = " ra_defrtr_metric " ,
. data = & ipv6_devconf . ra_defrtr_metric ,
. maxlen = sizeof ( u32 ) ,
. mode = 0644 ,
. proc_handler = proc_douintvec_minmax ,
. extra1 = ( void * ) SYSCTL_ONE ,
} ,
2016-04-18 14:41:17 +03:00
{
. procname = " accept_ra_min_hop_limit " ,
. data = & ipv6_devconf . accept_ra_min_hop_limit ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
2023-07-19 07:52:13 -07:00
{
2023-07-26 16:07:01 -07:00
. procname = " accept_ra_min_lft " ,
. data = & ipv6_devconf . accept_ra_min_lft ,
2023-07-19 07:52:13 -07:00
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
2016-04-18 14:41:17 +03:00
{
. procname = " accept_ra_pinfo " ,
. data = & ipv6_devconf . accept_ra_pinfo ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
2023-09-25 14:47:11 -07:00
{
. procname = " ra_honor_pio_life " ,
. data = & ipv6_devconf . ra_honor_pio_life ,
. maxlen = sizeof ( u8 ) ,
. mode = 0644 ,
. proc_handler = proc_dou8vec_minmax ,
. extra1 = SYSCTL_ZERO ,
. extra2 = SYSCTL_ONE ,
} ,
2006-03-20 17:05:30 -08:00
# ifdef CONFIG_IPV6_ROUTER_PREF
2016-04-18 14:41:17 +03:00
{
. procname = " accept_ra_rtr_pref " ,
. data = & ipv6_devconf . accept_ra_rtr_pref ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " router_probe_interval " ,
. data = & ipv6_devconf . rtr_probe_interval ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec_jiffies ,
} ,
2007-01-30 14:30:10 -08:00
# ifdef CONFIG_IPV6_ROUTE_INFO
2017-03-22 18:19:04 +09:00
{
. procname = " accept_ra_rt_info_min_plen " ,
. data = & ipv6_devconf . accept_ra_rt_info_min_plen ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
2016-04-18 14:41:17 +03:00
{
. procname = " accept_ra_rt_info_max_plen " ,
. data = & ipv6_devconf . accept_ra_rt_info_max_plen ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
2006-03-20 17:07:03 -08:00
# endif
2006-03-20 17:05:30 -08:00
# endif
2016-04-18 14:41:17 +03:00
{
. procname = " proxy_ndp " ,
. data = & ipv6_devconf . proxy_ndp ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = addrconf_sysctl_proxy_ndp ,
} ,
{
. procname = " accept_source_route " ,
. data = & ipv6_devconf . accept_source_route ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
2007-04-25 17:08:10 -07:00
# ifdef CONFIG_IPV6_OPTIMISTIC_DAD
2016-04-18 14:41:17 +03:00
{
. procname = " optimistic_dad " ,
. data = & ipv6_devconf . optimistic_dad ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " use_optimistic " ,
. data = & ipv6_devconf . use_optimistic ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
2008-04-03 09:22:53 +09:00
# endif
# ifdef CONFIG_IPV6_MROUTE
2016-04-18 14:41:17 +03:00
{
. procname = " mc_forwarding " ,
. data = & ipv6_devconf . mc_forwarding ,
. maxlen = sizeof ( int ) ,
. mode = 0444 ,
. proc_handler = proc_dointvec ,
} ,
2007-04-25 17:08:10 -07:00
# endif
2016-04-18 14:41:17 +03:00
{
. procname = " disable_ipv6 " ,
. data = & ipv6_devconf . disable_ipv6 ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = addrconf_sysctl_disable ,
} ,
{
. procname = " accept_dad " ,
. data = & ipv6_devconf . accept_dad ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " force_tllao " ,
. data = & ipv6_devconf . force_tllao ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec
} ,
{
. procname = " ndisc_notify " ,
. data = & ipv6_devconf . ndisc_notify ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec
} ,
{
. procname = " suppress_frag_ndisc " ,
. data = & ipv6_devconf . suppress_frag_ndisc ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec
} ,
{
. procname = " accept_ra_from_local " ,
. data = & ipv6_devconf . accept_ra_from_local ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " accept_ra_mtu " ,
. data = & ipv6_devconf . accept_ra_mtu ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " stable_secret " ,
. data = & ipv6_devconf . stable_secret ,
. maxlen = IPV6_MAX_STRLEN ,
. mode = 0600 ,
. proc_handler = addrconf_sysctl_stable_secret ,
} ,
{
. procname = " use_oif_addrs_only " ,
. data = & ipv6_devconf . use_oif_addrs_only ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " ignore_routes_with_linkdown " ,
. data = & ipv6_devconf . ignore_routes_with_linkdown ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = addrconf_sysctl_ignore_routes_with_linkdown ,
} ,
{
. procname = " drop_unicast_in_l2_multicast " ,
. data = & ipv6_devconf . drop_unicast_in_l2_multicast ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " drop_unsolicited_na " ,
. data = & ipv6_devconf . drop_unsolicited_na ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
{
. procname = " keep_addr_on_down " ,
. data = & ipv6_devconf . keep_addr_on_down ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
2016-11-08 14:57:39 +01:00
{
. procname = " seg6_enabled " ,
. data = & ipv6_devconf . seg6_enabled ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
2016-11-08 14:57:42 +01:00
# ifdef CONFIG_IPV6_SEG6_HMAC
{
. procname = " seg6_require_hmac " ,
. data = & ipv6_devconf . seg6_require_hmac ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
# endif
2016-12-02 14:00:08 -08:00
{
. procname = " enhanced_dad " ,
. data = & ipv6_devconf . enhanced_dad ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
2017-01-26 16:59:17 +13:00
{
2021-05-30 19:38:11 +08:00
. procname = " addr_gen_mode " ,
. data = & ipv6_devconf . addr_gen_mode ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
2017-01-26 16:59:17 +13:00
. proc_handler = addrconf_sysctl_addr_gen_mode ,
} ,
2017-02-23 16:27:18 +00:00
{
. procname = " disable_policy " ,
. data = & ipv6_devconf . disable_policy ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = addrconf_sysctl_disable_policy ,
} ,
net: ipv6: sysctl to specify IPv6 ND traffic class
Add a per-device sysctl to specify the default traffic class to use for
kernel originated IPv6 Neighbour Discovery packets.
Currently this includes:
- Router Solicitation (ICMPv6 type 133)
ndisc_send_rs() -> ndisc_send_skb() -> ip6_nd_hdr()
- Neighbour Solicitation (ICMPv6 type 135)
ndisc_send_ns() -> ndisc_send_skb() -> ip6_nd_hdr()
- Neighbour Advertisement (ICMPv6 type 136)
ndisc_send_na() -> ndisc_send_skb() -> ip6_nd_hdr()
- Redirect (ICMPv6 type 137)
ndisc_send_redirect() -> ndisc_send_skb() -> ip6_nd_hdr()
and if the kernel ever gets around to generating RA's,
it would presumably also include:
- Router Advertisement (ICMPv6 type 134)
(radvd daemon could pick up on the kernel setting and use it)
Interface drivers may examine the Traffic Class value and translate
the DiffServ Code Point into a link-layer appropriate traffic
prioritization scheme. An example of mapping IETF DSCP values to
IEEE 802.11 User Priority values can be found here:
https://tools.ietf.org/html/draft-ietf-tsvwg-ieee-802-11
The expected primary use case is to properly prioritize ND over wifi.
Testing:
jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
0
jzem22:~# echo -1 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
-bash: echo: write error: Invalid argument
jzem22:~# echo 256 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
-bash: echo: write error: Invalid argument
jzem22:~# echo 0 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
jzem22:~# echo 255 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
255
jzem22:~# echo 34 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
34
jzem22:~# echo $[0xDC] > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
jzem22:~# tcpdump -v -i eth0 icmp6 and src host jzem22.pgc and dst host fe80::1
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
IP6 (class 0xdc, hlim 255, next-header ICMPv6 (58) payload length: 24)
jzem22.pgc > fe80::1: [icmp6 sum ok] ICMP6, neighbor advertisement,
length 24, tgt is jzem22.pgc, Flags [solicited]
(based on original change written by Erik Kline, with minor changes)
v2: fix 'suspicious rcu_dereference_check() usage'
by explicitly grabbing the rcu_read_lock.
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Erik Kline <ek@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-07 21:52:09 -08:00
{
. procname = " ndisc_tclass " ,
. data = & ipv6_devconf . ndisc_tclass ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec_minmax ,
proc/sysctl: add shared variables for range check
In the sysctl code the proc_dointvec_minmax() function is often used to
validate the user supplied value between an allowed range. This
function uses the extra1 and extra2 members from struct ctl_table as
minimum and maximum allowed value.
On sysctl handler declaration, in every source file there are some
readonly variables containing just an integer which address is assigned
to the extra1 and extra2 members, so the sysctl range is enforced.
The special values 0, 1 and INT_MAX are very often used as range
boundary, leading duplication of variables like zero=0, one=1,
int_max=INT_MAX in different source files:
$ git grep -E '\.extra[12].*&(zero|one|int_max)' |wc -l
248
Add a const int array containing the most commonly used values, some
macros to refer more easily to the correct array member, and use them
instead of creating a local one for every object file.
This is the bloat-o-meter output comparing the old and new binary
compiled with the default Fedora config:
# scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o
add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164)
Data old new delta
sysctl_vals - 12 +12
__kstrtab_sysctl_vals - 12 +12
max 14 10 -4
int_max 16 - -16
one 68 - -68
zero 128 28 -100
Total: Before=20583249, After=20583085, chg -0.00%
[mcroce@redhat.com: tipc: remove two unused variables]
Link: http://lkml.kernel.org/r/20190530091952.4108-1-mcroce@redhat.com
[akpm@linux-foundation.org: fix net/ipv6/sysctl_net_ipv6.c]
[arnd@arndb.de: proc/sysctl: make firmware loader table conditional]
Link: http://lkml.kernel.org/r/20190617130014.1713870-1-arnd@arndb.de
[akpm@linux-foundation.org: fix fs/eventpoll.c]
Link: http://lkml.kernel.org/r/20190430180111.10688-1-mcroce@redhat.com
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-18 15:58:50 -07:00
. extra1 = ( void * ) SYSCTL_ZERO ,
net: ipv6: sysctl to specify IPv6 ND traffic class
Add a per-device sysctl to specify the default traffic class to use for
kernel originated IPv6 Neighbour Discovery packets.
Currently this includes:
- Router Solicitation (ICMPv6 type 133)
ndisc_send_rs() -> ndisc_send_skb() -> ip6_nd_hdr()
- Neighbour Solicitation (ICMPv6 type 135)
ndisc_send_ns() -> ndisc_send_skb() -> ip6_nd_hdr()
- Neighbour Advertisement (ICMPv6 type 136)
ndisc_send_na() -> ndisc_send_skb() -> ip6_nd_hdr()
- Redirect (ICMPv6 type 137)
ndisc_send_redirect() -> ndisc_send_skb() -> ip6_nd_hdr()
and if the kernel ever gets around to generating RA's,
it would presumably also include:
- Router Advertisement (ICMPv6 type 134)
(radvd daemon could pick up on the kernel setting and use it)
Interface drivers may examine the Traffic Class value and translate
the DiffServ Code Point into a link-layer appropriate traffic
prioritization scheme. An example of mapping IETF DSCP values to
IEEE 802.11 User Priority values can be found here:
https://tools.ietf.org/html/draft-ietf-tsvwg-ieee-802-11
The expected primary use case is to properly prioritize ND over wifi.
Testing:
jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
0
jzem22:~# echo -1 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
-bash: echo: write error: Invalid argument
jzem22:~# echo 256 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
-bash: echo: write error: Invalid argument
jzem22:~# echo 0 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
jzem22:~# echo 255 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
255
jzem22:~# echo 34 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
34
jzem22:~# echo $[0xDC] > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
jzem22:~# tcpdump -v -i eth0 icmp6 and src host jzem22.pgc and dst host fe80::1
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
IP6 (class 0xdc, hlim 255, next-header ICMPv6 (58) payload length: 24)
jzem22.pgc > fe80::1: [icmp6 sum ok] ICMP6, neighbor advertisement,
length 24, tgt is jzem22.pgc, Flags [solicited]
(based on original change written by Erik Kline, with minor changes)
v2: fix 'suspicious rcu_dereference_check() usage'
by explicitly grabbing the rcu_read_lock.
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Erik Kline <ek@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-07 21:52:09 -08:00
. extra2 = ( void * ) & two_five_five ,
} ,
2020-03-27 18:00:20 -04:00
{
. procname = " rpl_seg_enabled " ,
. data = & ipv6_devconf . rpl_seg_enabled ,
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
. proc_handler = proc_dointvec ,
} ,
ipv6: ioam: Data plane support for Pre-allocated Trace
Implement support for processing the IOAM Pre-allocated Trace with IPv6,
see [1] and [2]. Introduce a new IPv6 Hop-by-Hop TLV option, see IANA [3].
A new per-interface sysctl is introduced. The value is a boolean to accept (=1)
or ignore (=0, by default) IPv6 IOAM options on ingress for an interface:
- net.ipv6.conf.XXX.ioam6_enabled
Two other sysctls are introduced to define IOAM IDs, represented by an integer.
They are respectively per-namespace and per-interface:
- net.ipv6.ioam6_id
- net.ipv6.conf.XXX.ioam6_id
The value of the first one represents the IOAM ID of the node itself (u32; max
and default value = U32_MAX>>8, due to hop limit concatenation) while the other
represents the IOAM ID of an interface (u16; max and default value = U16_MAX).
Each "ioam6_id" sysctl has a "_wide" equivalent:
- net.ipv6.ioam6_id_wide
- net.ipv6.conf.XXX.ioam6_id_wide
The value of the first one represents the wide IOAM ID of the node itself (u64;
max and default value = U64_MAX>>8, due to hop limit concatenation) while the
other represents the wide IOAM ID of an interface (u32; max and default value
= U32_MAX).
The use of short and wide equivalents is not exclusive, a deployment could
choose to leverage both. For example, net.ipv6.conf.XXX.ioam6_id (short format)
could be an identifier for a physical interface, whereas
net.ipv6.conf.XXX.ioam6_id_wide (wide format) could be an identifier for a
logical sub-interface. Documentation about new sysctls is provided at the end
of this patchset.
Two relativistic hash tables are used: one for IOAM namespaces, the other for
IOAM schemas. A namespace can only have a single active schema and a schema
can only be attached to a single namespace (1:1 relationship).
[1] https://tools.ietf.org/html/draft-ietf-ippm-ioam-ipv6-options
[2] https://tools.ietf.org/html/draft-ietf-ippm-ioam-data
[3] https://www.iana.org/assignments/ipv6-parameters/ipv6-parameters.xhtml#ipv6-parameters-2
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 21:42:57 +02:00
{
. procname = " ioam6_enabled " ,
. data = & ipv6_devconf . ioam6_enabled ,
. maxlen = sizeof ( u8 ) ,
. mode = 0644 ,
. proc_handler = proc_dou8vec_minmax ,
. extra1 = ( void * ) SYSCTL_ZERO ,
. extra2 = ( void * ) SYSCTL_ONE ,
} ,
{
. procname = " ioam6_id " ,
. data = & ipv6_devconf . ioam6_id ,
. maxlen = sizeof ( u32 ) ,
. mode = 0644 ,
. proc_handler = proc_douintvec_minmax ,
. extra1 = ( void * ) SYSCTL_ZERO ,
. extra2 = ( void * ) & ioam6_if_id_max ,
} ,
{
. procname = " ioam6_id_wide " ,
. data = & ipv6_devconf . ioam6_id_wide ,
. maxlen = sizeof ( u32 ) ,
. mode = 0644 ,
. proc_handler = proc_douintvec ,
} ,
2021-11-01 10:36:29 -07:00
{
. procname = " ndisc_evict_nocarrier " ,
. data = & ipv6_devconf . ndisc_evict_nocarrier ,
. maxlen = sizeof ( u8 ) ,
. mode = 0644 ,
. proc_handler = proc_dou8vec_minmax ,
. extra1 = ( void * ) SYSCTL_ZERO ,
. extra2 = ( void * ) SYSCTL_ONE ,
} ,
2022-04-15 08:34:02 +00:00
{
2022-05-30 10:14:14 +00:00
. procname = " accept_untracked_na " ,
. data = & ipv6_devconf . accept_untracked_na ,
2022-04-15 08:34:02 +00:00
. maxlen = sizeof ( int ) ,
. mode = 0644 ,
2022-07-20 14:36:32 -04:00
. proc_handler = proc_dointvec_minmax ,
. extra1 = SYSCTL_ZERO ,
. extra2 = SYSCTL_TWO ,
2022-04-15 08:34:02 +00:00
} ,
2016-04-18 14:41:17 +03:00
{
/* sentinel */
}
2005-04-16 15:20:36 -07:00
} ;
2008-01-10 17:42:13 -08:00
static int __addrconf_sysctl_register ( struct net * net , char * dev_name ,
2009-11-05 13:32:03 -08:00
struct inet6_dev * idev , struct ipv6_devconf * p )
2005-04-16 15:20:36 -07:00
{
2016-08-30 10:09:22 +02:00
int i , ifindex ;
2016-04-18 14:41:10 +03:00
struct ctl_table * table ;
2012-04-19 13:41:24 +00:00
char path [ sizeof ( " net/ipv6/conf/ " ) + IFNAMSIZ ] ;
2007-12-02 00:59:38 +11:00
2022-05-02 15:15:51 +03:00
table = kmemdup ( addrconf_sysctl , sizeof ( addrconf_sysctl ) , GFP_KERNEL_ACCOUNT ) ;
2016-04-18 14:41:10 +03:00
if ( ! table )
2007-12-02 00:21:52 +11:00
goto out ;
2016-04-18 14:41:10 +03:00
for ( i = 0 ; table [ i ] . data ; i + + ) {
table [ i ] . data + = ( char * ) p - ( char * ) & ipv6_devconf ;
2016-09-29 00:33:43 -07:00
/* If one of these is already set, then it is not safe to
* overwrite either of them : this makes proc_dointvec_minmax
* usable .
*/
if ( ! table [ i ] . extra1 & & ! table [ i ] . extra2 ) {
table [ i ] . extra1 = idev ; /* embedded; no ref */
table [ i ] . extra2 = net ;
}
2005-04-16 15:20:36 -07:00
}
2012-04-19 13:41:24 +00:00
snprintf ( path , sizeof ( path ) , " net/ipv6/conf/%s " , dev_name ) ;
2005-04-16 15:20:36 -07:00
2023-08-09 12:50:03 +02:00
p - > sysctl_header = register_net_sysctl_sz ( net , path , table ,
ARRAY_SIZE ( addrconf_sysctl ) ) ;
2016-04-18 14:41:10 +03:00
if ( ! p - > sysctl_header )
2012-04-19 13:41:24 +00:00
goto free ;
2007-12-02 00:21:52 +11:00
2016-08-30 10:09:22 +02:00
if ( ! strcmp ( dev_name , " all " ) )
ifindex = NETCONFA_IFINDEX_ALL ;
else if ( ! strcmp ( dev_name , " default " ) )
ifindex = NETCONFA_IFINDEX_DEFAULT ;
else
ifindex = idev - > dev - > ifindex ;
2017-03-28 14:28:04 -07:00
inet6_netconf_notify_devconf ( net , RTM_NEWNETCONF , NETCONFA_ALL ,
ifindex , p ) ;
2008-01-10 17:41:45 -08:00
return 0 ;
2005-04-16 15:20:36 -07:00
2007-12-02 00:21:52 +11:00
free :
2016-04-18 14:41:10 +03:00
kfree ( table ) ;
2007-12-02 00:21:52 +11:00
out :
2008-01-10 17:41:45 -08:00
return - ENOBUFS ;
2005-04-16 15:20:36 -07:00
}
2017-03-28 14:28:05 -07:00
static void __addrconf_sysctl_unregister ( struct net * net ,
struct ipv6_devconf * p , int ifindex )
2008-01-10 17:41:21 -08:00
{
2016-04-18 14:41:10 +03:00
struct ctl_table * table ;
2008-01-10 17:41:21 -08:00
2016-04-18 14:41:10 +03:00
if ( ! p - > sysctl_header )
2008-01-10 17:41:21 -08:00
return ;
2016-04-18 14:41:10 +03:00
table = p - > sysctl_header - > ctl_table_arg ;
unregister_net_sysctl_table ( p - > sysctl_header ) ;
p - > sysctl_header = NULL ;
kfree ( table ) ;
2017-03-28 14:28:05 -07:00
inet6_netconf_notify_devconf ( net , RTM_DELNETCONF , 0 , ifindex , NULL ) ;
2008-01-10 17:41:21 -08:00
}
2014-07-25 15:25:09 -07:00
static int addrconf_sysctl_register ( struct inet6_dev * idev )
2007-12-02 00:58:37 +11:00
{
2014-07-25 15:25:09 -07:00
int err ;
if ( ! sysctl_dev_name_is_allowed ( idev - > dev - > name ) )
return - EINVAL ;
err = neigh_sysctl_register ( idev - > dev , idev - > nd_parms ,
& ndisc_ifinfo_sysctl_change ) ;
if ( err )
return err ;
err = __addrconf_sysctl_register ( dev_net ( idev - > dev ) , idev - > dev - > name ,
idev , & idev - > cnf ) ;
if ( err )
neigh_sysctl_unregister ( idev - > nd_parms ) ;
return err ;
2007-12-02 00:58:37 +11:00
}
2008-01-10 17:41:21 -08:00
static void addrconf_sysctl_unregister ( struct inet6_dev * idev )
2005-04-16 15:20:36 -07:00
{
2017-03-28 14:28:05 -07:00
__addrconf_sysctl_unregister ( dev_net ( idev - > dev ) , & idev - > cnf ,
idev - > dev - > ifindex ) ;
2008-01-10 17:41:21 -08:00
neigh_sysctl_unregister ( idev - > nd_parms ) ;
2005-04-16 15:20:36 -07:00
}
# endif
2010-01-17 03:35:32 +00:00
static int __net_init addrconf_init_net ( struct net * net )
2008-01-10 17:42:55 -08:00
{
2013-03-26 01:52:45 +08:00
int err = - ENOMEM ;
2008-01-10 17:42:55 -08:00
struct ipv6_devconf * all , * dflt ;
2022-02-07 20:50:28 -08:00
spin_lock_init ( & net - > ipv6 . addrconf_hash_lock ) ;
2022-02-07 20:50:29 -08:00
INIT_DEFERRABLE_WORK ( & net - > ipv6 . addr_chk_work , addrconf_verify_work ) ;
2022-02-07 20:50:28 -08:00
net - > ipv6 . inet6_addr_lst = kcalloc ( IN6_ADDR_HSIZE ,
sizeof ( struct hlist_head ) ,
GFP_KERNEL ) ;
if ( ! net - > ipv6 . inet6_addr_lst )
goto err_alloc_addr ;
2013-03-26 01:52:45 +08:00
all = kmemdup ( & ipv6_devconf , sizeof ( ipv6_devconf ) , GFP_KERNEL ) ;
2015-03-29 14:00:04 +01:00
if ( ! all )
2013-03-26 01:52:45 +08:00
goto err_alloc_all ;
2008-01-10 17:42:55 -08:00
2013-03-26 01:52:45 +08:00
dflt = kmemdup ( & ipv6_devconf_dflt , sizeof ( ipv6_devconf_dflt ) , GFP_KERNEL ) ;
2015-03-29 14:00:04 +01:00
if ( ! dflt )
2013-03-26 01:52:45 +08:00
goto err_alloc_dflt ;
2008-01-10 17:42:55 -08:00
2022-08-23 10:46:57 -07:00
if ( ! net_eq ( net , & init_net ) ) {
switch ( net_inherit_devconf ( ) ) {
2020-05-13 15:58:43 +02:00
case 1 : /* copy from init_net */
memcpy ( all , init_net . ipv6 . devconf_all ,
sizeof ( ipv6_devconf ) ) ;
memcpy ( dflt , init_net . ipv6 . devconf_dflt ,
sizeof ( ipv6_devconf_dflt ) ) ;
break ;
case 3 : /* copy from the current netns */
memcpy ( all , current - > nsproxy - > net_ns - > ipv6 . devconf_all ,
sizeof ( ipv6_devconf ) ) ;
memcpy ( dflt ,
current - > nsproxy - > net_ns - > ipv6 . devconf_dflt ,
sizeof ( ipv6_devconf_dflt ) ) ;
break ;
case 0 :
case 2 :
/* use compiled values */
break ;
}
2019-01-17 23:27:11 -08:00
}
2013-03-26 01:52:45 +08:00
/* these will be inherited by all namespaces */
dflt - > autoconf = ipv6_defaults . autoconf ;
dflt - > disable_ipv6 = ipv6_defaults . disable_ipv6 ;
2008-01-10 17:42:55 -08:00
2015-03-23 23:36:00 +01:00
dflt - > stable_secret . initialized = false ;
all - > stable_secret . initialized = false ;
2008-01-10 17:42:55 -08:00
net - > ipv6 . devconf_all = all ;
net - > ipv6 . devconf_dflt = dflt ;
# ifdef CONFIG_SYSCTL
2009-11-05 13:32:03 -08:00
err = __addrconf_sysctl_register ( net , " all " , NULL , all ) ;
2008-01-10 17:42:55 -08:00
if ( err < 0 )
goto err_reg_all ;
2009-11-05 13:32:03 -08:00
err = __addrconf_sysctl_register ( net , " default " , NULL , dflt ) ;
2008-01-10 17:42:55 -08:00
if ( err < 0 )
goto err_reg_dflt ;
# endif
return 0 ;
# ifdef CONFIG_SYSCTL
err_reg_dflt :
2017-03-28 14:28:05 -07:00
__addrconf_sysctl_unregister ( net , all , NETCONFA_IFINDEX_ALL ) ;
2008-01-10 17:42:55 -08:00
err_reg_all :
kfree ( dflt ) ;
2022-10-17 16:03:31 +08:00
net - > ipv6 . devconf_dflt = NULL ;
2008-01-10 17:42:55 -08:00
# endif
err_alloc_dflt :
kfree ( all ) ;
2022-10-17 16:03:31 +08:00
net - > ipv6 . devconf_all = NULL ;
2008-01-10 17:42:55 -08:00
err_alloc_all :
2022-02-07 20:50:28 -08:00
kfree ( net - > ipv6 . inet6_addr_lst ) ;
err_alloc_addr :
2008-01-10 17:42:55 -08:00
return err ;
}
2010-01-17 03:35:32 +00:00
static void __net_exit addrconf_exit_net ( struct net * net )
2008-01-10 17:42:55 -08:00
{
2022-02-07 20:50:28 -08:00
int i ;
2008-01-10 17:42:55 -08:00
# ifdef CONFIG_SYSCTL
2017-03-28 14:28:05 -07:00
__addrconf_sysctl_unregister ( net , net - > ipv6 . devconf_dflt ,
NETCONFA_IFINDEX_DEFAULT ) ;
__addrconf_sysctl_unregister ( net , net - > ipv6 . devconf_all ,
NETCONFA_IFINDEX_ALL ) ;
2008-01-10 17:42:55 -08:00
# endif
2014-11-26 10:25:58 +08:00
kfree ( net - > ipv6 . devconf_dflt ) ;
2022-02-06 07:56:00 -08:00
net - > ipv6 . devconf_dflt = NULL ;
2014-11-26 10:25:58 +08:00
kfree ( net - > ipv6 . devconf_all ) ;
2022-02-06 07:56:00 -08:00
net - > ipv6 . devconf_all = NULL ;
2022-02-07 20:50:28 -08:00
2022-02-16 10:20:37 -08:00
cancel_delayed_work_sync ( & net - > ipv6 . addr_chk_work ) ;
2022-02-07 20:50:28 -08:00
/*
* Check hash table , then free it .
*/
for ( i = 0 ; i < IN6_ADDR_HSIZE ; i + + )
WARN_ON_ONCE ( ! hlist_empty ( & net - > ipv6 . inet6_addr_lst [ i ] ) ) ;
kfree ( net - > ipv6 . inet6_addr_lst ) ;
net - > ipv6 . inet6_addr_lst = NULL ;
2008-01-10 17:42:55 -08:00
}
static struct pernet_operations addrconf_ops = {
. init = addrconf_init_net ,
. exit = addrconf_exit_net ,
} ;
2015-01-29 12:15:03 +01:00
static struct rtnl_af_ops inet6_ops __read_mostly = {
2010-11-16 04:33:57 +00:00
. family = AF_INET6 ,
. fill_link_af = inet6_fill_link_af ,
. get_link_af_size = inet6_get_link_af_size ,
2015-02-05 14:39:11 +01:00
. validate_link_af = inet6_validate_link_af ,
net: ipv6: add tokenized interface identifier support
This patch adds support for IPv6 tokenized IIDs, that allow
for administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix from
Router Advertisements. It is currently in draft status.
The primary target for such support is server platforms
where addresses are usually manually configured, rather
than using DHCPv6 or SLAAC. By using tokenised identifiers,
hosts can still determine their network prefix by use of
SLAAC, but more readily be automatically renumbered should
their network prefix change. [...]
The disadvantage with static addresses is that they are
likely to require manual editing should the network prefix
in use change. If instead there were a method to only
manually configure the static identifier part of the IPv6
address, then the address could be automatically updated
when a new prefix was introduced, as described in [RFC4192]
for example. In such cases a DNS server might be
configured with such a tokenised interface identifier of
::53, and SLAAC would use the token in constructing the
interface address, using the advertised prefix. [...]
http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
The implementation is partially based on top of Mark K.
Thompson's proof of concept. However, it uses the Netlink
interface for configuration resp. data retrival, so that
it can be easily extended in future. Successfully tested
by myself.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08 04:01:30 +00:00
. set_link_af = inet6_set_link_af ,
2010-11-16 04:33:57 +00:00
} ;
2005-04-16 15:20:36 -07:00
/*
* Init / cleanup code
*/
int __init addrconf_init ( void )
{
2014-07-25 15:25:09 -07:00
struct inet6_dev * idev ;
2022-02-07 20:50:30 -08:00
int err ;
2007-11-14 15:56:23 +09:00
2010-03-20 16:09:01 -07:00
err = ipv6_addr_label_init ( ) ;
if ( err < 0 ) {
2012-05-15 14:11:53 +00:00
pr_crit ( " %s: cannot initialize default policy table: %d \n " ,
__func__ , err ) ;
2010-09-24 09:55:52 +00:00
goto out ;
2007-11-14 15:56:23 +09:00
}
2005-04-16 15:20:36 -07:00
2010-09-24 09:55:52 +00:00
err = register_pernet_subsys ( & addrconf_ops ) ;
if ( err < 0 )
goto out_addrlabel ;
2008-01-10 17:42:55 -08:00
2014-03-27 18:28:07 +01:00
addrconf_wq = create_workqueue ( " ipv6_addrconf " ) ;
if ( ! addrconf_wq ) {
err = - ENOMEM ;
goto out_nowq ;
}
2005-04-16 15:20:36 -07:00
rtnl_lock ( ) ;
2022-02-10 13:42:29 -08:00
idev = ipv6_add_dev ( blackhole_netdev ) ;
2005-04-16 15:20:36 -07:00
rtnl_unlock ( ) ;
2014-07-25 15:25:09 -07:00
if ( IS_ERR ( idev ) ) {
err = PTR_ERR ( idev ) ;
2008-01-10 17:42:55 -08:00
goto errlo ;
2014-07-25 15:25:09 -07:00
}
2005-04-16 15:20:36 -07:00
2017-05-03 22:07:31 -07:00
ip6_route_init_special_entries ( ) ;
2005-04-16 15:20:36 -07:00
register_netdevice_notifier ( & ipv6_dev_notf ) ;
2022-02-07 20:50:29 -08:00
addrconf_verify ( & init_net ) ;
2007-03-22 11:58:32 -07:00
2013-12-30 10:41:32 -08:00
rtnl_af_register ( & inet6_ops ) ;
2010-11-16 04:33:57 +00:00
2017-12-02 21:44:08 +01:00
err = rtnl_register_module ( THIS_MODULE , PF_INET6 , RTM_GETLINK ,
NULL , inet6_dump_ifinfo , 0 ) ;
2007-03-22 11:58:32 -07:00
if ( err < 0 )
goto errout ;
2017-12-02 21:44:08 +01:00
err = rtnl_register_module ( THIS_MODULE , PF_INET6 , RTM_NEWADDR ,
inet6_rtm_newaddr , NULL , 0 ) ;
if ( err < 0 )
goto errout ;
err = rtnl_register_module ( THIS_MODULE , PF_INET6 , RTM_DELADDR ,
inet6_rtm_deladdr , NULL , 0 ) ;
if ( err < 0 )
goto errout ;
err = rtnl_register_module ( THIS_MODULE , PF_INET6 , RTM_GETADDR ,
inet6_rtm_getaddr , inet6_dump_ifaddr ,
RTNL_FLAG_DOIT_UNLOCKED ) ;
if ( err < 0 )
goto errout ;
err = rtnl_register_module ( THIS_MODULE , PF_INET6 , RTM_GETMULTICAST ,
NULL , inet6_dump_ifmcaddr , 0 ) ;
if ( err < 0 )
goto errout ;
err = rtnl_register_module ( THIS_MODULE , PF_INET6 , RTM_GETANYCAST ,
NULL , inet6_dump_ifacaddr , 0 ) ;
if ( err < 0 )
goto errout ;
err = rtnl_register_module ( THIS_MODULE , PF_INET6 , RTM_GETNETCONF ,
inet6_netconf_get_devconf ,
inet6_netconf_dump_devconf ,
RTNL_FLAG_DOIT_UNLOCKED ) ;
if ( err < 0 )
goto errout ;
2017-12-04 19:19:18 +01:00
err = ipv6_addr_label_rtnl_register ( ) ;
if ( err < 0 )
goto errout ;
2007-11-14 15:56:23 +09:00
2005-04-16 15:20:36 -07:00
return 0 ;
2007-03-22 11:58:32 -07:00
errout :
2017-12-02 21:44:08 +01:00
rtnl_unregister_all ( PF_INET6 ) ;
2010-11-16 04:33:57 +00:00
rtnl_af_unregister ( & inet6_ops ) ;
2007-03-22 11:58:32 -07:00
unregister_netdevice_notifier ( & ipv6_dev_notf ) ;
2008-01-10 17:42:55 -08:00
errlo :
2014-03-27 18:28:07 +01:00
destroy_workqueue ( addrconf_wq ) ;
out_nowq :
2008-01-10 17:42:55 -08:00
unregister_pernet_subsys ( & addrconf_ops ) ;
2010-09-24 09:55:52 +00:00
out_addrlabel :
ipv6_addr_label_cleanup ( ) ;
out :
2007-03-22 11:58:32 -07:00
return err ;
2005-04-16 15:20:36 -07:00
}
2007-12-13 05:34:58 -08:00
void addrconf_cleanup ( void )
2005-04-16 15:20:36 -07:00
{
2009-03-03 01:06:45 -08:00
struct net_device * dev ;
2005-04-16 15:20:36 -07:00
unregister_netdevice_notifier ( & ipv6_dev_notf ) ;
2008-01-10 17:42:55 -08:00
unregister_pernet_subsys ( & addrconf_ops ) ;
2010-09-24 09:55:52 +00:00
ipv6_addr_label_cleanup ( ) ;
2005-04-16 15:20:36 -07:00
2017-10-04 15:58:49 +02:00
rtnl_af_unregister ( & inet6_ops ) ;
2005-04-16 15:20:36 -07:00
2017-10-04 15:58:49 +02:00
rtnl_lock ( ) ;
2010-11-16 04:33:57 +00:00
2009-03-03 01:06:45 -08:00
/* clean dev list */
for_each_netdev ( & init_net , dev ) {
if ( __in6_dev_get ( dev ) = = NULL )
continue ;
2020-07-31 15:32:07 +02:00
addrconf_ifdown ( dev , true ) ;
2009-03-03 01:06:45 -08:00
}
2020-07-31 15:32:07 +02:00
addrconf_ifdown ( init_net . loopback_dev , true ) ;
2009-03-03 01:06:45 -08:00
2005-04-16 15:20:36 -07:00
rtnl_unlock ( ) ;
2014-03-27 18:28:07 +01:00
destroy_workqueue ( addrconf_wq ) ;
2005-04-16 15:20:36 -07:00
}