License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 15:07:57 +01:00
/* SPDX-License-Identifier: GPL-2.0 */
2007-09-12 11:50:50 +02:00
/*
* Operations on the network namespace
*/
# ifndef __NET_NET_NAMESPACE_H
# define __NET_NET_NAMESPACE_H
2011-07-26 16:09:06 -07:00
# include <linux/atomic.h>
2017-06-30 13:08:08 +03:00
# include <linux/refcount.h>
2007-09-12 11:50:50 +02:00
# include <linux/workqueue.h>
# include <linux/list.h>
2011-05-26 16:40:37 -04:00
# include <linux/sysctl.h>
2007-09-12 11:50:50 +02:00
ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif
As suggested by Julian:
Simply, flowi4_iif must not contain 0, it does not
look logical to ignore all ip rules with specified iif.
because in fib_rule_match() we do:
if (rule->iifindex && (rule->iifindex != fl->flowi_iif))
goto out;
flowi4_iif should be LOOPBACK_IFINDEX by default.
We need to move LOOPBACK_IFINDEX to include/net/flow.h:
1) It is mostly used by flowi_iif
2) Fix the following compile error if we use it in flow.h
by the patches latter:
In file included from include/linux/netfilter.h:277:0,
from include/net/netns/netfilter.h:5,
from include/net/net_namespace.h:21,
from include/linux/netdevice.h:43,
from include/linux/icmpv6.h:12,
from include/linux/ipv6.h:61,
from include/net/ipv6.h:16,
from include/linux/sunrpc/clnt.h:27,
from include/linux/nfs_fs.h:30,
from init/do_mounts.c:32:
include/net/flow.h: In function ‘flowi4_init_output’:
include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function)
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15 16:25:34 -07:00
# include <net/flow.h>
2008-03-31 19:41:14 -07:00
# include <net/netns/core.h>
2008-07-18 04:01:24 -07:00
# include <net/netns/mib.h>
2007-12-11 04:19:17 -08:00
# include <net/netns/unix.h>
2007-12-11 04:19:54 -08:00
# include <net/netns/packet.h>
2007-12-16 13:29:36 -08:00
# include <net/netns/ipv4.h>
2008-01-10 02:49:06 -08:00
# include <net/netns/ipv6.h>
2014-02-28 07:32:49 +01:00
# include <net/netns/ieee802154_6lowpan.h>
2012-08-06 08:42:04 +00:00
# include <net/netns/sctp.h>
2008-04-13 22:28:42 -07:00
# include <net/netns/dccp.h>
2013-03-24 23:50:39 +00:00
# include <net/netns/netfilter.h>
2008-01-31 04:02:13 -08:00
# include <net/netns/x_tables.h>
2008-10-08 11:35:02 +02:00
# if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
# include <net/netns/conntrack.h>
# endif
2013-10-10 23:28:33 +02:00
# include <net/netns/nftables.h>
2008-11-25 17:14:31 -08:00
# include <net/netns/xfrm.h>
2015-03-03 19:10:47 -06:00
# include <net/netns/mpls.h>
2017-02-21 12:19:47 +01:00
# include <net/netns/can.h>
2014-10-31 22:56:04 -04:00
# include <linux/ns_common.h>
2015-06-17 10:28:25 -05:00
# include <linux/idr.h>
# include <linux/skbuff.h>
2007-12-11 04:19:17 -08:00
2012-06-14 02:31:10 -07:00
struct user_namespace ;
2007-09-12 12:01:34 +02:00
struct proc_dir_entry ;
2007-09-26 22:10:56 -07:00
struct net_device ;
2007-11-19 22:26:51 -08:00
struct sock ;
2007-12-01 23:51:01 +11:00
struct ctl_table_header ;
2008-04-15 00:36:08 -07:00
struct net_generic ;
2018-03-19 13:17:30 +01:00
struct uevent_sock ;
2011-03-04 12:18:07 +02:00
struct netns_ipvs ;
2007-12-01 23:51:01 +11:00
2009-10-24 06:13:17 -07:00
# define NETDEV_HASHBITS 8
# define NETDEV_HASHENTRIES (1 << NETDEV_HASHBITS)
2007-09-12 11:50:50 +02:00
struct net {
2017-06-30 13:08:08 +03:00
refcount_t passive ; /* To decided when the network
2011-06-08 21:13:01 -04:00
* namespace should be freed .
*/
2018-01-12 18:28:31 +03:00
refcount_t count ; /* To decided when the network
2011-06-08 21:13:01 -04:00
* namespace should be shut down .
2007-09-12 11:50:50 +02:00
*/
2010-10-14 05:56:18 +00:00
spinlock_t rules_mod_lock ;
2015-03-11 18:53:14 -07:00
atomic64_t cookie_gen ;
2007-09-12 11:50:50 +02:00
struct list_head list ; /* list of network namespaces */
2018-02-19 12:58:38 +03:00
struct list_head exit_list ; /* To linked to call pernet exit
2018-03-27 18:02:23 +03:00
* methods on dead net (
* pernet_ops_rwsem read locked ) ,
* or to unregister pernet ops
* ( pernet_ops_rwsem write locked ) .
2018-02-19 12:58:38 +03:00
*/
2018-02-19 12:58:45 +03:00
struct llist_node cleanup_list ; /* namespaces on death row */
2012-06-14 02:31:10 -07:00
struct user_namespace * user_ns ; /* Owning user namespace */
2016-08-08 14:33:23 -05:00
struct ucounts * ucounts ;
2015-05-15 14:47:32 -07:00
spinlock_t nsid_lock ;
2015-01-15 15:11:15 +01:00
struct idr netns_ids ;
2012-06-14 02:31:10 -07:00
2014-10-31 22:56:04 -04:00
struct ns_common ns ;
2011-06-15 10:21:48 -07:00
2007-09-12 12:01:34 +02:00
struct proc_dir_entry * proc_net ;
struct proc_dir_entry * proc_net_stat ;
2007-09-17 11:56:21 -07:00
2008-07-14 21:22:20 -04:00
# ifdef CONFIG_SYSCTL
struct ctl_table_set sysctls ;
# endif
2007-11-30 23:55:42 +11:00
2010-10-14 05:56:18 +00:00
struct sock * rtnl ; /* rtnetlink socket */
struct sock * genl_sock ;
2007-09-26 22:10:56 -07:00
2018-03-19 13:17:30 +01:00
struct uevent_sock * uevent_sock ; /* uevent socket */
2007-09-17 11:56:21 -07:00
struct list_head dev_base_head ;
struct hlist_head * dev_name_head ;
struct hlist_head * dev_index_head ;
2011-06-21 03:11:20 +00:00
unsigned int dev_base_seq ; /* protected by rtnl_mutex */
2012-08-08 21:53:19 +00:00
int ifindex ;
2013-09-23 21:19:49 -07:00
unsigned int dev_unreg_count ;
2007-11-19 22:26:51 -08:00
2008-01-10 03:20:28 -08:00
/* core fib_rules */
struct list_head rules_ops ;
2018-03-27 18:02:23 +03:00
struct list_head fib_notifier_ops ; /* Populated by
* register_pernet_subsys ( )
*/
2010-10-14 05:56:18 +00:00
struct net_device * loopback_dev ; /* The loopback */
2008-03-31 19:41:14 -07:00
struct netns_core core ;
2008-07-18 04:01:24 -07:00
struct netns_mib mib ;
2007-12-11 04:19:54 -08:00
struct netns_packet packet ;
2007-12-11 04:19:17 -08:00
struct netns_unix unx ;
2007-12-16 13:29:36 -08:00
struct netns_ipv4 ipv4 ;
2011-12-10 09:48:31 +00:00
# if IS_ENABLED(CONFIG_IPV6)
2008-01-10 02:49:06 -08:00
struct netns_ipv6 ipv6 ;
# endif
2014-02-28 07:32:49 +01:00
# if IS_ENABLED(CONFIG_IEEE802154_6LOWPAN)
struct netns_ieee802154_lowpan ieee802154_lowpan ;
# endif
2012-08-06 08:42:04 +00:00
# if defined(CONFIG_IP_SCTP) || defined(CONFIG_IP_SCTP_MODULE)
struct netns_sctp sctp ;
# endif
2008-04-13 22:28:42 -07:00
# if defined(CONFIG_IP_DCCP) || defined(CONFIG_IP_DCCP_MODULE)
struct netns_dccp dccp ;
# endif
2008-01-31 04:02:13 -08:00
# ifdef CONFIG_NETFILTER
2013-03-24 23:50:39 +00:00
struct netns_nf nf ;
2008-01-31 04:02:13 -08:00
struct netns_xt xt ;
2008-10-08 11:35:02 +02:00
# if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
struct netns_ct ct ;
2012-09-18 16:50:08 +00:00
# endif
2013-10-10 23:28:33 +02:00
# if defined(CONFIG_NF_TABLES) || defined(CONFIG_NF_TABLES_MODULE)
struct netns_nftables nft ;
# endif
2012-09-18 16:50:08 +00:00
# if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
struct netns_nf_frag nf_frag ;
2008-10-08 11:35:02 +02:00
# endif
2010-01-13 16:02:14 +01:00
struct sock * nfnl ;
struct sock * nfnl_stash ;
2015-08-05 17:51:45 +02:00
# if IS_ENABLED(CONFIG_NETFILTER_NETLINK_ACCT)
struct list_head nfnl_acct_list ;
# endif
2015-12-09 14:07:40 +01:00
# if IS_ENABLED(CONFIG_NF_CT_NETLINK_TIMEOUT)
struct list_head nfct_timeout_list ;
# endif
2008-11-25 17:14:31 -08:00
# endif
2009-09-29 23:27:28 +02:00
# ifdef CONFIG_WEXT_CORE
2009-06-24 01:34:48 +00:00
struct sk_buff_head wext_nlevents ;
2008-01-31 04:02:13 -08:00
# endif
2010-10-25 03:20:11 +00:00
struct net_generic __rcu * gen ;
2010-10-14 05:56:18 +00:00
/* Note : following structs are cache line aligned */
# ifdef CONFIG_XFRM
struct netns_xfrm xfrm ;
# endif
2013-06-26 16:40:06 +08:00
# if IS_ENABLED(CONFIG_IP_VS)
2011-01-03 14:44:42 +01:00
struct netns_ipvs * ipvs ;
2015-03-03 19:10:47 -06:00
# endif
# if IS_ENABLED(CONFIG_MPLS)
struct netns_mpls mpls ;
2017-02-21 12:19:47 +01:00
# endif
# if IS_ENABLED(CONFIG_CAN)
struct netns_can can ;
2013-06-26 16:40:06 +08:00
# endif
2012-07-16 04:28:49 +00:00
struct sock * diag_nlsk ;
2013-05-27 20:46:33 +00:00
atomic_t fnhe_genid ;
2016-10-28 01:22:25 -07:00
} __randomize_layout ;
2007-09-12 11:50:50 +02:00
2008-04-02 00:10:28 -07:00
# include <linux/seq_file_net.h>
2007-09-13 09:16:29 +02:00
/* Init's network namespace */
2007-09-12 11:50:50 +02:00
extern struct net init_net ;
2008-04-03 13:04:33 -07:00
2012-06-14 02:16:42 -07:00
# ifdef CONFIG_NET_NS
2013-09-21 10:22:48 -07:00
struct net * copy_net_ns ( unsigned long flags , struct user_namespace * user_ns ,
struct net * old_net ) ;
2008-04-02 00:09:29 -07:00
2017-05-30 11:38:12 +02:00
void net_ns_barrier ( void ) ;
2012-06-14 02:16:42 -07:00
# else /* CONFIG_NET_NS */
# include <linux/sched.h>
# include <linux/nsproxy.h>
2012-06-14 02:31:10 -07:00
static inline struct net * copy_net_ns ( unsigned long flags ,
struct user_namespace * user_ns , struct net * old_net )
2007-09-26 22:04:26 -07:00
{
2012-06-14 02:16:42 -07:00
if ( flags & CLONE_NEWNET )
return ERR_PTR ( - EINVAL ) ;
return old_net ;
2007-09-26 22:04:26 -07:00
}
2017-05-30 11:38:12 +02:00
static inline void net_ns_barrier ( void ) { }
2012-06-14 02:16:42 -07:00
# endif /* CONFIG_NET_NS */
2008-04-02 00:09:29 -07:00
extern struct list_head net_namespace_list ;
2007-09-26 22:04:26 -07:00
2013-09-21 10:22:48 -07:00
struct net * get_net_ns_by_pid ( pid_t pid ) ;
2016-11-18 09:41:46 +00:00
struct net * get_net_ns_by_fd ( int fd ) ;
2009-07-10 09:51:35 +00:00
2014-02-09 22:29:14 +05:30
# ifdef CONFIG_SYSCTL
void ipx_register_sysctl ( void ) ;
void ipx_unregister_sysctl ( void ) ;
# else
# define ipx_register_sysctl()
# define ipx_unregister_sysctl()
# endif
2007-11-01 00:43:49 -07:00
# ifdef CONFIG_NET_NS
2013-09-21 10:22:48 -07:00
void __put_net ( struct net * net ) ;
2007-09-12 11:50:50 +02:00
static inline struct net * get_net ( struct net * net )
{
2018-01-12 18:28:31 +03:00
refcount_inc ( & net - > count ) ;
2007-09-12 11:50:50 +02:00
return net ;
}
2007-09-13 09:18:57 +02:00
static inline struct net * maybe_get_net ( struct net * net )
{
/* Used when we know struct net exists but we
* aren ' t guaranteed a previous reference count
* exists . If the reference count is zero this
* function fails and returns NULL .
*/
2018-01-12 18:28:31 +03:00
if ( ! refcount_inc_not_zero ( & net - > count ) )
2007-09-13 09:18:57 +02:00
net = NULL ;
return net ;
}
2007-09-12 11:50:50 +02:00
static inline void put_net ( struct net * net )
{
2018-01-12 18:28:31 +03:00
if ( refcount_dec_and_test ( & net - > count ) )
2007-09-12 11:50:50 +02:00
__put_net ( net ) ;
}
2008-03-26 03:57:35 +09:00
static inline
int net_eq ( const struct net * net1 , const struct net * net2 )
{
return net1 = = net2 ;
}
2011-06-08 21:13:01 -04:00
net: tcp: close sock if net namespace is exiting
When a tcp socket is closed, if it detects that its net namespace is
exiting, close immediately and do not wait for FIN sequence.
For normal sockets, a reference is taken to their net namespace, so it will
never exit while the socket is open. However, kernel sockets do not take a
reference to their net namespace, so it may begin exiting while the kernel
socket is still open. In this case if the kernel socket is a tcp socket,
it will stay open trying to complete its close sequence. The sock's dst(s)
hold a reference to their interface, which are all transferred to the
namespace's loopback interface when the real interfaces are taken down.
When the namespace tries to take down its loopback interface, it hangs
waiting for all references to the loopback interface to release, which
results in messages like:
unregister_netdevice: waiting for lo to become free. Usage count = 1
These messages continue until the socket finally times out and closes.
Since the net namespace cleanup holds the net_mutex while calling its
registered pernet callbacks, any new net namespace initialization is
blocked until the current net namespace finishes exiting.
After this change, the tcp socket notices the exiting net namespace, and
closes immediately, releasing its dst(s) and their reference to the
loopback interface, which lets the net namespace continue exiting.
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811
Signed-off-by: Dan Streetman <ddstreet@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 16:14:26 -05:00
static inline int check_net ( const struct net * net )
{
2018-01-29 10:14:59 -05:00
return refcount_read ( & net - > count ) ! = 0 ;
net: tcp: close sock if net namespace is exiting
When a tcp socket is closed, if it detects that its net namespace is
exiting, close immediately and do not wait for FIN sequence.
For normal sockets, a reference is taken to their net namespace, so it will
never exit while the socket is open. However, kernel sockets do not take a
reference to their net namespace, so it may begin exiting while the kernel
socket is still open. In this case if the kernel socket is a tcp socket,
it will stay open trying to complete its close sequence. The sock's dst(s)
hold a reference to their interface, which are all transferred to the
namespace's loopback interface when the real interfaces are taken down.
When the namespace tries to take down its loopback interface, it hangs
waiting for all references to the loopback interface to release, which
results in messages like:
unregister_netdevice: waiting for lo to become free. Usage count = 1
These messages continue until the socket finally times out and closes.
Since the net namespace cleanup holds the net_mutex while calling its
registered pernet callbacks, any new net namespace initialization is
blocked until the current net namespace finishes exiting.
After this change, the tcp socket notices the exiting net namespace, and
closes immediately, releasing its dst(s) and their reference to the
loopback interface, which lets the net namespace continue exiting.
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811
Signed-off-by: Dan Streetman <ddstreet@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 16:14:26 -05:00
}
2013-09-21 10:22:48 -07:00
void net_drop_ns ( void * ) ;
2011-06-08 21:13:01 -04:00
2007-11-01 00:43:49 -07:00
# else
2008-06-20 22:16:51 -07:00
2007-11-01 00:43:49 -07:00
static inline struct net * get_net ( struct net * net )
{
return net ;
}
static inline void put_net ( struct net * net )
{
}
2008-04-16 01:58:04 -07:00
static inline struct net * maybe_get_net ( struct net * net )
{
return net ;
}
static inline
int net_eq ( const struct net * net1 , const struct net * net2 )
{
return 1 ;
}
2011-06-08 21:13:01 -04:00
net: tcp: close sock if net namespace is exiting
When a tcp socket is closed, if it detects that its net namespace is
exiting, close immediately and do not wait for FIN sequence.
For normal sockets, a reference is taken to their net namespace, so it will
never exit while the socket is open. However, kernel sockets do not take a
reference to their net namespace, so it may begin exiting while the kernel
socket is still open. In this case if the kernel socket is a tcp socket,
it will stay open trying to complete its close sequence. The sock's dst(s)
hold a reference to their interface, which are all transferred to the
namespace's loopback interface when the real interfaces are taken down.
When the namespace tries to take down its loopback interface, it hangs
waiting for all references to the loopback interface to release, which
results in messages like:
unregister_netdevice: waiting for lo to become free. Usage count = 1
These messages continue until the socket finally times out and closes.
Since the net namespace cleanup holds the net_mutex while calling its
registered pernet callbacks, any new net namespace initialization is
blocked until the current net namespace finishes exiting.
After this change, the tcp socket notices the exiting net namespace, and
closes immediately, releasing its dst(s) and their reference to the
loopback interface, which lets the net namespace continue exiting.
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811
Signed-off-by: Dan Streetman <ddstreet@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 16:14:26 -05:00
static inline int check_net ( const struct net * net )
{
return 1 ;
}
2011-06-08 21:13:01 -04:00
# define net_drop_ns NULL
2008-04-16 01:58:04 -07:00
# endif
2015-03-11 23:06:44 -05:00
typedef struct {
2008-11-12 00:53:30 -08:00
# ifdef CONFIG_NET_NS
2015-03-11 23:06:44 -05:00
struct net * net ;
# endif
} possible_net_t ;
2008-11-12 00:53:30 -08:00
2015-03-11 23:06:44 -05:00
static inline void write_pnet ( possible_net_t * pnet , struct net * net )
2008-11-12 00:53:30 -08:00
{
2015-03-11 23:06:44 -05:00
# ifdef CONFIG_NET_NS
pnet - > net = net ;
# endif
2008-11-12 00:53:30 -08:00
}
2015-03-11 23:06:44 -05:00
static inline struct net * read_pnet ( const possible_net_t * pnet )
2008-11-12 00:53:30 -08:00
{
2015-03-11 23:06:44 -05:00
# ifdef CONFIG_NET_NS
return pnet - > net ;
2008-11-12 00:53:30 -08:00
# else
2015-03-11 23:06:44 -05:00
return & init_net ;
2008-11-12 00:53:30 -08:00
# endif
2015-03-11 23:06:44 -05:00
}
2008-04-16 01:58:04 -07:00
net: Introduce net_rwsem to protect net_namespace_list
rtnl_lock() is used everywhere, and contention is very high.
When someone wants to iterate over alive net namespaces,
he/she has no a possibility to do that without exclusive lock.
But the exclusive rtnl_lock() in such places is overkill,
and it just increases the contention. Yes, there is already
for_each_net_rcu() in kernel, but it requires rcu_read_lock(),
and this can't be sleepable. Also, sometimes it may be need
really prevent net_namespace_list growth, so for_each_net_rcu()
is not fit there.
This patch introduces new rw_semaphore, which will be used
instead of rtnl_mutex to protect net_namespace_list. It is
sleepable and allows not-exclusive iterations over net
namespaces list. It allows to stop using rtnl_lock()
in several places (what is made in next patches) and makes
less the time, we keep rtnl_mutex. Here we just add new lock,
while the explanation of we can remove rtnl_lock() there are
in next patches.
Fine grained locks generally are better, then one big lock,
so let's do that with net_namespace_list, while the situation
allows that.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29 19:20:32 +03:00
/* Protected by net_rwsem */
2007-09-12 11:50:50 +02:00
# define for_each_net(VAR) \
list_for_each_entry ( VAR , & net_namespace_list , list )
2009-07-10 09:51:33 +00:00
# define for_each_net_rcu(VAR) \
list_for_each_entry_rcu ( VAR , & net_namespace_list , list )
2007-10-08 20:38:39 -07:00
# ifdef CONFIG_NET_NS
# define __net_init
# define __net_exit
2007-11-13 03:23:50 -08:00
# define __net_initdata
2012-10-04 17:12:11 -07:00
# define __net_initconst
2007-10-08 20:38:39 -07:00
# else
# define __net_init __init
2016-08-02 14:03:33 -07:00
# define __net_exit __ref
2007-11-13 03:23:50 -08:00
# define __net_initdata __initdata
2012-10-04 17:12:11 -07:00
# define __net_initconst __initconst
2007-10-08 20:38:39 -07:00
# endif
2007-09-12 11:50:50 +02:00
2015-05-07 11:02:49 +02:00
int peernet2id_alloc ( struct net * net , struct net * peer ) ;
2015-05-07 11:02:53 +02:00
int peernet2id ( struct net * net , struct net * peer ) ;
bool peernet_has_id ( struct net * net , struct net * peer ) ;
2015-01-15 15:11:15 +01:00
struct net * get_net_ns_by_id ( struct net * net , int id ) ;
2007-09-12 11:50:50 +02:00
struct pernet_operations {
struct list_head list ;
2018-03-13 13:55:55 +03:00
/*
* Below methods are called without any exclusive locks .
* More than one net may be constructed and destructed
* in parallel on several cpus . Every pernet_operations
* have to keep in mind all other pernet_operations and
* to introduce a locking , if they share common resources .
*
2018-03-27 18:02:32 +03:00
* The only time they are called with exclusive lock is
* from register_pernet_subsys ( ) , unregister_pernet_subsys ( )
* register_pernet_device ( ) and unregister_pernet_device ( ) .
*
2018-03-13 13:55:55 +03:00
* Exit methods using blocking RCU primitives , such as
* synchronize_rcu ( ) , should be implemented via exit_batch .
* Then , destruction of a group of net requires single
* synchronize_rcu ( ) related to these pernet_operations ,
* instead of separate synchronize_rcu ( ) for every net .
* Please , avoid synchronize_rcu ( ) at all , where it ' s possible .
*/
2007-09-12 11:50:50 +02:00
int ( * init ) ( struct net * net ) ;
void ( * exit ) ( struct net * net ) ;
2009-12-03 02:29:03 +00:00
void ( * exit_batch ) ( struct list_head * net_exit_list ) ;
netns: make struct pernet_operations::id unsigned int
Make struct pernet_operations::id unsigned.
There are 2 reasons to do so:
1)
This field is really an index into an zero based array and
thus is unsigned entity. Using negative value is out-of-bound
access by definition.
2)
On x86_64 unsigned 32-bit data which are mixed with pointers
via array indexing or offsets added or subtracted to pointers
are preffered to signed 32-bit data.
"int" being used as an array index needs to be sign-extended
to 64-bit before being used.
void f(long *p, int i)
{
g(p[i]);
}
roughly translates to
movsx rsi, esi
mov rdi, [rsi+...]
call g
MOVSX is 3 byte instruction which isn't necessary if the variable is
unsigned because x86_64 is zero extending by default.
Now, there is net_generic() function which, you guessed it right, uses
"int" as an array index:
static inline void *net_generic(const struct net *net, int id)
{
...
ptr = ng->ptr[id - 1];
...
}
And this function is used a lot, so those sign extensions add up.
Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
messing with code generation):
add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
Unfortunately some functions actually grow bigger.
This is a semmingly random artefact of code generation with register
allocator being used differently. gcc decides that some variable
needs to live in new r8+ registers and every access now requires REX
prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
used which is longer than [r8]
However, overall balance is in negative direction:
add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
function old new delta
nfsd4_lock 3886 3959 +73
tipc_link_build_proto_msg 1096 1140 +44
mac80211_hwsim_new_radio 2776 2808 +32
tipc_mon_rcv 1032 1058 +26
svcauth_gss_legacy_init 1413 1429 +16
tipc_bcbase_select_primary 379 392 +13
nfsd4_exchange_id 1247 1260 +13
nfsd4_setclientid_confirm 782 793 +11
...
put_client_renew_locked 494 480 -14
ip_set_sockfn_get 730 716 -14
geneve_sock_add 829 813 -16
nfsd4_sequence_done 721 703 -18
nlmclnt_lookup_host 708 686 -22
nfsd4_lockt 1085 1063 -22
nfs_get_client 1077 1050 -27
tcf_bpf_init 1106 1076 -30
nfsd4_encode_fattr 5997 5930 -67
Total: Before=154856051, After=154854321, chg -0.00%
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-17 04:58:21 +03:00
unsigned int * id ;
2009-11-29 22:25:28 +00:00
size_t size ;
2007-09-12 11:50:50 +02:00
} ;
2009-02-22 00:11:09 -08:00
/*
* Use these carefully . If you implement a network device and it
* needs per network namespace operations use device pernet operations ,
* otherwise use pernet subsys operations .
*
2009-07-15 06:16:34 +00:00
* Network interfaces need to be removed from a dying netns _before_
* subsys notifiers can be called , as most of the network code cleanup
* ( which is done from subsys notifiers ) runs with the assumption that
* dev_remove_pack has been called so no new packets will arrive during
* and after the cleanup functions have been called . dev_remove_pack
* is not per namespace so instead the guarantee of no more packets
* arriving in a network namespace is provided by ensuring that all
* network devices and all sockets have left the network namespace
* before the cleanup methods are called .
2009-02-22 00:11:09 -08:00
*
* For the longest time the ipv4 icmp code was registered as a pernet
* device which caused kernel oops , and panics during network
* namespace cleanup . So please don ' t get this wrong .
*/
2013-09-21 10:22:48 -07:00
int register_pernet_subsys ( struct pernet_operations * ) ;
void unregister_pernet_subsys ( struct pernet_operations * ) ;
int register_pernet_device ( struct pernet_operations * ) ;
void unregister_pernet_device ( struct pernet_operations * ) ;
2009-11-29 22:25:28 +00:00
2007-11-30 23:55:42 +11:00
struct ctl_table ;
struct ctl_table_header ;
2008-05-19 13:45:33 -07:00
2012-04-19 13:20:32 +00:00
# ifdef CONFIG_SYSCTL
2013-09-21 10:22:48 -07:00
int net_sysctl_init ( void ) ;
struct ctl_table_header * register_net_sysctl ( struct net * net , const char * path ,
struct ctl_table * table ) ;
void unregister_net_sysctl_table ( struct ctl_table_header * header ) ;
2012-04-23 12:13:02 +00:00
# else
static inline int net_sysctl_init ( void ) { return 0 ; }
static inline struct ctl_table_header * register_net_sysctl ( struct net * net ,
const char * path , struct ctl_table * table )
{
return NULL ;
}
static inline void unregister_net_sysctl_table ( struct ctl_table_header * header )
{
}
# endif
2013-07-30 08:33:53 +08:00
static inline int rt_genid_ipv4 ( struct net * net )
2012-09-10 22:09:44 +00:00
{
2013-07-30 08:33:53 +08:00
return atomic_read ( & net - > ipv4 . rt_genid ) ;
2012-09-10 22:09:44 +00:00
}
2013-07-30 08:33:53 +08:00
static inline void rt_genid_bump_ipv4 ( struct net * net )
2012-09-10 22:09:44 +00:00
{
2013-07-30 08:33:53 +08:00
atomic_inc ( & net - > ipv4 . rt_genid ) ;
}
2014-09-28 00:46:06 +02:00
extern void ( * __fib6_flush_trees ) ( struct net * net ) ;
2013-07-30 08:33:53 +08:00
static inline void rt_genid_bump_ipv6 ( struct net * net )
{
2014-09-28 00:46:06 +02:00
if ( __fib6_flush_trees )
__fib6_flush_trees ( net ) ;
2013-07-30 08:33:53 +08:00
}
2014-04-17 18:22:54 -07:00
# if IS_ENABLED(CONFIG_IEEE802154_6LOWPAN)
static inline struct netns_ieee802154_lowpan *
net_ieee802154_lowpan ( struct net * net )
{
return & net - > ieee802154_lowpan ;
}
# endif
2013-07-30 08:33:53 +08:00
/* For callers who don't really care about whether it's IPv4 or IPv6 */
static inline void rt_genid_bump_all ( struct net * net )
{
rt_genid_bump_ipv4 ( net ) ;
rt_genid_bump_ipv6 ( net ) ;
2012-09-10 22:09:44 +00:00
}
2007-11-30 23:55:42 +11:00
2013-05-27 20:46:33 +00:00
static inline int fnhe_genid ( struct net * net )
{
return atomic_read ( & net - > fnhe_genid ) ;
}
static inline void fnhe_genid_bump ( struct net * net )
{
atomic_inc ( & net - > fnhe_genid ) ;
}
2007-09-12 11:50:50 +02:00
# endif /* __NET_NET_NAMESPACE_H */