5920 Commits

Author SHA1 Message Date
David Ahern
6f74b6c259 net: Align ip_multipath_l3_keys and ip6_multipath_l3_keys
Symmetry is good and allows easy comparison that ipv4 and ipv6 are
doing the same thing. To that end, change ip_multipath_l3_keys to
set addresses at the end after the icmp compares, and move the
initialization of ipv6 flow keys to rt6_multipath_hash.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-04 13:04:21 -05:00
David S. Miller
4a0c7191c7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter fixes for your net tree,
they are:

1) Put back reference on CLUSTERIP configuration structure from the
   error path, patch from Florian Westphal.

2) Put reference on CLUSTERIP configuration instead of freeing it,
   another cpu may still be walking over it, also from Florian.

3) Refetch pointer to IPv6 header from nf_nat_ipv6_manip_pkt() given
   packet manipulation may reallocation the skbuff header, from Florian.

4) Missing match size sanity checks in ebt_among, from Florian.

5) Convert BUG_ON to WARN_ON in ebtables, from Florian.

6) Sanity check userspace offsets from ebtables kernel, from Florian.

7) Missing checksum replace call in flowtable IPv4 DNAT, from Felix
   Fietkau.

8) Bump the right stats on checksum error from bridge netfilter,
   from Taehee Yoo.

9) Unset interface flag in IPv6 fib lookups otherwise we get
   misleading routing lookup results, from Florian.

10) Missing sk_to_full_sk() in ip6_route_me_harder() from Eric Dumazet.

11) Don't allow devices to be part of multiple flowtables at the same
    time, this may break setups.

12) Missing netlink attribute validation in flowtable deletion.

13) Wrong array index in nf_unregister_net_hook() call from error path
    in flowtable addition path.

14) Fix FTP IPVS helper when NAT mangling is in place, patch from
    Julian Anastasov.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-02 20:32:15 -05:00
Sabrina Dubroca
f1c02cfb7b ipv6: allow userspace to add IFA_F_OPTIMISTIC addresses
According to RFC 4429 (section 3.1), adding new IPv6 addresses as
optimistic addresses is acceptable, as long as the implementation
follows some rules:

   * Optimistic DAD SHOULD only be used when the implementation is aware
        that the address is based on a most likely unique interface
        identifier (such as in [RFC2464]), generated randomly [RFC3041],
        or by a well-distributed hash function [RFC3972] or assigned by
        Dynamic Host Configuration Protocol for IPv6 (DHCPv6) [RFC3315].
        Optimistic DAD SHOULD NOT be used for manually entered
        addresses.

Thus, it seems reasonable to allow userspace to set the optimistic flag
when adding new addresses.

We must not let userspace set NODAD + OPTIMISTIC, since if the kernel is
not performing DAD we would never clear the optimistic flag. We must
also ignore userspace's request to add OPTIMISTIC flag to addresses that
have already completed DAD (addresses that don't have the TENTATIVE
flag, or that have the DADFAILED flag).

Then we also need to clear the OPTIMISTIC flag on permanent addresses
when DAD fails. Otherwise, IFA_F_OPTIMISTIC addresses added by userspace
can still be used after DAD has failed, because in
ipv6_chk_addr_and_flags(), IFA_F_OPTIMISTIC overrides IFA_F_TENTATIVE.

Setting IFA_F_OPTIMISTIC from userspace is conditional on
CONFIG_IPV6_OPTIMISTIC_DAD and the optimistic_dad sysctl.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 13:43:06 -05:00
Yuval Mintz
7b0db85737 ipmr, ip6mr: Unite dumproute flows
The various MFC entries are being held in the same kind of mr_tables
for both ipmr and ip6mr, and their traversal logic is identical.
Also, with the exception of the addresses [and other small tidbits]
the major bulk of the nla setting is identical.

Unite as much of the dumping as possible between the two.
Notice this requires creating an mr_table iterator for each, as the
for-each preprocessor macro can't be used by the common logic.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 13:13:23 -05:00
Yuval Mintz
889cd83cbe ip6mr: Remove MFC_NOTIFY and refactor flags
MFC_NOTIFY exists in ip6mr, probably as some legacy code
[was already removed for ipmr in commit
06bd6c0370bb ("net: ipmr: remove unused MFC_NOTIFY flag and make the flags enum").
Remove it from ip6mr as well, and move the enum into a common file;
Notice MFC_OFFLOAD is currently only used by ipmr.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 13:13:23 -05:00
Yuval Mintz
3feda6b46f ipmr, ip6mr: Unite vif seq functions
Same as previously done with the mfc seq, the logic for the vif seq is
refactored to be shared between ipmr and ip6mr.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 13:13:23 -05:00
Yuval Mintz
c8d6196803 ipmr, ip6mr: Unite mfc seq logic
With the exception of the final dump, ipmr and ip6mr have the exact same
seq logic for traversing a given mr_table. Refactor that code and make
it common.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 13:13:23 -05:00
Yuval Mintz
845c9a7ae7 ipmr, ip6mr: Unite logic for searching in MFC cache
ipmr and ip6mr utilize the exact same methods for searching the
hashed resolved connections, difference being only in the construction
of the hash comparison key.

In order to unite the flow, introduce an mr_table operation set that
would contain the protocol specific information required for common
flows, in this case - the hash parameters and a comparison key
representing a (*,*) route.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 13:13:23 -05:00
Yuval Mintz
494fff5637 ipmr, ip6mr: Make mfc_cache a common structure
mfc_cache and mfc6_cache are almost identical - the main difference is
in the origin/group addresses and comparison-key. Make a common
structure encapsulating most of the multicast routing logic  - mr_mfc
and convert both ipmr and ip6mr into using it.

For easy conversion [casting, in this case] mr_mfc has to be the first
field inside every multicast routing abstraction utilizing it.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 13:13:23 -05:00
Yuval Mintz
0bbbf0e7d0 ipmr, ip6mr: Unite creation of new mr_table
Now that both ipmr and ip6mr are using the same mr_table structure,
we can have a common function to allocate & initialize a new instance.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 13:13:23 -05:00
Yuval Mintz
b70432f731 mroute*: Make mr_table a common struct
Following previous changes to ip6mr, mr_table and mr6_table are
basically the same [up to mr6_table having additional '6' suffixes to
its variable names].
Move the common structure definition into a common header; This
requires renaming all references in ip6mr to variables that had the
distinct suffix.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 13:13:23 -05:00
Yuval Mintz
87c418bf13 ip6mr: Align hash implementation to ipmr
Since commit 8fb472c09b9d ("ipmr: improve hash scalability") ipmr has
been using rhashtable as a basis for its mfc routes, but ip6mr is
currently still using the old private MFC hash implementation.

Align ip6mr to the current ipmr implementation.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 13:13:23 -05:00
Yuval Mintz
8571ab479a ip6mr: Make mroute_sk rcu-based
In ipmr the mr_table socket is handled under RCU. Introduce the same
for ip6mr.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 13:13:23 -05:00
Yuval Mintz
6853f21f76 ipmr,ipmr6: Define a uniform vif_device
The two implementations have almost identical structures - vif_device and
mif_device. As a step toward uniforming the mr_tables, eliminate the
mif_device and relocate the vif_device definition into a new common
header file.

Also, introduce a common initializing function for setting most of the
vif_device fields in a new common source file. This requires modifying
the ipv{4,6] Kconfig and ipv4 makefile as we're introducing a new common
config option - CONFIG_IP_MROUTE_COMMON.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 13:13:23 -05:00
Roopa Prabhu
5e5d6fed37 ipv6: route: dissect flow in input path if fib rules need it
Dissect flow in fwd path if fib rules require it. Controlled by
a flag to avoid penatly for the common case. Flag is set when fib
rules with sport, dport and proto match that require flow dissect
are installed. Also passes the dissected hash keys to the multipath
hash function when applicable to avoid dissecting the flow again.
icmp packets will continue to use inner header for hash
calculations.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 22:44:44 -05:00
Roopa Prabhu
bb0ad1987e ipv6: fib6_rules: support for match on sport, dport and ip proto
support to match on src port, dst port and ip protocol.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 22:44:43 -05:00
Stephen Hemminger
82695b30ff inet: whitespace cleanup
Ran simple script to find/remove trailing whitespace and blank lines
at EOF because that kind of stuff git whines about and editors leave
behind.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 11:43:28 -05:00
Petr Machata
d1b2a6c4be net: GRE: Add is_gretap_dev, is_ip6gretap_dev
Determining whether a device is a GRE device is easily done by
inspecting struct net_device.type. However, for the tap variants, the
type is just ARPHRD_ETHER.

Therefore introduce two predicate functions that use netdev_ops to tell
the tap devices.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27 14:46:26 -05:00
Xin Long
2b3957c34b sit: fix IFLA_MTU ignored on NEWLINK
Commit 128bb975dc3c ("ip6_gre: init dev->mtu and dev->hard_header_len
correctly") fixed IFLA_MTU ignored on NEWLINK for ip6_gre. The same
mtu fix is also needed for sit.

Note that dev->hard_header_len setting for sit works fine, no need to
fix it. sit is actually ipv4 tunnel, it can't call ip6_tnl_change_mtu
to set mtu.

Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27 14:36:28 -05:00
Xin Long
a6aa804462 ip6_tunnel: fix IFLA_MTU ignored on NEWLINK
Commit 128bb975dc3c ("ip6_gre: init dev->mtu and dev->hard_header_len
correctly") fixed IFLA_MTU ignored on NEWLINK for ip6_gre. The same
mtu fix is also needed for ip6_tunnel.

Note that dev->hard_header_len setting for ip6_tunnel works fine,
no need to fix it.

Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27 14:36:28 -05:00
Kirill Tkhai
9532ce17f7 net: Convert defrag6_net_ops
These pernet_operations only unregister nf hooks.
So, they are able to be marked as async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27 11:01:39 -05:00
Kirill Tkhai
afd7b3eb13 net: Convert ila_net_ops
These pernet_operations register and unregister nf hooks.
Also they populate and depopulate ila_net_id-pointed hash
table. The table is changed by hooks during skb processing
and via netlink request. It looks impossible for another
net pernet_operations to force the table reading or writing,
so, they are able to be marked as async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27 11:01:39 -05:00
Kirill Tkhai
989d9812b7 net: Convert sit_net_ops
These pernet_operations are similar to ip6_tnl_net_ops. Exit method
unregisters all net sit devices, and it looks like another
pernet_operations are not interested in foreign net sit list.
Init method registers netdevice. So, it's possible to mark them async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27 11:01:38 -05:00
Kirill Tkhai
5ecc29550a net: Convert vti6_net_ops
These pernet_operations are similar to ip6_tnl_net_ops. Exit method
unregisters all net vti6 tunnels, and it looks like another
pernet_operations are not interested in foreign net vti6 list.
Init method registers netdevice. So, it's possible to mark them async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27 11:01:38 -05:00
Kirill Tkhai
66997ba083 net: Convert ip6_tnl_net_ops
These pernet_operations are similar to ip6gre_net_ops. Exit method
unregisters all net ip6_tnl tunnels, and it looks like another
pernet_operations are not interested in foreign net tunnels list.
So, it's possible to mark them async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27 11:01:38 -05:00
Kirill Tkhai
5c155c5024 net: Convert ip6gre_net_ops
These pernet_operations are similar to bond_net_ops. Exit method
unregisters all net ip6gre devices, and it looks like another
pernet_operations are not interested in foreign net ip6gre list
or net_generic()->tunnels_wc. Init method registers net device.
So, it's possible to mark them async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27 11:01:38 -05:00
Kirill Tkhai
02df428ca2 net: Convert simple pernet_operations
These pernet_operations make pretty simple actions
like variable initialization on init, debug checks
on exit, and so on, and they obviously are able
to be executed in parallel with any others:

vrf_net_ops
lockd_net_ops
grace_net_ops
xfrm6_tunnel_net_ops
kcm_net_ops
tcf_net_ops

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27 11:01:35 -05:00
Shannon Nelson
5211fcfb81 esp: check the NETIF_F_HW_ESP_TX_CSUM bit before segmenting
If I understand correctly, we should not be asking for a
checksum offload on an ipsec packet if the netdev isn't
advertising NETIF_F_HW_ESP_TX_CSUM.  In that case, we should
clear the NETIF_F_CSUM_MASK bits.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-02-27 10:46:01 +01:00
Eric Dumazet
7d98386d55 netfilter: use skb_to_full_sk in ip6_route_me_harder
For some reason, Florian forgot to apply to ip6_route_me_harder
the fix that went in commit 29e09229d9f2 ("netfilter: use
skb_to_full_sk in ip_route_me_harder")

Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") 
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-25 20:51:13 +01:00
Florian Westphal
47b7e7f828 netfilter: don't set F_IFACE on ipv6 fib lookups
"fib" starts to behave strangely when an ipv6 default route is
added - the FIB lookup returns a route using 'oif' in this case.

This behaviour was inherited from ip6tables rpfilter so change
this as well.

Bugzilla: https://bugzilla.netfilter.org/show_bug.cgi?id=1221
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-25 20:15:53 +01:00
Florian Westphal
b078556aec netfilter: ipv6: fix use-after-free Write in nf_nat_ipv6_manip_pkt
l4proto->manip_pkt() can cause reallocation of skb head so pointer
to the ipv6 header must be reloaded.

Reported-and-tested-by: <syzbot+10005f4292fc9cc89de7@syzkaller.appspotmail.com>
Fixes: 58a317f1061c89 ("netfilter: ipv6: add IPv6 NAT support")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-25 20:03:54 +01:00
David S. Miller
f74290fdb3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-02-24 00:04:20 -05:00
Arnd Bergmann
ca79bec237 ipv6 sit: work around bogus gcc-8 -Wrestrict warning
gcc-8 has a new warning that detects overlapping input and output arguments
in memcpy(). It triggers for sit_init_net() calling ipip6_tunnel_clone_6rd(),
which is actually correct:

net/ipv6/sit.c: In function 'sit_init_net':
net/ipv6/sit.c:192:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]

The problem here is that the logic detecting the memcpy() arguments finds them
to be the same, but the conditional that tests for the input and output of
ipip6_tunnel_clone_6rd() to be identical is not a compile-time constant.

We know that netdev_priv(t->dev) is the same as t for a tunnel device,
and comparing "dev" directly here lets the compiler figure out as well
that 'dev == sitn->fb_tunnel_dev' when called from sit_init_net(), so
it no longer warns.

This code is old, so Cc stable to make sure that we don't get the warning
for older kernels built with new gcc.

Cc: Martin Sebor <msebor@gmail.com>
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83456
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-23 10:53:26 -05:00
David S. Miller
943a0d4a9b Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains large batch with Netfilter fixes for
your net tree, mostly due to syzbot report fixups and pr_err()
ratelimiting, more specifically, they are:

1) Get rid of superfluous unnecessary check in x_tables before vmalloc(),
   we don't hit BUG there anymore, patch from Michal Hock, suggested by
   Andrew Morton.

2) Race condition in proc file creation in ipt_CLUSTERIP, from Cong Wang.

3) Drop socket lock that results in circular locking dependency, patch
   from Paolo Abeni.

4) Drop packet if case of malformed blob that makes backpointer jump
   in x_tables, from Florian Westphal.

5) Fix refcount leak due to race in ipt_CLUSTERIP in
   clusterip_config_find_get(), from Cong Wang.

6) Several patches to ratelimit pr_err() for x_tables since this can be
   a problem where CAP_NET_ADMIN semantics can protect us in untrusted
   namespace, from Florian Westphal.

7) Missing .gitignore update for new autogenerated asn1 state machine
   for the SNMP NAT helper, from Zhu Lingshan.

8) Missing timer initialization in xt_LED, from Paolo Abeni.

9) Do not allow negative port range in NAT, also from Paolo.

10) Lock imbalance in the xt_hashlimit rate match mode, patch from
    Eric Dumazet.

11) Initialize workqueue before timer in the idletimer match,
    from Eric Dumazet.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21 14:49:55 -05:00
David S. Miller
f5c0c6f429 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-02-19 18:46:11 -05:00
Kirill Tkhai
4d6b80762b net: Convert ip_tables_net_ops, udplite6_net_ops and xt_net_ops
ip_tables_net_ops and udplite6_net_ops create and destroy /proc entries.
xt_net_ops does nothing.

So, we are able to mark them async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 14:19:12 -05:00
Kirill Tkhai
5fc094f5b8 net: Convert ip6_frags_ops
Exit methods calls inet_frags_exit_net() with global ip6_frags
as argument. So, after we make the pernet_operations async,
a pair of exit methods may be called to iterate this hash table.
Since there is inet_frag_worker(), which already may work
in parallel with inet_frags_exit_net(), and it can make the same
cleanup, that inet_frags_exit_net() does, it's safe. So we may
mark these pernet_operations as async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 14:19:11 -05:00
Kirill Tkhai
d16784d9fb net: Convert fib6_net_ops, ipv6_addr_label_ops and ip6_segments_ops
These pernet_operations register and unregister tables
and lists for packets forwarding. All of the entities
are per-net. Init methods makes simple initializations,
and since net is not visible for foreigners at the time
it is working, it can't race with anything. Exit method
is executed when there are only local devices, and there
mustn't be packets in-flight. Also, it looks like no one
pernet_operations want to send ipv6 packets to foreign
net. The same reasons are for ipv6_addr_label_ops and
ip6_segments_ops. So, we are able to mark all them as
async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 14:19:11 -05:00
Kirill Tkhai
b489141369 net: Convert xfrm6_net_ops
These pernet_operations create sysctl tables and
initialize net::xfrm.xfrm6_dst_ops used for routing.
It doesn't look like another pernet_operations send
ipv6 packets to foreign net namespaces, so it should
be safe to mark the pernet_operations as async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 14:19:11 -05:00
Kirill Tkhai
a7852a76f4 net: Convert ip6_flowlabel_net_ops
These pernet_operations create and destroy /proc entries.
ip6_fl_purge() makes almost the same actions as timer
ip6_fl_gc_timer does, and as it can be executed in parallel
with ip6_fl_purge(), two parallel ip6_fl_purge() may be
executed. So, we can mark it async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 14:19:11 -05:00
Kirill Tkhai
ac34cb6c0c net: Convert ping_v6_net_ops
These pernet_operations only register and unregister /proc
entries, so it's possible to mark them async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 14:19:11 -05:00
Kirill Tkhai
58708caef5 net: Convert ipv6_sysctl_net_ops
These pernet_operations create and destroy sysctl tables.
They are not touched by another net pernet_operations.
So, it's possible to execute them in parallel with others.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 14:19:11 -05:00
Kirill Tkhai
fef65a2c6c net: Convert tcpv6_net_ops
These pernet_operations create and destroy net::ipv6.tcp_sk
socket, which is used in tcp_v6_send_response() only. It looks
like foreign pernet_operations don't want to set ipv6 connection
inside destroyed net, so this socket may be created in destroyed
in parallel with anything else. inet_twsk_purge() is also safe
for that, as described in patch for tcp_sk_ops. So, it's possible
to mark them as async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 14:19:10 -05:00
Kirill Tkhai
7b7dd180b8 net: Convert fib6_rules_net_ops
These pernet_operations register and unregister
net::ipv6.fib6_rules_ops, which are used for
routing. It looks like there are no pernet_operations,
which send ipv6 packages to another net, so we
are able to mark them as async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 14:19:10 -05:00
Kirill Tkhai
85ca51b2a2 net: Convert ipv6_inetpeer_ops
net->ipv6.peers is dereferenced in three places via inet_getpeer_v6(),
and it's used to handle skb. All the users of inet_getpeer_v6() do not
look like be able to be called from foreign net pernet_operations, so
we may mark them as async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 14:19:10 -05:00
Kirill Tkhai
509114112d net: Convert raw6_net_ops, udplite6_net_ops, ipv6_proc_ops, if6_proc_net_ops and ip6_route_net_late_ops
These pernet_operations create and destroy /proc entries
and safely may be converted and safely may be mark as async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 14:19:10 -05:00
Kirill Tkhai
1a2e93329d net: Convert icmpv6_sk_ops, ndisc_net_ops and igmp6_net_ops
These pernet_operations create and destroy net::ipv6.icmp_sk
socket, used to send ICMP or error reply.

Nobody can dereference the socket to handle a packet before
net is initialized, as there is no routing; nobody can do
that in parallel with exit, as all of devices are moved
to init_net or destroyed and there are no packets it-flight.
So, it's possible to mark these pernet_operations as async.

The same for ndisc_net_ops and for igmp6_net_ops. The last
one also creates and destroys /proc entries.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 14:19:10 -05:00
Kirill Tkhai
b01a59a488 net: Convert ip6mr_net_ops
These pernet_operations create and destroy /proc entries,
populate and depopulate net::rules_ops and multiroute table.
All the structures are pernet, and they are not touched
by foreign net pernet_operations. So, it's possible to mark
them async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 14:19:10 -05:00
Kirill Tkhai
753d525a08 net: Convert inet6_net_ops
init method initializes sysctl defaults, allocates
percpu arrays and creates /proc entries.
exit method reverts the above.

There are no pernet_operations, which are interested
in the above entities of foreign net namespace, so
inet6_net_ops are able to be marked as async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 14:19:09 -05:00
Alexey Kodanev
15f35d49c9 udplite: fix partial checksum initialization
Since UDP-Lite is always using checksum, the following path is
triggered when calculating pseudo header for it:

  udp4_csum_init() or udp6_csum_init()
    skb_checksum_init_zero_check()
      __skb_checksum_validate_complete()

The problem can appear if skb->len is less than CHECKSUM_BREAK. In
this particular case __skb_checksum_validate_complete() also invokes
__skb_checksum_complete(skb). If UDP-Lite is using partial checksum
that covers only part of a packet, the function will return bad
checksum and the packet will be dropped.

It can be fixed if we skip skb_checksum_init_zero_check() and only
set the required pseudo header checksum for UDP-Lite with partial
checksum before udp4_csum_init()/udp6_csum_init() functions return.

Fixes: ed70fcfcee95 ("net: Call skb_checksum_init in IPv4")
Fixes: e4f45b7f40bd ("net: Call skb_checksum_init in IPv6")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16 15:57:42 -05:00