Commit Graph

40489 Commits

Author SHA1 Message Date
Marek Lindner
143d157c9e batman-adv: remove leftovers of unused BATADV_PRIMARIES_FIRST_HOP flag
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
2016-01-09 20:56:00 +08:00
Sven Eckelmann
008a374487 batman-adv: Fix lockdep annotation of batadv_tlv_container_remove
The function handles tlv containers and not tlv handlers. Thus the
lockdep_assert_held has to check for the container_list lock.

Fixes: 2c72d655b0 ("batman-adv: Annotate deleting functions with external lock via lockdep")
Signed-off-by: Sven Eckelmann <sven@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
2016-01-09 20:56:00 +08:00
Simon Wunderlich
4a4d045eb2 batman-adv: Start new development cycle
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
2016-01-09 20:56:00 +08:00
Lance Richardson
ad64b8be71 ipv4: eliminate lock count warnings in ping.c
Add lock release/acquire annotations to ping_seq_start() and
ping_seq_stop() to satisfy sparse.

Signed-off-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-08 21:30:43 -05:00
Lance Richardson
30d3d83a7d ipv4: fix endianness warnings in ip_tunnel_core.c
Eliminate endianness mismatch warnings (reported by sparse) in this file by
using appropriate nla_put_*()/nla_get_*() calls.

Signed-off-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-08 21:30:43 -05:00
David S. Miller
9b59377b75 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next, they are:

1) Release nf_tables objects on netns destructions via
   nft_release_afinfo().

2) Destroy basechain and rules on netdevice removal in the new netdev
   family.

3) Get rid of defensive check against removal of inactive objects in
   nf_tables.

4) Pass down netns pointer to our existing nfnetlink callbacks, as well
   as commit() and abort() nfnetlink callbacks.

5) Allow to invert limit expression in nf_tables, so we can throttle
   overlimit traffic.

6) Add packet duplication for the netdev family.

7) Add forward expression for the netdev family.

8) Define pr_fmt() in conntrack helpers.

9) Don't leave nfqueue configuration on inconsistent state in case of
   errors, from Ken-ichirou MATSUZAWA, follow up patches are also from
   him.

10) Skip queue option handling after unbind.

11) Return error on unknown both in nfqueue and nflog command.

12) Autoload ctnetlink when NFQA_CFG_F_CONNTRACK is set.

13) Add new NFTA_SET_USERDATA attribute to store user data in sets,
    from Carlos Falgueras.

14) Add support for 64 bit byteordering changes nf_tables, from Florian
    Westphal.

15) Add conntrack byte/packet counter matching support to nf_tables,
    also from Florian.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-08 20:53:16 -05:00
David S. Miller
250fbf129e Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2016-01-08

Here's one more bluetooth-next pull request for the 4.5 kernel:

 - Support for CRC check and promiscuous mode for CC2520
 - Fixes to btmrvl driver
 - New ACPI IDs for hci_bcm driver
 - Limited Discovery support for the Bluetooth mgmt interface
 - Minor other cleanups here and there

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-08 13:17:31 -05:00
Florian Westphal
48f66c905a netfilter: nft_ct: add byte/packet counter support
If the accounting extension isn't present, we'll return a counter
value of 0.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-01-08 14:44:09 +01:00
Florian Westphal
ce1e7989d9 netfilter: nft_byteorder: provide 64bit le/be conversion
Needed to convert the (64bit) conntrack counters to BE ordering.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-01-08 13:32:11 +01:00
Carlos Falgueras García
e6d8ecac9e netfilter: nf_tables: Add new attributes into nft_set to store user data.
User data is stored at after 'nft_set_ops' private data into 'data[]'
flexible array. The field 'udata' points to user data and 'udlen' stores
its length.

Add new flag NFTA_SET_USERDATA.

Signed-off-by: Carlos Falgueras García <carlosfg@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-01-08 13:25:08 +01:00
Ken-ichirou MATSUZAWA
eb075954e9 netfilter: nfnetlink_log: just returns error for unknown command
This patch stops processing options for unknown command.

Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-01-08 13:25:07 +01:00
Ken-ichirou MATSUZAWA
71b2e5f5ca netfilter: nfnetlink_queue: autoload nf_conntrack_netlink module NFQA_CFG_F_CONNTRACK config flag
This patch enables to load nf_conntrack_netlink module if
NFQA_CFG_F_CONNTRACK config flag is specified.

Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-01-08 13:25:06 +01:00
Ken-ichirou MATSUZAWA
21c3c971d1 netfilter: nfnetlink_queue: just returns error for unknown command
This patch stops processing options for unknown command.

Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-01-08 13:25:05 +01:00
Ken-ichirou MATSUZAWA
17bc6b4884 netfilter: nfnetlink_queue: don't handle options after unbind
This patch stops processing after destroying a queue instance.

Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-01-08 13:25:04 +01:00
Ken-ichirou MATSUZAWA
60d2c7f9ab netfilter: nfnetlink_queue: validate dependencies to avoid breaking atomicity
Check that dependencies are fulfilled before updating the queue
instance, otherwise we can leave things in intermediate state on errors
in nfqnl_recv_config().

Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-01-08 13:25:03 +01:00
Andrew Lunn
0071f56e46 dsa: Register netdev before phy
When the phy is connected, an info message is printed. If the netdev
it is attached to has not been registered yet, the name
'uninitialised' in the output. By registering the netdev first, then
connecting they phy, we can avoid this.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-07 14:31:26 -05:00
Andrew Lunn
7f854420fb phy: Add API for {un}registering an mdio device to a bus.
Rather than have drivers directly manipulate the mii_bus structure,
provide and API for registering and unregistering devices on an MDIO
bus, and performing lookups.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-07 14:31:26 -05:00
Andrew Lunn
e5a03bfd87 phy: Add an mdio_device structure
Not all devices attached to an MDIO bus are phys. So add an
mdio_device structure to represent the generic parts of an mdio
device, and place this structure into the phy_device.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-07 14:31:26 -05:00
Andrew Lunn
2220943a21 phy: Centralise print about attached phy
Many Ethernet drivers contain the same netdev_info() print statement
about the attached phy. Move it into the phy device code. Additionally
add a varargs function which can be used to append additional
information.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-07 14:31:25 -05:00
David S. Miller
9e0efaf6b4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-01-06 22:54:18 -05:00
Yuchung Cheng
8b8a321ff7 tcp: fix zero cwnd in tcp_cwnd_reduction
Patch 3759824da8 ("tcp: PRR uses CRB mode by default and SS mode
conditionally") introduced a bug that cwnd may become 0 when both
inflight and sndcnt are 0 (cwnd = inflight + sndcnt). This may lead
to a div-by-zero if the connection starts another cwnd reduction
phase by setting tp->prior_cwnd to the current cwnd (0) in
tcp_init_cwnd_reduction().

To prevent this we skip PRR operation when nothing is acked or
sacked. Then cwnd must be positive in all cases as long as ssthresh
is positive:

1) The proportional reduction mode
   inflight > ssthresh > 0

2) The reduction bound mode
  a) inflight == ssthresh > 0

  b) inflight < ssthresh
     sndcnt > 0 since newly_acked_sacked > 0 and inflight < ssthresh

Therefore in all cases inflight and sndcnt can not both be 0.
We check invalid tp->prior_cwnd to avoid potential div0 bugs.

In reality this bug is triggered only with a sequence of less common
events.  For example, the connection is terminating an ECN-triggered
cwnd reduction with an inflight 0, then it receives reordered/old
ACKs or DSACKs from prior transmission (which acks nothing). Or the
connection is in fast recovery stage that marks everything lost,
but fails to retransmit due to local issues, then receives data
packets from other end which acks nothing.

Fixes: 3759824da8 ("tcp: PRR uses CRB mode by default and SS mode conditionally")
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-06 16:39:56 -05:00
David S. Miller
c7f5d10549 net: Add eth_platform_get_mac_address() helper.
A repeating pattern in drivers has become to use OF node information
and, if not found, platform specific host information to extract the
ethernet address for a given device.

Currently this is done with a call to of_get_mac_address() and then
some ifdef'd stuff for SPARC.

Consolidate this into a portable routine, and provide the
arch_get_platform_mac_address() weak function hook for all
architectures to implement if they want.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-06 16:31:56 -05:00
Francesco Ruggeri
07a5d38453 net: possible use after free in dst_release
dst_release should not access dst->flags after decrementing
__refcnt to 0. The dst_entry may be in dst_busy_list and
dst_gc_task may dst_destroy it before dst_release gets a chance
to access dst->flags.

Fixes: d69bbf88c8 ("net: fix a race in dst_release()")
Fixes: 27b75c95f1 ("net: avoid RCU for NOCACHE dst")
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-06 15:00:27 -05:00
Elad Raz
404cdbf089 bridge: add vlan filtering change for new bridged device
Notifying hardware about newly bridged port vlan-aware changes.

Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-06 14:42:41 -05:00
Elad Raz
6b72a77020 bridge: add vlan filtering change notification
Notifying hardware about bridge vlan-aware changes.

Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-06 14:42:40 -05:00
Elad Raz
08474cc1e6 bridge: Propagate vlan add failure to user
Disallow adding interfaces to a bridge when vlan filtering operation
failed. Send the failure code to the user.

Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-06 14:42:40 -05:00
Johannes Berg
787b306cf3 Bluetooth: avoid rebuilding hci_sock all the time
Instead, allow using string formatting with send_monitor_note()
and access init_utsname().

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-01-06 16:36:44 +01:00
John Fastabend
73c20a8b72 net: sched: fix missing free per cpu on qstats
When a qdisc is using per cpu stats (currently just the ingress
qdisc) only the bstats are being freed. This also free's the qstats.

Fixes: b0ab6f9275 ("net: sched: enable per cpu qstats")
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-06 01:40:21 -05:00
Craig Gallek
00ce3a15d8 soreuseport: change consume_skb to kfree_skb in error case
Fixes: 538950a1b7 ("soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF")
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Craig Gallek <kraig@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-06 01:30:27 -05:00
Craig Gallek
1134158ba3 soreuseport: pass skb to secondary UDP socket lookup
This socket-lookup path did not pass along the skb in question
in my original BPF-based socket selection patch.  The skb in the
udpN_lib_lookup2 path can be used for BPF-based socket selection just
like it is in the 'traditional' udpN_lib_lookup path.

udpN_lib_lookup2 kicks in when there are greater than 10 sockets in
the same hlist slot.  Coincidentally, I chose 10 sockets per
reuseport group in my functional test, so the lookup2 path was not
excersised. This adds an additional set of tests with 20 sockets.

Fixes: 538950a1b7 ("soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF")
Fixes: 3ca8e40299 ("soreuseport: BPF selection functional test")
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Craig Gallek <kraig@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-06 01:28:04 -05:00
Florian Westphal
a72a5e2d34 inet: kill unused skb_free op
The only user was removed in commit
029f7f3b87 ("netfilter: ipv6: nf_defrag: avoid/free clone operations").

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05 22:25:57 -05:00
Hannes Frederic Sowa
ff62198553 bridge: Only call /sbin/bridge-stp for the initial network namespace
[I stole this patch from Eric Biederman. He wrote:]

> There is no defined mechanism to pass network namespace information
> into /sbin/bridge-stp therefore don't even try to invoke it except
> for bridge devices in the initial network namespace.
>
> It is possible for unprivileged users to cause /sbin/bridge-stp to be
> invoked for any network device name which if /sbin/bridge-stp does not
> guard against unreasonable arguments or being invoked twice on the
> same network device could cause problems.

[Hannes: changed patch using netns_eq]

Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05 16:46:17 -05:00
Xin Long
c79c066691 sctp: remove the local_bh_disable/enable in sctp_endpoint_lookup_assoc
sctp_endpoint_lookup_assoc is called in the protection of sock lock
there is no need to call local_bh_disable in this function. so remove
them.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05 12:24:02 -05:00
Xin Long
b5eff71283 sctp: drop the old assoc hashtable of sctp
transport hashtable will replace the association hashtable,
so association hashtable is not used in sctp any more, so
drop the codes about that.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05 12:24:01 -05:00
Xin Long
39f66a7dce sctp: apply rhashtable api to sctp procfs
Traversal the transport rhashtable, get the association only once through
the condition assoc->peer.primary_path != transport.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05 12:24:01 -05:00
Xin Long
4f00878126 sctp: apply rhashtable api to send/recv path
apply lookup apis to two functions, for __sctp_endpoint_lookup_assoc
and __sctp_lookup_association, it's invoked in the protection of sock
lock, it will be safe, but sctp_lookup_association need to call
rcu_read_lock() and to detect the t->dead to protect it.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05 12:24:01 -05:00
Xin Long
d6c0256a60 sctp: add the rhashtable apis for sctp global transport hashtable
tranport hashtbale will replace the association hashtable to do the
lookup for transport, and then get association by t->assoc, rhashtable
apis will be used because of it's resizable, scalable and using rcu.

lport + rport + paddr will be the base hashkey to locate the chain,
with net to protect one netns from another, then plus the laddr to
compare to get the target.

this patch will provider the lookup functions:
- sctp_epaddr_lookup_transport
- sctp_addrs_lookup_transport

hash/unhash functions:
- sctp_hash_transport
- sctp_unhash_transport

init/destroy functions:
- sctp_transport_hashtable_init
- sctp_transport_hashtable_destroy

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05 12:24:00 -05:00
Johan Hedberg
78b781ca0d Bluetooth: Add support for Start Limited Discovery command
This patch implements the mgmt Start Limited Discovery command. Most
of existing Start Discovery code is reused since the only difference
is the presence of a 'limited' flag as part of the discovery state.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-01-05 17:02:50 +01:00
Johan Hedberg
0d3b7f64c8 Bluetooth: Change eir_has_data_type() to more generic eir_get_data()
To make the EIR parsing helper more general purpose, make it return
the found data and its length rather than just saying whether the data
was present or not.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-01-05 17:02:49 +01:00
Rainer Weikusat
c845acb324 af_unix: Fix splice-bind deadlock
On 2015/11/06, Dmitry Vyukov reported a deadlock involving the splice
system call and AF_UNIX sockets,

http://lists.openwall.net/netdev/2015/11/06/24

The situation was analyzed as

(a while ago) A: socketpair()
B: splice() from a pipe to /mnt/regular_file
	does sb_start_write() on /mnt
C: try to freeze /mnt
	wait for B to finish with /mnt
A: bind() try to bind our socket to /mnt/new_socket_name
	lock our socket, see it not bound yet
	decide that it needs to create something in /mnt
	try to do sb_start_write() on /mnt, block (it's
	waiting for C).
D: splice() from the same pipe to our socket
	lock the pipe, see that socket is connected
	try to lock the socket, block waiting for A
B:	get around to actually feeding a chunk from
	pipe to file, try to lock the pipe.  Deadlock.

on 2015/11/10 by Al Viro,

http://lists.openwall.net/netdev/2015/11/10/4

The patch fixes this by removing the kern_path_create related code from
unix_mknod and executing it as part of unix_bind prior acquiring the
readlock of the socket in question. This means that A (as used above)
will sb_start_write on /mnt before it acquires the readlock, hence, it
won't indirectly block B which first did a sb_start_write and then
waited for a thread trying to acquire the readlock. Consequently, A
being blocked by C waiting for B won't cause a deadlock anymore
(effectively, both A and B acquire two locks in opposite order in the
situation described above).

Dmitry Vyukov(<dvyukov@google.com>) tested the original patch.

Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04 23:22:49 -05:00
David Ahern
b5bdacf3bb net: Propagate lookup failure in l3mdev_get_saddr to caller
Commands run in a vrf context are not failing as expected on a route lookup:
    root@kenny:~# ip ro ls table vrf-red
    unreachable default

    root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
    ping: Warning: source address might be selected on device other than vrf-red.
    PING 10.100.1.254 (10.100.1.254) from 0.0.0.0 vrf-red: 56(84) bytes of data.

    --- 10.100.1.254 ping statistics ---
    2 packets transmitted, 0 received, 100% packet loss, time 999ms

Since the vrf table does not have a route for 10.100.1.254 the ping
should have failed. The saddr lookup causes a full VRF table lookup.
Propogating a lookup failure to the user allows the command to fail as
expected:

    root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
    connect: No route to host

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04 22:58:30 -05:00
Craig Gallek
538950a1b7 soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF
Expose socket options for setting a classic or extended BPF program
for use when selecting sockets in an SO_REUSEPORT group.  These options
can be used on the first socket to belong to a group before bind or
on any socket in the group after bind.

This change includes refactoring of the existing sk_filter code to
allow reuse of the existing BPF filter validation checks.

Signed-off-by: Craig Gallek <kraig@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04 22:49:59 -05:00
Craig Gallek
e32ea7e747 soreuseport: fast reuseport UDP socket selection
Include a struct sock_reuseport instance when a UDP socket binds to
a specific address for the first time with the reuseport flag set.
When selecting a socket for an incoming UDP packet, use the information
available in sock_reuseport if present.

This required adding an additional field to the UDP source address
equality function to differentiate between exact and wildcard matches.
The original use case allowed wildcard matches when checking for
existing port uses during bind.  The new use case of adding a socket
to a reuseport group requires exact address matching.

Performance test (using a machine with 2 CPU sockets and a total of
48 cores):  Create reuseport groups of varying size.  Use one socket
from this group per user thread (pinning each thread to a different
core) calling recvmmsg in a tight loop.  Record number of messages
received per second while saturating a 10G link.
  10 sockets: 18% increase (~2.8M -> 3.3M pkts/s)
  20 sockets: 14% increase (~2.9M -> 3.3M pkts/s)
  40 sockets: 13% increase (~3.0M -> 3.4M pkts/s)

This work is based off a similar implementation written by
Ying Cai <ycai@google.com> for implementing policy-based reuseport
selection.

Signed-off-by: Craig Gallek <kraig@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04 22:49:58 -05:00
Craig Gallek
ef456144da soreuseport: define reuseport groups
struct sock_reuseport is an optional shared structure referenced by each
socket belonging to a reuseport group.  When a socket is bound to an
address/port not yet in use and the reuseport flag has been set, the
structure will be allocated and attached to the newly bound socket.
When subsequent calls to bind are made for the same address/port, the
shared structure will be updated to include the new socket and the
newly bound socket will reference the group structure.

Usually, when an incoming packet was destined for a reuseport group,
all sockets in the same group needed to be considered before a
dispatching decision was made.  With this structure, an appropriate
socket can be found after looking up just one socket in the group.

This shared structure will also allow for more complicated decisions to
be made when selecting a socket (eg a BPF filter).

This work is based off a similar implementation written by
Ying Cai <ycai@google.com> for implementing policy-based reuseport
selection.

Signed-off-by: Craig Gallek <kraig@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04 22:49:58 -05:00
David S. Miller
15ab90f400 NFC 4.5 pull request
This is the first NFC pull request for 4.5 and it brings:
 
 - A new driver for the STMicroelectronics ST95HF NFC chipset.
   The ST95HF is an NFC digital transceiver with an embedded analog
   front-end and as such relies on the Linux NFC digital
   implementation. This is the 3rd user of the NFC digital stack.
 
 - ACPI support for the ST st-nci and st21nfca drivers.
 
 - A small improvement for the nfcsim driver, as we can now tune
   the Rx delay through sysfs.
 
 - A bunch of minor cleanups and small fixes from Christophe Ricard,
   for a few drivers and the NFC core code.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWgsv0AAoJEIqAPN1PVmxKvukP/3eJwA+chAUF89/fqwqFTaJN
 fffLoOxx2OBIbTXD2VV36yw4bAo9tbKDAYBZiot3Ig7Kg0SeCJ5oPA9xCVnWPrEY
 hxFAldvl+lWQs8nrUOgUZItiFBeUPdfW9YX/yKhUZVc+602nUG/e/+6x8B5MhIce
 SAfgCyd0c16DApltP7sw1muyZMvsO6Ow6dyNzDUVYZuabvEhe3SLSj9KFJi7Thsp
 h41Iv+bwPLhwF4RXGA6rei/gdEDSMRohprdj3uTDiTarGW+OpcAO0zWACS5m4eR0
 zF19+HjGPUk/LpFRaU31xX0ZQQjOTmmfsOt4FBb3P7oJx47egycsadLYPexeG7nj
 ruyS6ezlRX1I/tZsnyLNJK92mK5TXYLz2uJ8r2ii/BgPNE+AErB3zKCC+EjXzWhh
 AvClGu5b88WJLxoq3I3l5evPwGhebGZ8N/1uiFsHOxvzKVLgxwOmNLRGN4XXxB2i
 UbIHgBb6smsu/l+3q9R83kfoMaoWnr+OUIi2QPQVDt/K7t1LfsCuIhzcGSgo1VuW
 fGlA1iu+CNDknofeCl4JDo2UXAETO4gdKWw87GXeUcbbraLUczZeO7FFLZqxbMYc
 OCaPYshmVFeZRypYdRWDHw67ivj0/h+9iq4PP1XOROkRFH746dD/p4yamJwVi20B
 samZ8VPwzgH3/ohQJyX3
 =VFUH
 -----END PGP SIGNATURE-----

Merge tag 'nfc-next-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next

Samuel Ortiz says:

====================
NFC 4.5 pull request

This is the first NFC pull request for 4.5 and it brings:

- A new driver for the STMicroelectronics ST95HF NFC chipset.
  The ST95HF is an NFC digital transceiver with an embedded analog
  front-end and as such relies on the Linux NFC digital
  implementation. This is the 3rd user of the NFC digital stack.

- ACPI support for the ST st-nci and st21nfca drivers.

- A small improvement for the nfcsim driver, as we can now tune
  the Rx delay through sysfs.

- A bunch of minor cleanups and small fixes from Christophe Ricard,
  for a few drivers and the NFC core code.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04 21:48:15 -05:00
Eric Dumazet
197c949e77 udp: properly support MSG_PEEK with truncated buffers
Backport of this upstream commit into stable kernels :
89c22d8c3b ("net: Fix skb csum races when peeking")
exposed a bug in udp stack vs MSG_PEEK support, when user provides
a buffer smaller than skb payload.

In this case,
skb_copy_and_csum_datagram_iovec(skb, sizeof(struct udphdr),
                                 msg->msg_iov);
returns -EFAULT.

This bug does not happen in upstream kernels since Al Viro did a great
job to replace this into :
skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr), msg);
This variant is safe vs short buffers.

For the time being, instead reverting Herbert Xu patch and add back
skb->ip_summed invalid changes, simply store the result of
udp_lib_checksum_complete() so that we avoid computing the checksum a
second time, and avoid the problematic
skb_copy_and_csum_datagram_iovec() call.

This patch can be applied on recent kernels as it avoids a double
checksumming, then backported to stable kernels as a bug fix.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04 17:23:36 -05:00
Guillaume Nault
98f40b3e22 l2tp: rely on ppp layer for skb scrubbing
Since 79c441ae50 ("ppp: implement x-netns support"), the PPP layer
calls skb_scrub_packet() whenever the skb is received on the PPP
device. Manually resetting packet meta-data in the L2TP layer is thus
redundant.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04 16:45:24 -05:00
Julia Lawall
29663b0cc1 mac802154: constify ieee802154_llsec_ops structure
The ieee802154_llsec_ops structure is never modified, so declare it as
const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-01-04 20:40:41 +01:00
Heikki Krogerus
d3d2072540 Bluetooth: hci_bcm: move all Broadcom ACPI IDs to BCM HCI driver
The IDs should all be for Broadcom BCM43241 module, and
hci_bcm is now the proper driver for them. This removes one
of two different ways of handling PM with the module.

Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-01-04 19:22:05 +01:00
Pablo Neira Ayuso
ad6d950393 netfilter: nf_ct_helper: define pr_fmt()
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-01-04 17:48:51 +01:00