linux

iv/linux

Author	SHA1	Message	Date
Eric W. Biederman	6493517eae	tcp_metrics: panic when tcp_metrics_init fails. There is not a practical way to cleanup during boot so just panic if there is a problem initializing tcp_metrics. That will at least give us a clear place to start debugging if something does go wrong. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-13 01:57:07 -04:00
Eric W. Biederman	76fecd8275	mpls: In mpls_egress verify the packet length. Reobert Shearman noticed that mpls_egress is failing to verify that the bytes to be examined are in fact present in the packet before mpls_egress reads those bytes. As suggested by David Miller reduce this to a single pskb_may_pull call so that we don't do unnecessary work in the fast path. Reported-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 23:05:04 -04:00
Eric Dumazet	3f66b083a5	inet: introduce ireq_family Before inserting request socks into general hash table, fill their socket family. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 22:58:13 -04:00
Eric Dumazet	d4f06873b6	inet: get_openreq4() & get_openreq6() do not need listener ireq->ir_num contains local port, use it. Also, get_openreq4() dumping listen_sk->refcnt makes litle sense. inet_diag_fill_req() can also use ireq->ir_num Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 22:58:13 -04:00
Eric Dumazet	41b822c59e	inet: prepare sock_edemux() & sock_gen_put() for new SYN_RECV state sock_edemux() & sock_gen_put() should be ready to cope with request socks. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 22:58:13 -04:00
Eric Dumazet	0159dfd3d7	net: add req_prot_cleanup() & req_prot_init() helpers Make proto_register() & proto_unregister() a bit nicer. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 22:58:13 -04:00
Eric Dumazet	bd337c581b	ipv6: add missing ireq_net & ir_cookie initializations I forgot to update dccp_v6_conn_request() & cookie_v6_check(). They both need to set ireq->ireq_net and ireq->ir_cookie Lets clear ireq->ir_cookie in inet_reqsk_alloc() Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `33cf7c90fe` ("net: add real socket cookies") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 22:58:12 -04:00
Daniel Borkmann	54720df130	cls_bpf: do eBPF invocation under non-bh RCU lock variant for maps Currently, it is possible in cls_bpf to access eBPF maps only under rcu_read_lock_bh() variants: while on ingress side, that is, handle_ing(), the classifier would be called from __netif_receive_skb_core() under rcu_read_lock(); on egress side, however, it's rcu_read_lock_bh() via __dev_queue_xmit(). This rcu/rcu_bh mix doesn't work together with eBPF maps as they require soley to be called under rcu_read_lock(). eBPF maps could also be shared among various other eBPF programs (possibly even with other eBPF program types, f.e. tracing) and user space processes, so any context is assumed. Therefore, a possible fix for cls_bpf is to wrap/nest eBPF program invocation under non-bh RCU lock variant. Fixes: `e2e9b6541d` ("cls_bpf: add initial eBPF support for programmable classifiers") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 18:33:15 -04:00
Alexander Duyck	0b65bd97ba	fib_trie: Provide a deterministic order for fib_alias w/ tables merged This change makes it so that we should always have a deterministic ordering for the main and local aliases within the merged table when two leaves overlap. So for example if we have a leaf with a key of 192.168.254.0. If we previously added two aliases with a prefix length of 24 from both local and main the first entry would be first and the second would be second. When I was coding this I had added a WARN_ON should such a situation occur as I wasn't sure how likely it would be. However this WARN_ON has been triggered so this is something that should be addressed. With this patch the ordering of the aliases is as follows. First they are sorted on prefix length, then on their table ID, then tos, and finally priority. This way what we end up doing is essentially interleaving the two tables on what used to be leaf_info structure boundaries. Fixes: `0ddcf43d5` ("ipv4: FIB Local/MAIN table collapse") Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 18:26:51 -04:00
Alexander Duyck	3c9e9f7320	fib_trie: Avoid NULL pointer if local table is not allocated The function fib_unmerge assumed the local table had already been allocated. If that is not the case however when custom rules are applied then this can result in a NULL pointer dereference. In order to prevent this we must check the value of the local table pointer and if it is NULL simply return 0 as there is no local table to separate from the main. Fixes: `0ddcf43d5` ("ipv4: FIB Local/MAIN table collapse") Reported-by: Madhu Challa <challa@noironetworks.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 18:26:51 -04:00
Eric W. Biederman	0c5c9fb551	net: Introduce possible_net_t Having to say > #ifdef CONFIG_NET_NS > struct net net; > #endif in structures is a little bit wordy and a little bit error prone. Instead it is possible to say: > typedef struct { > #ifdef CONFIG_NET_NS > struct net net; > #endif > } possible_net_t; And then in a header say: > possible_net_t net; Which is cleaner and easier to use and easier to test, as the possible_net_t is always there no matter what the compile options. Further this allows read_pnet and write_pnet to be functions in all cases which is better at catching typos. This change adds possible_net_t, updates the definitions of read_pnet and write_pnet, updates optional struct net * variables that write_pnet uses on to have the type possible_net_t, and finally fixes up the b0rked users of read_pnet and write_pnet. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 14:39:40 -04:00
Eric W. Biederman	efd7ef1c19	net: Kill hold_net release_net hold_net and release_net were an idea that turned out to be useless. The code has been disabled since 2008. Kill the code it is long past due. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 14:39:40 -04:00
Marcel Holtmann	983f9814c0	Bluetooth: Remove two else branches that are not needed The SMP code contains two else branches that are not needed since the successful test will actually leave the function. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-12 09:00:48 +02:00
Eric Dumazet	d77c555d32	net: fix CONFIG_NET_NS=n compilation I forgot to use write_pnet() in three locations. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `33cf7c90fe` ("net: add real socket cookies") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 23:28:49 -04:00
Lubomir Rintel	c78ba6d64c	ipv6: expose RFC4191 route preference via rtnetlink This makes it possible to retain the route preference when RAs are handled in userspace. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 23:28:09 -04:00
Simon Horman	ac70c05b6f	switchdev: correct spelling of notifier in comments Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 23:06:39 -04:00
Eric Dumazet	33cf7c90fe	net: add real socket cookies A long standing problem in netlink socket dumps is the use of kernel socket addresses as cookies. 1) It is a security concern. 2) Sockets can be reused quite quickly, so there is no guarantee a cookie is used once and identify a flow. 3) request sock, establish sock, and timewait socks for a given flow have different cookies. Part of our effort to bring better TCP statistics requires to switch to a different allocator. In this patch, I chose to use a per network namespace 64bit generator, and to use it only in the case a socket needs to be dumped to netlink. (This might be refined later if needed) Note that I tried to carry cookies from request sock, to establish sock, then timewait sockets. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Eric Salo <salo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 21:55:28 -04:00
Alexander Duyck	654eff4516	fib_trie: Only display main table in /proc/net/route When we merged the tries for local and main I had overlooked the iterator for /proc/net/route. As a result it was outputting both local and main when the two tries were merged. This patch resolves that by only providing output for aliases that are actually in the main trie. As a result we should go back to the original behavior which I assume will be necessary to maintain legacy support. Fixes: `0ddcf43d5` ("ipv4: FIB Local/MAIN table collapse") Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 21:24:32 -04:00
Florian Fainelli	cd28a1a9ba	net: dsa: fully divert PHY reads/writes if requested In case a PHY is found via Device Tree, and is also flagged by the switch driver as needing indirect reads/writes using the switch driver implemented MDIO bus, make sure that we bind this PHY to the slave MII bus in order for this to happen. Without this, we would succeed in having the PHY driver probe()'s function to use slave MII bus read/write functions, because this is done during dsa_slave_mii_init(), but past that point, the PHY driver would not go through these diverted reads and writes. Fixes: `0d8bcdd383` ("net: dsa: allow for more complex PHY setups") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 17:56:29 -04:00
Florian Fainelli	c305c1651c	net: dsa: move PHY setup on DSA MII bus to its own function In preparation for dealing with indirect reads and writes towards certain PHY devices, move the code which deals with binding the PHY device to the slave MII bus created by DSA to its own function: dsa_slave_phy_connect(). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 17:56:28 -04:00
Alexander Duyck	61f0d861fc	fib_trie: Fix uninitialized variable warning The 0-day kernel test infrastructure reported a use of uninitialized variable warning for local_table due to the fact that the local and main allocations had been swapped from the original setup. This change corrects that by making it so that we free the main table if the local table allocation fails. Fixes: `0ddcf43d5` ("ipv4: FIB Local/MAIN table collapse") Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 17:33:44 -04:00
Sabrina Dubroca	6dede75b7e	fib_trie: call fib_table_flush_external under RTNL Move rtnl_lock() before the call to fib4_rules_exit so that fib_table_flush_external is called under RTNL. Fixes: `104616e74e` ("switchdev: don't support custom ip rules, for now") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 16:46:26 -04:00
Robert Shearman	8a08919f43	mpls: Allow mpls_gso and mpls_router to be built as modules CONFIG_MPLS=m doesn't result in a kernel module being built because it applies to the net/mpls directory, rather than to .o files. So revert the MPLS menuitem to being a boolean and make MPLS_GSO and MPLS_ROUTING tristates to allow mpls_gso and mpls_router modules to be produced as desired. Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 16:38:54 -04:00
Alexander Duyck	0ddcf43d5d	ipv4: FIB Local/MAIN table collapse This patch is meant to collapse local and main into one by converting tb_data from an array to a pointer. Doing this allows us to point the local table into the main while maintaining the same variables in the table. As such the tb_data was converted from an array to a pointer, and a new array called data is added in order to still provide an object for tb_data to point to. In order to track the origin of the fib aliases a tb_id value was added in a hole that existed on 64b systems. Using this we can also reverse the merge in the event that custom FIB rules are enabled. With this patch I am seeing an improvement of 20ns to 30ns for routing lookups as long as custom rules are not enabled, with custom rules enabled we fall back to split tables and the original behavior. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 16:22:14 -04:00
Johan Hedberg	4ba9faf35f	Bluetooth: Check for matching IRK when looking for paired LE devices If we're given an RPA when checking whether we're paired or not, we should consult the local RPA storage whether there's a matching IRK. This we we ensure that hci_bdaddr_is_paired() gives the right result even when trying to pair a second time with the same device with an RPA. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-11 15:54:23 +01:00
Johan Hedberg	87c8b28d29	Bluetooth: Fix missing rcu_read_unlock() in hci_bdaddr_is_paired() When finding a matching LTK the rcu_read_unlock() function was failing to release the RCU read lock. This patch adds the missing call to rcu_reaD_unlock(). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-11 08:52:32 +01:00
Marcel Holtmann	beb1c21b8e	Bluetooth: Increment management interface revision This patch increments the management interface revision due to introduction of new static address setting and fixes for the fast connectable feature. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-11 09:28:41 +02:00
Jon Paul Maloy	169bf9121b	tipc: ensure that idle links are deleted when a bearer is disabled commit `afaa3f65f6` (tipc: purge links when bearer is disabled) was an attempt to resolve a problem that turned out to have a more profound reason. When we disable a bearer, we delete all its pertaining links if there is no other bearer to perform failover to, or if the module is shutting down. In case there are dual bearers, we wait with deleting links until the failover procedure is finished. However, this misses the case when a link on the removed bearer was already down, so that there will be no failover procedure to finish the link delete. This causes confusion if a new bearer is added to replace the removed one, and also entails a small memory leak. This commit takes the current state of the link into account when deciding when to delete it, and also reverses the above-mentioned commit. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-10 18:37:36 -04:00
Alexander Duyck	ddb4b9a132	fib_trie: Address possible NULL pointer dereference in resize If the inflate call failed it would return NULL. As a result tp would be set to NULL and cause use to trigger a NULL pointer dereference in should_halve if the inflate failed on the first attempt. In order to prevent this we should decrement max_work before we actually attempt to inflate as this will force us to exit before attempting to halve a node we should have inflated. In order to keep things symmetric between inflate and halve I went ahead and also moved the decrement of max_work for the halve case as well so we take care of that before we actually attempt to halve the tnode. Fixes: `88bae714` ("fib_trie: Add key vector to root, return parent key_vector in resize") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-10 18:36:56 -04:00
Johan Hedberg	55e76b3898	Bluetooth: Add 'Already Paired' error for Pair Device command To make the behavior predictable when attempting to pair with a device for which we already have a Link Key or Long Term Key, this patch adds a new 'Already Paired' error which gets sent in such a scenario. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-10 21:42:05 +01:00
Alexander Duyck	3ec320dd5c	fib_trie: Correctly handle case of key == 0 in leaf_walk_rcu In the case of a trie that had no tnodes with a key of 0 the initial look-up would fail resulting in an out-of-bounds cindex on the first tnode. This resulted in an entire trie being skipped. In order resolve this I have updated the cindex logic in the initial look-up so that if the key is zero we will always traverse the child zero path. Fixes: `8be33e95` ("fib_trie: Fib walk rcu should take a tnode and key instead of a trie and a leaf") Reported-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Tested-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-10 16:13:55 -04:00
Johan Hedberg	406ef2a67b	Bluetooth: Make Fast Connectable available while powered off To maximize the usability of the Fast Connectable feature we should make it possible to set (or unset) it at any given moment. This means removing the dependency on the 'connectable' setting as well as the 'powered' setting. The former makes also sense since page scan may get enabled through add_device even if 'connectable' is false. To keep the setting available over power cycles its flag also needs to be removed from the flags that are cleared upon HCI_Reset. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-10 19:37:02 +01:00
Eric Dumazet	34160ea3f9	inet_diag: add const to inet_diag_req_v2 diag dumpers should not modify the request. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-10 13:45:28 -04:00
Eric Dumazet	e31c5e0e48	inet_diag: cleanups Remove all inline keywords, add some const, and cleanup style. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-10 13:45:28 -04:00
Eric Dumazet	491da2a477	net: constify sock_diag_check_cookie() sock_diag_check_cookie() second parameter is constant Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-10 13:45:28 -04:00
David S. Miller	515fb5c317	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter fixes for net-next The following batch contains a couple of fixes to address some fallout from the previous pull request, they are: 1) Address link problems in the bridge code after `e5de75b`. Fix it by using rcu hook to address to avoid ifdef pollution and hard dependency between bridge and br_netfilter. 2) Address sparse warnings in the netfilter reject code, patch from Florian Westphal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-10 12:48:47 -04:00
Pablo Neira Ayuso	1a4ba64d16	netfilter: bridge: use rcu hook to resolve br_netfilter dependency `e5de75b` ("netfilter: bridge: move DNAT helper to br_netfilter") results in the following link problem: net/bridge/br_device.c:29: undefined reference to `br_nf_prerouting_finish_bridge` Moreover it creates a hard dependency between br_netfilter and the bridge core, which is what we've been trying to avoid so far. Resolve this problem by using a hook structure so we reduce #ifdef pollution and keep bridge netfilter specific code under br_netfilter.c which was the original intention. Reported-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-10 15:03:02 +01:00
Florian Westphal	a03a8dbe20	netfilter: fix sparse warnings in reject handling make C=1 CF=-D__CHECK_ENDIAN__ shows following: net/bridge/netfilter/nft_reject_bridge.c:65:50: warning: incorrect type in argument 3 (different base types) net/bridge/netfilter/nft_reject_bridge.c:65:50: expected restricted __be16 [usertype] protocol [..] net/bridge/netfilter/nft_reject_bridge.c:102:37: warning: cast from restricted __be16 net/bridge/netfilter/nft_reject_bridge.c:102:37: warning: incorrect type in argument 1 (different base types) [..] net/bridge/netfilter/nft_reject_bridge.c:121:50: warning: incorrect type in argument 3 (different base types) [..] net/bridge/netfilter/nft_reject_bridge.c:168:52: warning: incorrect type in argument 3 (different base types) [..] net/bridge/netfilter/nft_reject_bridge.c:233:52: warning: incorrect type in argument 3 (different base types) [..] Caused by two (harmless) errors: 1. htons() instead of ntohs() 2. __be16 for protocol in nf_reject_ipXhdr_put API, use u8 instead. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-10 15:01:32 +01:00
Scott Feldman	f8f2147150	switchdev: add netlink flags to IPv4 FIB add op Pass in the netlink flags (NLM_F_*) into switchdev driver for IPv4 FIB add op to allow driver to 1) optimize hardware updates, 2) handle ip route prepend and append commands correctly. Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com> Suggested-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Scott Feldman <sfeldma@gmail.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 23:56:52 -04:00
Florian Fainelli	769a020289	net: dsa: utilize of_find_net_device_by_node Using of_find_device_by_node() restricts the search to platform_device that match the specified device_node pointer. This is not even remotely true for network devices backed by a pci_device for instance. of_find_net_device_by_node() allows us to do a more thorough lookup to find the struct net_device corresponding to a particular device_node pointer. For symetry with the non-OF code path, we hold the net_device pointer in dsa_probe() just like what dev_to_net_dev() does when we call this function. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 23:50:21 -04:00
Florian Fainelli	aa836df958	net: core: add of_find_net_device_by_node() Add a helper function which allows getting the struct net_device pointer associated with a given struct device_node pointer. This is useful for instance for DSA Ethernet devices not backed by a platform_device, but a PCI device. Since we need to access net_class which is not accessible outside of net/core/net-sysfs.c, this helper function is also added here and gated with CONFIG_OF_NET. Network devices initialized with SET_NETDEV_DEV() are also taken into account by checking for dev->parent first and then falling back to checking the device pointer within struct net_device. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 23:50:20 -04:00
David S. Miller	3cef5c5b0b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/cadence/macb.c Overlapping changes in macb driver, mostly fixes and cleanups in 'net' overlapping with the integration of at91_ether into macb in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 23:38:02 -04:00
Linus Torvalds	36bef88380	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) nft_compat accidently truncates ethernet protocol to 8-bits, from Arturo Borrero. 2) Memory leak in ip_vs_proc_conn(), from Julian Anastasov. 3) Don't allow the space required for nftables rules to exceed the maximum value representable in the dlen field. From Patrick McHardy. 4) bcm63xx_enet can accidently leave interrupts permanently disabled due to errors in the NAPI polling exit logic. Fix from Nicolas Schichan. 5) Fix OOPSes triggerable by the ping protocol module, due to missing address family validations etc. From Lorenzo Colitti. 6) Don't use RCU locking in sleepable context in team driver, from Jiri Pirko. 7) xen-netback miscalculates statistic offset pointers when reporting the stats to userspace. From David Vrabel. 8) Fix a leak of up to 256 pages per VIF destroy in xen-netaback, also from David Vrabel. 9) ip_check_defrag() cannot assume that skb_network_offset(), particularly when it is used by the AF_PACKET fanout defrag code. From Alexander Drozdov. 10) gianfar driver doesn't query OF node names properly when trying to determine the number of hw queues available. Fix it to explicitly check for OF nodes named queue-group. From Tobias Waldekranz. 11) MID field in macb driver should be 12 bits, not 16. From Punnaiah Choudary Kalluri. 12) Fix unintentional regression in traceroute due to timestamp socket option changes. Empty ICMP payloads should be allowed in non-timestamp cases. From Willem de Bruijn. 13) When devices are unregistered, we have to get rid of AF_PACKET multicast list entries that point to it via ifindex. Fix from Francesco Ruggeri. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits) tipc: fix bug in link failover handling net: delete stale packet_mclist entries net: macb: constify macb configuration data MAINTAINERS: add Marc Kleine-Budde as co maintainer for CAN networking layer MAINTAINERS: linux-can moved to github can: kvaser_usb: Read all messages in a bulk-in URB buffer can: kvaser_usb: Avoid double free on URB submission failures can: peak_usb: fix missing ctrlmode_ init for every dev can: add missing initialisations in CAN related skbuffs ip: fix error queue empty skb handling bgmac: Clean warning messages tcp: align tcp_xmit_size_goal() on tcp_tso_autosize() net: fec: fix unbalanced clk disable on driver unbind net: macb: Correct the MID field length value net: gianfar: correctly determine the number of queue groups ipv4: ip_check_defrag should not assume that skb_network_offset is zero net: bcmgenet: properly disable password matching net: eth: xgene: fix booting with devicetree bnx2x: Force fundamental reset for EEH recovery xen-netback: refactor xenvif_handle_frag_list() ...	2015-03-09 18:17:21 -07:00
Jon Paul Maloy	e6441bae32	tipc: fix bug in link failover handling In commit `c637c10355` ("tipc: resolve race problem at unicast message reception") we introduced a new mechanism for delivering buffers upwards from link to socket layer. That code contains a bug in how we handle the new link input queue during failover. When a link is reset, some of its users may be blocked because of congestion, and in order to resolve this, we add any pending wakeup pseudo messages to the link's input queue, and deliver them to the socket. This misses the case where the other, remaining link also may have congested users. Currently, the owner node's reference to the remaining link's input queue is unconditionally overwritten by the reset link's input queue. This has the effect that wakeup events from the remaining link may be unduely delayed (but not lost) for a potentially long period. We fix this by adding the pending events from the reset link to the input queue that is currently referenced by the node, whichever one it is. This commit should be applied to both net and net-next. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 16:20:41 -04:00
Francesco Ruggeri	82f17091e6	net: delete stale packet_mclist entries When an interface is deleted from a net namespace the ifindex in the corresponding entries in PF_PACKET sockets' mclists becomes stale. This can create inconsistencies if later an interface with the same ifindex is moved from a different namespace (not that unlikely since ifindexes are per-namespace). In particular we saw problems with dev->promiscuity, resulting in "promiscuity touches roof, set promiscuity failed. promiscuity feature of device might be broken" warnings and EOVERFLOW failures of setsockopt(PACKET_ADD_MEMBERSHIP). This patch deletes the mclist entries for interfaces that are deleted. Since this now causes setsockopt(PACKET_DROP_MEMBERSHIP) to fail with EADDRNOTAVAIL if called after the interface is deleted, also make packet_mc_drop not fail. Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 16:17:43 -04:00
Eric W. Biederman	ddb3b6033c	net: Remove protocol from struct dst_ops After my change to neigh_hh_init to obtain the protocol from the neigh_table there are no more users of protocol in struct dst_ops. Remove the protocol field from dst_ops and all of it's initializers. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 16:06:10 -04:00
David S. Miller	5428aef811	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree. Basically, improvements for the packet rejection infrastructure, deprecation of CLUSTERIP, cleanups for nf_tables and some untangling for br_netfilter. More specifically they are: 1) Send packet to reset flow if checksum is valid, from Florian Westphal. 2) Fix nf_tables reject bridge from the input chain, also from Florian. 3) Deprecate the CLUSTERIP target, the cluster match supersedes it in functionality and it's known to have problems. 4) A couple of cleanups for nf_tables rule tracing infrastructure, from Patrick McHardy. 5) Another cleanup to place transaction declarations at the bottom of nf_tables.h, also from Patrick. 6) Consolidate Kconfig dependencies wrt. NF_TABLES. 7) Limit table names to 32 bytes in nf_tables. 8) mac header copying in bridge netfilter is already required when calling ip_fragment(), from Florian Westphal. 9) move nf_bridge_update_protocol() to br_netfilter.c, also from Florian. 10) Small refactor in br_netfilter in the transmission path, again from Florian. 11) Move br_nf_pre_routing_finish_bridge_slow() to br_netfilter. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 15:58:21 -04:00
Geert Uytterhoeven	26c459a807	mpls: Spelling: s/conceved/conceived/, s/as/a/ Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 15:51:59 -04:00
Erik Hugne	143fe22f50	tipc: fix inconsistent signal handling regression Commit `9bbb4ecc68` ("tipc: standardize recvmsg routine") changed the sleep/wakeup behaviour for sockets entering recv() or accept(). In this process the order of reporting -EAGAIN/-EINTR was reversed. This caused problems with wrong errno being reported back if the timeout expires. The same problem happens if the socket is nonblocking and recv()/accept() is called when the process have pending signals. If there is no pending data read or connections to accept, -EINTR will be returned instead of -EAGAIN. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Reported-by László Benedek <laszlo.benedek@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 15:42:19 -04:00
Jiri Pirko	f4427bc3e2	switchdev: use gpl variant of symbol export Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Scott Feldman <sfeldma@gmail.com> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 15:37:39 -04:00
Erik Hugne	4fee6be813	tipc: sparse: fix htons conversion warnings Commit `d0f91938be` ("tipc: add ip/udp media type") introduced some new sparse warnings. Clean them up. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 15:37:04 -04:00
Cong Wang	1e052be69d	net_sched: destroy proto tp when all filters are gone Kernel automatically creates a tp for each (kind, protocol, priority) tuple, which has handle 0, when we add a new filter, but it still is left there after we remove our own, unless we don't specify the handle (literally means all the filters under the tuple). For example this one is left: # tc filter show dev eth0 filter parent 8001: protocol arp pref 49152 basic The user-space is hard to clean up these for kernel because filters like u32 are organized in a complex way. So kernel is responsible to remove it after all filters are gone. Each type of filter has its own way to store the filters, so each type has to provide its way to check if all filters are gone. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim<jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 15:35:55 -04:00
Pablo Neira Ayuso	e5de75bf88	netfilter: bridge: move DNAT helper to br_netfilter Only one caller, there is no need to keep this in a header. Move it to br_netfilter.c where this belongs to. Based on patch from Florian Westphal. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-09 17:56:07 +01:00
Florian Westphal	7a8d831df5	netfilter: bridge: refactor conditional in br_nf_dev_queue_xmit simpilifies followup patch that re-works brnf ip_fragment handling. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-09 13:22:58 +01:00
Florian Westphal	4a9d2f2008	netfilter: bridge: move nf_bridge_update_protocol to where its used no need to keep it in a header file. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-09 13:21:31 +01:00
Florian Westphal	8bd63cf1a4	bridge: move mac header copying into br_netfilter The mac header only has to be copied back into the skb for fragments generated by ip_fragment(), which only happens for bridge forwarded packets with nf-call-iptables=1 && active nf_defrag. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-09 13:20:48 +01:00
Oliver Hartkopp	969439016d	can: add missing initialisations in CAN related skbuffs When accessing CAN network interfaces with AF_PACKET sockets e.g. by dhclient this can lead to a skb_under_panic due to missing skb initialisations. Add the missing initialisations at the CAN skbuff creation times on driver level (rx path) and in the network layer (tx path). Reported-by: Austin Schuh <austin@peloton-tech.com> Reported-by: Daniel Steer <daniel.steer@mclaren.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2015-03-09 10:22:24 +01:00
Willem de Bruijn	c247f0534c	ip: fix error queue empty skb handling When reading from the error queue, msg_name and msg_control are only populated for some errors. A new exception for empty timestamp skbs added a false positive on icmp errors without payload. `traceroute -M udpconn` only displayed gateways that return payload with the icmp error: the embedded network headers are pulled before sock_queue_err_skb, leaving an skb with skb->len == 0 otherwise. Fix this regression by refining when msg_name and msg_control branches are taken. The solutions for the two fields are independent. msg_name only makes sense for errors that configure serr->port and serr->addr_offset. Test the first instead of skb->len. This also fixes another issue. saddr could hold the wrong data, as serr->addr_offset is not initialized in some code paths, pointing to the start of the network header. It is only valid when serr->port is set (non-zero). msg_control support differs between IPv4 and IPv6. IPv4 only honors requests for ICMP and timestamps with SOF_TIMESTAMPING_OPT_CMSG. The skb->len test can simply be removed, because skb->dev is also tested and never true for empty skbs. IPv6 honors requests for all errors aside from local errors and timestamps on empty skbs. In both cases, make the policy more explicit by moving this logic to a new function that decides whether to process msg_control and that optionally prepares the necessary fields in skb->cb[]. After this change, the IPv4 and IPv6 paths are more similar. The last case is rxrpc. Here, simply refine to only match timestamps. Fixes: `49ca0d8bfa` ("net-timestamp: no-payload option") Reported-by: Jan Niehusmann <jan@gondor.com> Signed-off-by: Willem de Bruijn <willemb@google.com> ---- Changes v1->v2 - fix local origin test inversion in ip6_datagram_support_cmsg - make v4 and v6 code paths more similar by introducing analogous ipv4_datagram_support_cmsg - fix compile bug in rxrpc Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-08 23:01:54 -04:00
Eric W. Biederman	b79bda3d38	neigh: Use neigh table index for neigh_packet_xmit Remove a little bit of unnecessary work when transmitting a packet with neigh_packet_xmit. Use the neighbour table index not the address family as a parameter. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-08 19:30:06 -04:00
Eric W. Biederman	7d5f41f276	mpls: Fix the openvswitch select of NET_MPLS_GSO Fix the OPENVSWITCH Kconfig option and old Kconfigs by having OPENVSWITCH select both NET_MPLS_GSO and MPLSO. A Kbuild test robot reported that when NET_MPLS_GSO is selected by OPENVSWITCH the generated .config is broken because MPLS is not selected. Cc: Simon Horman <horms@verge.net.au> Fixes: `cec9166ca4` mpls: Refactor how the mpls module is built Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-08 19:30:06 -04:00
Eric W. Biederman	aa7da93756	mpls: Correct the ttl decrement. According to RFC3032 section 2.4.2 packets with an outgoing ttl of 0 MUST NOT be forwarded. According to section 2.4.1 an outgoing TTL of 0 comes from an incomming TTL <= 1. Therefore any packets that is received with a ttl <= 1 should not have it's ttl decremented and forwarded. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-08 19:30:06 -04:00
Eric W. Biederman	0f7bbd5805	mpls: Better error code for unsupported option. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-08 19:30:06 -04:00
Eric W. Biederman	19d0c341d9	mpls: Cleanup the rcu usage in the code. Sparse was generating a lot of warnings mostly from missing annotations in the code. Add missing annotations and in a few cases tweak the code for performance by moving work before loops. This also fixes a problematic ommision of rcu_assign_pointer and rcu_dereference. Hopefully with complete rcu annotations any new rcu errors will stick out like a sore thumb. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-08 19:30:06 -04:00
Eric W. Biederman	d865616e18	mpls: Fix the kzalloc argument order in mpls_rt_alloc Blink I got the argument order wrong to kzalloc and the code was working properly when tested. Blink Fix that. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-08 19:30:06 -04:00
Al Viro	1711fd9add	sunrpc: fix braino in ->poll() POLL_OUT isn't what callers of ->poll() are expecting to see; it's actually __SI_POLL \| 2 and it's a siginfo code, not a poll bitmap bit... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Cc: Bruce Fields <bfields@fieldses.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-03-08 12:53:46 -07:00
Linus Torvalds	bbbce516bb	TTY/Serial fixes for 4.0-rc3 Here are some tty and serial driver fixes for 4.0-rc3. Along with the atime fix that you know about, here are some other serial driver bugfixes as well. Most notable is a wait_until_sent bugfix that was traced back to being around since before 2.6.12 that Johan has fixed up. All have been in linux-next successfully. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlT8RCYACgkQMUfUDdst+yk62QCgycxS4giC2hyRver3dyvaNR6g zYYAn2w0uRndW+AqP4Tls54isRz6owpF =gA2k -----END PGP SIGNATURE----- Merge tag 'tty-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg KH: "Here are some tty and serial driver fixes for 4.0-rc3. Along with the atime fix that you know about, here are some other serial driver bugfixes as well. Most notable is a wait_until_sent bugfix that was traced back to being around since before 2.6.12 that Johan has fixed up. All have been in linux-next successfully" * tag 'tty-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: TTY: fix tty_wait_until_sent maximum timeout TTY: fix tty_wait_until_sent on 64-bit machines USB: serial: fix infinite wait_until_sent timeout TTY: bfin_jtag_comm: remove incorrect wait_until_sent operation net: irda: fix wait_until_sent poll timeout serial: uapi: Declare all userspace-visible io types serial: core: Fix iotype userspace breakage serial: sprd: Fix missing spin_unlock in sprd_handle_irq() console: Fix console name size mismatch tty: fix up atime/mtime mess, take four serial: 8250_dw: Fix get_mctrl behaviour serial:8250:8250_pci: delete unneeded quirk entries serial:8250:8250_pci: fix redundant entry report for WCH_CH352_2S Change email address for 8250_pci serial: 8250: Revert "tty: serial: 8250_core: read only RX if there is something in the FIFO" Revert "tty/serial: of_serial: add DT alias ID handling"	2015-03-08 12:25:40 -07:00
Alexander Aring	0402d9f233	Bluetooth: fix sco_exit compile warning While compiling the following warning occurs: WARNING: net/built-in.o(.init.text+0x602c): Section mismatch in reference from the function bt_init() to the function .exit.text:sco_exit() The function __init bt_init() references a function __exit sco_exit(). This is often seen when error handling in the init function uses functionality in the exit path. The fix is often to remove the __exit annotation of sco_exit() so it may be used outside an exit section. Since commit `6d785aa345` ("Bluetooth: Convert mgmt to use HCI chan registration API") the function "sco_exit" is used inside of function "bt_init". The suggested solution by remove the __exit annotation solved this issue. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-07 22:13:17 +02:00
Eric Dumazet	58025e46ea	net: gro: remove obsolete code from skb_gro_receive() Some drivers use copybreak to copy tiny frames into smaller skb, and this smaller skb might not have skb->head_frag set for various reasons. skb_gro_receive() currently doesn't allow to aggregate the smaller skb into the previous GRO packet if this GRO packet has at least 2 MSS in it. Following workload easily demonstrates the problem. netperf -t TCP_RR -H target -- -r 3000,3000 (tcpdump shows one GRO packet with 2 MSS, plus one additional packet of 104 bytes that should have been appended.) It turns out that we can remove code from skb_gro_receive(), because commit `8a29111c7c` ("net: gro: allow to build full sized skb") and its followups removed the assumption that a GRO packet with a frag_list had to have an empty head. Removing this code allows the aggregation of the last (incomplete) frame in some RPC workloads. Note that tcp_gro_receive() already takes care of forcing a flush if necessary, including this case. If we want to avoid using frag_list in the first place (in forwarding workloads for example, as the outgoing NIC is generally not able to cope with skbs having a frag_list), we need to address this separately. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 21:50:55 -05:00
Shani Michaeli	c93682477b	net/dcb: Add IEEE QCN attribute As specified in 802.1Qau spec. Add this optional attribute to the DCB netlink layer. To allow for application to use the new attribute, NIC drivers should implement and register the callbacks ieee_getqcn, ieee_setqcn and ieee_getqcnstats. The QCN attribute holds a set of parameters for management, and a set of statistics to provide informative data on Congestion-Control defined by this spec. Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 21:50:02 -05:00
Johan Hovold	2c3fbe3cf2	net: irda: fix wait_until_sent poll timeout In case an infinite timeout (0) is requested, the irda wait_until_sent implementation would use a zero poll timeout rather than the default 200ms. Note that wait_until_sent is currently never called with a 0-timeout argument due to a bug in tty_wait_until_sent. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable <stable@vger.kernel.org> # v2.6.12 Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-03-07 03:44:14 +01:00
Alexander Duyck	88bae7149a	fib_trie: Add key vector to root, return parent key_vector in resize This change makes it so that the root of the trie contains a key_vector, by doing this we make room to essentially collapse the entire trie by at least one cache line as we can store the information about the tnode or leaf that is pointed to in the root. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:28 -05:00
Alexander Duyck	f23e59fbd7	fib_trie: Move parent from key_vector to tnode This change pulls the parent pointer from the key_vector and places it in the tnode structure. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:28 -05:00
Alexander Duyck	6e22d174ba	fib_trie: Pull empty_children and full_children into tnode This pulls the information about the child array out of the key_vector and places it in the tnode since that is where it is needed. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:28 -05:00
Alexander Duyck	56ca2adf6a	fib_trie: Move rcu from key_vector to tnode, add accessors. RCU is only needed once for the entire node, not once per key_vector so we can pull that out and move it to the tnode structure. In addition add accessors to be used inside the RCU functions so that we can more easily get from the key vector to either the tnode or the trie pointers. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:28 -05:00
Alexander Duyck	dc35dbeda3	fib_trie: Add tnode struct as a container for fields not needed in key_vector This change pulls the fields not explicitly needed in the key_vector and placed them in the new tnode structure. By doing this we will eventually be able to reduce the key_vector down to 16 bytes on 64 bit systems, and 12 bytes on 32 bit systems. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:28 -05:00
Alexander Duyck	2e1ac88a48	fib_trie: Rename tnode_child_length to child_length We are now checking the length of a key_vector instead of a tnode so it makes sense to probably just rename this to child_length since it would probably even be applicable to a leaf. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:28 -05:00
Alexander Duyck	754baf8dec	fib_trie: replace tnode_get_child functions with get_child macros I am replacing the tnode_get_child call with get_child since we are techically pulling the child out of a key_vector now and not a tnode. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:27 -05:00
Alexander Duyck	35c6edac19	fib_trie: Rename tnode to key_vector Rename the tnode to key_vector. The key_vector will be the eventual container for all of the information needed by either a leaf or a tnode. The final result should be much smaller than the 40 bytes currently needed for either one. This also updates the trie struct so that it contains an array of size 1 of tnode pointers. This is to bring the structure more inline with how an actual tnode itself is configured. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:27 -05:00
Alexander Duyck	8d8e810ca8	fib_trie: Return pointer to tnode pointer in resize/inflate/halve Resize related functions now all return a pointer to the pointer that references the object that was resized. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:27 -05:00
Alexander Duyck	72be72607a	fib_trie: Minor cleanups to fib_table_flush_external This change just does a couple of minor cleanups on fib_table_flush_external. Specifically it addresses the fact that resize was being called even though nothing was being removed from the table, and it drops an unecessary indent since we could just call continue on the inverse of the fi && flag check. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:49:27 -05:00
Robert Shearman	f8d54afc4c	mpls: Properly validate RTA_VIA payload length If the nla length is less than 2 then the nla data could be accessed beyond the accessible bounds. So ensure that the nla is big enough to at least read the via_family before doing so. Replace magic value of 2. Fixes: `03c0566542` ("mpls: Basic support for adding and removing routes") Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Robert Shearman <rshearma@brocade.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 15:19:06 -05:00
Fan Du	05cbc0db03	ipv4: Create probe timer for tcp PMTU as per RFC4821 As per RFC4821 7.3. Selecting Probe Size, a probe timer should be armed once probing has converged. Once this timer expired, probing again to take advantage of any path PMTU change. The recommended probing interval is 10 minutes per RFC1981. Probing interval could be sysctled by sysctl_tcp_probe_interval. Eric Dumazet suggested to implement pseudo timer based on 32bits jiffies tcp_time_stamp instead of using classic timer for such rare event. Signed-off-by: Fan Du <fan.du@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 14:57:42 -05:00
Fan Du	6b58e0a5f3	ipv4: Use binary search to choose tcp PMTU probe_size Current probe_size is chosen by doubling mss_cache, the probing process will end shortly with a sub-optimal mss size, and the link mtu will not be taken full advantage of, in return, this will make user to tweak tcp_base_mss with care. Use binary search to choose probe_size in a fine granularity manner, an optimal mss will be found to boost performance as its maxmium. In addition, introduce a sysctl_tcp_probe_threshold to control when probing will stop in respect to the width of search range. Test env: Docker instance with vxlan encapuslation(82599EB) iperf -c 10.0.0.24 -t 60 before this patch: 1.26 Gbits/sec After this patch: increase 26% 1.59 Gbits/sec Signed-off-by: Fan Du <fan.du@intel.com> Acked-by: John Heffner <johnwheffner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 14:57:41 -05:00
Eric W. Biederman	aaa4e70404	DECnet: Only use neigh_ops for adding the link layer header Other users users of the neighbour table use neigh->output as the method to decided when and which link-layer header to place on a packet. DECnet has been using neigh->output to decide which DECnet headers to place on a packet depending which neighbour the packet is destined for. The DECnet usage isn't totally wrong but it can run into problems if the neighbour output function is run for a second time as the teql driver and the bridge netfilter code can do. Therefore to avoid pathologic problems later down the line and make the neighbour code easier to understand by refactoring the decnet output code to only use a neighbour method to add a link layer header to a packet. This is done by moving the neigbhour operations lookup from dn_to_neigh_output to dn_neigh_output_packet. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 14:54:22 -05:00
Johan Hedberg	7a00ff445f	Bluetooth: Add mgmt_send_event() helper to send to any HCI channel Currently the mgmt_event() function is only capable of sending to HCI_CHANNEL_CONTROL. To void having to change all users of it, add a new mgmt_send_event() function that takes a channel parameter, and make the old mgmt_event() a wrapper that passes MGMT_CHANNEL_CONTROL to it. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-06 20:15:22 +01:00
Johan Hedberg	3b0602cd01	Bluetooth: Rename pending_cmd to mgmt_pending_cmd This patch renames the pending_cmd struct (used for tracking pending mgmt commands) to mgmt_pending_cmd, so that it can be moved to a more generic place and be used also by other modules using other HCI channels. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-06 20:15:21 +01:00
Johan Hedberg	2a1afb5ac8	Bluetooth: Rename cmd_complete() to mgmt_cmd_complete() This patch renames the cmd_complete() function to mgmt_cmd_complete() in preparation of making it a generic helper for other modules to use too. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-06 20:15:21 +01:00
Johan Hedberg	a69e8375a1	Bluetooth: Rename cmd_status() to mgmt_cmd_status() This patch renames the cmd_status() function to mgmt_cmd_status() in preparation of making it a generic helper for other modules to use too. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-06 20:15:21 +01:00
Johan Hedberg	b9a245fb12	Bluetooth: Move all mgmt command quirks to handler table In order to completely generalize the mgmt command handling we need to move away command-specific information from mgmt_control() into the actual command table. This patch adds a new 'flags' field to the handler entries which can now contain the following command specific information: - Command takes variable length parameters - Command doesn't target any specific HCI device - Command can be sent when the HCI device is unconfigured After this the mgmt_control() function is completely generic and can potentially be reused by new HCI channels. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-06 20:15:21 +01:00
Johan Hedberg	6d785aa345	Bluetooth: Convert mgmt to use HCI chan registration API This patch converts the existing mgmt code to use the newly introduced generic API for registering HCI channels with mgmt-like semantics. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-06 20:15:21 +01:00
Johan Hedberg	801c1e8da5	Bluetooth: Add mgmt HCI channel registration API This patch adds an API for registering HCI channels with mgmt-like semantics. For now the only user will be HCI_CHANNEL_CONTROL, but e.g. 6lowpan is intended to use this as well in the future. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-06 20:15:21 +01:00
Marcel Holtmann	93690c227a	Bluetooth: Introduce controller setting information for static address Currently it is not possible to determine if the static address is used by the controller. It is also not possible to determine if using a static on a dual-mode controller with disabled BR/EDR is possible or not. To address this issue, introduce a new setting called static-address. If support for this setting is signaled that means that the kernel supports using static addresses. And if used on dual-mode controllers with BR/EDR disabled it means that a configured static address can be used. In addition utilize the same setting for the list of current active settings that indicates if a static address is configured and if that address will be actually used. With this in mind the existing Set Static Address management command has been extended to return the current settings. That way the caller of that command can easily determine if the programmed address will be used or if extra steps are required. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-06 20:43:07 +02:00
Linus Torvalds	1b1bd56191	NFS client bugfixes for Linux 4.0 Highlights include: - Fix a regression in the NFSv4 open state recovery code - Fix a regression in the NFSv4 close code - Fix regressions and side-effects of the loop-back mounted NFS fixes in 3.18, that cause the NFS read() syscall to return EBUSY. - Fix regressions around the readdirplus code and how it interacts with the VFS lazy unmount changes that went into v3.18. - Fix issues with out-of-order RPC call replies replacing updated attributes with stale ones (particularly after a truncate()). - Fix an underflow checking issue with RPC/RDMA credits - Fix a number of issues with the NFSv4 delegation return/free code. - Fix issues around stale NFSv4.1 leases when doing a mount -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJU+SJ6AAoJEGcL54qWCgDyVgQQAKNsF/O2O9ip2uAHZRZvM6TS Ev8c3Spj/FmRI1tlCcGi1zZ8uCSwvPQz3uAN0vOTMocUWjokT5sAgN5yBIxHasem 6YK7jxs9WHiP7MGyReaAFJwG/W6LZndnlNqPWPs9KiaWwKVXIsQ3uFm/Y0lr90Fi ew16DQm0DUd4Yvv42WJR9ay7UwUPvT7wmaGIVVK2hjQqr2lx02jspt5kfrC+vsMU OYU/0YDofb2ajs5krbah6tUHf3VDnSVrXP6if66IrukCM9S4AvowpnMJQ5QJALh+ cPlqHDm2ZzuIecpqZEgYLM73wQ2q+KBXTlDLcgYg6LjnqBEivwO9RDn6tfCwKTcS tCFohQc9iDOj9rTZ9EQlQME6u/FdxWncpxyTsMyBk7FlcLsOQRqio/FXZhbyJGuH gvIdIW3fPseijtejpYkxgabe6JL9NRvjv3SnOay7xHs9Vn4tRkFF2mkkQDZOG6HT atkxQp8kB3m9gMoeAmTLdTcJkcFdk6AKnNrcyJaa1GW4msmMuGZtq/6vayKJsBZb OIw788bDSOVpcVR+6SAC24/dutcl+WJHlSJShqvIrTKejBxPCc7IVSeg63FJsWTO sxfXdUr3wVJ1ooDFCspeBj+3zAimIq7qDmRRs85ekgEtxrgdUldhg/VwbDjvLjmb whXFpiCS7Ii0fVYtZvjz =qMB7 -----END PGP SIGNATURE----- Merge tag 'nfs-for-4.0-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: "Highlights include: - Fix a regression in the NFSv4 open state recovery code - Fix a regression in the NFSv4 close code - Fix regressions and side-effects of the loop-back mounted NFS fixes in 3.18, that cause the NFS read() syscall to return EBUSY. - Fix regressions around the readdirplus code and how it interacts with the VFS lazy unmount changes that went into v3.18. - Fix issues with out-of-order RPC call replies replacing updated attributes with stale ones (particularly after a truncate()). - Fix an underflow checking issue with RPC/RDMA credits - Fix a number of issues with the NFSv4 delegation return/free code. - Fix issues around stale NFSv4.1 leases when doing a mount" * tag 'nfs-for-4.0-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (24 commits) NFSv4.1: Clear the old state by our client id before establishing a new lease NFSv4: Fix a race in NFSv4.1 server trunking discovery NFS: Don't write enable new pages while an invalidation is proceeding NFS: Fix a regression in the read() syscall NFSv4: Ensure we skip delegations that are already being returned NFSv4: Pin the superblock while we're returning the delegation NFSv4: Ensure we honour NFS_DELEGATION_RETURNING in nfs_inode_set_delegation() NFSv4: Ensure that we don't reap a delegation that is being returned NFS: Fix stateid used for NFS v4 closes NFSv4: Don't call put_rpccred() under the rcu_read_lock() NFS: Don't require a filehandle to refresh the inode in nfs_prime_dcache() NFSv3: Use the readdir fileid as the mounted-on-fileid NFS: Don't invalidate a submounted dentry in nfs_prime_dcache() NFSv4: Set a barrier in the update_changeattr() helper NFS: Fix nfs_post_op_update_inode() to set an attribute barrier NFS: Remove size hack in nfs_inode_attrs_need_update() NFSv4: Add attribute update barriers to delegreturn and pNFS layoutcommit NFS: Add attribute update barriers to NFS writebacks NFS: Set an attribute barrier on all updates NFS: Add attribute update barriers to nfs_setattr_update_inode() ...	2015-03-06 10:09:57 -08:00
Scott Feldman	e1315db17d	switchdev: fix CONFIG_IP_MULTIPLE_TABLES compile issue Signed-off-by: Scott Feldman <sfeldma@gmail.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 12:43:54 -05:00
David S. Miller	23375a0fd5	ipv4: Fix unused variable warnings in fib_table_flush_external. net/ipv4/fib_trie.c: In function ‘fib_table_flush_external’: net/ipv4/fib_trie.c:1572:6: warning: unused variable ‘found’ [-Wunused-variable] int found = 0; ^ net/ipv4/fib_trie.c:1571:16: warning: unused variable ‘slen’ [-Wunused-variable] unsigned char slen; ^ Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:38:35 -05:00
Scott Feldman	8e05fd7166	fib: hook IPv4 fib for hardware offload Call into the switchdev driver any time an IPv4 fib entry is added/modified/deleted from the kernel's FIB. The switchdev driver may or may not install the route to the offload device. In the case where the driver tries to install the route and something goes wrong (device's routing table is full, etc), then all of the offloaded routes will be flushed from the device, route forwarding falls back to the kernel, and no more routes are offloading. We can refine this logic later. For now, use the simplist model of offloading routes up to the point of failure, and then on failure, undo everything and mark IPv4 offloading disabled. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:24:58 -05:00
Scott Feldman	b5d6fbdeed	switchdev: implement IPv4 fib ndo wrappers Flesh out ndo wrappers to call into device driver. To call into device driver, the wrapper must interate over route's nexthops to ensure all nexthop devs belong to the same switch device. Currently, there is no support for route's nexthops spanning offloaded and non-offloaded devices, or spanning ports of multiple offload devices. Since switch device ports may be stacked under virtual interfaces (bonds and/or bridges), and the route's nexthop may be on the virtual interface, the wrapper will traverse the nexthop dev down to the base dev. It's the base dev that's passed to the switchdev driver's ndo ops. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:24:58 -05:00
Scott Feldman	104616e74e	switchdev: don't support custom ip rules, for now Keep switchdev FIB offload model simple for now and don't allow custom ip rules. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:24:58 -05:00
Scott Feldman	5e8d90497d	switchdev: add IPv4 fib ndo ops wrappers Add IPv4 fib ndo wrapper funcs and stub them out for now. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:24:58 -05:00
Florian Fainelli	c86e59b9e6	net: dsa: extract dsa switch tree setup and removal Extract the core logic that setups a 'struct dsa_switch_tree' and removes it, update dsa_probe() and dsa_remove() to use the two helper functions. This will be useful to allow for other callers to setup this structure differently. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 00:18:20 -05:00

1 2 3 4 5 ...

36755 Commits