linux

iv/linux

Author	SHA1	Message	Date
Daniel Borkmann	94caee8c31	ebpf: add sched_act_type and map it to sk_filter's verifier ops In order to prepare eBPF support for tc action, we need to add sched_act_type, so that the eBPF verifier is aware of what helper function act_bpf may use, that it can load skb data and read out currently available skb fields. This is bascially analogous to `96be4325f4` ("ebpf: add sched_cls_type and map it to sk_filter's verifier ops"). BPF_PROG_TYPE_SCHED_CLS and BPF_PROG_TYPE_SCHED_ACT need to be separate since both will have a different set of functionality in future (classifier vs action), thus we won't run into ABI troubles when the point in time comes to diverge functionality from the classifier. The future plan for act_bpf would be that it will be able to write into skb->data and alter selected fields mirrored in struct __sk_buff. For an initial support, it's sufficient to map it to sk_filter_ops. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 19:10:44 -04:00
David S. Miller	0fa74a4be4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/emulex/benet/be_main.c net/core/sysctl_net_core.c net/ipv4/inet_diag.c The be_main.c conflict resolution was really tricky. The conflict hunks generated by GIT were very unhelpful, to say the least. It split functions in half and moved them around, when the real actual conflict only existed solely inside of one function, that being be_map_pci_bars(). So instead, to resolve this, I checked out be_main.c from the top of net-next, then I applied the be_main.c changes from 'net' since the last time I merged. And this worked beautifully. The inet_diag.c and sysctl_net_core.c conflicts were simple overlapping changes, and were easily to resolve. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 18:51:09 -04:00
Herbert Xu	6626af6926	rhashtable: Fix undeclared EEXIST build error on ia64 We need to include linux/errno.h in rhashtable.h since it doesn't always get included otherwise. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 18:18:45 -04:00
Al Viro	4de930efc2	net: validate the range we feed to iov_iter_init() in sys_sendto/sys_recvfrom Cc: stable@vger.kernel.org # v3.19 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:38:06 -04:00
David S. Miller	b4c11cb437	Merge branch 'amd-xgbe-next' Tom Lendacky says: ==================== amd-xgbe: AMD XGBE driver updates 2015-03-19 The following series of patches includes functional updates and changes to the driver. - Use the phydev->advertising field instead of the phydev->supported field when configuring for auto-negotiation, etc. - Use the phy_driver flags field for setting the transceiver type instead of hardcoding it in the ethtool support. - Provide an auto-negotiation timeout check - Clarify the Tx/Rx queue information messages - Use the new DMA memory barrier operations - Set the device DMA mask based on what the hardware reports - Remove the software implementation of Tx coalescing - Fix the reporting of the Rx coalescing value - Use napi_alloc_skb when allocating an SKB in softirq This patch series is based on net-next. Changes from v2: - Use jiffies instead of timespec for the auto-negotiation timeout check - Remove the Rx path SKB allocation re-work patch since we should only inline the headers and the current code guards better against any hardware bugs Changes from v1: - Default to 32-bit DMA width (minimum supported) if hardware returns an unexpected DMA width value ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:34:03 -04:00
Lendacky, Thomas	385565a1f0	amd-xgbe: Use napi_alloc_skb when allocating skb in softirq Use the napi_alloc_skb function to allocate an skb when running within the softirq context to avoid calls to local_irq_save/restore. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:57 -04:00
Lendacky, Thomas	4a57ebcc2c	amd-xgbe: Fix Rx coalescing reporting The Rx coalescing value is internally converted from usecs to a value that the hardware can use. When reporting the Rx coalescing value, this internal value is converted back to usecs. During the conversion from and back to usecs some rounding occurs. So, for example, when setting an Rx usec of 30, it will be reported as 29. Fix this reporting issue by keeping the original usec value and using that during reporting. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:57 -04:00
Lendacky, Thomas	c635eaacbf	amd-xgbe: Remove Tx coalescing The Tx coalescing support in the driver was a software implementation for something lacking in the hardware. Using hrtimers, the idea was to trigger a timer interrupt after having queued a packet for transmit. Unfortunately, as the timer value was lowered, the timer expired before the hardware actually did the transmit and so it was racey and resulted in unnecessary interrupts. Remove the Tx coalescing support and hrtimer and replace with a Tx timer that is used as a reclaim timer in case of inactivity. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:57 -04:00
Lendacky, Thomas	386d325dbd	amd-xgbe: Set DMA mask based on hardware register value The hardware supplies a value that indicates the DMA range that it is capable of using. Use this value rather than hard-coding it in the driver. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:57 -04:00
Lendacky, Thomas	ceb8f6be7e	amd-xgbe: Use the new DMA memory barriers where appropriate Use the new lighter weight memory barriers when working with the device descriptors. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:57 -04:00
Lendacky, Thomas	600c8811d3	amd-xgbe: Clarify output message about queues Clarify that the queues referred to in a message when the device is brought up are hardware queues and not necessarily related to the Linux network queues. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:57 -04:00
Lendacky, Thomas	9ae5eecdba	amd-xgbe-phy: Provide support for auto-negotiation timeout Currently, there is no interrupt code that indicates auto-negotiation has timed out. If the auto-negotiation has timed out then the start of a new auto-negotiation will begin again with a new base page being received. The state machine could be in a state that is not expecting this interrupt code which results in an error during auto-negotiation. Update the code to timestamp when the auto-negotiation starts. Should another page received interrupt code occur before auto-negotiation has completed but after the auto-negotiation timeout, then reset the state machine to allow the auto-negotiation to continue. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:56 -04:00
Lendacky, Thomas	65f57cb152	amd-xgbe-phy: Use the phy_driver flags field Remove the setting of the transceiver type when retrieving the device settings using ethtool and instead set the transceiver type in the phy_driver structure flags field. Change the transceiver type to be internal, also. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:56 -04:00
Lendacky, Thomas	d9663c8c21	amd-xgbe-phy: Use phydev advertising field vs supported With ethtool being able to control what is advertised, the advertising field is what should be used for priming the auto-negotiation registers and for various other checks, instead of the supported field. Also, move the initial setting of the supported and advertising fields into the probe function so that they are not reset each time the device is brought up, thus allowing the user to set as desired before bringing the device up. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:33:56 -04:00
Catalin Marinas	91edd096e2	net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour Commit `db31c55a6f` (net: clamp ->msg_namelen instead of returning an error) introduced the clamping of msg_namelen when the unsigned value was larger than sizeof(struct sockaddr_storage). This caused a msg_namelen of -1 to be valid. The native code was subsequently fixed by commit `dbb490b965` (net: socket: error on a negative msg_namelen). In addition, the native code sets msg_namelen to 0 when msg_name is NULL. This was done in commit (`6a2a2b3ae0` net:socket: set msg_namelen to 0 if msg_name is passed as NULL in msghdr struct from userland) and subsequently updated by `08adb7dabd` (fold verify_iovec() into copy_msghdr_from_user()). This patch brings the get_compat_msghdr() in line with copy_msghdr_from_user(). Fixes: `db31c55a6f` (net: clamp ->msg_namelen instead of returning an error) Cc: David S. Miller <davem@davemloft.net> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:31:09 -04:00
David S. Miller	ebd6af092a	Merge branch 'rhashtable-inlined-interface' Herbert Xu says: ==================== rhashtable: Introduce inlined interface This series of patches introduces the inlined rhashtable interface. The idea is to make all the function pointers visible to the compiler by providing the rhashtable_params structure explicitly to each inline rhashtable function. For example, instead of doing obj = rhashtable_lookup(ht, key); you would now do obj = rhashtable_lookup_fast(ht, key, params); Where params is the same data that you would give to rhashtable_init. In particular, within rhashtable.c itself we would simply supply ht->p. So to convert users over, you simply have to make params globally accessible, e.g., by placing it in a static const variable, which can then be used at each inlined call site, as well as by the rhashtable_init call. The only ticky bit is that some users (i.e., netfilter) has a dynamic key length. This is dealt with by using params.key_len in the inline functions when it is non-zero, and otherwise falling back on ht->p.key_len. Note that I've only tested this on one compiler, gcc 4.7.2. So please test this with your compilers as well and make sure that the code is actually inlined without indirect function calls. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:32 -04:00
Herbert Xu	dc0ee268d8	rhashtable: Rip out obsolete out-of-line interface Now that all rhashtable users have been converted over to the inline interface, this patch removes the unused out-of-line interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	6cca7289d5	tipc: Use inlined rhashtable interface This patch converts tipc to the inlined rhashtable interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	b182aa6e96	test_rhashtable: Use inlined rhashtable interface This patch converts test_rhashtable to the inlined rhashtable interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	fa3773211e	netfilter: Convert nft_hash to inlined rhashtable This patch converts nft_hash to the inlined rhashtable interface. This patch also replaces the call to rhashtable_lookup_compare with a straight rhashtable_lookup_fast because it's simply doing a memcmp (in fact nft_hash_lookup already uses memcmp instead of nft_data_cmp). Furthermore, the compare function is only meant to compare, it is not supposed to have side-effects. The current side-effect code can simply be moved into the nft_hash_get. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	c428ecd1a2	netlink: Move namespace into hash key Currently the name space is a de facto key because it has to match before we find an object in the hash table. However, it isn't in the hash value so all objects from different name spaces with the same port ID hash to the same bucket. This is bad as the number of name spaces is unbounded. This patch fixes this by using the namespace when doing the hash. Because the namespace field doesn't lie next to the portid field in the netlink socket, this patch switches over to the rhashtable interface without a fixed key. This patch also uses the new inlined rhashtable interface where possible. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	02fd97c3d4	rhashtable: Allow hash/comparison functions to be inlined This patch deals with the complaint that we make indirect function calls on the fast paths unnecessarily in rhashtable. We resolve it by moving the fast paths into inline functions that take struct rhashtable_param (which obviously must be the same set of parameters supplied to rhashtable_init) as an argument. The only remaining indirect call is to obj_hashfn (or key_hashfn it obj_hashfn is unset) on the rehash as well as the insert-during- rehash slow path. This patch also extends the support of vairable-length keys to include those where the key is fixed but scattered in the object. For example, in netlink we want to key off the namespace and the portid but they're not next to each other. This patch does this by directly using the object hash function as the indicator of whether the key is accessible or not. It also adds a new function obj_cmpfn to compare a key against an object. This means that the caller no longer needs to supply explicit compare functions. All this is done in a backwards compatible manner so no existing users are affected until they convert to the new interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	488fb86ee9	rhashtable: Make rhashtable_init params argument const This patch marks the rhashtable_init params argument const as there is no reason to modify it since we will always make a copy of it in the rhashtable. This patch also fixes a bug where we don't actually round up the value of min_size unless it is less than HASH_MIN_SIZE. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Daniel Borkmann	0b8c707ddf	ebpf, filter: do not convert skb->protocol to host endianess during runtime Commit `c249739579` ("bpf: allow BPF programs access 'protocol' and 'vlan_tci' fields") has added support for accessing protocol, vlan_present and vlan_tci into the skb offset map. As referenced in the below discussion, accessing skb->protocol from an eBPF program should be converted without handling endianess. The reason for this is that an eBPF program could simply do a check more naturally, by f.e. testing skb->protocol == htons(ETH_P_IP), where the LLVM compiler resolves htons() against a constant automatically during compilation time, as opposed to an otherwise needed run time conversion. After all, the way of programming both from a user perspective differs quite a lot, i.e. bpf_asm ["ld proto"] versus a C subset/LLVM. Reference: https://patchwork.ozlabs.org/patch/450819/ Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 15:24:26 -04:00
Marcelo Ricardo Leitner	c4a6853d8f	ipv6: invert join/leave anycast rtnl/socket locking order Commit `baf606d9c9` ("ipv4,ipv6: grab rtnl before locking the socket") missed to update two setsockopt options, IPV6_JOIN_ANYCAST and IPV6_LEAVE_ANYCAST, causing a lock inverstion regarding to the updated ones. As ipv6_sock_ac_join and ipv6_sock_ac_leave are only called from do_ipv6_setsockopt, we are good to just move the rtnl lock upper. Fixes: `baf606d9c9` ("ipv4,ipv6: grab rtnl before locking the socket") Reported-by: Ying Huang <ying.huang@intel.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:32:38 -04:00
Marcelo Ricardo Leitner	149d7549c2	vxlan: fix possible use of uninitialized in vxlan_igmp_{join, leave} Test robot noticed that we check the return of vxlan_igmp_join and leave but inside them there was a path that it could be used initialized. It's not really possible because those if() inside these igmp functions would always match as we can't have sockets of other type in there, but this way we keep the compiler happy. Fixes: `56ef9c909b` ("vxlan: Move socket initialization to within rtnl scope") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:31:24 -04:00
David S. Miller	de58a6da85	Merge branch 'be2net' Sathya Perla says: ==================== be2net: patch set Hi David, this patch set includes 3 bug fixes to the be2net driver. Patch 1 fixes a vlan isolation issue with VFs. When a VF is placed in promiscous mode, it could receive packets belonging to any vlan, as the PF driver grants vlan promisc capability to VFs. The PF driver now disables the vlan promisc capability for VFs to fix this problem. Patch 2 fixes the call to MODIFY_EQ_DELAY FW cmd to not include more than 8 EQs per cmd. The FW is not capable of handling more than 8 EQs per cmd. Patch 3 fixes an EEH error detection issue. On Power platforms, when an EEH error occurs, the slot disconnect state is more reliably detected via an MMIO read compared to a config read. So, the error register reads that occur every second are now done via MMIO. Pls apply this patch set to the "net" tree. Thanks! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:25:56 -04:00
Suresh Reddy	25848c9015	be2net: use PCI MMIO read instead of config read for errors When an EEH error occurs, the device/slot is disconnected. This condition is more reliably detected (i.e., returns all ones) with an MMIO read rather than a config read -- especially on power platforms. Hence, this patch fixes EEH error detection by replacing config reads with MMIO reads for reading the error registers. The error registers in Skyhawk-R/BE2/BE3 are accessible both via the config space and the PCICFG (BAR0) memory space. Reported-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Suresh Reddy <Suresh.Reddy@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:25:51 -04:00
Suresh Reddy	c8ba4ad0b5	be2net: restrict MODIFY_EQ_DELAY cmd to a max of 8 EQs Issuing this cmd for more than 8 EQs does not have the intended effect even on BEx and Skyhawk-R. This patch fixes this by issuing this cmd for upto 8 EQs at a time. Signed-off-by: Suresh Reddy <Suresh.Reddy@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:25:51 -04:00
Vasundhara Volam	435452aa88	be2net: Prevent VFs from enabling VLAN promiscuous mode Currently, a PF does not restrict its VF interface from enabling vlan promiscuous mode. This breaks vlan isolation when a vlan (transparent tagging) is configured on a VF. This patch fixes this problem by disabling the vlan promisc capability for VFs. Reported-by: Yoann Juet <veilletechno-irts@univ-nantes.fr> Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:25:51 -04:00
Josh Hunt	d22e153718	tcp: fix tcp fin memory accounting tcp_send_fin() does not account for the memory it allocates properly, so sk_forward_alloc can be negative in cases where we've sent a FIN: ss example output (ss -amn \| grep -B1 f4294): tcp FIN-WAIT-1 0 1 192.168.0.1:45520 192.0.2.1:8080 skmem:(r0,rb87380,t0,tb87380,f4294966016,w1280,o0,bl0) Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:18:52 -04:00
Steven Barth	73ba57bfae	ipv6: fix backtracking for throw routes for throw routes to trigger evaluation of other policy rules EAGAIN needs to be propagated up to fib_rules_lookup similar to how its done for IPv4 A simple testcase for verification is: ip -6 rule add lookup 33333 priority 33333 ip -6 route add throw 2001:db8::1 ip -6 route add 2001:db8::1 via fe80::1 dev wlan0 table 33333 ip route get 2001:db8::1 Signed-off-by: Steven Barth <cyrus@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:57:23 -04:00
Markos Chandras	87f966d97b	net: ethernet: pcnet32: Setup the SRAM and NOUFLO on Am79C97{3, 5} On a MIPS Malta board, tons of fifo underflow errors have been observed when using u-boot as bootloader instead of YAMON. The reason for that is that YAMON used to set the pcnet device to SRAM mode but u-boot does not. As a result, the default Tx threshold (64 bytes) is now too small to keep the fifo relatively used and it can result to Tx fifo underflow errors. As a result of which, it's best to setup the SRAM on supported controllers so we can always use the NOUFLO bit. Cc: <netdev@vger.kernel.org> Cc: <stable@vger.kernel.org> Cc: <linux-kernel@vger.kernel.org> Cc: Don Fry <pcnet32@frontier.com> Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:56:40 -04:00
Sabrina Dubroca	8e199dfd82	ipv6: call ipv6_proxy_select_ident instead of ipv6_select_ident in udp6_ufo_fragment Matt Grant reported frequent crashes in ipv6_select_ident when udp6_ufo_fragment is called from openvswitch on a skb that doesn't have a dst_entry set. ipv6_proxy_select_ident generates the frag_id without using the dst associated with the skb. This approach was suggested by Vladislav Yasevich. Fixes: `0508c07f5e` ("ipv6: Select fragment id during UFO segmentation if not set.") Cc: Vladislav Yasevich <vyasevic@redhat.com> Reported-by: Matt Grant <matt@mattgrant.net.nz> Tested-by: Matt Grant <matt@mattgrant.net.nz> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:56:11 -04:00
Palik, Imre	edafc132ba	xen-netback: making the bandwidth limiter runtime settable With the current netback, the bandwidth limiter's parameters are only settable during vif setup time. This patch register a watch on them, and thus makes them runtime changeable. When the watch fires, the timer is reset. The timer's mutex is used for fencing the change. Cc: Anthony Liguori <aliguori@amazon.com> Signed-off-by: Imre Palik <imrep@amazon.de> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:55:15 -04:00
David S. Miller	750f2f9165	Merge branch 'listener_refactor_part_14' Eric Dumazet says: ==================== inet: tcp listener refactoring part 14 OK, we have serious patches here. We get rid of the central timer handling SYNACK rtx, which is killing us under even medium SYN flood. We still use the listener specific hash table. This will be done in next round ;) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:40:33 -04:00
Eric Dumazet	becb74f0ac	net: increase sk_[max_]ack_backlog sk_ack_backlog & sk_max_ack_backlog were 16bit fields, meaning listen() backlog was limited to 65535. It is time to increase the width to allow much bigger backlog, if admins change /proc/sys/net/core/somaxconn & /proc/sys/net/ipv4/tcp_max_syn_backlog default values. Tested: echo 5000000 >/proc/sys/net/core/somaxconn echo 5000000 >/proc/sys/net/ipv4/tcp_max_syn_backlog Ran a SYNFLOOD test against a listener using listen(fd, 5000000) myhost~# grep request_sock_TCP /proc/slabinfo request_sock_TCP 4185642 4411940 304 13 1 : tunables 54 27 8 : slabdata 339380 339380 0 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:40:25 -04:00
Eric Dumazet	fa76ce7328	inet: get rid of central tcp/dccp listener timer One of the major issue for TCP is the SYNACK rtx handling, done by inet_csk_reqsk_queue_prune(), fired by the keepalive timer of a TCP_LISTEN socket. This function runs for awful long times, with socket lock held, meaning that other cpus needing this lock have to spin for hundred of ms. SYNACK are sent in huge bursts, likely to cause severe drops anyway. This model was OK 15 years ago when memory was very tight. We now can afford to have a timer per request sock. Timer invocations no longer need to lock the listener, and can be run from all cpus in parallel. With following patch increasing somaxconn width to 32 bits, I tested a listener with more than 4 million active request sockets, and a steady SYNFLOOD of ~200,000 SYN per second. Host was sending ~830,000 SYNACK per second. This is ~100 times more what we could achieve before this patch. Later, we will get rid of the listener hash and use ehash instead. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:40:25 -04:00
Eric Dumazet	52452c5425	inet: drop prev pointer handling in request sock When request sock are put in ehash table, the whole notion of having a previous request to update dl_next is pointless. Also, following patch will get rid of big purge timer, so we want to delete a request sock without holding listener lock. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:40:25 -04:00
Thomas Graf	a998f712f7	rhashtable: Round up/down min/max_size to ensure we respect limit Round up min_size respectively round down max_size to the next power of two to make sure we always respect the limit specified by the user. This is required because we compare the table size against the limit before we expand or shrink. Also fixes a minor bug where we modified min_size in the params provided instead of the copy stored in struct rhashtable. Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-19 21:02:23 -04:00
Linus Torvalds	b314acaccd	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: "An update to Synaptics driver that makes it usable with the 2015 lineup from Lenovo" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Revert "Input: synaptics - use dmax in input_mt_assign_slots" Input: synaptics - remove X250 from the topbuttonpad list Input: synaptics - remove X1 Carbon 3rd gen from the topbuttonpad list Input: synaptics - re-route tracksticks buttons on the Lenovo 2015 series Input: synaptics - remove TOPBUTTONPAD property for Lenovos 2015 Input: synaptics - retrieve the extended capabilities in query $10 Input: synaptics - do not retrieve the board id on old firmwares Input: synaptics - handle spurious release of trackstick buttons Input: synaptics - fix middle button on Lenovo 2015 products Input: synaptics - skip quirks when post-2013 dimensions Input: synaptics - support min/max board id in min_max_pnpid_table Input: synaptics - remove obsolete min/max quirk for X240 Input: synaptics - query min dimensions for fw v8.1 Input: synaptics - log queried and quirked dimension values Input: synaptics - split synaptics_resolution(), query first	2015-03-19 16:43:10 -07:00
Linus Torvalds	1e744c938d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: "This fixes bugs in zero-copy splice to the fuse device" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: explicitly set /dev/fuse file's private_data fuse: set stolen page uptodate fuse: notify: don't move pages	2015-03-19 16:36:24 -07:00
Linus Torvalds	e409ac3550	Merge branch 'overlayfs-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs fixes from Miklos Szeredi: "This fixes minor issues with the multi-layer update in v4.0" * 'overlayfs-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: upper fs should not be R/O ovl: check lowerdir amount for non-upper mount ovl: print error message for invalid mount options	2015-03-19 16:27:36 -07:00
Linus Torvalds	32dafb94a6	MMC core: - Fix error path in mmc_pwrseq_simple_alloc() -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVCqfTAAoJEP4mhCVzWIwpasUQANVHYNWEHhlo8pnu+29VT/R8 tpNvu2AaHdTiD82C81X6cJeBGegNmzcddZLyK/Q+9aEAyNeKhe/McR+tHiiiaZSQ Ty2shtF27za5Ouu2Lr1hElDi1zkk6WHz5rzGFuBE+jLPiQmObrdKKJerUoZBYL2Z cQrXJzgVpTkeXMydS+JUjxES5qeeQZWQmxsEf3cmQamkEQirLA4kf7FMDL8R665g oA8j1LIPyzPgoXOVjCfTAANomsANszB2ib3CtjrM4gTqzKvQ0aNer61LM44khTBn 5d2Zu8muGtsFlGaSPjyKrnV2H96bEszJtPFa6ym7P94slyqO/Bx3hXDiH2PvjKmP /IoXdcNONuQ+1L4x0dc6WTxHBMXdrl512yC8qxf2mAn3t60wHAfq9KP9urLwrZgx ZWRv3kSYh2Y3jJX7E+eQqrdY+lE0urI33ld+P0RZPp29Zzud1UUzXS4s/pCFYDg+ roNuhR2JW2ytYutZp3dhn7bswZTNDTLx2fvu/xPKrt0U+Ozh7vZTTqMDlEjHLNlG iMIChqdiGpM2E0KoO15OalG/sCo1XAwZN/Qze/h1PsbBXKn1G0XApjce7MO60Knj LHAwway5W83bVCHQIQonPg0aN53QyvDstkvXH1UWBrRZUw6t5AIO7YRCYT7KUWkB 8GBzbQ+QLVuY8UtjiAg5 =Q3/F -----END PGP SIGNATURE----- Merge tag 'mmc-v4.0-rc4' of git://git.linaro.org/people/ulf.hansson/mmc Pull MMC fix from Ulf Hansson: "MMC core: fix error path in mmc_pwrseq_simple_alloc()" * tag 'mmc-v4.0-rc4' of git://git.linaro.org/people/ulf.hansson/mmc: mmc: pwrseq_simple: fix error path in mmc_pwrseq_simple_alloc	2015-03-19 16:18:30 -07:00
Linus Torvalds	01d62ee520	A set of pin control fixes for the v4.0 release cycle: - Fix up consumer return values on pin control stubs. - Four patches fixing up the interrupt handling and sleep context save in the Baytrail driver. - Make default output directions work properly in the Cherryview driver. - Fix interrupt locking in the AT91 driver. - Fix setting interrupt generating lines as input in the sunxi driver. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVCo4HAAoJEEEQszewGV1z+HQQAMAzKL9igh3iCDR8p3tmB1sp ZBTgWl4NHT9MFeA1AmilVEn1JbmUDmDetTN7P/sVC8mxaJiheY8PrFbj4bwBsvzA JV0mlSBSj7jw8CxNM9rzSWZxRNtdpKr45siLA1SBPdni2x711SRW1H57eK73UQnQ PEVW9hWzXY4cHU6Q7dX67YDJuteQsu5A1QCy6hBYX4Kyyy5gT8RJs6lAvx2f5k8g Gsdgs60T/bTmIAiQT3FIf6VUQezW2m1PZn2fhuJeUCZWM17ej2YjVoanQUKu3Bz5 GPvV3wt2FpbKJsum5p4FJQPRD2qPsuq4jg7Msk6QOiVWHOl/QtL30AnS6N/iQ97z TlblAH3ze2t182rHeI4J8d4FIX8jRfftb9DHlgBhLZFU4k0EanMvqEWN6/8yqgmy n5nYUB88y6rI5RRLoGAStudlRIHqpz0fFvsU6IW5aZ7wpobIv+JPtMBUbfIKdQBV Xj39LDJj9W5jtI7Icl2v8q7oTknnEa7rUuH/VYbptMLkXBxndWPKG2JLFwfhQ3Py KMZvFdLP7E2uAR89KNvqQxbQQuYOK8wx5T2nFV57wVPX6VFv/G9sKUqdS5F7iraS qg8aBloBN9k79mBBWyIo/XEdluYC/zuKf2KPRvH7UhXep+e9iqCSxWyXqgbgv6F9 MtWvNLoK9kZyxymnqbmJ =qp0z -----END PGP SIGNATURE----- Merge tag 'pinctrl-v4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "Here is a slew of pin control fixes I've accumulated for the v4.0 kernel. Nothing special, just driver fixes (mainly embedded Intel it seems) and a misunderstanding regarding the stub functions was reverted: - Fix up consumer return values on pin control stubs. - Four patches fixing up the interrupt handling and sleep context save in the Baytrail driver. - Make default output directions work properly in the Cherryview driver. - Fix interrupt locking in the AT91 driver. - Fix setting interrupt generating lines as input in the sunxi driver" * tag 'pinctrl-v4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: sun4i: GPIOs configured as irq must be set to input before reading pinctrl: at91: move lock/unlock_as_irq calls into request/release pinctrl: update direction_output function of cherryview driver pinctrl: baytrail: Save pin context over system sleep pinctrl: baytrail: Rework interrupt handling pinctrl: baytrail: Clear interrupt triggering from pins that are in GPIO mode pinctrl: baytrail: Relax GPIO request rules Revert "pinctrl: consumer: use correct retval for placeholder functions"	2015-03-19 15:52:28 -07:00
Linus Torvalds	18eda522c2	nios2 fixes for v4.0-rc5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iQIVAwUAVQkR4VWoEK+e3syCAQIDFBAAyPOv4yrZP9kZu04I6x7HAczDBo2LjE9A dZyVC2U6RCOga/K1As2HCZ6nujan0D19MIZBm7wjwr0MgdGbdRpYFqoVlAMTAkWG cajvQlOS4bPar+HugKJDsn1C6uhrFBC6Gj8oilkj2oTZWyVVgZzZtJNjQwDVd/41 bpLBIxpnF0DBbSrAeN7DNlvb5bUVmPvAmGqYNghxmkYZqKY0hYKZNsIhuBo/3dPj l1oZRGEdYK/GX1sQB8F9Gk98nru13UUJ1cXFYMn4c0AWgrhwxu4jZCdYvyeO7svV inX+OwUYAsxZ8OTfwUtE9+VlidtquHemYfZMiGJnCIrtlBGW2oklys8Ob1WjX3NK 5Bazjpdi96SYCkhq9wS7nYhZLghU/Kw6daNAw55kECuGTexPmvp1aU02DPIVFlnj hLYQKBU/dfu3VKPpmFbw2QxYJStYotGKhlObBp9LMT1DxNvKuWEZkymLUDf3+PB1 G708afAgAUh547CafIFeBeuuba1eVFwPhhPEKj0mwBoxdpwKnv9FaOAZaI4ekq5w xNz6NtTDpUNdZoYn+nt+TCmd9Cip6JB0u45Qn2kGzzR3UuWpscwlogXpFg5b5nvh BCq/7iyrrFh7aoyvl/KGXkoYSJGVDH2ZSTb8uMoSFGJSb3vHYs1+AlPbaOxrZdbI BA5eS0Q9Rlg= =kGbT -----END PGP SIGNATURE----- Merge tag 'nios2-fixes-v4.0-rc5' of git://git.rocketboards.org/linux-socfpga-next Pull two arch/nios2 fixes from Ley Foon Tan: - Remove ucontext.h from exported arch headers - nios2: mm: do not invoke OOM killer on kernel fault OOM * tag 'nios2-fixes-v4.0-rc5' of git://git.rocketboards.org/linux-socfpga-next: nios2: mm: do not invoke OOM killer on kernel fault OOM nios2: Remove ucontext.h from exported arch headers	2015-03-19 15:24:28 -07:00
Shannon Nelson	91a0f93056	i40e: add NVM update events to AQ clean Quit complaining about a couple of events that we actually expect to see during an NVM update. Reported-by: Stefan Assmann <sassmann@redhat.com> Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-19 17:52:04 -04:00
Linus Torvalds	a93fc153b1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide Pull IDE fix from David Miller: "Just one fix to convert a by-hand conversion of jiffies to msecs, from Nicholas McGuire" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide: ide_tape: convert jiffies with jiffies_to_msecs	2015-03-19 13:16:49 -07:00
Linus Torvalds	22283c8260	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc Pull sparc fixes from David Miller: 1) Some command cases of semtimedop() not even handled due to miscoded comparison on sparc64. From Rob Gardner. 2) Due to two bugs, /proc/kcore wan't working properly on sparc. 3) Make sure fatal traps stop all running cpus, from Dave Kleikamp. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc: Fix /proc/kcore sparc: semtimedop() unreachable due to comparison error sparc: io_64.h: Replace io function-link macros sparc64: fatal trap should stop all cpus arch: sparc: kernel: starfire.c: Remove unused function arch: sparc: kernel: traps_64.c: Remove some unused functions	2015-03-19 13:11:55 -07:00
Marcelo Ricardo Leitner	446981e5fc	tipc: fix build issue when building without IPv6 We can't directly call ipv6_sock_mc_join() but should use the stub instead and protect it around IS_ENABLED. Fixes: `d0f91938be` ("tipc: add ip/udp media type") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-19 16:06:27 -04:00

1 2 3 4 5 ...

507673 Commits