linux

iv/linux

Author	SHA1	Message	Date
Haiyang Zhang	f9cbce34c3	hv_netvsc: Add support to set MTU reservation from guest side When packet encapsulation is in use, the MTU needs to be reduced for headroom reservation. The existing code takes the updated MTU value only from the host side. But vSwitch extensions, such as Open vSwitch, require the flexibility to change the MTU to different values from within a guest during the lifecycle of a vNIC, when the encapsulation protocol is changed. The patch supports this kind of MTU changes. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 16:00:56 -07:00
Eric Dumazet	9e29e21a9b	ifb: add multiqueue operation Add multiqueue capabilities to ifb netdevice. This removes last bottleneck for ingress when mq qdisc can be used to shard load from multiple RX queues on physical device. Tested: # netem based setup, installed at receiver side ETH=eth0 IFB=ifb10 EST="est 1sec 4sec" # Optional rate estimator RTT_HALF=2ms #REORDER=20us #LOSS="loss 1" TXQ=8 ip link add ifb10 numtxqueues $TXQ type ifb ip link set dev $IFB up tc qdisc add dev $ETH ingress 2>/dev/null tc filter add dev $ETH parent ffff: \ protocol ip u32 match u32 0 0 flowid 1:1 \ action mirred egress redirect dev $IFB tc qdisc del dev $IFB root 2>/dev/null tc qdisc add dev $IFB root handle 1: mq for i in `seq 1 $TXQ` do slot=$( printf %x $(( i )) ) tc qd add dev $IFB parent 1:$slot $EST netem \ limit 100000 delay $RTT_HALF $REORDER $LOSS done lpaa24:~# tc -s -d qd sh dev ifb10 qdisc mq 1: root Sent 316544766 bytes 5265927 pkt (dropped 0, overlimits 0 requeues 0) backlog 98880b 1648p requeues 0 qdisc netem 8002: parent 1:1 limit 100000 delay 2.0ms Sent 39601416 bytes 658721 pkt (dropped 0, overlimits 0 requeues 0) rate 38235Kbit 79657pps backlog 12240b 204p requeues 0 qdisc netem 8003: parent 1:2 limit 100000 delay 2.0ms Sent 39472866 bytes 657227 pkt (dropped 0, overlimits 0 requeues 0) rate 38234Kbit 79655pps backlog 10620b 176p requeues 0 qdisc netem 8004: parent 1:3 limit 100000 delay 2.0ms Sent 39703417 bytes 659699 pkt (dropped 0, overlimits 0 requeues 0) rate 38320Kbit 79831pps backlog 12780b 213p requeues 0 qdisc netem 8005: parent 1:4 limit 100000 delay 2.0ms Sent 39565149 bytes 658011 pkt (dropped 0, overlimits 0 requeues 0) rate 38174Kbit 79530pps backlog 11880b 198p requeues 0 qdisc netem 8006: parent 1:5 limit 100000 delay 2.0ms Sent 39506078 bytes 657354 pkt (dropped 0, overlimits 0 requeues 0) rate 38195Kbit 79571pps backlog 12480b 208p requeues 0 qdisc netem 8007: parent 1:6 limit 100000 delay 2.0ms Sent 39675994 bytes 658849 pkt (dropped 0, overlimits 0 requeues 0) rate 38323Kbit 79838pps backlog 12600b 210p requeues 0 qdisc netem 8008: parent 1:7 limit 100000 delay 2.0ms Sent 39532042 bytes 658367 pkt (dropped 0, overlimits 0 requeues 0) rate 38177Kbit 79536pps backlog 13140b 219p requeues 0 qdisc netem 8009: parent 1:8 limit 100000 delay 2.0ms Sent 39488164 bytes 657705 pkt (dropped 0, overlimits 0 requeues 0) rate 38192Kbit 79568pps backlog 13Kb 222p requeues 0 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 16:00:09 -07:00
Andrew Lunn	6c3e921b18	net: fec: Ensure clocks are enabled while using mdio bus When a switch is attached to the mdio bus, the mdio bus can be used while the interface is not open. If the IPG clock is not enabled, MDIO reads/writes will simply time out. Add support for runtime PM to control this clock. Enable/disable this clock using runtime PM, with open()/close() and mdio read()/write() function triggering runtime PM operations. Since PM is optional, the IPG clock is enabled at probe and is no longer modified by fec_enet_clk_enable(), thus if PM is not enabled in the kernel, it is guaranteed the clock is running when MDIO operations are performed. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 15:59:04 -07:00
Lendacky, Thomas	cfbfd86bfd	amd-xgbe: Fix DMA API debug warning When running a kernel configured with CONFIG_DMA_API_DEBUG=y a warning is issued: DMA-API: device driver tries to sync DMA memory it has not allocated This warning is the result of mapping the full range of the Rx buffer pages allocated and then performing a dma_sync_single_for_cpu against a calculated DMA address. The proper thing to do is to use the dma_sync_single_range_for_cpu with a base DMA address and an offset. Reported-by: Kim Phillips <kim.phillips@arm.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Kim Phillips <kim.phillips@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 15:57:14 -07:00
Chris Metcalf	673c2c34f6	modpost: work correctly with tile coldtext sections The tilegx and tilepro compilers use .coldtext for their unlikely executed text section name, so an __attribute__((cold)) function will (when compiled with higher optimization levels) land in the .coldtext section. Modify modpost to add .coldtext to the set of OTHER_TEXT_SECTIONS so we don't get warnings about referencing such a section in an __ex_table block, and then also modify arch/tile/lib/memcpy_user_64.c so that it uses plain ".coldtext" instead of ".coldtext.memcpy". The latter naming is a relic of an earlier use of -ffunction-sections, which we no longer use by default. Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au>	2015-07-08 18:53:49 -04:00
Hariprasad Shenai	81aa507981	cxgb4: Add PCI device ids for few more T5 and T6 adapters Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 15:53:11 -07:00
Nicolas Dichtel	95ec655bc4	Revert "dev: set iflink to 0 for virtual interfaces" This reverts commit e1622baf54df8cc958bf29d71de5ad545ea7d93c. The side effect of this commit is to add a '@NONE' after each virtual interface name with a 'ip link'. It may break existing scripts. Reported-by: Olivier Hartkopp <socketcan@hartkopp.net> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 15:52:33 -07:00
Eric Dumazet	d339727c2b	net: graceful exit from netif_alloc_netdev_queues() User space can crash kernel with ip link add ifb10 numtxqueues 100000 type ifb We must replace a BUG_ON() by proper test and return -EINVAL for crazy values. Fixes: 60877a32bce00 ("net: allow large number of tx queues") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 15:46:17 -07:00
Carol Soto	0beb44b065	net/mlx4_core: Add extra check for total vfs for SRIOV Add extra check for total vfs for SRIOV to check if that value is bigger than total vfs in pci SRIOV capabalities. Fix a check and print of the number of maximum vfs that hw can handle. Fix a check and print of the number of maximum vfs per port that driver can handle. Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 15:19:54 -07:00
Michael Holzheu	d912557b34	samples: bpf: enable trace samples for s390x The trace bpf samples do not compile on s390x because they use x86 specific fields from the "pt_regs" structure. Fix this and access the fields via new PT_REGS macros. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 15:17:45 -07:00
Phil Sutter	142b942a75	rhashtable: fix for resize events during table walk If rhashtable_walk_next detects a resize operation in progress, it jumps to the new table and continues walking that one. But it misses to drop the reference to it's current item, leading it to continue traversing the new table's bucket in which the current item is sorted into, and after reaching that bucket's end continues traversing the new table's second bucket instead of the first one, thereby potentially missing items. This fixes the rhashtable runtime test for me. Bug probably introduced by Herbert Xu's patch eddee5ba ("rhashtable: Fix walker behaviour during rehash") although not explicitly tested. Fixes: eddee5ba ("rhashtable: Fix walker behaviour during rehash") Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 14:53:49 -07:00
Satish Ashok	f7e2965db1	bridge: mdb: start delete timer for temp static entries Start the delete timer when adding temp static entries so they can expire. Signed-off-by: Satish Ashok <sashok@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: ccb1c31a7a87 ("bridge: add flags to distinguish permanent mdb entires") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 14:52:58 -07:00
Kristina Martsenko	9ccd608070	arm64: dts: add device tree for ARM SMM-A53x2 on LogicTile Express 20MG Add a DTS file for the MP2 Cortex-A53 Soft Macrocell Model implemented on a LogicTile Express 20MG (V2F-1XV7) daughterboard. This is based on the version that's currently available from the ARM DTS repository [1]. [1] git://linux-arm.org/arm-dts.git Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Kevin Hilman <khilman@linaro.org>	2015-07-08 14:44:56 -07:00
Sudeep Holla	3adf7aaa76	arm: dts: vexpress: add missing CCI PMU device node to TC2 The CCI device node was added to vexpress CA15_A7(i.e. TC2) much before the CCI PMU support and binding was added. This patch adds the missing PMU node so that CCI PMUs can be used on TC2. Cc: Liviu Dudau <liviu.dudau@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Punit Agrawal <punit.agrawal@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Kevin Hilman <khilman@linaro.org>	2015-07-08 14:44:55 -07:00
Mark Rutland	4d44f2a026	arm: dts: vexpress: describe all PMUs in TC2 dts The dts for the CoreTile Express A15x2 A7x3 (TC2) only describes the PMUs of the Cortex-A15 CPUs, and not the Cortex-A7 CPUs. Now that we have a mechanism for describing disparate PMUs and their interrupts in device tree, this patch makes use of these to describe the PMUs for all CPUs in the system. For consistency, the existing A15 PMU interrupt-affinity property is reflowed across two lines. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Liviu Dudau <liviu.dudau@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Kevin Hilman <khilman@linaro.org>	2015-07-08 14:44:55 -07:00
Punnaiah Choudary Kalluri	7baaa9092d	net: macb: Add SG support for Zynq SOC family Enable SG support for Zynq SOC family devices. Signed-off-by: Punnaiah Choudary Kalluri <punnaia@xilinx.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 14:42:46 -07:00
Peter Zijlstra	758556bdc1	module: Fix load_module() error path The load_module() error path frees a module but forgot to take it out of the mod_tree, leaving a dangling entry in the tree, causing havoc. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reported-by: Arthur Marsh <arthur.marsh@internode.on.net> Tested-by: Arthur Marsh <arthur.marsh@internode.on.net> Fixes: 93c2e105f6bc ("module: Optimize __module_address() using a latched RB-tree") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2015-07-09 06:57:12 +09:30
Tirumalesh Chalamarla	efc5120b82	GICv3: Add ITS entry to THUNDER dts The PCIe host controller uses MSIs provided by GICv3 ITS. Enable it on Thunder SoCs by adding an entry to DT. Signed-off-by: Tirumalesh Chalamarla <tchalamarla@cavium.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Kevin Hilman <khilman@linaro.org>	2015-07-08 14:24:57 -07:00
Kevin Hilman	8dfaf05682	move CSR rtc iobrg read/write API to be regmap this moves to general APIs, and all drivers will be changed based on it. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJVd+OkAAoJEDIv4aC191RhuWkP/2LzJBRoBP1WJmTOvR0Kofl7 4PG5Xi/+1NtSHubvTSJJX2kZ9I9FHemQAJba8pqZ0pgZkb+7t0dA2+G1mDi8SOr8 VyWlMotI8Im0PoUHXmt5GPHEI4E6WqyX1TQFUD4JM/Ln07mevH/ocjd94LN79eO6 tc1z8AEw1/d5MCcEI34FP+vNXyLOE/9hM7IsmBtf4Niq7VrO8SWS5y1tJBOz+d2v HmZZYEP/WS2mX8E77cz89pf0+BkmhF6nj1zoFCmlLVxCXrlKncs+Uvnsb+n/k1BI gA4UNOfSoA+k+h+1F/7BhyEABqYkEuBDYs9m5NsypL+WITaVnI9pmNWNAA7IZNTg dEYoVKqpJX3lVPWJ0i0hvgK1xXNMKlbwFb6iV80upvUVNfzZs9JveSL97Iiy9WjI LlRdCTlteBvgsNBJyVG//788VU5+dr8HXV1UCfm+drs3H43TudCnJ8AhF3JaToO+ wv045McJTUL/CI+WQiXGNlAoXDjprgjLSiXRyEKF1xNsem3u78Z5bpjGX5crpzgC HIdnsxb39jCHCNVJLuK/eCJ9Oocpo9LdKU0wSqIx0GNAx+xdoPo5H8jmlE4bnSWo 9f/9784A1SI/tpc0mKRz/z3FEKiP0i3CTJfcmC9EAd4LWFYk54WgtlzPztRkE625 YQMHjM+cavYWpzzgnQ+B =Beg0 -----END PGP SIGNATURE----- Merge tag 'sirf-iobrg2regmap-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/baohua/linux into fixes Merge "CSR SiRFSoC rtc iobrg move to regmap for 4.2" from Barry Song: move CSR rtc iobrg read/write API to be regmap this moves to general APIs, and all drivers will be changed based on it. * tag 'sirf-iobrg2regmap-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/baohua/linux: ARM: prima2: move to use REGMAP APIs for rtciobrg	2015-07-08 14:20:12 -07:00
Kevin Hilman	b649125350	add atlas7 pinctrl dts stuff add atlas7 pin groups and gpio/pin mapping descriptions -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJVd+HMAAoJEDIv4aC191RhuuoQAK1Gv4ZiK54aACFYcXUGv93g KaUXxekDkVbt+9ggs7tvmule1ABEEextcLBoKM7vck7jpbpZTYfQ7rI7QainCC23 f9c50ywOGd9w34k6vM23r4d8lTUnAfSm0CfDPHdu5zqDBxTW1eiSg2ASklYXpdGr uBTAPxzydUd/IF8kEC37itEv8IKZ+/PyIR94o8JOj5tQYsrw0W6HGM3blB/pmW+i FdviiaeSNVcAj90nCt2DmkJQ39JfDzW850eY/fZpeNozu7TD/UOi90RECL2US3/Q SLRucwYZ2yIJpH0+EPPpnenv3rxsl6KLnbcKS5Wp3ZGXzkuqRT1AzAorlV+reL1P jUw6LdYjfa4NpX/sd/4lgvF1IGAhGNKJnNM+WJfZr7abqxrf1Pg3t5IHRbqB3ux2 ptCK+P5qFdBs72TNHUl1cVKb+eL4RkvvkQAGXm7kHxCJ8UOAr3bPscTa/9QLoUK8 wNd0NcyvH2WsscNPcS09G7v54fX34RudEMLxAGJN0iYPr1tV3jwr3qf8ilClIZa9 Kdr1TodsZDqPNwomynLK+23tNFw0B1FZRcq2IZbnM1bgCArg6tOepXONhW5SDxHA rX8H7AAhctSAyziKD4zWi7uWhc2Phxw0g1lL9y1jJwmrGXixXzWizOOgNXmo7KgI yHqLobkYSAKK9N0K6jOJ =21Po -----END PGP SIGNATURE----- Merge tag 'atlas7-pinctrl-dts-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/baohua/linux into fixes Merge "CSR atlas7 pinctrl descriptions for 4.2" from Barry Song: add atlas7 pinctrl dts stuff add atlas7 pin groups and gpio/pin mapping descriptions * tag 'atlas7-pinctrl-dts-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/baohua/linux: ARM: dts: atlas7: add pinctrl and gpio descriptions	2015-07-08 14:18:45 -07:00
Li, Liang Z	6ab13b2769	xen-netback: remove duplicated function definition There are two duplicated xenvif_zerocopy_callback() definitions. Remove one of them. Signed-off-by: Liang Li <liang.z.li@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 14:12:53 -07:00
Rob Herring	cfa5200564	net: phy: add dependency on HAS_IOMEM to MDIO_BUS_MUX_MMIOREG On UML builds, mdio-mux-mmioreg.c fails to compile: drivers/net/phy/mdio-mux-mmioreg.c:50:3: error: implicit declaration of function ‘ioremap’ [-Werror=implicit-function-declaration] drivers/net/phy/mdio-mux-mmioreg.c:63:3: error: implicit declaration of function ‘iounmap’ [-Werror=implicit-function-declaration] This is due to CONFIG_OF now being user selectable. Add a dependency on HAS_IOMEM to fix this. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: David S. Miller <davem@davemloft.net> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 14:10:38 -07:00
Y Vo	3d8cc14152	arm64: dts: Add poweroff button device node for APM X-Gene platform This patch adds poweroff button device node to support poweroff feature on APM X-Gene Mustang platform. Signed-off-by: Y Vo <yvo@apm.com> Signed-off-by: Kevin Hilman <khilman@linaro.org>	2015-07-08 14:09:18 -07:00
Ralf Baechle	aa43c5ff73	NET: hamradio: Fix IP over bpq encapsulation. Since 1d5da757da860a6916adbf68b09e868062b4b3b8 (ax25: Stop using magic neighbour cache operations.) any attempt to transmit IP packets over a bpqether device will result in a message like "Dead loop on virtual device bpq0, fix it urgently!" Fix suggested by Eric W. Biederman <ebiederm@xmission.com>. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: <stable@vger.kernel.org> # 4.1 Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 14:09:03 -07:00
Eric Dumazet	32f675bbc9	net_sched: gen_estimator: extend pps limit rate estimators are limited to 4 Mpps, which was fine years ago, but too small with current hardware generation. Lets use 2^5 scaling instead of 2^10 to get 128 Mpps new limit. On 64bit arch, use an "unsigned long" for temp storage and remove limit. (We do not expect 32bit arches to be able to reach this point) Tested: tc -s -d filter sh dev eth0 parent ffff: filter protocol ip pref 1 u32 filter protocol ip pref 1 u32 fh 800: ht divisor 1 filter protocol ip pref 1 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:15 match 07000000/ff000000 at 12 action order 1: gact action drop random type none pass val 0 index 1 ref 1 bind 1 installed 166 sec Action statistics: Sent 39734251496 bytes 863788076 pkt (dropped 863788117, overlimits 0 requeues 0) rate 4067Mbit 11053596pps backlog 0b 0p requeues 0 Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 13:59:20 -07:00
David S. Miller	8685255ec5	Merge branch 'sch_act_lockless' Eric Dumazet says: ==================== net_sched: act: lockless operation As mentioned by Alexei last week in Budapest, it is a bit weird to take a spinlock in order to drop a packet in a tc filter... Lets add percpu infra for tc actions and use it for gact & mirred. Before changes, my host with 8 RX queues was handling 5 Mpps with gact, and more than 11 Mpps after. Mirred change is not yet visible if ifb+qdisc is used, as ifb is not yet multi queue enabled, but is a step forward. ==================== Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 13:50:42 -07:00
Eric Dumazet	2ee22a90c7	net_sched: act_mirred: remove spinlock in fast path Like act_gact, act_mirred can be lockless in packet processing 1) Use percpu stats 2) update lastuse only every clock tick to avoid false sharing 3) use rcu to protect tcfm_dev 4) Remove spinlock usage, as it is no longer needed. Next step : add multi queue capability to ifb device Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.fastabend@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 13:50:42 -07:00
Eric Dumazet	56e5d1ca18	net_sched: act_gact: remove spinlock in fast path Final step for gact RCU operation : 1) Use percpu stats 2) update lastuse only every clock tick to avoid false sharing 3) Remove spinlock acquisition, as it is no longer needed. Since this is the last contended lock in packet RX when tc gact is used, this gives impressive gain. My host with 8 RX queues was handling 5 Mpps before the patch, and more than 11 Mpps after patch. Tested: On receiver : dev=eth0 tc qdisc del dev $dev ingress 2>/dev/null tc qdisc add dev $dev ingress tc filter del dev $dev root pref 10 2>/dev/null tc filter del dev $dev pref 10 2>/dev/null tc filter add dev $dev est 1sec 4sec parent ffff: protocol ip prio 1 \ u32 match ip src 7.0.0.0/8 flowid 1:15 action drop Sender sends packets flood from 7/8 network Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 13:50:42 -07:00
Eric Dumazet	8f2ae965b7	net_sched: act_gact: read tcfg_ptype once Third step for gact RCU operation : Following patch will get rid of spinlock protection, so we need to read tcfg_ptype once. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 13:50:42 -07:00
Eric Dumazet	cc6510a950	net_sched: act_gact: use a separate packet counters for gact_determ() Second step for gact RCU operation : We want to get rid of the spinlock protecting gact operations. Stats (packets/bytes) will soon be per cpu. gact_determ() would not work without a central packet counter, so lets add it for this mode. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 13:50:42 -07:00
Eric Dumazet	cef5ecf96b	net_sched: act_gact: make tcfg_pval non zero First step for gact RCU operation : Instead of testing if tcfg_pval is zero or not, just make it 1. No change in behavior, but slightly faster code. The smp_rmb()/smp_wmb() barriers, while not strictly needed at this stage are added for upcoming spinlock removal. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 13:50:42 -07:00
Eric Dumazet	519c818e8f	net: sched: add percpu stats to actions Reuse existing percpu infrastructure John Fastabend added for qdisc. This patch adds a new cpustats parameter to tcf_hash_create() and all actions pass false, meaning this patch should have no effect yet. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 13:50:41 -07:00
Eric Dumazet	24ea591d22	net: sched: extend percpu stats helpers qdisc_bstats_update_cpu() and other helpers were added to support percpu stats for qdisc. We want to add percpu stats for tc action, so this patch add common helpers. qdisc_bstats_update_cpu() is renamed to qdisc_bstats_cpu_update() qdisc_qstats_drop_cpu() is renamed to qdisc_qstats_cpu_drop() Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 13:50:41 -07:00
Eric Dumazet	0a6d424569	mlx4: TCP/UDP packets have L4 hash Mellanox driver has the knowledge if rxhash is a L4 hash, if it receives a non fragmented TCP or UDP frame and NETIF_F_RXCSUM is enabled on netdev. ip_summed value is CHECKSUM_UNNECESSARY in this case. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Ido Shamay <idos@mellanox.com> Acked-by: Ido Shamay <idos@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 13:45:22 -07:00
David S. Miller	71c3353775	Merge branch 'tcp-policer-drops' Yuchung Cheng says: ==================== tcp: reducing lost retransmits in recovery This patch series reduces lost retransmits in recovery, in particular when dealing with traffic policers. The main problem is that slow start in recovery under policing can cause massive lost and retransmit storms: any excess sending rate turns into drops. The solution is to avoid doing slow start when lost retransmit is detected and use packet conservation instead. On networks with traffic policers the patches have lowered the TCP loss rates by ~20% from Google servers without latency regressions. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 13:29:46 -07:00
Yuchung Cheng	3759824da8	tcp: PRR uses CRB mode by default and SS mode conditionally PRR slow start is often too aggressive especially when drops are caused by traffic policers. The policers mainly use token bucket to enforce the rate so sending (twice) faster than the delivery rate causes excessive drops. This patch changes PRR to the conservative reduction bound (CRB) mode in RFC 6937 by default. CRB follows the packet conservation rule to send at most the delivery rate by default. But if many packets are lost and the pipe is empty, CRB may take N round trips to repair N losses. We conditionally turn on slow start mode if all these conditions are made to speed up the recovery: 1) on the second round or later in recovery 2) retransmission sent in the previous round is delivered on this ACK 3) no retransmission is marked lost on this ACK By using packet conservation by default, this change reduces the loss retransmits signicantly on networks that deploy traffic policers, up to 20% reduction of overall loss rate. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Nandita Dukkipati <nanditad@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 13:29:46 -07:00
Yuchung Cheng	291a00d1a7	tcp: reduce cwnd if retransmit is lost in CA_Loss If the retransmission in CA_Loss is lost again, we should not continue to slow start or raise cwnd in congestion avoidance mode. Instead we should enter fast recovery and use PRR to reduce cwnd, following the principle in RFC5681: "... or the loss of a retransmission, should be taken as two indications of congestion and, therefore, cwnd (and ssthresh) MUST be lowered twice in this case." This is especially important to reduce loss when the CA_Loss state was caused by a traffic policer dropping the entire inflight. The CA_Loss state has a problem where a loss of L packets causes the sender to send a burst of L packets. So a policer that's dropping most packets in a given RTT can cause a huge retransmit storm. By contrast, PRR includes logic to bound the number of outbound packets that result from a given ACK. So switching to CA_Recovery on lost retransmits in CA_Loss avoids this retransmit storm problem when in CA_Loss. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Nandita Dukkipati <nanditad@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 13:29:45 -07:00
Mark Rutland	1b42804d27	arm64: entry: handle debug exceptions in el_inv Currently we enable debug exceptions before reading ESR_EL1 in both el0_inv and el1_inv. If a debug exception is taken before we read ESR_EL1, the value will have been corrupted. As el_inv is typically fatal, an intervening debug exception results in misleading debug information being logged to the console, but is not otherwise harmful. As with the other entry paths, we can use the ESR_EL1 value stashed earlier in the exception entry (in x25 for el0_sync{,_compat}, and x1 for el1_sync), giving us better error reporting in this case. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2015-07-08 18:03:48 +01:00
Dan Carpenter	866a920403	drm/radeon: fix underflow in r600_cp_dispatch_texture() The "if (pass_size > buf->total)" can underflow so I have changed the type of size and pass_size to unsigned to avoid this problem. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-07-08 12:41:43 -04:00
Grigori Goronzy	5e3c4f9070	drm/radeon: default to 2048 MB GART size on SI+ Newer ASICs have more VRAM on average and allocating more GART as well can have advantages. Also see commit edcd26e8. Ideally, we should scale GART size based on actual VRAM size, but that requires significant restructuring of initialization. v2: extract small helper, apply to error paths Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Grigori Goronzy <greg@chown.ath.cx> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-07-08 12:41:41 -04:00
Grigori Goronzy	54e0398613	drm/radeon: fix HDP flushing This was regressed by commit 39e7f6f8, although I don't know of any actual issues caused by it. The storage domain is read without TTM locking now, but the lock never helped to prevent any races. Reviewed-by: Christian König <christian.koenig@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Grigori Goronzy <greg@chown.ath.cx> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-07-08 12:41:40 -04:00
Grigori Goronzy	828202a382	drm/radeon: use RCU query for GEM_BUSY syscall We don't need to call the (expensive) radeon_bo_wait, checking the fences via RCU is much faster. The reservation done by radeon_bo_wait does not save us from any race conditions. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Grigori Goronzy <greg@chown.ath.cx> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-07-08 12:41:39 -04:00
Mario Kleiner	bd833144a2	drm/amdgpu: Handle irqs only based on irq ring, not irq status regs. This is a translation of the patch ... "drm/radeon: Handle irqs only based on irq ring, not irq status regs." ... for the vblank irq handling, to fix the same problem described in that patch on the new driver. Only compile tested due to lack of suitable hw. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> CC: Michel Dänzer <michel.daenzer@amd.com> CC: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-07-08 12:41:37 -04:00
Mario Kleiner	07f18f0bb8	drm/radeon: Handle irqs only based on irq ring, not irq status regs. Trying to resolve issues with missed vblanks and impossible values inside delivered kms pageflip completion events showed that radeon's irq handling sometimes doesn't handle valid irqs, but silently skips them. This was observed for vblank interrupts. Although those irqs have corresponding events queued in the gpu's irq ring at time of interrupt, and therefore the corresponding handling code gets triggered by these events, the handling code sometimes silently skipped processing the irq. The reason for those skips is that the handling code double-checks for each irq event if the corresponding irq status bits in the irq status registers are set. Sometimes those bits are not set at time of check for valid irqs, maybe due to some hardware race on some setups? The problem only seems to happen on some machine + card combos sometimes, e.g., never happened during my testing of different PC cards of the DCE-2/3/4 generation a year ago, but happens consistently now on two different Apple Mac cards (RV730, DCE-3, Apple iMac and Evergreen JUNIPER, DCE-4 in a Apple MacPro). It also doesn't happen at each interrupt but only occassionally every couple of hundred or thousand vblank interrupts. This results in XOrg warning messages like "[ 7084.472] (WW) RADEON(0): radeon_dri2_flip_event_handler: Pageflip completion event has impossible msc 420120 < target_msc 420121" as well as skipped frames and problems for applications that use kms pageflip events or vblank events, e.g., users of DRI2 and DRI3/Present, Waylands Weston compositor, etc. See also https://bugs.freedesktop.org/show_bug.cgi?id=85203 After some talking to Alex and Michel, we decided to fix this by turning the double-check for asserted irq status bits into a warning. Whenever a irq event is queued in the IH ring, always execute the corresponding interrupt handler. Still check the irq status bits, but only to log a DRM_DEBUG message on a mismatch. This fixed the problems reliably on both previously failing cards, RV-730 dual-head tested on both crtcs (pipes D1 and D2) and a triple-output Juniper HD-5770 card tested on all three available crtcs (D1/D2/D3). The r600 and evergreen irq handling is therefore tested, but the cik an si handling is only compile tested due to lack of hw. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> CC: Michel Dänzer <michel.daenzer@amd.com> CC: Alex Deucher <alexander.deucher@amd.com> CC: <stable@vger.kernel.org> # v3.16+ Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-07-08 12:41:36 -04:00
Linus Torvalds	45820c294f	Fix broken audit tests for exec arg len The "fix" in commit 0b08c5e5944 ("audit: Fix check of return value of strnlen_user()") didn't fix anything, it broke things. As reported by Steven Rostedt: "Yes, strnlen_user() returns 0 on fault, but if you look at what len is set to, than you would notice that on fault len would be -1" because we just subtracted one from the return value. So testing against 0 doesn't test for a fault condition, it tests against a perfectly valid empty string. Also fix up the usual braindamage wrt using WARN_ON() inside a conditional - make it part of the conditional and remove the explicit unlikely() (which is already part of the WARN_ON*() logic, exactly so that you don't have to write unreadable code. Reported-and-tested-by: Steven Rostedt <rostedt@goodmis.org> Cc: Jan Kara <jack@suse.cz> Cc: Paul Moore <pmoore@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-07-08 09:33:38 -07:00
Daniel Vetter	dec4f799d0	drm/i915: Use crtc_state->active in primary check_plane func Since commit 8c7b5ccb729870e606321b3703e2c2e698c49a95 Author: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Date: Tue Apr 21 17:13:19 2015 +0300 drm/i915: Use atomic helpers for computing changed flags we compute the plane state for a modeset before actually committing any changes, which means crtc->active won't be correct yet. Looking at future work in the modeset conversion targetting 4.3 the only places where crtc_state->active isn't accurate is when disabling other CRTCs than the one the modeset is for (when stealing connectors). Which isn't the case here. And that's also confirmed by an audit, we do unconditionally update crtc_state->active for the current pipe. We also don't need to update any other plane check functions since we only ever add the primary state to the modeset update right now. Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-07-08 16:42:25 +02:00
Daniel Vetter	63fef06ada	drm/i915: Check crtc->active in intel_crtc_disable_planes This was lost in commit ce22dba92de22e951dee2ff89937a39754d2dd91 Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Date: Tue Apr 21 17:12:56 2015 +0300 drm/i915: Move toggling planes out of crtc enable/disable. and we still need that crtc->active check since the overall modeset flow doesn't yet take dpms state into account properly. Fixes WARNING backtraces on at least bdw/hsw due to the ips disabling code being upset about being run on a switched-off pipe. We don't need a corresponding change on the enable side since with the old setCrtc semantics we always force-enable the pipe after a modeset. And the dpms function intel_crtc_control already checks for ->active. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-07-08 16:42:25 +02:00
Thomas Gleixner	a899418167	hotplug: Prevent alloc/free of irq descriptors during cpu up/down When a cpu goes up some architectures (e.g. x86) have to walk the irq space to set up the vector space for the cpu. While this needs extra protection at the architecture level we can avoid a few race conditions by preventing the concurrent allocation/free of irq descriptors and the associated data. When a cpu goes down it moves the interrupts which are targeted to this cpu away by reassigning the affinities. While this happens interrupts can be allocated and freed, which opens a can of race conditions in the code which reassignes the affinities because interrupt descriptors might be freed underneath. Example: CPU1 CPU2 cpu_up/down irq_desc = irq_to_desc(irq); remove_from_radix_tree(desc); raw_spin_lock(&desc->lock); free(desc); We could protect the irq descriptors with RCU, but that would require a full tree change of all accesses to interrupt descriptors. But fortunately these kind of race conditions are rather limited to a few things like cpu hotplug. The normal setup/teardown is very well serialized. So the simpler and obvious solution is: Prevent allocation and freeing of interrupt descriptors accross cpu hotplug. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: xiao jin <jin.xiao@intel.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Borislav Petkov <bp@suse.de> Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com> Link: http://lkml.kernel.org/r/20150705171102.063519515@linutronix.de	2015-07-08 11:32:25 +02:00
Tvrtko Ursulin	2c3d99845e	drm/i915: Restore all GGTT VMAs on resume When rotated and partial views were added no one spotted the resume path which assumes only one GGTT VMA per object and hence is now skipping rebind of alternative views. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-07-08 11:28:47 +02:00
Sébastien Hinderer	69711ca19b	x86/kconfig: Fix typo in the CONFIG_CMDLINE_BOOL help text Signed-off-by: Sébastien Hinderer <Sebastien.Hinderer@ens-lyon.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samuel Thibault <Samuel.Thibault@ens-lyon.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-07-08 11:10:56 +02:00

... 3 4 5 6 7 ...

533486 Commits