638270 Commits

Author SHA1 Message Date
Linus Torvalds
ce38aa9cbe Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Platform regulatory domain support for ath10k, from Bartosz
    Markowski.

 2) Centralize min/max MTU checking, thus removing tons of duplicated
    code all of the the various drivers. From Jarod Wilson.

 3) Support ingress actions in act_mirred, from Shmulik Ladkani.

 4) Improve device adjacency tracking, from David Ahern.

 5) Add support for LED triggers on PHY link state changes, from Zach
    Brown.

 6) Improve UDP socket memory accounting, from Paolo Abeni.

 7) Set SK_MEM_QUANTUM to a fixed size of 4096, instead of PAGE_SIZE.
    From Eric Dumazet.

 8) Collapse TCP SKBs at retransmit time even if the right side SKB has
    frags. Also from Eric Dumazet.

 9) Add IP_RECVFRAGSIZE and IPV6_RECVFRAGSIZE cmsgs, from Willem de
    Bruijn.

10) Support routing by UID, from Lorenzo Colitti.

11) Handle L3 domain binding (ie. VRF) for RAW sockets, from David
    Ahern.

12) tcp_get_info() can run lockless, from Eric Dumazet.

13) 4-tuple UDP hashing in SFC driver, from Edward Cree.

14) Avoid reorders in GRO code, from Eric Dumazet.

15) IPV6 Segment Routing support, from David Lebrun.

16) Support MPLS push and pop for L3 packets in openvswitch, from Jiri
    Benc.

17) Add LRU datastructure support for BPF, Martin KaFai Lau.

18) VF support in liquidio driver, from Raghu Vatsavayi.

19) Multiqueue support in alx driver, from Tobias Regnery.

20) Networking cgroup BPF support, from Daniel Mack.

21) TCP chronograph measurements, from Francis Yan.

22) XDP support for qed driver, from Yuval Mintz.

23) BPF based lwtunnels, from Thomas Graf.

24) Consistent FIB dumping to offloading drivers, from Ido Schimmel.

25) Many optimizations for UDP under high load, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
  netfilter: nft_counter: rework atomic dump and reset
  e1000: use disable_hardirq() for e1000_netpoll()
  i40e: don't truncate match_method assignment
  net: ethernet: ti: netcp: add support of cpts
  net: phy: phy drivers should not set SUPPORTED_[Asym_]Pause
  net: l2tp: ppp: change PPPOL2TP_MSG_* => L2TP_MSG_*
  net: l2tp: deprecate PPPOL2TP_MSG_* in favour of L2TP_MSG_*
  net: l2tp: export debug flags to UAPI
  net: ethernet: stmmac: remove private tx queue lock
  net: ethernet: sxgbe: remove private tx queue lock
  net: bridge: shorten ageing time on topology change
  net: bridge: add helper to set topology change
  net: bridge: add helper to offload ageing time
  net: nicvf: use new api ethtool_{get|set}_link_ksettings
  net: ethernet: ti: cpsw: sync rates for channels in dual emac mode
  net: ethernet: ti: cpsw: re-split res only when speed is changed
  net: ethernet: ti: cpsw: combine budget and weight split and check
  net: ethernet: ti: cpsw: don't start queue twice
  net: ethernet: ti: cpsw: use same macros to get active slave
  net: mvneta: select GENERIC_ALLOCATOR
  ...
2016-12-12 07:54:15 -08:00
Randy Dunlap
7c7808ce10 openrisc: prevent VGA console, fix builds
OpenRISC does not support VGA console, so prevent that kconfig symbol
from being enabled for OpenRISC, thus fixing these build errors:

   drivers/built-in.o: In function `vgacon_save_screen':
   vgacon.c:(.text+0x20e0): undefined reference to `screen_info'
   vgacon.c:(.text+0x20e8): undefined reference to `screen_info'
   drivers/built-in.o: In function `vgacon_init':
   vgacon.c:(.text+0x284c): undefined reference to `screen_info'
   vgacon.c:(.text+0x2850): undefined reference to `screen_info'
   drivers/built-in.o: In function `vgacon_startup':
   vgacon.c:(.text+0x28d8): undefined reference to `screen_info'
   drivers/built-in.o:vgacon.c:(.text+0x28f0): more undefined references to `screen_info' follow

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: Chen Gang <gang.chen@asianux.com>
Cc: Jonas Bonn <jonas@southpole.se>
Signed-off-by: Stafford Horne <shorne@gmail.com>
2016-12-12 23:10:29 +09:00
Stefan Kristiansson
cdb75442fe openrisc: include l.swa in check for write data pagefault
During page fault handling we check the last instruction to understand
if the fault was for a read or for a write.  By default we fall back to
read.  New instructions were added to the openrisc 1.1 spec for an
atomic load/store pair (l.lwa/l.swa).

This patch adds the opcode for l.swa (0x33) allowing it to be treated as
a write operation.

Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
[shorne@gmail.com: expanded a bit on the comment]
Signed-off-by: Stafford Horne <shorne@gmail.com>
2016-12-12 23:10:26 +09:00
Stafford Horne
d01e1f35fe openrisc: Updates after openrisc.net has been lost
The openrisc.net domain expired and was taken over by squatters.
These updates point documentation to the new domain, mailing lists
and git repos.

Also, Jonas is not the main maintainer anylonger, he reviews changes
but does not maintain a repo or sent pull requests.  Updating this to
add Stafford and Stefan who are the active maintainers.

Acked-by: Olof Kindgren <olof.kindgren@gmail.com>
Signed-off-by: Stafford Horne <shorne@gmail.com>
2016-12-12 23:10:19 +09:00
Stafford Horne
266c7fad15 openrisc: Consolidate setup to use memblock instead of bootmem
Clearing out one todo item. Use the memblock boot time memory
which is the current standard.

Tested-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Jonas <jonas@southpole.se>
Signed-off-by: Stafford Horne <shorne@gmail.com>
2016-12-12 23:10:00 +09:00
Rob Herring
994894c3f7 openrisc: remove the redundant of_platform_populate
The of_platform_populate call in the openrisc arch code is now redundant
as the DT core provides a default call. Openrisc has a NULL match table
which means only top level nodes with compatible strings will have
devices creates. The default version will also descend nodes in the
match table such as "simple-bus" which should be fine as openrisc
doesn't have any of these (though it is preferred that memory-mapped
peripherals be grouped under a bus node(s)).

Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Jonas Bonn <jonas@southpole.se>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Stafford Horne <shorne@gmail.com>
2016-12-12 23:09:57 +09:00
Stafford Horne
34bbdcdcda openrisc: add NR_CPUS Kconfig default value
The build system now expects that NR_CPUS is defined.

Follow 4cbbbb4 ("microblaze: Fix missing NR_CPUS in menuconfig")

Signed-off-by: Stafford Horne <shorne@gmail.com>
2016-12-12 23:09:51 +09:00
Guenter Roeck
1f43e235df openrisc: Support both old (or32) and new (or1k) toolchain
The output file format for or1k has changed from "elf32-or32"
to "elf32-or1k". Select the correct output format automatically
to be able to compile the kernel with both toolchain variants.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Stafford Horne <shorne@gmail.com>
2016-12-12 23:09:40 +09:00
Christian Svensson
e60aa2fba4 openrisc: Add thread-local storage (TLS) support
Historically OpenRISC GCC has reserved r10 which we now use to hold
the thread pointer for thread-local storage (TLS).

Signed-off-by: Christian Svensson <blue@cmd.nu>
Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Stafford Horne <shorne@gmail.com>
2016-12-12 23:09:28 +09:00
Jonas Bonn
c79902190f openrisc: restore all regs on rt_sigreturn
Fix signal handling for when signals are handled as the result of timers
or exceptions, previous code assumed syscalls. This was noticeable with X
crashing where it uses SIGALRM.

This patch restores all regs before returning to userspace via
_resume_userspace instead of via syscall return path.

The rt_sigreturn syscall is more like a context switch than a function
call; it entails a return from one context (the signal handler) to another
(the process in question).  For a context switch like this there are
effectively no call-saved regs that remain constant across the transition.

Reported-by: Sebastian Macke <sebastian@macke.de>
Signed-off-by: Jonas Bonn <jonas@southpole.se>
Tested-by: Guenter Roeck <linux@roeck-us.net>
[shorne@gmail.com: Updated comment better reflect change and issue]
Signed-off-by: Stafford Horne <shorne@gmail.com>
2016-12-12 23:09:16 +09:00
Stefan Kristiansson
f47706099b openrisc: fix PTRS_PER_PGD define
On OpenRISC, with its 8k pages, PAGE_SHIFT is defined to be 13.
That makes the expression (1UL << (PAGE_SHIFT-2)) evaluate
to 2048.
The correct value for PTRS_PER_PGD should be 256.

Correcting the PTRS_PER_PGD define unveiled a bug in map_ram(),
where PTRS_PER_PGD was used when the intent was to iterate
over a set of page table entries.
This patch corrects that issue as well.

Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Acked-by: Jonas Bonn <jonas@southpole.se>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Stafford Horne <shorne@gmail.com>
2016-12-12 23:09:06 +09:00
Hans-Christian Noren Egtvedt
8712a5b807 avr32: wire up pkey syscalls
This patch wires up the new pkey_mprotect, pkey_alloc and pkey_free syscalls on
AVR32.
2016-12-12 09:24:13 +01:00
Markus Elfring
79ba1814da AVR32-pio: Replace two seq_printf() calls by seq_puts() in pio_bank_show()
Strings which did not contain data format specifications should be put
into a sequence. Thus use the corresponding function "seq_puts".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
2016-12-12 09:05:46 +01:00
Markus Elfring
78b7c6bff1 AVR32-pio: Use seq_putc() in pio_bank_show()
A single character (line break) should be put into a sequence.
Thus use the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
2016-12-12 09:05:46 +01:00
Markus Elfring
017e7d0246 AVR32-clock: Combine nine seq_printf() calls into one call in clk_show()
Some data were printed into a sequence by nine separate function calls.
Print the same data by a single function call instead.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
2016-12-12 09:05:46 +01:00
Markus Elfring
65c0787ce0 AVR32-clock: Use seq_putc() in two functions
A single character (line break) should be put into two sequences.
Thus use the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
2016-12-12 09:05:46 +01:00
Gonglei \(Arei\)
541cc39433 sparc: fix a building error reported by kbuild
>> arch/sparc/include/asm/topology_64.h:44:44:
error: implicit declaration of function 'cpu_data'
[-Werror=implicit-function-declaration]

 #define topology_physical_package_id(cpu) (cpu_data(cpu).proc_id)
                                               ^
Let's include cpudata.h in topology_64.h.

Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Suggested-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-11 18:25:20 -08:00
Kirill A. Shutemov
acff7fdbc3 sparc64: fix typo in pgd_clear()
It really has to be pgdp, not pgd.

It just happend to work since all callers have 'pgd' as an argument.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-11 18:15:49 -08:00
Dan Carpenter
e241cfd3bd sparc64: restore irq in error paths in iommu
There are some error paths where we should restore IRQs but we don't.

Fixes: bb620c3d3925 ("sparc: Make sparc64 use scalable lib/iommu-common.c functions")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-11 18:15:49 -08:00
Dan Carpenter
601e6e3cc5 sparc: leon: Fix a retry loop in leon_init_timers()
The original code causes a static checker warning because it has a
continue inside a do { } while (0); loop.  In that context, a continue
and a break are equivalent.  The intent was to go back to the start of
the loop so the continue was a bug.

I've added a retry label at the start and changed the continue to a goto
retry.  Then I removed the do { } while (0) loop and pulled the code in
one indent level.

Fixes: 2791c1a43900 ("SPARC/LEON: added support for selecting Timer Core and Timer within core")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-11 18:15:49 -08:00
Dan Carpenter
b5c3206190 sparc64: make string buffers large enough
My static checker complains that if "lvl" is ULONG_MAX (this is 64 bit)
then some of the strings will overflow.  I don't know if that's possible
but it seems simple enough to make the buffers slightly larger.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-11 18:15:48 -08:00
Dan Carpenter
efca4885b5 sparc64: move dereference after check for NULL
We shouldn't dereference "iommu" until after we have checked that it is
non-NULL.

Fixes: f08978b0fdbf ("sparc64: Enable sun4v dma ops to use IOMMU v2 APIs")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-11 18:15:48 -08:00
Geliang Tang
b1ebb97550 sparc: kernel: use builtin_platform_driver
Use builtin_platform_driver() helper to simplify the code.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-11 18:15:48 -08:00
Allen Pais
e8f4aa6087 sparc64:Support User Probes for sparc
Signed-off-by: Eric Saint Etienne <eric.saint.etienne@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-11 18:01:51 -08:00
Daniel Vetter
868c97a846 dma-buf: Extract dma-buf.rst
Just prep work to polish and consolidate all the dma-buf related
documenation.

Unfortunately I didn't discover a way to both integrate this new file
into the overall toc while keeping it at the current place. Work
around that by moving it into the overall driver-api/index.rst.

Cc: linux-doc@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2016-12-11 13:37:55 -07:00
Linus Torvalds
69973b8308 Linux 4.9 2016-12-11 11:17:54 -08:00
Linus Torvalds
2e4333c14d Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
 "Two more MIPS fixes for 4.9:

   - RTC: Return -ENODEV so an external RTC will be tried

   - Fix mask of GPE frequency

  These two have been tested on Imagination's automated test system and
  also both received positive reviews on the linux-mips mailing list"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: Lantiq: Fix mask of GPE frequency
  MIPS: Return -ENODEV from weak implementation of rtc_mips_set_time
2016-12-11 10:17:39 -08:00
Pablo Neira
d84701ecbc netfilter: nft_counter: rework atomic dump and reset
Dump and reset doesn't work unless cmpxchg64() is used both from packet
and control plane paths. This approach is going to be slow though.
Instead, use a percpu seqcount to fetch counters consistently, then
subtract bytes and packets in case a reset was requested.

The cpu that running over the reset code is guaranteed to own this stats
exclusively, we have to turn counters into signed 64bit though so stats
update on reset don't get wrong on underflow.

This patch is based on original sketch from Eric Dumazet.

Fixes: 43da04a593d8 ("netfilter: nf_tables: atomic dump and reset for stateful objects")
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-11 10:01:05 -05:00
Vincent Guittot
6b94780e45 sched/core: Use load_avg for selecting idlest group
find_idlest_group() only compares the runnable_load_avg when looking
for the least loaded group. But on fork intensive use case like
hackbench where tasks blocked quickly after the fork, this can lead to
selecting the same CPU instead of other CPUs, which have similar
runnable load but a lower load_avg.

When the runnable_load_avg of 2 CPUs are close, we now take into
account the amount of blocked load as a 2nd selection factor. There is
now 3 zones for the runnable_load of the rq:

 - [0 .. (runnable_load - imbalance)]:
	Select the new rq which has significantly less runnable_load

 - [(runnable_load - imbalance) .. (runnable_load + imbalance)]:
	The runnable loads are close so we use load_avg to chose
	between the 2 rq

 - [(runnable_load + imbalance) .. ULONG_MAX]:
	Keep the current rq which has significantly less runnable_load

The scale factor that is currently used for comparing runnable_load,
doesn't work well with small value. As an example, the use of a
scaling factor fails as soon as this_runnable_load == 0 because we
always select local rq even if min_runnable_load is only 1, which
doesn't really make sense because they are just the same. So instead
of scaling factor, we use an absolute margin for runnable_load to
detect CPUs with similar runnable_load and we keep using scaling
factor for blocked load.

For use case like hackbench, this enable the scheduler to select
different CPUs during the fork sequence and to spread tasks across the
system.

Tests have been done on a Hikey board (ARM based octo cores) for
several kernel. The result below gives min, max, avg and stdev values
of 18 runs with each configuration.

The patches depend on the "no missing update_rq_clock()" work.

hackbench -P -g 1

         ea86cb4b7621  7dc603c9028e  v4.8        v4.8+patches
  min    0.049         0.050         0.051       0,048
  avg    0.057         0.057(0%)     0.057(0%)   0,055(+5%)
  max    0.066         0.068         0.070       0,063
  stdev  +/-9%         +/-9%         +/-8%       +/-9%

More performance numbers here:

  https://lkml.kernel.org/r/20161203214707.GI20785@codeblueprint.co.uk

Tested-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten.Rasmussen@arm.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: kernellwp@gmail.com
Cc: umgwanakikbuti@gmail.com
Cc: yuyang.du@intel.comc
Link: http://lkml.kernel.org/r/1481216215-24651-3-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-11 13:10:57 +01:00
Vincent Guittot
f519a3f1c6 sched/core: Fix find_idlest_group() for fork
During fork, the utilization of a task is init once the rq has been
selected because the current utilization level of the rq is used to
set the utilization of the fork task. As the task's utilization is
still 0 at this step of the fork sequence, it doesn't make sense to
look for some spare capacity that can fit the task's utilization.
Furthermore, I can see perf regressions for the test:

   hackbench -P -g 1

because the least loaded policy is always bypassed and tasks are not
spread during fork.

With this patch and the fix below, we are back to same performances as
for v4.8. The fix below is only a temporary one used for the test
until a smarter solution is found because we can't simply remove the
test which is useful for others benchmarks

| @@ -5708,13 +5708,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
|
|	avg_cost = this_sd->avg_scan_cost;
|
| -	/*
| -	 * Due to large variance we need a large fuzz factor; hackbench in
| -	 * particularly is sensitive here.
| -	 */
| -	if ((avg_idle / 512) < avg_cost)
| -		return -1;
| -
|	time = local_clock();
|
|	for_each_cpu_wrap(cpu, sched_domain_span(sd), target, wrap) {

Tested-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: kernellwp@gmail.com
Cc: umgwanakikbuti@gmail.com
Cc: yuyang.du@intel.comc
Link: http://lkml.kernel.org/r/1481216215-24651-2-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-11 13:10:56 +01:00
Ingo Molnar
6643aab30f Merge branch 'linus' into sched/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-11 13:10:40 +01:00
Peter Zijlstra
11f254dbb3 x86/paravirt: Fix bool return type for PVOP_CALL()
Commit:

  3cded4179481 ("x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()")

introduced a paravirt op with bool return type [*]

It turns out that the PVOP_CALL*() macros miscompile when rettype is
bool. Code that looked like:

   83 ef 01                sub    $0x1,%edi
   ff 15 32 a0 d8 00       callq  *0xd8a032(%rip)        # ffffffff81e28120 <pv_lock_ops+0x20>
   84 c0                   test   %al,%al

ended up looking like so after PVOP_CALL1() was applied:

   83 ef 01                sub    $0x1,%edi
   48 63 ff                movslq %edi,%rdi
   ff 14 25 20 81 e2 81    callq  *0xffffffff81e28120
   48 85 c0                test   %rax,%rax

Note how it tests the whole of %rax, even though a typical bool return
function only sets %al, like:

  0f 95 c0                setne  %al
  c3                      retq

This is because ____PVOP_CALL() does:

		__ret = (rettype)__eax;

and while regular integer type casts truncate the result, a cast to
bool tests for any !0 value. Fix this by explicitly truncating to
sizeof(rettype) before casting.

[*] The actual bug should've been exposed in commit:
      446f3dc8cc0a ("locking/core, x86/paravirt: Implement vcpu_is_preempted(cpu) for KVM and Xen guests")
    but that didn't properly implement the paravirt call.

Reported-by: kernel test robot <xiaolong.ye@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 3cded4179481 ("x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()")
Link: http://lkml.kernel.org/r/20161208154349.346057680@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-11 13:09:20 +01:00
Peter Zijlstra
45dbea5f55 x86/paravirt: Fix native_patch()
While chasing a regression I noticed we potentially patch the wrong
code in native_patch().

If we do not select the native code sequence, we must use the default
patcher, not fall-through the switch case.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel test robot <xiaolong.ye@intel.com>
Fixes: 3cded4179481 ("x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()")
Link: http://lkml.kernel.org/r/20161208154349.270616999@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-11 13:09:19 +01:00
Ingo Molnar
6f38751510 Merge branch 'linus' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-11 13:07:13 +01:00
Andi Kleen
b0c1ef5295 perf/x86: Fix exclusion of BTS and LBR for Goldmont
An earlier patch allowed enabling PT and LBR at the same
time on Goldmont. However it also allowed enabling BTS and LBR
at the same time, which is still not supported. Fix this by
bypassing the check only for PT.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: alexander.shishkin@intel.com
Cc: kan.liang@intel.com
Cc: <stable@vger.kernel.org>
Fixes: ccbebba4c6bf ("perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports it")
Link: http://lkml.kernel.org/r/20161209001417.4713-1-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-11 13:06:09 +01:00
Ingo Molnar
6de75a37b8 Merge branch 'linus' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-11 13:05:59 +01:00
Hauke Mehrtens
ba735155b9 MIPS: Lantiq: Fix mask of GPE frequency
The hardware documentation says bit 11:10 are used for the GPE
frequency selection. Fix the mask in the define to match these bits.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Thomas Langer <thomas.langer@intel.com>
Cc: linux-mips@linux-mips.org
Cc: john@phrozen.org
Patchwork: https://patchwork.linux-mips.org/patch/14648/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-12-11 11:20:25 +01:00
Luuk Paulussen
edb6fa1a64 MIPS: Return -ENODEV from weak implementation of rtc_mips_set_time
The sync_cmos_clock function in kernel/time/ntp.c first tries to update
the internal clock of the cpu by calling the "update_persistent_clock64"
architecture specific function.  If this returns -ENODEV, it then tries
to update an external RTC using "rtc_set_ntp_time".

On the mips architecture, the weak implementation of the underlying
function would return 0 if it wasn't overridden.  This meant that the
sync_cmos_clock function would never try to update an external RTC
(if both CONFIG_GENERIC_CMOS_UPDATE and CONFIG_RTC_SYSTOHC are
configured)

Returning -ENODEV instead, means that an external RTC will be tried.

Signed-off-by: Luuk Paulussen <luuk.paulussen@alliedtelesis.co.nz>
Reviewed-by: Richard Laing <richard.laing@alliedtelesis.co.nz>
Reviewed-by: Scott Parlane <scott.parlane@alliedtelesis.co.nz>
Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14649/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-12-11 11:19:04 +01:00
WANG Cong
3111912971 e1000: use disable_hardirq() for e1000_netpoll()
In commit 02cea3958664 ("genirq: Provide disable_hardirq()")
Peter introduced disable_hardirq() for netpoll, but it is forgotten
to use it for e1000.

This patch changes disable_irq() to disable_hardirq() for e1000.

Reported-by: Dave Jones <davej@codemonkey.org.uk>
Suggested-by: Sabrina Dubroca <sd@queasysnail.net>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-10 23:31:19 -05:00
Keller, Jacob E
0266ac4536 i40e: don't truncate match_method assignment
The .match_method field is a u8, so we shouldn't be casting to a u16,
and because it is only one byte, we do not need to byte swap anything.
Just assign the value directly. This avoids issues on Big Endian
architectures which would have byte swapped and then incorrectly
truncated the value.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Bimmy Pujari <bimmy.pujari@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-10 23:31:19 -05:00
WingMan Kwok
6246168b4a net: ethernet: ti: netcp: add support of cpts
This patch adds support of the cpts device found in the
gbe and 10gbe ethernet switches on the keystone 2 SoCs
(66AK2E/L/Hx, 66AK2Gx).

Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: WingMan Kwok <w-kwok2@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-10 23:31:19 -05:00
Timur Tabi
529ed12752 net: phy: phy drivers should not set SUPPORTED_[Asym_]Pause
Instead of having individual PHY drivers set the SUPPORTED_Pause and
SUPPORTED_Asym_Pause flags, phylib itself should set those flags,
unless there is a hardware erratum or other special case.  During
autonegotiation, the PHYs will determine whether to enable pause
frame support.

Pause frames are a feature that is supported by the MAC.  It is the MAC
that generates the frames and that processes them.  The PHY can only be
configured to allow them to pass through.

This commit also effectively reverts the recently applied c7a61319
("net: phy: dp83848: Support ethernet pause frames").

So the new process is:

1) Unless the PHY driver overrides it, phylib sets the SUPPORTED_Pause
and SUPPORTED_AsymPause bits in phydev->supported.  This indicates that
the PHY supports pause frames.

2) The MAC driver checks phydev->supported before it calls phy_start().
If (SUPPORTED_Pause | SUPPORTED_AsymPause) is set, then the MAC driver
sets those bits in phydev->advertising, if it wants to enable pause
frame support.

3) When the link state changes, the MAC driver checks phydev->pause and
phydev->asym_pause,  If the bits are set, then it enables the corresponding
features in the MAC.  The algorithm is:

	if (phydev->pause)
		The MAC should be programmed to receive and honor
                pause frames it receives, i.e. enable receive flow control.

	if (phydev->pause != phydev->asym_pause)
		The MAC should be programmed to transmit pause
		frames when needed, i.e. enable transmit flow control.

Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-10 23:31:19 -05:00
Asbjørn Sloth Tønnesen
fba40c632c net: l2tp: ppp: change PPPOL2TP_MSG_* => L2TP_MSG_*
Signed-off-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-10 23:29:11 -05:00
Asbjørn Sloth Tønnesen
47c3e7783b net: l2tp: deprecate PPPOL2TP_MSG_* in favour of L2TP_MSG_*
PPPOL2TP_MSG_* and L2TP_MSG_* are duplicates, and are being used
interchangeably in the kernel, so let's standardize on L2TP_MSG_*
internally, and keep PPPOL2TP_MSG_* defined in UAPI for compatibility.

Signed-off-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-10 23:29:11 -05:00
Asbjørn Sloth Tønnesen
41c43fbee6 net: l2tp: export debug flags to UAPI
Move the L2TP_MSG_* definitions to UAPI, as it is part of
the netlink API.

Signed-off-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-10 23:29:11 -05:00
David S. Miller
0972770eea Merge branch 'sxgbe-stmmac-remove-private-tx-lock'
Lino Sanfilippo says:

====================
Remove private tx queue locks

this patch series removes unnecessary private locks in the sxgbe and the
stmmac driver.

v2:
- adjust commit message
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-10 23:27:02 -05:00
Lino Sanfilippo
739c8e149a net: ethernet: stmmac: remove private tx queue lock
The driver uses a private lock for synchronization of the xmit function and
the xmit completion handler, but since the NETIF_F_LLTX flag is not set,
the xmit function is also called with the xmit_lock held.

On the other hand the completion handler uses the reverse locking order by
first taking the private lock and (in case that the tx queue had been
stopped) then the xmit_lock.

Improve the locking by removing the private lock and using only the
xmit_lock for synchronization instead.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-10 23:26:54 -05:00
Lino Sanfilippo
980f140493 net: ethernet: sxgbe: remove private tx queue lock
The driver uses a private lock for synchronization of the xmit function and
the xmit completion handler, but since the NETIF_F_LLTX flag is not set,
the xmit function is also called with the xmit_lock held.

On the other hand the completion handler uses the reverse locking order by
first taking the private lock and (in case that the tx queue had been
stopped) then the xmit_lock.

Improve the locking by removing the private lock and using only the
xmit_lock for synchronization instead.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-10 23:23:35 -05:00
David S. Miller
c280b48266 Merge branch 'bridge-fast-ageing-on-topology-change'
Vivien Didelot says:

====================
net: bridge: fast ageing on topology change

802.1D [1] specifies that the bridges in a network must use a short
value to age out dynamic entries in the Filtering Database for a period,
once a topology change has been communicated by the root bridge.

This patchset fixes this for the in-kernel STP implementation.

Once the topology change flag is set in a net_bridge instance, the
ageing time value is shorten to twice the forward delay used by the
topology.

When the topology change flag is cleared, the ageing time configured for
the bridge is restored.

To accomplish that, a new bridge_ageing_time member is added to the
net_bridge structure, to store the user configured bridge ageing time.

Two helpers are added to offload the ageing time and set the topology
change flag in the net_bridge instance. Then the required logic is added
in the topology change helper if in-kernel STP is used.

This has been tested on the following topology:

    +--------------+
    | root bridge  |
    |  1  2  3  4  |
    +--+--+--+--+--+
       |  |  |  |      +--------+
       |  |  |  +------| laptop |
       |  |  |         +--------+
    +--+--+--+-----+
    |  1  2  3     |
    | slave bridge |
    +--------------+

When unplugging/replugging the laptop, the slave bridge (under test)
gets the topology change flag sent by the root bridge, and fast ageing
is triggered on the bridges. Once the topology change timer of the root
bridge expires, the topology change flag is cleared and the configured
ageing time is restored on the bridges.

A similar test has been done between two bridges under test.
When changing the forward delay of the root bridge with:

    # echo 3000 > /sys/class/net/br0/bridge/forward_delay

the ageing time correctly changes on both bridges from 300s to 60s while
the TOPOLOGY_CHANGE flag is present.

[1] "8.3.5 Notifying topology changes",
    http://profesores.elo.utfsm.cl/~agv/elo309/doc/802.1D-1998.pdf

No change since RFC: https://lkml.org/lkml/2016/10/19/828
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-10 21:28:39 -05:00
Vivien Didelot
34d8acd8aa net: bridge: shorten ageing time on topology change
802.1D [1] specifies that the bridges must use a short value to age out
dynamic entries in the Filtering Database for a period, once a topology
change has been communicated by the root bridge.

Add a bridge_ageing_time member in the net_bridge structure to store the
bridge ageing time value configured by the user (ioctl/netlink/sysfs).

If we are using in-kernel STP, shorten the ageing time value to twice
the forward delay used by the topology when the topology change flag is
set. When the flag is cleared, restore the configured ageing time.

[1] "8.3.5 Notifying topology changes ",
    http://profesores.elo.utfsm.cl/~agv/elo309/doc/802.1D-1998.pdf

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-10 21:28:28 -05:00