29907 Commits

Author SHA1 Message Date
Peng Sun
781e62823c bpf: decrease usercnt if bpf_map_new_fd() fails in bpf_map_get_fd_by_id()
In bpf/syscall.c, bpf_map_get_fd_by_id() use bpf_map_inc_not_zero()
to increase the refcount, both map->refcnt and map->usercnt. Then, if
bpf_map_new_fd() fails, should handle map->usercnt too.

Fixes: bd5f5f4ecb78 ("bpf: Add BPF_MAP_GET_FD_BY_ID")
Signed-off-by: Peng Sun <sironhide0null@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-26 19:08:30 +01:00
David S. Miller
70f3522614 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Three conflicts, one of which, for marvell10g.c is non-trivial and
requires some follow-up from Heiner or someone else.

The issue is that Heiner converted the marvell10g driver over to
use the generic c45 code as much as possible.

However, in 'net' a bug fix appeared which makes sure that a new
local mask (MDIO_AN_10GBT_CTRL_ADV_NBT_MASK) with value 0x01e0
is cleared.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 12:06:19 -08:00
Linus Torvalds
c4eb1e1852 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Hopefully the last pull request for this release. Fingers crossed:

   1) Only refcount ESP stats on full sockets, from Martin Willi.

   2) Missing barriers in AF_UNIX, from Al Viro.

   3) RCU protection fixes in ipv6 route code, from Paolo Abeni.

   4) Avoid false positives in untrusted GSO validation, from Willem de
      Bruijn.

   5) Forwarded mesh packets in mac80211 need more tailroom allocated,
      from Felix Fietkau.

   6) Use operstate consistently for linkup in team driver, from George
      Wilkie.

   7) ThunderX bug fixes from Vadim Lomovtsev. Mostly races between VF
      and PF code paths.

   8) Purge ipv6 exceptions during netdevice removal, from Paolo Abeni.

   9) nfp eBPF code gen fixes from Jiong Wang.

  10) bnxt_en firmware timeout fix from Michael Chan.

  11) Use after free in udp/udpv6 error handlers, from Paolo Abeni.

  12) Fix a race in x25_bind triggerable by syzbot, from Eric Dumazet"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
  net: phy: realtek: Dummy IRQ calls for RTL8366RB
  tcp: repaired skbs must init their tso_segs
  net/x25: fix a race in x25_bind()
  net: dsa: Remove documentation for port_fdb_prepare
  Revert "bridge: do not add port to router list when receives query with source 0.0.0.0"
  selftests: fib_tests: sleep after changing carrier. again.
  net: set static variable an initial value in atl2_probe()
  net: phy: marvell10g: Fix Multi-G advertisement to only advertise 10G
  bpf, doc: add bpf list as secondary entry to maintainers file
  udp: fix possible user after free in error handler
  udpv6: fix possible user after free in error handler
  fou6: fix proto error handler argument type
  udpv6: add the required annotation to mib type
  mdio_bus: Fix use-after-free on device_register fails
  net: Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255
  bnxt_en: Wait longer for the firmware message response to complete.
  bnxt_en: Fix typo in firmware message timeout logic.
  nfp: bpf: fix ALU32 high bits clearance bug
  nfp: bpf: fix code-gen bug on BPF_ALU | BPF_XOR | BPF_K
  Documentation: networking: switchdev: Update port parent ID section
  ...
2019-02-24 09:28:26 -08:00
Thomas Gleixner
a324ca9cad irqchip updates for Linux 5.1
- Core pseudo-NMI handling code
 - Allow the default irq domain to be retrieved
 - A new interrupt controller for the Loongson LS1X platform
 - Affinity support for the SiFive PLIC
 - Better support for the iMX irqsteer driver
 - NUMA aware memory allocations for GICv3
 - A handful of other fixes (i8259, GICv3, PLIC)
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlxwGtgVHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpD+2YP/2m9cVU3Z9ak8+HdSblq2Sw8QPfd
 RshYS+DzppLUzhzj2w2jnz9eP2fWEqBwrQmvtOI8Fo+id0PvdE3ngaP4hPMJDyuU
 Ou02TV6YwE4jknoO02RXOdeBJArccc1WR5++YZjp1gGUABFUPCHwKLoZgysurapV
 sZQ1Ten3wlsrZKKNTdWfYFWB36d7J3eqFYeGy3sll1wQ6XUbHmUJPPrSfXMqDYzY
 giDD/DH8IIhfnRs+T2TxGzKtTDMnJRYJYQK2bNgtNAW+wEY2BtCLSHj8//3bK0R9
 Jek9xg1NLpbQE+T8f2ZUd6BjbVxmDd3mGPvshXKyHFESl4fvC9yrddC86dBzHwrN
 VJmaES974PBuMtE2xPZGInh77EcelVC7OPeXsnjVMrUZo0s7tFY/TWA+rqCOLmgC
 A+0jagCDx1nTTYGXsqoyrHThoQoYZRX6AnXFeDJb9OLo3cV7x4w/FPORstM0PbAc
 butyZulVg1YQ+Y+oJK/UvIkdFL7FFqB/kgZK/lrL0InvbQMj4CBt3bsWY5OxgInF
 E02tgzEnrx1nHGi1XPnCTOs7DnKeaPR/h/u3PjoT7FeiZLClyiGDw7V/NuF+buLB
 w7Pqpn835CnkXC27MycTjPo23eZv690M4vcHL4vrhN+iuGp+2hZdXUiR15mZnH6m
 g0N8anZbL1iol0Gm
 =M6YA
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates from Marc Zyngier

- Core pseudo-NMI handling code
- Allow the default irq domain to be retrieved
- A new interrupt controller for the Loongson LS1X platform
- Affinity support for the SiFive PLIC
- Better support for the iMX irqsteer driver
- NUMA aware memory allocations for GICv3
- A handful of other fixes (i8259, GICv3, PLIC)
2019-02-23 10:53:31 +01:00
David S. Miller
ea34a00364 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2019-02-23

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix a bug in BPF's LPM deletion logic to match correct prefix
   length, from Alban.

2) Fix AF_XDP teardown by not destroying umem prematurely as it
   is still needed till all outstanding skbs are freed, from Björn.

3) Fix unkillable BPF_PROG_TEST_RUN under preempt kernel by checking
   signal_pending() outside need_resched() condition which is never
   triggered there, from Stanislav.

4) Fix two nfp JIT bugs, one in code emission for K-based xor, and
   another one to explicitly clear upper bits in alu32, from Jiong.

5) Add bpf list address to maintainers file, from Daniel.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 20:45:38 -08:00
Alexander Shishkin
c60f83b813 perf, pt, coresight: Fix address filters for vmas with non-zero offset
Currently, the address range calculation for file-based filters works as
long as the vma that maps the matching part of the object file starts
from offset zero into the file (vm_pgoff==0). Otherwise, the resulting
filter range would be off by vm_pgoff pages. Another related problem is
that in case of a partially matching vma, that is, a vma that matches
part of a filter region, the filter range size wouldn't be adjusted.

Fix the arithmetics around address filter range calculations, taking
into account vma offset, so that the entire calculation is done before
the filter configuration is passed to the PMU drivers instead of having
those drivers do the final bit of arithmetics.

Based on the patch by Adrian Hunter <adrian.hunter.intel.com>.

Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 375637bc5249 ("perf/core: Introduce address range filtering")
Link: http://lkml.kernel.org/r/20190215115655.63469-3-alexander.shishkin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-22 16:52:07 -03:00
Alexander Shishkin
18736eef12 perf: Copy parent's address filter offsets on clone
When a child event is allocated in the inherit_event() path, the VMA
based filter offsets are not copied from the parent, even though the
address space mapping of the new task remains the same, which leads to
no trace for the new task until exec.

Reported-by: Mansour Alharthi <malharthi9@gatech.edu>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 375637bc5249 ("perf/core: Introduce address range filtering")
Link: http://lkml.kernel.org/r/20190215115655.63469-2-alexander.shishkin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-22 16:52:07 -03:00
Alban Crequy
7c0cdf0b39 bpf, lpm: fix lookup bug in map_delete_elem
trie_delete_elem() was deleting an entry even though it was not matching
if the prefixlen was correct. This patch adds a check on matchlen.

Reproducer:

$ sudo bpftool map create /sys/fs/bpf/mylpm type lpm_trie key 8 value 1 entries 128 name mylpm flags 1
$ sudo bpftool map update pinned /sys/fs/bpf/mylpm key hex 10 00 00 00 aa bb cc dd value hex 01
$ sudo bpftool map dump pinned /sys/fs/bpf/mylpm
key: 10 00 00 00 aa bb cc dd  value: 01
Found 1 element
$ sudo bpftool map delete pinned /sys/fs/bpf/mylpm key hex 10 00 00 00 ff ff ff ff
$ echo $?
0
$ sudo bpftool map dump pinned /sys/fs/bpf/mylpm
Found 0 elements

A similar reproducer is added in the selftests.

Without the patch:

$ sudo ./tools/testing/selftests/bpf/test_lpm_map
test_lpm_map: test_lpm_map.c:485: test_lpm_delete: Assertion `bpf_map_delete_elem(map_fd, key) == -1 && errno == ENOENT' failed.
Aborted

With the patch: test_lpm_map runs without errors.

Fixes: e454cf595853 ("bpf: Implement map_delete_elem for BPF_MAP_TYPE_LPM_TRIE")
Cc: Craig Gallek <kraig@google.com>
Signed-off-by: Alban Crequy <alban@kinvolk.io>
Acked-by: Craig Gallek <kraig@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-22 16:17:53 +01:00
Alexei Starovoitov
e80d02dd76 seccomp, bpf: disable preemption before calling into bpf prog
All BPF programs must be called with preemption disabled.

Fixes: 568f196756ad ("bpf: check that BPF programs run with preemption disabled")
Reported-by: syzbot+8bf19ee2aa580de7a2a7@syzkaller.appspotmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-22 00:14:19 +01:00
Johannes Weiner
4e37504d1c psi: avoid divide-by-zero crash inside virtual machines
We've been seeing hard-to-trigger psi crashes when running inside VM
instances:

    divide error: 0000 [#1] SMP PTI
    Modules linked in: [...]
    CPU: 0 PID: 212 Comm: kworker/0:2 Not tainted 4.16.18-119_fbk9_3817_gfe944c98d695 #119
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
    Workqueue: events psi_clock
    RIP: 0010:psi_update_stats+0x270/0x490
    RSP: 0018:ffffc90001117e10 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8800a35a13f8
    RDX: 0000000000000000 RSI: ffff8800a35a1340 RDI: 0000000000000000
    RBP: 0000000000000658 R08: ffff8800a35a1470 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
    R13: 0000000000000000 R14: 0000000000000000 R15: 00000000000f8502
    FS:  0000000000000000(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fbe370fa000 CR3: 00000000b1e3a000 CR4: 00000000000006f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     psi_clock+0x12/0x50
     process_one_work+0x1e0/0x390
     worker_thread+0x2b/0x3c0
     ? rescuer_thread+0x330/0x330
     kthread+0x113/0x130
     ? kthread_create_worker_on_cpu+0x40/0x40
     ? SyS_exit_group+0x10/0x10
     ret_from_fork+0x35/0x40
    Code: 48 0f 47 c7 48 01 c2 45 85 e4 48 89 16 0f 85 e6 00 00 00 4c 8b 49 10 4c 8b 51 08 49 69 d9 f2 07 00 00 48 6b c0 64 4c 8b 29 31 d2 <48> f7 f7 49 69 d5 8d 06 00 00 48 89 c5 4c 69 f0 00 98 0b 00 48

The Code-line points to `period` being 0 inside update_stats(), and we
divide by that when calculating that period's pressure percentage.

The elapsed period should never be 0.  The reason this can happen is due
to an off-by-one in the idle time / missing period calculation combined
with a coarse sched_clock() in the virtual machine.

The target time for aggregation is advanced into the future on a fixed
grid to prevent clock drift.  So when an aggregation runs after some idle
period, we can not just set it to "now + psi_period", but have to
calculate the downtime and advance the target time relative to itself.

However, if the aggregator was disabled exactly one psi_period (ns), we
drop one idle period in the calculation due to a > when we should do >=.
In that case, next_update will be advanced from 'now - psi_period' to
'now' when it should be moved to 'now + psi_period'.  The run finishes
with last_update == next_update == sched_clock().

With hardware clocks, this exact nanosecond match isn't likely in the
first place; but if it does happen, the clock will still have moved on and
the period non-zero by the time the worker runs.  A pointlessly short
period, but besides the extra work, no harm no foul.  However, a slow
sched_clock() like we have on VMs might not have advanced either by the
time the worker runs again.  And when we calculate the elapsed period, the
result, our pressure divisor, will be 0.  Ouch.

Fix this by correctly handling the situation when the elapsed time between
aggregation runs is precisely two periods, and advance the expiration
timestamp correctly to period into the future.

Link: http://lkml.kernel.org/r/20190214193157.15788-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Łukasz Siudut <lsiudut@fb.com
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-21 09:01:00 -08:00
Liu Song
8bdc620178 workqueue: fix typo in comment
qeueue/queue

Signed-off-by: Liu Song <liu.song11@zte.com.cn>
Signed-off-by: Tejun Heo <tj@kernel.org>
2019-02-21 08:03:38 -08:00
Jann Horn
83540fbc88 tracing/perf: Use strndup_user() instead of buggy open-coded version
The first version of this method was missing the check for
`ret == PATH_MAX`; then such a check was added, but it didn't call kfree()
on error, so there was still a small memory leak in the error case.
Fix it by using strndup_user() instead of open-coding it.

Link: http://lkml.kernel.org/r/20190220165443.152385-1-jannh@google.com

Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 0eadcc7a7bc0 ("perf/core: Fix perf_uprobe_init()")
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-21 10:35:10 -05:00
Michael Ellerman
d0055df0c9 Merge branch 'topic/dma' into next
Merge hch's big DMA rework series. This is in a topic branch in case he
wants to merge it to minimise conflicts.
2019-02-21 23:15:10 +11:00
Linus Walleij
3dda927fdb Merge branch 'ib-qcom-ssbi' into devel 2019-02-21 12:58:31 +01:00
Marc Zyngier
9f199dd34c irqdomain: Allow the default irq domain to be retrieved
The default irq domain allows legacy code to create irqdomain
mappings without having to track the domain it is allocating
from. Setting the default domain is a one shot, fire and forget
operation, and no effort was made to be able to retrieve this
information at a later point in time.

Newer irqdomain APIs (the hierarchical stuff) relies on both
the irqchip code to track the irqdomain it is allocating from,
as well as some form of firmware abstraction to easily identify
which piece of HW maps to which irq domain (DT, ACPI).

For systems without such firmware (or legacy platform that are
getting dragged into the 21st century), things are a bit harder.
For these cases (and these cases only!), let's provide a way
to retrieve the default domain, allowing the use of the v2 API
without having to resort to platform-specific hacks.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-21 10:32:28 +00:00
Tetsuo Handa
cbae05d32f printk: Pass caller information to log_store().
When thread1 called printk() which did not end with '\n', and then thread2
called printk() which ends with '\n' before thread1 calls pr_cont(), the
partial content saved into "struct cont" is flushed by thread2 despite the
partial content was generated by thread1. This leads to confusing output
as if the partial content was generated by thread2. Fix this problem by
passing correct caller information to log_store().

Before:

  [ T8533] abcdefghijklm
  [ T8533] ABCDEFGHIJKLMNOPQRSTUVWXYZ
  [ T8532] nopqrstuvwxyz
  [ T8532] abcdefghijklmnopqrstuvwxyz
  [ T8533] abcdefghijklm
  [ T8533] ABCDEFGHIJKLMNOPQRSTUVWXYZ
  [ T8532] nopqrstuvwxyz

After:

  [ T8507] abcdefghijklm
  [ T8508] ABCDEFGHIJKLMNOPQRSTUVWXYZ
  [ T8507] nopqrstuvwxyz
  [ T8507] abcdefghijklmnopqrstuvwxyz
  [ T8507] abcdefghijklm
  [ T8508] ABCDEFGHIJKLMNOPQRSTUVWXYZ
  [ T8507] nopqrstuvwxyz

Link: http://lkml.kernel.org/r/1550314773-8607-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
To: Dmitry Vyukov <dvyukov@google.com>
To: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
[pmladek: broke 80-column rule where it made more harm than good]
Signed-off-by: Petr Mladek <pmladek@suse.com>
2019-02-21 10:41:15 +01:00
Colin Ian King
9e5a36a337 tracing: Fix spelling mistake: "analagous" -> "analogous"
There is a spelling mistake in the mini-howto help text. Fix it.

Link: http://lkml.kernel.org/r/20190217223222.16479-1-colin.king@canonical.com

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-20 13:51:08 -05:00
Steven Rostedt (VMware)
1c347a94ca tracing: Comment why cond_snapshot is checked outside of max_lock protection
Before setting tr->cond_snapshot, it must be NULL before it can be updated.
It can go to NULL when a trace event hist trigger is created or removed, and
can only be modified under the max_lock spin lock. But because it can only
be set to something other than NULL under both the max_lock spin lock as
well as the trace_types_lock, we can perform the check if it is not NULL
only under the trace_types_lock and fail out without having to grab the
max_lock spin lock.

This is very subtle, and deserves a comment.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-20 13:51:08 -05:00
Tom Zanussi
e91eefd731 tracing: Add alternative synthetic event trace action syntax
Add a 'trace(synthetic_event_name, params)' alternative to
synthetic_event_name(params).

Currently, the syntax used for generating synthetic events is to
invoke synthetic_event_name(params) i.e. use the synthetic event name
as a function call.

Users requested a new form that more explicitly shows that the
synthetic event is in effect being traced.  In this version, a new
'trace()' keyword is used, and the synthetic event name is passed in
as the first argument.

In addition, for the sake of consistency with other actions, change
the documention to emphasize the trace() form over the function-call
form, which remains documented as equivalent.

Link: http://lkml.kernel.org/r/d082773e50232a001480cf837679a1e01c1a2eb7.1550100284.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-20 13:51:07 -05:00
Tom Zanussi
dff81f5592 tracing: Add hist trigger onchange() handler
Add support for a hist:onchange($var) handler, similar to the onmax()
handler but triggering whenever there's any change in $var, not just a
max.

Link: http://lkml.kernel.org/r/dfbc7e4ada242603e9ec3f049b5ad076a07dfd03.1550100284.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-20 13:51:07 -05:00
Tom Zanussi
a3785b7eca tracing: Add hist trigger snapshot() action
Add support for hist:handlerXXX($var).snapshot(), which will take a
snapshot of the current trace buffer whenever handlerXXX is hit.

As a first user, this also adds snapshot() action support for the
onmax() handler i.e. hist:onmax($var).snapshot().

Also, the hist trigger key printing is moved into a separate function
so the snapshot() action can print a histogram key outside the
histogram display - add and use hist_trigger_print_key() for that
purpose.

Link: http://lkml.kernel.org/r/2f1a952c0dcd8aca8702ce81269581a692396d45.1550100284.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-20 13:51:07 -05:00
Tom Zanussi
a35873a099 tracing: Add conditional snapshot
Currently, tracing snapshots are context-free - they capture the ring
buffer contents at the time the tracing_snapshot() function was
invoked, and nothing else.  Additionally, they're always taken
unconditionally - the calling code can decide whether or not to take a
snapshot, but the data used to make that decision is kept separately
from the snapshot itself.

This change adds the ability to associate with each trace instance
some user data, along with an 'update' function that can use that data
to determine whether or not to actually take a snapshot.  The update
function can then update that data along with any other state (as part
of the data presumably), if warranted.

Because snapshots are 'global' per-instance, only one user can enable
and use a conditional snapshot for any given trace instance.  To
enable a conditional snapshot (see details in the function and data
structure comments), the user calls tracing_snapshot_cond_enable().
Similarly, to disable a conditional snapshot and free it up for other
users, tracing_snapshot_cond_disable() should be called.

To actually initiate a conditional snapshot, tracing_snapshot_cond()
should be called.  tracing_snapshot_cond() will invoke the update()
callback, allowing the user to decide whether or not to actually take
the snapshot and update the user-defined data associated with the
snapshot.  If the callback returns 'true', tracing_snapshot_cond()
will then actually take the snapshot and return.

This scheme allows for flexibility in snapshot implementations - for
example, by implementing slightly different update() callbacks,
snapshots can be taken in situations where the user is only interested
in taking a snapshot when a new maximum in hit versus when a value
changes in any way at all.  Future patches will demonstrate both
cases.

Link: http://lkml.kernel.org/r/1bea07828d5fd6864a585f83b1eed47ce097eb45.1550100284.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-20 13:51:06 -05:00
Tom Zanussi
466f4528fb tracing: Generalize hist trigger onmax and save action
The action refactor code allowed actions and handlers to be separated,
but the existing onmax handler and save action code is still not
flexible enough to handle arbitrary coupling.  This change generalizes
them and in the process makes additional handlers and actions easier
to implement.

The onmax action can be broken up and thought of as two separate
components - a variable to be tracked (the parameter given to the
onmax($var_to_track) function) and an invisible variable created to
save the ongoing result of doing something with that variable, such as
saving the max value of that variable so far seen.

Separating it out like this and renaming it appropriately allows us to
use the same code for similar tracking functions such as
onchange($var_to_track), which would just track the last value seen
rather than the max seen so far, which is useful in some situations.

Additionally, because different handlers and actions may want to save
and access data differently e.g. save and retrieve tracking values as
local variables vs something more global, save_val() and get_val()
interface functions are introduced and max-specific implementations
are used instead.

The same goes for the code that checks whether a maximum has been hit
- a generic check_val() interface and max-checking implementation is
used instead, which allows future patches to make use of he same code
using their own implemetations of similar functionality.

Link: http://lkml.kernel.org/r/980ea73dd8e3f36db3d646f99652f8fed42b77d4.1550100284.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-20 13:51:06 -05:00
Tom Zanussi
c3e49506a0 tracing: Split up onmatch action data
Currently, the onmatch action data binds the onmatch action to data
related to synthetic event generation.  Since we want to allow the
onmatch handler to potentially invoke a different action, and because
we expect other handlers to generate synthetic events, we need to
separate the data related to these two functions.

Also rename the onmatch data to something more descriptive, and create
and use common action data destroy function.

Link: http://lkml.kernel.org/r/b9abbf9aae69fe3920cdc8ddbcaad544dd258d78.1550100284.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-20 13:51:06 -05:00
Tom Zanussi
7d18a10c31 tracing: Refactor hist trigger action code
The hist trigger action code currently implements two essentially
hard-coded pairs of 'actions' - onmax(), which tracks a variable and
saves some event fields when a max is hit, and onmatch(), which is
hard-coded to generate a synthetic event.

These hardcoded pairs (track max/save fields and detect match/generate
synthetic event) should really be decoupled into separate components
that can then be arbitrarily combined.  The first component of each
pair (track max/detect match) is called a 'handler' in the new code,
while the second component (save fields/generate synthetic event) is
called an 'action' in this scheme.

This change refactors the action code to reflect this split by adding
two handlers, HANDLER_ONMATCH and HANDLER_ONMAX, along with two
actions, ACTION_SAVE and ACTION_TRACE.

The new code combines them to produce the existing ONMATCH/TRACE and
ONMAX/SAVE functionality, but doesn't implement the other combinations
now possible.  Future patches will expand these to further useful
cases, such as ONMAX/TRACE, as well as add additional handlers and
actions such as ONCHANGE and SNAPSHOT.

Also, add abbreviated documentation for handlers and actions to
README.

Link: http://lkml.kernel.org/r/98bfdd48c1b4ff29fc5766442f99f5bc3c34b76b.1550100284.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-20 13:51:06 -05:00
zhangyi (F)
e7f0c424d0 tracing: Do not free iter->trace in fail path of tracing_open_pipe()
Commit d716ff71dd12 ("tracing: Remove taking of trace_types_lock in
pipe files") use the current tracer instead of the copy in
tracing_open_pipe(), but it forget to remove the freeing sentence in
the error path.

There's an error path that can call kfree(iter->trace) after the iter->trace
was assigned to tr->current_trace, which would be bad to free.

Link: http://lkml.kernel.org/r/1550060946-45984-1-git-send-email-yi.zhang@huawei.com

Cc: stable@vger.kernel.org
Fixes: d716ff71dd12 ("tracing: Remove taking of trace_types_lock in pipe files")
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-20 13:47:08 -05:00
Christoph Hellwig
82c5de0ab8 dma-mapping: remove the DMA_MEMORY_EXCLUSIVE flag
All users of dma_declare_coherent want their allocations to be
exclusive, so default to exclusive allocations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-20 07:27:00 -07:00
Christoph Hellwig
91a6fda95c dma-mapping: remove dma_mark_declared_memory_occupied
This API is not used anywhere, so remove it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-02-20 07:27:00 -07:00
Christoph Hellwig
ddb26d8e1e dma-mapping: move CONFIG_DMA_CMA to kernel/dma/Kconfig
This is where all the related code already lives.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-20 07:27:00 -07:00
Christoph Hellwig
ff4c25f26a dma-mapping: improve selection of dma_declare_coherent availability
This API is primarily used through DT entries, but two architectures
and two drivers call it directly.  So instead of selecting the config
symbol for random architectures pull it in implicitly for the actual
users.  Also rename the Kconfig option to describe the feature better.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paul Burton <paul.burton@mips.com> # MIPS
Acked-by: Lee Jones <lee.jones@linaro.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-20 07:26:35 -07:00
Linus Walleij
2f7db3c70f gpio: updates for v5.1 - part 2
- gpio-mockup updates improving the user-space testing interface and
   adding line state tracking for correct edge interrupts
 - interrupt simulator patch exposing the irq type configuration to
   users
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAlxsN0kACgkQEacuoBRx
 13KDpQ/9GJvxpPn81vm8z8+IK+qQcbZ55lKIB90FB6kz0+ru6g+9gXWSA3FAifOr
 8ZxqwALIx52rLDYXpN0gQbOz4bIIXRm6eSKwm6QHyWlsW+59wvS4kAhu4a1j5jmy
 /jJlTQc+zOIkoYJX1EdRn581ehsmwftWm1kBIMudybC99vq3ks3a7nJjcNL/OYYo
 quvsVabB2n2/At4g4SfP4BRA1Hfgb1+X8rUcqHiIKlYHy6bVggrLLOcyEAwECoIT
 uvmXNGGD5g8W//sTi/Ex8xmR9xSdF3tI3PQQODvrRU4nbp7gYOIP7qFzUrBeZ+dP
 tRWmy0FVj6DaHbm9SBhVa/i8na7K04ibUAr/oknrPkBc55eSf3A9SWuqDa7HoC+L
 voFlndfhx8l8yEMQAui+S/NapRSOLm/UeOSBdkpe/NQSEOoi6QYDT+fqc0xCYd9F
 tvAjTLwDVpP17AxUHiDQFog/XESiNyhfGTB0Ca4utYfk6dhSyS7M2O4pO7p+RPt+
 /IMoL5KTkuJ3HMyC22M/yCyQrIYrNBc0dKG1JVKCCLQSLAfdtZ7zjQSiwp6uPlL8
 hW9D4k+FKX93jQm4Z0c/JSIC/O04pElILTdf0Oy5ki4QpgJvJpYqMupVaJ/rUyTF
 FsYcmRS3dXfGLbfTJUSnx8P293qiRktO8968RHX4PWRi2sF/L5w=
 =mEAi
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v5.1-updates-for-linus-part-2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into devel

gpio: updates for v5.1 - part 2

- gpio-mockup updates improving the user-space testing interface and
  adding line state tracking for correct edge interrupts
- interrupt simulator patch exposing the irq type configuration to
  users
2019-02-20 09:43:45 +01:00
David S. Miller
375ca548f7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Two easily resolvable overlapping change conflicts, one in
TCP and one in the eBPF verifier.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-20 00:34:07 -08:00
Linus Torvalds
40e196a906 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix suspend and resume in mt76x0u USB driver, from Stanislaw
    Gruszka.

 2) Missing memory barriers in xsk, from Magnus Karlsson.

 3) rhashtable fixes in mac80211 from Herbert Xu.

 4) 32-bit MIPS eBPF JIT fixes from Paul Burton.

 5) Fix for_each_netdev_feature() on big endian, from Hauke Mehrtens.

 6) GSO validation fixes from Willem de Bruijn.

 7) Endianness fix for dwmac4 timestamp handling, from Alexandre Torgue.

 8) More strict checks in tcp_v4_err(), from Eric Dumazet.

 9) af_alg_release should NULL out the sk after the sock_put(), from Mao
    Wenan.

10) Missing unlock in mac80211 mesh error path, from Wei Yongjun.

11) Missing device put in hns driver, from Salil Mehta.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
  sky2: Increase D3 delay again
  vhost: correctly check the return value of translate_desc() in log_used()
  net: netcp: Fix ethss driver probe issue
  net: hns: Fixes the missing put_device in positive leg for roce reset
  net: stmmac: Fix a race in EEE enable callback
  qed: Fix iWARP syn packet mac address validation.
  qed: Fix iWARP buffer size provided for syn packet processing.
  r8152: Add support for MAC address pass through on RTL8153-BD
  mac80211: mesh: fix missing unlock on error in table_path_del()
  net/mlx4_en: fix spelling mistake: "quiting" -> "quitting"
  net: crypto set sk to NULL when af_alg_release.
  net: Do not allocate page fragments that are not skb aligned
  mm: Use fixed constant in page_frag_alloc instead of size + 1
  tcp: tcp_v4_err() should be more careful
  tcp: clear icsk_backoff in tcp_write_queue_purge()
  net: mv643xx_eth: disable clk on error path in mv643xx_eth_shared_probe()
  qmi_wwan: apply SET_DTR quirk to Sierra WP7607
  net: stmmac: handle endianness in dwmac4_get_timestamp
  doc: Mention MSG_ZEROCOPY implementation for UDP
  mlxsw: __mlxsw_sp_port_headroom_set(): Fix a use of local variable
  ...
2019-02-19 16:13:19 -08:00
Peter Zijlstra
568f196756 bpf: check that BPF programs run with preemption disabled
Introduce cant_sleep() macro for annotation of functions that
cannot sleep.

Use it in BPF_PROG_RUN to catch execution of BPF programs in
preemptable context.

Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-19 21:53:07 +01:00
Bartosz Golaszewski
8d91ecc84d irq/irq_sim: add irq_set_type() callback
Implement the irq_set_type() callback and call irqd_set_trigger_type()
internally so that users interested in the configured trigger type can
later retrieve it using irqd_get_trigger_type(). We only support edge
trigger types.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-19 17:42:28 +01:00
Masahiro Yamada
6a613d24ef cpuset: remove unused task_has_mempolicy()
This is a remnant of commit 5f155f27cb7f ("mm, cpuset: always use
seqlock when changing task's nodemask").

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2019-02-19 06:26:02 -08:00
Linus Torvalds
10f4902173 Two more tracing fixes
- Have kprobes not use copy_from_user() to access kernel addresses,
    because kprobes can legitimately poke at bad kernel memory, which
    will fault. Copy from user code should never fault in kernel space.
    Using probe_mem_read() can handle kernel address space faulting.
 
  - Put back the entries counter in the tracing output that was accidentally
    removed.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXGb7BxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qqvaAQC66gQ79frSW7xPjJ4Y+qLIm0YDV18i
 aCHowAXxDeK3qAEA3sDeELAPVupacPrzZc6zejI+bf0HArPe08n3vlHwAQw=
 =uUQj
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.0-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "Two more tracing fixes

   - Have kprobes not use copy_from_user() to access kernel addresses,
     because kprobes can legitimately poke at bad kernel memory, which
     will fault. Copy from user code should never fault in kernel space.
     Using probe_mem_read() can handle kernel address space faulting.

   - Put back the entries counter in the tracing output that was
     accidentally removed"

* tag 'trace-v5.0-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix number of entries in trace header
  kprobe: Do not use uaccess functions to access kernel memory that can fault
2019-02-18 09:40:16 -08:00
Christoph Hellwig
feee96440c swiotlb: remove swiotlb_dma_supported
The only user left is powerpc, but even there the generic dma-direct
version works just as well, given that we guarantee that the swiotlb
buffer must always be addressable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-18 22:41:04 +11:00
Christoph Hellwig
11ddce1545 dma-mapping, powerpc: simplify the arch dma_set_mask override
Instead of letting the architecture supply all of dma_set_mask just
give it an additional hook selected by Kconfig.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-18 22:41:03 +11:00
Christoph Hellwig
ffe3dfd4e3 powerpc/dma: stop overriding dma_get_required_mask
The ppc_md and pci_controller_ops methods are unused now and can be
removed.  The dma_nommu implementation is generic to the generic one
except for using max_pfn instead of calling into the memblock API,
and all other dma_map_ops instances implement a method of their own.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-18 22:41:03 +11:00
Christoph Hellwig
fbce251baa dma-direct: we might need GFP_DMA for 32-bit dma masks
If there is no ZONE_DMA32 we might need GFP_DMA to be able to
allocate memory that satisfies a 32-bit DMA mask.

Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-18 22:41:01 +11:00
Thomas Gleixner
a6a309edba genirq/affinity: Remove the leftovers of the original set support
Now that the NVME driver is converted over to the calc_set() callback, the
workarounds of the original set support can be removed.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Shivasharan Srikanteshwara <shivasharan.srikanteshwara@broadcom.com>
Link: https://lkml.kernel.org/r/20190216172228.689834224@linutronix.de
2019-02-18 11:21:29 +01:00
Ming Lei
c66d4bd110 genirq/affinity: Add new callback for (re)calculating interrupt sets
The interrupt affinity spreading mechanism supports to spread out
affinities for one or more interrupt sets. A interrupt set contains one or
more interrupts. Each set is mapped to a specific functionality of a
device, e.g. general I/O queues and read I/O queus of multiqueue block
devices.

The number of interrupts per set is defined by the driver. It depends on
the total number of available interrupts for the device, which is
determined by the PCI capabilites and the availability of underlying CPU
resources, and the number of queues which the device provides and the
driver wants to instantiate.

The driver passes initial configuration for the interrupt allocation via a
pointer to struct irq_affinity.

Right now the allocation mechanism is complex as it requires to have a loop
in the driver to determine the maximum number of interrupts which are
provided by the PCI capabilities and the underlying CPU resources.  This
loop would have to be replicated in every driver which wants to utilize
this mechanism. That's unwanted code duplication and error prone.

In order to move this into generic facilities it is required to have a
mechanism, which allows the recalculation of the interrupt sets and their
size, in the core code. As the core code does not have any knowledge about the
underlying device, a driver specific callback is required in struct
irq_affinity, which can be invoked by the core code. The callback gets the
number of available interupts as an argument, so the driver can calculate the
corresponding number and size of interrupt sets.

At the moment the struct irq_affinity pointer which is handed in from the
driver and passed through to several core functions is marked 'const', but for
the callback to be able to modify the data in the struct it's required to
remove the 'const' qualifier.

Add the optional callback to struct irq_affinity, which allows drivers to
recalculate the number and size of interrupt sets and remove the 'const'
qualifier.

For simple invocations, which do not supply a callback, a default callback
is installed, which just sets nr_sets to 1 and transfers the number of
spreadable vectors to the set_size array at index 0.

This is for now guarded by a check for nr_sets != 0 to keep the NVME driver
working until it is converted to the callback mechanism.

To make sure that the driver configuration is correct under all circumstances
the callback is invoked even when there are no interrupts for queues left,
i.e. the pre/post requirements already exhaust the numner of available
interrupts.

At the PCI layer irq_create_affinity_masks() has to be invoked even for the
case where the legacy interrupt is used. That ensures that the callback is
invoked and the device driver can adjust to that situation.

[ tglx: Fixed the simple case (no sets required). Moved the sanity check
  	for nr_sets after the invocation of the callback so it catches
  	broken drivers. Fixed the kernel doc comments for struct
  	irq_affinity and de-'This patch'-ed the changelog ]

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Shivasharan Srikanteshwara <shivasharan.srikanteshwara@broadcom.com>
Link: https://lkml.kernel.org/r/20190216172228.512444498@linutronix.de
2019-02-18 11:21:28 +01:00
Ming Lei
9cfef55bb5 genirq/affinity: Store interrupt sets size in struct irq_affinity
The interrupt affinity spreading mechanism supports to spread out
affinities for one or more interrupt sets. A interrupt set contains one
or more interrupts. Each set is mapped to a specific functionality of a
device, e.g. general I/O queues and read I/O queus of multiqueue block
devices.

The number of interrupts per set is defined by the driver. It depends on
the total number of available interrupts for the device, which is
determined by the PCI capabilites and the availability of underlying CPU
resources, and the number of queues which the device provides and the
driver wants to instantiate.

The driver passes initial configuration for the interrupt allocation via
a pointer to struct irq_affinity.

Right now the allocation mechanism is complex as it requires to have a
loop in the driver to determine the maximum number of interrupts which
are provided by the PCI capabilities and the underlying CPU resources.
This loop would have to be replicated in every driver which wants to
utilize this mechanism. That's unwanted code duplication and error
prone.

In order to move this into generic facilities it is required to have a
mechanism, which allows the recalculation of the interrupt sets and
their size, in the core code. As the core code does not have any
knowledge about the underlying device, a driver specific callback will
be added to struct affinity_desc, which will be invoked by the core
code. The callback will get the number of available interupts as an
argument, so the driver can calculate the corresponding number and size
of interrupt sets.

To support this, two modifications for the handling of struct irq_affinity
are required:

1) The (optional) interrupt sets size information is contained in a
   separate array of integers and struct irq_affinity contains a
   pointer to it.

   This is cumbersome and as the maximum number of interrupt sets is small,
   there is no reason to have separate storage. Moving the size array into
   struct affinity_desc avoids indirections and makes the code simpler.

2) At the moment the struct irq_affinity pointer which is handed in from
   the driver and passed through to several core functions is marked
   'const'.

   With the upcoming callback to recalculate the number and size of
   interrupt sets, it's necessary to remove the 'const'
   qualifier. Otherwise the callback would not be able to update the data.

Implement #1 and store the interrupt sets size in 'struct irq_affinity'.

No functional change.

[ tglx: Fixed the memcpy() size so it won't copy beyond the size of the
  	source. Fixed the kernel doc comments for struct irq_affinity and
  	de-'This patch'-ed the changelog ]

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Shivasharan Srikanteshwara <shivasharan.srikanteshwara@broadcom.com>
Link: https://lkml.kernel.org/r/20190216172228.423723127@linutronix.de
2019-02-18 11:21:27 +01:00
Thomas Gleixner
0145c30e89 genirq/affinity: Code consolidation
All information and calculations in the interrupt affinity spreading code
is strictly unsigned int. Though the code uses int all over the place.

Convert it over to unsigned int.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Shivasharan Srikanteshwara <shivasharan.srikanteshwara@broadcom.com>
Link: https://lkml.kernel.org/r/20190216172228.336424556@linutronix.de
2019-02-18 11:21:27 +01:00
Linus Walleij
8fab3d713c gpio updates for v5.1
- support for a new variant of pca953x
 - documentation fix from Wolfram
 - some tegra186 name changes
 - two minor fixes for madera and altera-a10sr
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAlxleLcACgkQEacuoBRx
 13I45Q//YMGUYzkMjOL+lp2DYnnVhVNqrF4hoLjinWVrnhZ6gqu88RgV2Cea4Pta
 oxVxnSsE8LK7kY8VZ8tcBmIqLLkQAJdSVtqkeSoZF2vhWBAbE9ZaSOYb17SIkSXK
 Ok16lZgZ+ZWOM5EjEvuRpB/qYGjX2glD5/Y2Kl7+wsX1W6U2pXasP0IjhcvDU8mJ
 NXNgfkr6kluMUqHJyqKo8eT/P3Hdv0CK9GsN2vGyfJenCdTSd7EC6KuhWAivi+fG
 /lf1bVuc2cCiXjxdSOXx+Yz7SjNe56viTaqnn/K6OlfLgErjKnRW+AxPkTZXNtDi
 pfMMpPXiwPcbQR2wrXG/7OMmJ1kUsfWoIUCx5RDwhF1KbEQVqgaSITLylk+4Yp/3
 eM0fYsQ+KvOdAnWKSgfxBhaaiO7z5XDdrnkSHBDoiBrm07BqBgK/v3Rivzf2GMEv
 QvM4OBfThS9I8skV5BaOBRDfHZs4N0EU/vhsW9gt50urtlSM0vSYx6kdMq/8R0k4
 NkJT43u+1vi5koMljBAsZYZiyXOQ2B+PlfpTMfMu+93QH8wlu9mOt1r3YTQyA1Xf
 jiOK8M2yQKP5g7RuPM6MtMsqlZKDM5nAlSf7S280Z3+vBd+LaELbXvT2/JL5ViGU
 hfH/gaNwUGUYd8EsWvfhHVdPAAecDCwxfKyKEnFGhMrtunTgwfI=
 =nV64
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v5.1-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into devel

gpio updates for v5.1

- support for a new variant of pca953x
- documentation fix from Wolfram
- some tegra186 name changes
- two minor fixes for madera and altera-a10sr
2019-02-17 21:59:33 +01:00
Linus Torvalds
dd6f29da69 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Two fixes on the kernel side: fix an over-eager condition that failed
  larger perf ring-buffer sizes, plus fix crashes in the Intel BTS code
  for a corner case, found by fuzzing"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix impossible ring-buffer sizes warning
  perf/x86: Add check_period PMU callback
2019-02-17 08:38:13 -08:00
David S. Miller
885e631959 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2019-02-16

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) numerous libbpf API improvements, from Andrii, Andrey, Yonghong.

2) test all bpf progs in alu32 mode, from Jiong.

3) skb->sk access and bpf_sk_fullsock(), bpf_tcp_sock() helpers, from Martin.

4) support for IP encap in lwt bpf progs, from Peter.

5) remove XDP_QUERY_XSK_UMEM dead code, from Jan.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-16 22:56:34 -08:00
David S. Miller
6e1077f514 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2019-02-16

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) fix lockdep false positive in bpf_get_stackid(), from Alexei.

2) several AF_XDP fixes, from Bjorn, Magnus, Davidlohr.

3) fix narrow load from struct bpf_sock, from Martin.

4) mips JIT fixes, from Paul.

5) gso handling fix in bpf helpers, from Willem.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-16 22:34:07 -08:00
YueHaibing
22cb45d769 swiotlb: drop pointless static qualifier in swiotlb_create_debugfs()
There is no need to have the 'struct dentry *d_swiotlb_usage' variable
static since new value always be assigned before use it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2019-02-16 11:36:34 -05:00